pull down to refresh
121 sats \ 1 reply \ @carter OP 24 Aug \ on: XAI releases it’s grok-2 model open source AI
I really need to stop being so lazy and figure out how to run these myself and not just wait for ollama to implement it
what I do to simply test a
safetensors
model (though this one is huge so you need proper hardware) through hf/torch:env, prereqs and model download:
uv venv
. .venv/bin/activate
uv pip install torch transformers accelerate
# optional: hf auth login
hf download "<org/repo>" # i.e. "google/gemma-3-270m-it"
example usage:
import torch
from transformers import pipeline
model_name = "google/gemma-3-270m-it" # org/repo format as used in hf download
chat = [
{"role": "system", "content": "You're a helpful assistant."},
{"role": "user", "content": "Explain consciousness in simple, concise terms."},
]
pipeline = pipeline(task="text-generation", model=model_name, device_map="auto")
response = pipeline(chat, max_new_tokens=512)
print(response[0]["generated_text"][-1]["content"])
example output:
uv run yourfile.py
% uv run test.py
Consciousness is the state of being aware of yourself and your surroundings.
It's like having a personal identity and internal world.
reply