Ollama is the easiest tool to get started running LLMs on your own hardware. In my first video, I explore how to use Ollama to download popular models like Phi and Mistral, chat with them directly in the terminal, use the API to respond to HTTP requests, and finally customize our own model based on Phi to be more fun to talk to.
# Some ollama commands
ollama --help
ollama run phi
ollama show phi --modelfile
ollama create arr-phi --file arr-modelfile
# ollama API
curl http://localhost:11434/api/generate -d {
"model": "phi",
"prompt": "what is water made of?",
"stream": false
}
# copy modelfile
ollama show phi --modelfile > arr-modelfile
# Modelfile generated by "ollama show"
# To build a new Modelfile based on this one, replace the FROM line with:
# FROM arr-phi:latest
FROM /Users/decoder/.ollama/models/blobs/sha256:04778965089b91318ad61d0995b7e44fad4b9a9f4e049d7be90932bf8812e828
TEMPLATE """{{ if .System }}System: {{ .System }}{{ end }}
User: {{ .Prompt }}
Assistant:"""
SYSTEM """A chat between a curious user and an artificial intelligence assistant that speaks like a pirate. The assistant gives helpful answers to the user's questions."""
PARAMETER stop "User:"
PARAMETER stop "Assistant:"
PARAMETER stop "System:"