LinuxCommandLibrary
GitHubF-DroidGoogle Play Store

ollama

runs large language models locally

TLDR

Run a model interactively
$ ollama run [llama3]
copy
Run a model with a prompt
$ ollama run [llama3] "[What is the capital of France?]"
copy
List installed models
$ ollama list
copy
Pull a model
$ ollama pull [mistral]
copy
Show model info
$ ollama show [llama3]
copy
List running models
$ ollama ps
copy
Remove a model
$ ollama rm [model_name]
copy
Start the API server
$ ollama serve
copy

SYNOPSIS

ollama [command] [options]

DESCRIPTION

ollama runs large language models locally. It handles model downloads, serving via a REST API, and interactive chat sessions.Supports a wide range of open models including Llama, Mistral, Gemma, Phi, Qwen, DeepSeek, and others. Models are pulled from the Ollama registry and cached locally.The API server provides OpenAI-compatible endpoints for chat completions, embeddings, and model management. Custom models can be created using Modelfiles that specify base models, system prompts, parameters, and adapter layers.

PARAMETERS

run MODEL [PROMPT]

Run model interactively or with a one-off prompt.
pull MODEL
Download model from registry.
push MODEL
Push model to registry.
list (or ls)
List locally available models.
show MODEL
Show model information (architecture, parameters, license).
ps
List currently running models.
stop MODEL
Stop a running model.
rm MODEL
Remove a local model.
cp SOURCE DESTINATION
Copy a model locally under a new name.
serve
Start the Ollama API server (default port 11434).
create NAME -f MODELFILE
Create a custom model from a Modelfile.
--help
Display help information.

CAVEATS

Requires sufficient RAM/VRAM depending on model size. GPU acceleration is supported (NVIDIA, AMD, Apple Silicon). The API server listens on localhost:11434 by default; configure with OLLAMA_HOST environment variable.

HISTORY

Ollama was created by Jeffrey Morgan and first released in 2023. Built on llama.cpp, it simplifies the process of downloading, running, and managing open-source language models locally. The project quickly gained popularity as interest in running LLMs without cloud APIs grew.

SEE ALSO

Copied to clipboard
Kai