Install
docker model run ai/gemma3
$
Docker Model Runner pulls, runs, and serves LLMs locally via the docker model CLI, sourcing models from Docker Hub, OCI registries, or Hugging Face and exposing them through OpenAI- and Ollama-compatible APIs. It can package GGUF and Safetensors files as OCI artifacts, and lets you tune settings like context size. Models are cached, loaded only at runtime, and unloaded when idle. Use it for local inference without a cloud provider.