🦞
Free AI

Run Local AI Models

Run OpenClaw (formerly Moltbot) completely free with local AI models. Use Ollama, LM Studio, or any OpenAI-compatible server.

$0 Why Run Local Models?
  • +Completely Free - No API costs, no subscription fees, forever
  • +Total Privacy - All data stays on your machine, never leaves
  • +Works Offline - Use OpenClaw without internet after setup
  • +No Rate Limits - Send as many messages as your hardware allows
  • +Model Choice - Run any open model, switch anytime

Local AI Runners

Ollama

Recommended

Most popular and easiest to set up. Great for Mac and Linux.

https://ollama.ai

Default port: 11434

LM Studio

GUI-based with model browser. Great for beginners on any platform.

https://lmstudio.ai

Default port: 1234

LocalAI

OpenAI-compatible API. Good for Docker deployments.

https://localai.io

Default port: 8080

llama.cpp server

Direct llama.cpp HTTP server. Maximum control and performance.

https://github.com/ggerganov/llama.cpp

Default port: 8080

vLLM

High-throughput inference server. Best for production deployments.

https://vllm.ai

Default port: 8000

Recommended Models

Llama 3.2 3B

llama3.2:3b

RAM: 4 GB

Quality: Good for simple tasks

Speed: Very Fast

Llama 3.1 8B

Recommended
llama3.1:8b

RAM: 8 GB

Quality: Best balance

Speed: Fast

Mistral 7B

mistral:7b

RAM: 8 GB

Quality: Great for coding

Speed: Fast

Qwen2.5 14B

qwen2.5:14b

RAM: 16 GB

Quality: Strong reasoning

Speed: Medium

Llama 3.1 70B

llama3.1:70b

RAM: 48 GB

Quality: Near GPT-4

Speed: Slow

Quick Setup with Ollama

1

Install Ollama

curl -fsSL https://ollama.ai/install.sh | sh

For Windows/Mac GUI: Download from ollama.ai

2

Download a Model

ollama pull llama3.1:8b
3

Configure OpenClaw

{
  "agent": {
    "provider": "ollama",
    "model": "llama3.1:8b",
    "baseUrl": "http://localhost:11434"
  }
}
LM Studio Configuration

LM Studio provides an OpenAI-compatible API:

{
  "agent": {
    "provider": "openai-compatible",
    "model": "local-model",
    "baseUrl": "http://localhost:1234/v1",
    "apiKey": "not-needed"
  }
}

Start LM Studio, load a model, and enable the server in Settings > Local Server.

~ Model Failover

Configure multiple local models with automatic failover:

{
  "agent": {
    "provider": "ollama",
    "model": "llama3.1:70b",
    "baseUrl": "http://localhost:11434",
    "fallbackModels": [
      "llama3.1:8b",
      "mistral:7b"
    ],
    "fallbackOnError": true,
    "fallbackOnTimeout": true,
    "timeout": 30000
  }
}
  • fallbackModels - List of backup models to try
  • fallbackOnError - Switch on model errors
  • fallbackOnTimeout - Switch if model is too slow
  • timeout - Timeout in milliseconds before failover
Hybrid: Local + Cloud Fallback

Use local models primarily, fall back to cloud for complex tasks:

{
  "agent": {
    "provider": "ollama",
    "model": "llama3.1:8b",
    "baseUrl": "http://localhost:11434",
    "cloudFallback": {
      "provider": "anthropic",
      "model": "claude-3-5-sonnet-20241022",
      "apiKey": "sk-ant-...",
      "triggerOn": ["complex_reasoning", "long_context"]
    }
  }
}

Stay mostly free with local AI, use cloud only when needed.

Hardware Requirements
Model SizeRAM NeededBest For
3B models4 GBBasic tasks, older hardware
7-8B models8 GBMost users, good balance
13-14B models16 GBBetter quality, modern laptops
70B models48+ GBNear cloud quality, high-end

Apple Silicon Macs are particularly efficient due to unified memory.

For detailed local model configuration, see the OpenClaw Local Models Documentation.

Local AI Running!

Now connect your messaging apps and chat for free.