Add comprehensive documentation for ClawLama project, detailing architecture, features, installation, and usage.
24 KiB
Project Overview
ClawLama is a CPU-optimized, multi-arch (amd64/arm64), single-container AI assistant that bundles OpenClaw and Ollama into one self-contained Docker image. Zero-cost, fully local, privacy-first — GPU accelerated when available, fully functional without it.
Image: docker.io/casjaysdevdocker/clawlama:latest
Base: debian:bookworm-slim
Platforms: linux/amd64, linux/arm64
Single container: Both Ollama and OpenClaw run inside one image, no compose required.
Source Reference: Based on iam-veeramalla's OpenClaw + Ollama guide.
Problem Statement
Running OpenClaw with a local Ollama model requires manual multi-step setup: installing OpenClaw, installing Ollama, pulling a model, writing a JSON config, and wiring everything together. ClawLama eliminates this friction by packaging everything into a single container — just docker run.
Architecture
┌───────────────────────────────────────────────────────────┐
│ docker.io/casjaysdevdocker/clawlama:latest │
│ debian:bookworm-slim | linux/amd64, linux/arm64 │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ entrypoint.sh │ │
│ │ • Detect arch (amd64/arm64) + GPU (nvidia-smi) │ │
│ │ • Generate openclaw.json from env vars │ │
│ │ • Pull model if not cached │ │
│ │ • Start Ollama (background) │ │
│ │ • Wait for Ollama health │ │
│ │ • Start OpenClaw gateway (foreground) │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────┐ ┌───────────────────┐ │
│ │ OpenClaw │───▶│ Ollama (CPU/GPU) │ │
│ │ Gateway + │ │ localhost:11434 │ │
│ │ Agent │ │ Auto-detects GPU │ │
│ │ :18789 │ │ at runtime │ │
│ └──────────────┘ └───────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────┐ ┌────────────────┐ │
│ │ /data/ │ │ /data/ │ │
│ │ workspace/ │ │ ollama/ │ │
│ │ (volume) │ │ (volume) │ │
│ └─────────────┘ └────────────────┘ │
└───────────────────────────────────────────────────────────┘
Single-Image Architecture
Both Ollama and OpenClaw run inside one container. The entrypoint manages process lifecycle:
- Ollama starts as a background process, binding to
localhost:11434. - OpenClaw starts as the foreground process after Ollama is healthy, connecting to
http://localhost:11434/v1. - If Ollama crashes, the entrypoint detects it and exits (container restarts via Docker's restart policy).
- Signals (SIGTERM/SIGINT) are forwarded to both processes for clean shutdown.
GPU Runtime Detection
The image ships no GPU libraries. GPU acceleration is achieved through NVIDIA Container Toolkit runtime passthrough:
- At startup, entrypoint runs
nvidia-smito detect GPU availability. - GPU found: Ollama automatically uses CUDA via the mounted NVIDIA runtime. Logs report GPU model and VRAM.
- No GPU: Ollama falls back to CPU inference. No errors, no warnings — this is the expected default path.
- User enables GPU by passing
--gpus alltodocker run(requires NVIDIA Container Toolkit on host).
Ports
18789— OpenClaw gateway (exposed)11434— Ollama API (internal only by default; expose with-p 11434:11434for debugging)
Installation Method (No curl | sh)
- Ollama: Latest release binary downloaded directly from GitHub at build time:
https://github.com/ollama/ollama/releases/latest/download/ollama-linux-${TARGETARCH}. No version pinning — always gets the newest stable release. - OpenClaw: Installed via
npm install -g openclaw@latest. Always gets the newest published version. - Node.js: Current LTS from NodeSource apt repository (
node_lts.x) with GPG key verification. Auto-advances to next LTS major (e.g., 22 → 24) when Node promotes it.
Features
- Single container, single command —
docker run -d docker.io/casjaysdevdocker/clawlama:latestlaunches both Ollama and OpenClaw. No compose file required for basic use. - debian:bookworm-slim base — Minimal Debian with glibc for full Ollama SIMD compatibility (AVX/AVX2 on amd64, NEON on arm64).
- Direct binary installation (no curl | sh):
- Ollama: latest release binary from GitHub releases (
/releases/latest/download/), selected perTARGETARCH. - OpenClaw:
npm install -g openclaw@latest. - Node.js: current LTS from NodeSource apt repo (
node_lts.x) with GPG key verification.
- Ollama: latest release binary from GitHub releases (
- Pre-configured OpenClaw ↔ Ollama wiring — OpenClaw config auto-generated at startup pointing to
http://localhost:11434/v1with zero-cost pricing. - Persistent volumes — Two mount points:
/data/ollama(model store),/data/workspace(OpenClaw). Survive restarts. - Default model:
gpt-oss:20b— Automatically pulled on first launch before OpenClaw starts. - Health checks — Container
HEALTHCHECKverifies both Ollama API and OpenClaw gateway. - Environment variable overrides:
CLAWLAMA_MODEL— Model to pull and use (default:gpt-oss:20b)CLAWLAMA_CONTEXT_WINDOW— Context window size (default:131072)CLAWLAMA_MAX_TOKENS— Max output tokens (default:8192)CLAWLAMA_MAX_CONCURRENT— Agent concurrency (default:4)CLAWLAMA_SUBAGENT_CONCURRENT— Subagent concurrency (default:8)CLAWLAMA_OPENCLAW_PORT— OpenClaw gateway port (default:18789)OLLAMA_NUM_THREADS— CPU threads for inference (default: auto-detect physical cores)OLLAMA_NUM_PARALLEL— Max parallel requests (default:1)OLLAMA_MAX_LOADED_MODELS— Models in memory (default:1)OLLAMA_HOST— Ollama bind address (default:127.0.0.1:11434)
- Multi-arch (amd64 + arm64) — Single manifest tag built with
docker buildx. Ollama binary selected byTARGETARCH. All scripts POSIX shell. - Runtime GPU detection — Entrypoint probes
nvidia-smi. GPU used automatically if available via--gpus all. No GPU libs in image; NVIDIA Container Toolkit handles passthrough. CPU is the default and primary path. - Model swap without rebuild — Changing
CLAWLAMA_MODELand restarting pulls the new model and regenerates config. - CPU performance auto-tuning — Entrypoint auto-detects physical cores, available RAM, and sets
OLLAMA_NUM_THREADSoptimally if unset. Logs detected values. - Telegram integration helper — Optional
CLAWLAMA_TELEGRAM_BOT_TOKENenv var auto-configures Telegram channel. - Full tool profile with layered restrictions — All OpenClaw tools enabled via
profile: "full". Git and/etcrestrictions enforced via TOOLS.md (soft) and container filesystem (hard). - Startup banner — Print connection info, detected arch, CPU cores, RAM, GPU status, and model to stdout.
- Quantized model recommendations — README documents CPU-friendly models by RAM tier:
- 8 GB RAM: 7B Q4 variants
- 16 GB RAM:
gpt-oss:20b(default) or 13B Q5 - 32+ GB RAM: 20B+ full or 34B Q4
- docker-compose.yml included — Provided for users who prefer compose, with volume mounts and restart policy pre-configured.
- Multi-model support — Comma-separated
CLAWLAMA_MODELSenv var configures multiple models in the OpenClaw provider config. - Backup/restore scripts — Shell scripts to tar
/datavolumes for migration. - Portainer/Dockge compatible — Compose file works with popular Docker management UIs.
- Architecture detection in logs — Log detected arch and SIMD instruction sets (AVX, AVX2, AVX-512, NEON) for performance troubleshooting.
- GPU VRAM-aware model selection — When GPU detected, log VRAM and suggest optimal model/quantization for available resources.
File Structure
clawlama/
├── AI.md # This spec
├── TODO.AI.md # Task tracking
├── Dockerfile # Multi-stage, multi-arch (amd64 + arm64)
├── docker-compose.yml # Optional compose file for convenience
├── .env.example # Template environment variables
├── rootfs/
│ ├── usr/local/bin/
│ │ ├── entrypoint.sh # Main entrypoint: detect GPU, gen config, start services
│ │ └── healthcheck.sh # Health check script for HEALTHCHECK instruction
│ └── etc/clawlama/
│ ├── openclaw.template.json # OpenClaw config template (envsubst-ready)
│ └── TOOLS.md # Agent tool usage rules (git deny, /etc deny)
├── scripts/
│ ├── build.sh # Multi-arch buildx build + push
│ ├── backup.sh # Backup /data volumes
│ └── restore.sh # Restore /data volumes
└── README.md # User-facing documentation
Quick Start
# CPU-only (default)
docker run -d \
--name clawlama \
-v clawlama-data:/data \
-p 18789:18789 \
docker.io/casjaysdevdocker/clawlama:latest
# With NVIDIA GPU acceleration
docker run -d \
--name clawlama \
--gpus all \
-v clawlama-data:/data \
-p 18789:18789 \
docker.io/casjaysdevdocker/clawlama:latest
# Custom model + expose Ollama API for debugging
docker run -d \
--name clawlama \
-v clawlama-data:/data \
-p 18789:18789 \
-p 11434:11434 \
-e CLAWLAMA_MODEL=qwen2:7b \
-e OLLAMA_HOST=0.0.0.0:11434 \
docker.io/casjaysdevdocker/clawlama:latest
Dockerfile Sketch
# ── Stage 1: Build dependencies ──────────────────────────────
FROM debian:bookworm-slim AS builder
ARG TARGETARCH
# Install Node.js LTS from NodeSource apt repo (no curl | sh)
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates curl gnupg gettext-base && \
mkdir -p /etc/apt/keyrings && \
curl -fsSL https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key \
| gpg --dearmor -o /etc/apt/keyrings/nodesource.gpg && \
echo "deb [signed-by=/etc/apt/keyrings/nodesource.gpg] https://deb.nodesource.com/node_lts.x nodistro main" \
> /etc/apt/sources.list.d/nodesource.list && \
apt-get update && apt-get install -y --no-install-recommends nodejs && \
rm -rf /var/lib/apt/lists/*
# Download latest Ollama binary directly (no curl | sh, no version pinning)
RUN curl -fsSL -o /usr/local/bin/ollama \
"https://github.com/ollama/ollama/releases/latest/download/ollama-linux-${TARGETARCH}" && \
chmod +x /usr/local/bin/ollama
# Install latest OpenClaw via npm
RUN npm install -g openclaw@latest
# ── Stage 2: Runtime ─────────────────────────────────────────
FROM debian:bookworm-slim
# Install Node.js LTS runtime (same repo method, no dev packages)
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates curl gnupg tini procps gettext-base && \
mkdir -p /etc/apt/keyrings && \
curl -fsSL https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key \
| gpg --dearmor -o /etc/apt/keyrings/nodesource.gpg && \
echo "deb [signed-by=/etc/apt/keyrings/nodesource.gpg] https://deb.nodesource.com/node_lts.x nodistro main" \
> /etc/apt/sources.list.d/nodesource.list && \
apt-get update && apt-get install -y --no-install-recommends nodejs && \
rm -rf /var/lib/apt/lists/*
# Copy Ollama binary
COPY --from=builder /usr/local/bin/ollama /usr/local/bin/ollama
# Copy OpenClaw global install
COPY --from=builder /usr/lib/node_modules /usr/lib/node_modules
COPY --from=builder /usr/bin/openclaw /usr/bin/openclaw
# Copy rootfs overlay
COPY rootfs/ /
# Create non-root user and data directories
RUN groupadd -r clawlama && useradd -r -g clawlama -m clawlama && \
mkdir -p /data/ollama /data/workspace && \
chown -R clawlama:clawlama /data
# Hard restriction: make /etc read-only for non-root users
RUN chmod -R a-w /etc
# Environment defaults (CPU-optimized)
ENV CLAWLAMA_MODEL=gpt-oss:20b \
CLAWLAMA_CONTEXT_WINDOW=131072 \
CLAWLAMA_MAX_TOKENS=8192 \
CLAWLAMA_MAX_CONCURRENT=4 \
CLAWLAMA_SUBAGENT_CONCURRENT=8 \
CLAWLAMA_OPENCLAW_PORT=18789 \
OLLAMA_HOST=127.0.0.1:11434 \
OLLAMA_MODELS=/data/ollama \
OLLAMA_NUM_PARALLEL=1 \
OLLAMA_MAX_LOADED_MODELS=1
VOLUME ["/data"]
EXPOSE 18789
HEALTHCHECK --interval=30s --timeout=10s --start-period=120s --retries=3 \
CMD /usr/local/bin/healthcheck.sh
ENTRYPOINT ["tini", "--"]
CMD ["/usr/local/bin/entrypoint.sh"]
Notes:
tiniis the PID 1 init for proper signal handling — no zombie processes.TARGETARCHis automatically set bydocker buildx(amd64orarm64).curlused only for apt key download and direct binary fetch — never piped to shell.- Builder stage is discarded; runtime image contains only what's needed.
OLLAMA_MODELS=/data/ollamaensures models persist in the volume./etcmade non-writable viachmod— hard enforcement of/etcwrite protection.- Entrypoint copies
TOOLS.mdinto/data/workspace/TOOLS.mdon first run (OpenClaw reads this as agent instructions).
OpenClaw Configuration
The following JSON config is generated at container startup from environment variables via envsubst:
{
"models": {
"providers": {
"ollama": {
"baseUrl": "http://localhost:11434/v1",
"apiKey": "ollama-local",
"api": "openai-completions",
"models": [
{
"id": "${CLAWLAMA_MODEL}",
"name": "${CLAWLAMA_MODEL}",
"reasoning": false,
"input": ["text"],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": ${CLAWLAMA_CONTEXT_WINDOW},
"maxTokens": ${CLAWLAMA_MAX_TOKENS}
}
]
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "ollama/${CLAWLAMA_MODEL}"
},
"workspace": "/data/workspace",
"maxConcurrent": ${CLAWLAMA_MAX_CONCURRENT},
"subagents": {
"maxConcurrent": ${CLAWLAMA_SUBAGENT_CONCURRENT}
}
}
},
"tools": {
"profile": "full",
"exec": {
"security": "full",
"ask": "off",
"backgroundMs": 10000,
"timeoutSec": 1800,
"applyPatch": {
"enabled": true,
"workspaceOnly": true
}
},
"fs": {
"workspaceOnly": false
},
"elevated": {
"enabled": false
}
}
}
Tool Permission Policy
OpenClaw's tool policy operates at the tool level (allow/deny entire tools like exec, read, write), not at the command or path level. To enforce the desired restrictions (no git commit/push/reset --hard, no writes to /etc), ClawLama uses a layered approach:
| Layer | Mechanism | What It Enforces |
|---|---|---|
| Tool profile | tools.profile: "full" |
All tools enabled: group:fs, group:runtime, group:ui, group:sessions, group:memory, group:automation, web_search, web_fetch |
| Exec security | tools.exec.security: "full" |
Shell commands auto-approved (no prompt). Agent has full exec access. |
| Workspace TOOLS.md | Agent instruction file | Soft restrictions: instructs agent to never run git commit, git push, git reset --hard |
| Container filesystem | Dockerfile RUN chmod / read-only mounts |
Hard restriction: /etc read-only at container level, preventing writes regardless of agent behavior |
| Elevated mode | tools.elevated.enabled: false |
No host-level exec breakout (good hygiene) |
| Workspace scope | tools.exec.applyPatch.workspaceOnly: true |
apply_patch operations restricted to workspace directory |
Tool Groups (Reference)
OpenClaw's built-in tool groups for use in tools.allow / tools.deny:
| Group | Tools |
|---|---|
group:runtime |
exec, bash, process |
group:fs |
read, write, edit, apply_patch |
group:sessions |
sessions_list, sessions_history, sessions_send, sessions_spawn, session_status |
group:memory |
memory_search, memory_get |
group:ui |
browser, canvas |
group:automation |
cron, gateway |
group:messaging |
message |
group:nodes |
nodes |
Git Restrictions (via TOOLS.md)
Since OpenClaw has no command-pattern deny list for exec, git restrictions are enforced via the workspace TOOLS.md file — an agent instruction document that OpenClaw injects into the system prompt:
<!-- /data/workspace/TOOLS.md -->
# Tool Usage Rules
## Git Restrictions (MANDATORY)
- NEVER run `git commit` in any form
- NEVER run `git push` in any form
- NEVER run `git reset --hard` in any form
- All other git commands are allowed (status, diff, log, add, branch, checkout, clone, pull, stash, etc.)
## Filesystem Restrictions
- Do NOT write to or delete files in /etc/
- The /etc directory is read-only at the container level
Important: TOOLS.md is a soft restriction — the LLM is instructed not to run these commands, but it is not technically blocked by OpenClaw's tool policy engine. For hard enforcement, users should enable exec approvals (tools.exec.ask: "always"). The container-level /etc read-only mount is a hard restriction regardless.
/etc Protection (via Container)
Since OpenClaw's tools.fs.workspaceOnly is an all-or-nothing toggle, and we want reads everywhere but writes denied only to /etc, this is enforced at the Docker layer:
RUN chmod -R a-w /etc
Override Path
Users can override tool policy by bind-mounting a custom config:
docker run -v ./my-openclaw.json:/data/workspace/.openclaw/openclaw.json ...
Technical Constraints
- CPU-optimized, GPU-optional — Image ships zero GPU libraries. Ollama runs CPU inference by default. GPU is activated automatically when user passes
--gpus alland NVIDIA Container Toolkit is installed on host. No image rebuild needed. - Direct binary installation only — No
curl | shorcurl | bashanywhere in the Dockerfile. All software installed via apt packages (Node.js via NodeSource repo with GPG key), direct binary download (Ollama from GitHub releases), and npm package manager (OpenClaw). - No version pinning — Ollama uses
/releases/latest/download/, OpenClaw uses@latest, Node.js usesnode_lts.x. Eachdocker buildpicks up the newest stable versions. No version ARGs to maintain. - Multi-arch manifest — Published image contains both
linux/amd64andlinux/arm64. Built withdocker buildx. - Platform-specific performance:
- amd64: Ollama leverages AVX/AVX2/AVX-512 SIMD when available (most x86_64 CPUs from 2013+).
- arm64: Ollama leverages NEON SIMD (all ARMv8+). Apple Silicon performs well; Raspberry Pi 5 is functional but slower.
- Node.js LTS required for OpenClaw (currently ≥ 22).
- Single-process entrypoint pattern — Ollama runs as background process, OpenClaw as foreground. Entrypoint handles process supervision, signal forwarding, and crash detection. No external supervisor (s6, supervisord) required.
- Ollama must be healthy before OpenClaw starts — Entrypoint polls
localhost:11434with retry loop before launching OpenClaw. - No external API keys required — entire stack is zero-cost by design.
- Model storage can be large (20B model ≈ 12-15 GB);
/data/ollamavolume mount is mandatory, not tmpfs. - RAM requirements (CPU inference):
- 7B Q4 model: ~4-6 GB RAM minimum
- 13B Q4 model: ~8-10 GB RAM minimum
- 20B model: ~16 GB RAM minimum (default)
- Recommend at least 2 GB headroom above model size for OpenClaw + Node.js + OS.
- RAM requirements (GPU inference): Model must fit in VRAM. Partial offload (CPU+GPU split) is handled automatically by Ollama.
apiKey: "ollama-local"— Ollama doesn't require auth but OpenClaw config requires a non-empty value; this is a dummy placeholder.- Soft vs hard restrictions — OpenClaw's tool policy operates at tool granularity only. Git command restrictions are soft (TOOLS.md).
/etcwrite protection is hard (OS-level chmod). For maximum safety, enabletools.exec.ask: "always". - Image size target — Under 500 MB compressed (excluding pulled models).
Security Considerations
- OpenClaw gateway should NOT be exposed to the public internet without authentication.
- Ollama API binds to
127.0.0.1inside the container by default — not accessible from host unless explicitly exposed. - All data stays local — no telemetry, no cloud calls, no API key leakage.
- Container runs as non-root user (
clawlama) where possible. Entrypoint drops privileges after setup. - OpenClaw's prompt injection surface is inherited; users should review OpenClaw's security docs before enabling messaging integrations.
- When GPU passthrough is enabled (
--gpus all), the container gains access to host GPU devices — standard NVIDIA Container Toolkit security model applies.
Success Criteria
docker pull docker.io/casjaysdevdocker/clawlama:latestsucceeds on both amd64 and arm64 hosts.docker run -d -v clawlama-data:/data -p 18789:18789 docker.io/casjaysdevdocker/clawlama:latestbrings up both services with no manual intervention.- Ollama model is pulled automatically on first run using CPU inference.
- OpenClaw agent responds to queries using the local Ollama model within 120 seconds of container start (CPU inference baseline; faster with GPU).
- Changing
CLAWLAMA_MODELenv var and restarting pulls the new model and regenerates config. - Container restart preserves all workspace data and downloaded models via
/datavolume. docker stop && docker startrecovers to working state.- Runs successfully on: x86_64 Linux server (cloud VM), Apple Silicon Mac (Docker Desktop), Raspberry Pi 5 (arm64).
- When launched with
--gpus allon an NVIDIA host, Ollama detects and uses GPU — verified in logs. - When launched without
--gpuson any host, Ollama runs CPU-only — no GPU-related errors in logs.
Out of Scope
- Baking GPU libraries into the image — GPU support is via NVIDIA Container Toolkit runtime passthrough only.
- AMD ROCm / Intel Arc GPU support — Only NVIDIA GPUs supported via container toolkit.
- Custom OpenClaw skill development (users add their own post-deploy).
- Building or fine-tuning custom Ollama models.
- Production-grade reverse proxy / TLS termination (user's responsibility).
- OpenClaw's built-in onboarding wizard (
openclaw onboard) — replaced by container auto-config. - iMessage / BlueBubbles / platform-specific integrations requiring host OS access.