English | 中文 | 한국어 | 日本語 | Français | Deutsch | Español | Português
CheetahClaws: A Fast and Easy-to-Use Agent Harness Infrastructure for Long-Horizon, Multi-Model, and Tool-Using AI Systems
Website · Scaling the Harness · Issue · The newest source of Claude Code
curl -fsSL https://raw.githubusercontent.com/SafeRL-Lab/cheetahclaws/main/scripts/install.sh | bashAfter installation:
source ~/.zshrc # macOS
# or: source ~/.bashrc # Linux
cheetahclaws # start chatting!Other install methods: pip install | uv install | run from source | full details
- May 31, 2026 (latest): QQ bot bridge —
/qqconnects cheetahclaws to QQ groups + C2C private chats via the officialqq-botpySDK (PR #121). Details: docs/guides/bridges.md · docs/news.md. - May 12, 2026: Security hardening sweep — env-var bot tokens, web CSRF cookie, terminal session owner-binding, and plugin/MCP/filesystem sandboxing (two CRITICAL + HIGH rounds, 2347 tests green). Details: docs/guides/security.md · docs/news.md.
- May 12, 2026: Daemon foundation roadmap — all nine F-1…F-9 items landed: subprocess agent runners, on-crash restart policy, daemonized Telegram/Slack/WeChat bridges, and budget guardrails. Details: docs/news.md.
For more news, see here.
CheetahClaws: A Lightweight and Easy-to-Use Python Reimplementation of Claude Code Supporting Any Model, such as Claude, GPT, Gemini, Kimi, Qwen, Zhipu, DeepSeek, MiniMax, and local open-source models via Ollama or any OpenAI-compatible endpoint.
- Why CheetahClaws
- CheetahClaws vs OpenClaw
- Features
- Supported Models
- Installation
- Usage: Closed-Source API Models
- Usage: Open-Source Models (Local)
- Model Name Format
- Trading Agent
- Web UI
- Documentation (guides for all features)
- Contributing · FAQ · Citation
More animated demos (code review,
/research,/brainstorm,/lab, Telegram/WeChat/Slack bridges) live indocs/media/.
Claude Code is a powerful, production-grade AI coding assistant — but its source is a compiled ~12 MB TypeScript/Node bundle (~1,300 files, ~283K lines), tightly coupled to the Anthropic API, hard to modify, and impossible to run against a local or alternative model.
CheetahClaws reimplements the same core loop in ~90K lines of readable Python — keeping what you need, dropping what you don't, and adding multi-provider + local-model support. Full comparison: docs/guides/comparison.md.
| Dimension | Claude Code (TypeScript) | CheetahClaws (Python) |
|---|---|---|
| Language | TypeScript + React/Ink | Python 3.8+ |
| Source files / LoC | ~1,332 files / ~283K | ~315 files / ~90K (core; ~127K with tests) |
| Built-in tools / commands | 44+ / 88 | 27 / 50+ |
| Model providers | Anthropic only | 8+ (Anthropic · OpenAI · Gemini · Kimi · Qwen · DeepSeek · MiniMax · …) |
| Local models | No | Yes — Ollama, LM Studio, vLLM, any OpenAI-compatible endpoint |
| Build step | Yes (Bun + esbuild) | No — python cheetahclaws.py |
| Extensibility | Closed (compile-time) | Open — register_tool() at runtime, Markdown skills, git plugins, MCP |
| Voice input | Proprietary WebSocket (OAuth) | Local Whisper / OpenAI — works offline |
Where Claude Code wins: richer React/Ink UI, more built-in tools, enterprise features (MDM, team permission sync, OAuth/keychain), AI-driven memory extraction, single-binary production reliability.
Where CheetahClaws wins: any-model switching (--model//model, no recompile) incl. full local/offline support; a readable agent loop in one file (agent.py, ~740 lines); zero build; runtime tool registration + MCP + git plugins + Markdown skills; task dependency graph (blocks/blocked_by); two-layer context compression; offline voice; cloud session sync; bridges to Telegram/WeChat/Slack/QQ.
Who it's for: developers who want a local/non-Anthropic coding assistant, researchers studying how agentic assistants work, and teams who need a hackable baseline — without a Node.js build chain.
OpenClaw is another popular open-source assistant (TypeScript/Node). The two have different primary goals — OpenClaw is a personal life-assistant across messaging channels; CheetahClaws is a developer/coding tool.
| Dimension | OpenClaw (TypeScript) | CheetahClaws (Python) |
|---|---|---|
| Lines of code | ~245K (~10,349 files) | ~90K core (~315 files) |
| Primary focus | Personal assistant across channels | AI coding assistant / dev tool |
| Architecture | Always-on Gateway daemon + apps | Zero-install terminal REPL |
| Messaging channels | 20+ (WhatsApp · Signal · iMessage · Discord · Matrix · …) | Terminal + Telegram · WeChat · Slack · QQ bridges |
| Local / offline models | Limited | Full — Ollama · vLLM · LM Studio · any OpenAI-compatible |
| Code editing tools | Browser control, Canvas | Read · Write · Edit · Bash · Glob · Grep · NotebookEdit · GetDiagnostics |
| Mobile / Live Canvas | Yes (menu bar + iOS/Android, A2UI) | — |
| MCP support | — | Yes (stdio/SSE/HTTP) |
| Hackability | 245K lines, harder to modify | ~90K lines — agent loop in one file |
| If you want… | Use |
|---|---|
| A personal assistant on WhatsApp/Signal/Discord, mobile-first, browser automation + Canvas | OpenClaw |
| An AI coding assistant in your terminal, full offline/local models, multi-provider switching, source you can read in an afternoon | CheetahClaws |
Full comparison — both sides' wins + key design differences (agent loop, tool registration, context compression, memory): docs/guides/comparison.md.
| Feature | Details |
|---|---|
| Multi-provider | Anthropic · OpenAI · Gemini · Kimi · Qwen · Zhipu · DeepSeek · MiniMax · Ollama · LM Studio · Custom endpoint |
| Agent loop | Streaming API + automatic tool-use loop; the whole loop is in agent.py |
| 27 built-in tools | Read · Write · Edit · Bash · Glob · Grep · WebFetch · WebSearch · NotebookEdit · GetDiagnostics · Memory* · Agent/SendMessage · Skill · AskUserQuestion · Task* · SleepTimer · EnterPlanMode/ExitPlanMode · (MCP + plugin tools auto-added) |
| MCP integration | Connect any MCP server (stdio/SSE/HTTP); tools auto-registered — see extensions guide |
| Plugin system | Install/enable/update plugins from git URLs or local paths; multi-scope; recommendation engine |
| Task management | TaskCreate/Update/Get/List, sequential IDs, dependency edges, persisted to .cheetahclaws/tasks.json |
| Context compression | Four cooperating layers — dynamic max_tokens cap, per-model context-window registry, two-layer snip + AI summarize at 70%, and auto-fanout for oversized tool outputs. Details |
| Persistent memory | Dual-scope (user + project), 4 types, confidence/source metadata, conflict detection, recency-weighted search, /memory consolidate |
| Multi-agent | Spawn typed sub-agents (coder/reviewer/researcher/…), git-worktree isolation, background mode |
| Permission system | auto / accept-all / manual / plan modes |
| Checkpoints & plan mode | Auto-snapshot conversation + files each turn (/checkpoint, /rewind); /plan read-only analysis mode |
| Slash commands & themes | 50+ slash commands with Tab-complete; /theme offers 15 curated palettes |
| Brainstorm → Worker | /brainstorm runs an N-persona debate → todo_list.txt; /worker auto-implements the pending tasks |
| SSJ Developer Mode | /ssj — persistent power menu chaining Brainstorm, Worker, Review, Trading, Agent, Video/TTS, Monitor, etc. |
| Trading agent | /trading multi-agent analysis, backtesting, paper-trade calibration, MV portfolios. Guide |
| Monitor | /monitor subscribes to AI-monitored topics on a schedule (arxiv / stock / crypto / news / custom), pushes reports to bridges/console |
| Research (multi-source) | /research fans out to 20 sources with attention heat table, entity extraction, trend sparkline, comparison mode. Guide |
| Autonomous agents | /agent background loops from Markdown templates; iteration summaries pushed via bridge; stagnation-stop guard |
| Bridges + remote control | Telegram · WeChat · Slack · QQ — chat round-trip, slash passthrough, per-bridge job queue (!jobs/!retry/!cancel). Guide |
| Voice / Vision / Video / TTS | Offline Whisper /voice; /image clipboard vision (local + cloud); /video + /tts content factories. Guide |
| Web UI | --web — multi-user browser chat + PTY terminal. Guide |
| More | Tmux integration · !cmd shell escape · proactive monitoring · 3×Ctrl+C force-quit · session persistence · /cloudsave GitHub-Gist sync · cost tracking · --print non-interactive mode |
Full feature reference — every row above with complete detail (context-compression layers, auto-fanout, 15 themes, the full Trading/Research/Agents writeups, …): docs/guides/features.md.
| Provider | Example models | Context | API Key Env |
|---|---|---|---|
| Anthropic | claude-opus-4-6 · claude-sonnet-4-6 · claude-haiku-4-5-20251001 |
200k | ANTHROPIC_API_KEY |
| OpenAI | gpt-4o · gpt-4.1 · gpt-5 · o3 · o4-mini |
128–200k | OPENAI_API_KEY |
gemini-2.5-pro · gemini-2.0-flash · gemini-1.5-pro |
1–2M | GEMINI_API_KEY |
|
| Moonshot (Kimi) | moonshot-v1-8k / -32k / -128k |
8–128k | MOONSHOT_API_KEY |
| Alibaba (Qwen) | qwen-max · qwen-plus · qwen-turbo · qwq-32b |
32k–1M | DASHSCOPE_API_KEY |
| Zhipu (GLM) | glm-4-plus · glm-4 · glm-4-flash (free tier) |
128k | ZHIPU_API_KEY |
| DeepSeek | deepseek-chat · deepseek-reasoner |
64k | DEEPSEEK_API_KEY |
| MiniMax | MiniMax-Text-01 · MiniMax-VL-01 · abab6.5s-chat |
256k–1M | MINIMAX_API_KEY |
| AWS Bedrock / Azure / Vertex (via litellm) | litellm/<provider>/<model> |
varies | provider-specific |
litellm/adapter: routes to 100+ providers behind one SDK — mainly for upstreams with awkward auth (Bedrock SigV4, Azure deployment routing, Vertex service-account JWTs). For plain OpenAI-shaped endpoints, prefer the zero-dependencycustom/adapter. Install withpip install ".[litellm]". See recipes.md.
| Model | Size | Strengths | Pull |
|---|---|---|---|
qwen2.5-coder |
7B / 32B | Best for coding | ollama pull qwen2.5-coder |
llama3.3 / llama3.2 |
70B / 3B–11B | General purpose | ollama pull llama3.3 |
deepseek-r1 |
7B–70B | Reasoning, math | ollama pull deepseek-r1 |
mistral / mixtral |
7B / 8x7B | Fast / strong MoE | ollama pull mistral |
phi4 · gemma3 · codellama |
14B · 4–27B · 7–34B | Reasoning / open / code | ollama pull phi4 |
llava · llama3.2-vision |
7–13B · 11B | Vision | ollama pull llava |
Tool calling needs a function-calling model — recommended:
qwen2.5-coder,llama3.3,mistral,phi4. Reasoning models (deepseek-r1,qwen3,gemma4) stream native<think>blocks; enable with/verbose+/thinking.
curl -fsSL https://raw.githubusercontent.com/SafeRL-Lab/cheetahclaws/main/scripts/install.sh | bash
# or:
pip install cheetahclawsWorks on Linux, macOS, WSL2, and Android (Termux) (Python 3.10+). First run guides you through provider + API-key setup; re-run anytime with cheetahclaws --setup.
Windows: native Windows is not supported — use WSL2. Android/Termux:
pkg install python git && pip install cheetahclaws.
git clone https://github.com/SafeRL-Lab/cheetahclaws.git
cd cheetahclaws
pip install . # then: cheetahclaws
git pull && pip install --force-reinstall . # to updatepip install ".[voice]" # voice input (sounddevice + faster-whisper)
pip install ".[vision]" # clipboard image capture (Pillow)
pip install ".[autosuggest]"# typing-time slash autosuggest (prompt_toolkit)
pip install ".[browser]" # headless browser (playwright); then: playwright install chromium
pip install ".[files]" # PDF + Excel reading (pymupdf, openpyxl)
pip install ".[ocr]" # image OCR (pytesseract)
pip install ".[trading]" # trading agent (yfinance, rank-bm25)
pip install ".[qq]" # QQ bot bridge (qq-botpy)
pip install ".[litellm]" # AWS Bedrock / Azure / Vertex auth via litellm
pip install ".[all]" # everything abovegit clone https://github.com/SafeRL-Lab/cheetahclaws.git && cd cheetahclaws
uv tool install ".[all]" # minimal: uv tool install .
uv tool install ".[all]" --reinstall # update · uv tool uninstall cheetahclawsgit clone https://github.com/SafeRL-Lab/cheetahclaws.git && cd cheetahclaws
pip install -r requirements.txt
python cheetahclaws.py # changes take effect immediatelyEvery cloud provider follows the same pattern — export its API key (see the Supported Models table for the env-var name), then select a model:
export ANTHROPIC_API_KEY=sk-ant-... # or OPENAI_API_KEY / GEMINI_API_KEY / DEEPSEEK_API_KEY / …
cheetahclaws # default model
cheetahclaws --model gpt-4o # pick any model
cheetahclaws --model deepseek-chat --thinking --verboseProvider get-key pages: Anthropic · OpenAI · Gemini · Kimi · Qwen · Zhipu · DeepSeek · MiniMax.
AWS Bedrock / Azure / Vertex use the litellm/<provider>/<model> form (pip install ".[litellm]") — full env-var recipes in recipes.md.
Full per-provider guide — every provider's get-key page + example model commands, plus Bedrock/Azure/Vertex env-var recipes: docs/guides/usage.md.
curl -fsSL https://ollama.com/install.sh | sh # install
ollama pull qwen2.5-coder # pull a tool-calling model
ollama serve # http://localhost:11434 (auto-starts on macOS)
cheetahclaws --model ollama/qwen2.5-coder # run (use `ollama list` to see local models)Download LM Studio, grab a GGUF model, start its Local Server (port 1234), then:
cheetahclaws --model lmstudio/<model-name>python -m vllm.entrypoints.openai.api_server \
--model Qwen/Qwen2.5-Coder-32B-Instruct --port 8000 \
--enable-auto-tool-choice --tool-call-parser hermes
export CUSTOM_BASE_URL=http://localhost:8000/v1
export CUSTOM_API_KEY=token-abc123 # any non-empty string if the server has no auth
cheetahclaws --model custom/Qwen2.5-Coder-32B-InstructThe name after custom/ must match the server's --served-model-name. For the Web UI, --web --model custom/<name> persists the model before the server starts. Remote server? Point CUSTOM_BASE_URL at its IP.
Full local-model guide — Ollama step-by-step, LM Studio, vLLM + Web UI: docs/guides/usage.md.
Three equivalent forms are accepted:
cheetahclaws --model gpt-4o # 1. auto-detect by prefix
cheetahclaws --model ollama/qwen2.5-coder # 2. provider/model
cheetahclaws --model kimi:moonshot-v1-32k # 3. provider:modelAuto-detection by prefix: claude-→anthropic · gpt-/o1/o3→openai · gemini-→gemini · moonshot-/kimi-→kimi · qwen/qwq-→qwen · glm-→zhipu · deepseek-→deepseek · MiniMax-/abab→minimax · llama/mistral/phi/gemma/mixtral/codellama→ollama.
A built-in AI trading analysis + backtesting module (pip install "cheetahclaws[trading]").
/trading analyze NVDA # 5-phase pipeline: data → Bull/Bear debate → Judge → Risk panel → PM decision
/trading backtest AAPL dual_ma # backtest a strategy (or let AI pick); Sharpe/Sortino/Calmar/drawdown/win-rate4 strategies (dual_ma, rsi_mean_reversion, bollinger_breakout, macd_crossover), BM25 memory of past situations, US/HK/A-share + crypto markets with no-API-key data fallbacks. Guided sub-menu via /ssj → Trading.
Full guide: docs/guides/trading.md
A production-ready browser interface — real user accounts (bcrypt + JWT), SQLite-backed history, ops endpoints — served by Python stdlib + ten vanilla-JS modules (no Node.js / React / build step).
pip install 'cheetahclaws[web]'
cheetahclaws --web # auto-picks a free port (tries 8080)
cheetahclaws --web --port 9000 --host 0.0.0.0 # bind explicitly / open to LAN
cheetahclaws --web --no-auth # skip login (localhost dev only)Open http://localhost:<port>/chat — first account becomes admin. Includes streaming chat (WS) + SSE slash commands, persistent sessions with folders/search/Markdown export, tool cards, inline permission approval, settings panel, light/dark/system theme, and /health + /metrics endpoints. A full xterm.js PTY terminal lives at / (100% CLI parity).
Full guide: docs/guides/web-ui.md · Docker / home server: docs/guides/docker.md
Detailed guides live in docs/guides/ to keep this README focused:
| Guide | What's inside |
|---|---|
| Features (full) | The complete feature table — every row with full detail (context compression, auto-fanout, themes, Trading/Research/Agents writeups) |
| Usage (all providers) | Per-provider setup + example commands: Anthropic/OpenAI/Gemini/Kimi/Qwen/Zhipu/DeepSeek/MiniMax/litellm, and local Ollama/LM Studio/vLLM |
| Web UI | Chat UI, PTY terminal, API endpoints, settings, auth, SSE streaming |
| Docker / Home Server | Dockerfile + compose: web UI + bridges in one container, host Ollama, workspace mount |
| Reference | CLI, 50+ commands, 33 built-in tools, session search, error classification, tool cache |
| Extensions | Memory, Skills, Sub-Agents, MCP servers, Plugins, Monitor, Autonomous Agents |
| Bridges | Telegram, WeChat, Slack, QQ setup + remote control from your phone |
| Security & env vars | Threat model, CHEETAHCLAWS_* vars, bot-token handling, Bash denylist, fs sandbox, CSRF |
| Voice & Video | Offline Whisper voice input, Video factory, TTS factory |
| Trading | Multi-agent analysis, backtesting, BM25 memory, data fallbacks, SSJ integration |
| Advanced | Brainstorm, SSJ, Tmux, proactive monitoring, checkpoints, plan mode, sessions, cloud sync |
| Comparison | Full positioning vs Claude Code and OpenClaw — at-a-glance tables, both sides' wins, key design differences |
| Recipes | 12 step-by-step examples: code review, remote control, research, bug fix, browse, email, PDF/Excel |
| FAQ | The full FAQ (MCP, models/providers, CLI/scripting, voice) |
| Plugin Authoring · Example | Build a plugin: tools, commands, skills, MCP; starter template |
| Research Lab | /lab start <topic> — autonomous multi-agent paper writing with sandboxed experiments |
| Agent OS · RFC index | The cc_kernel/ layer + all design notes (RFC 0001-0032) |
| Contributing | Project structure, architecture guide, PR checklist |
cheetahclaws [OPTIONS] [PROMPT]
-p, --print Non-interactive: run prompt and exit
-m, --model MODEL Override model (e.g. gpt-4o, ollama/llama3.3)
--accept-all Auto-approve all operations (no permission prompts)
--verbose Show thinking blocks and per-turn token counts
--thinking Enable Extended Thinking (Claude only)
--web Start web server (Chat UI + PTY terminal in browser)
--port / --host Web server port / host (default 8080 / 127.0.0.1)
--no-auth Disable web password (local use only)
--version / -h Print version / show help
cheetahclaws # interactive REPL, default model
cheetahclaws -m ollama/deepseek-r1:32b # pick a model
cheetahclaws -p "Write a Python fibonacci function" # non-interactive
cheetahclaws --accept-all -p "Init a pyproject.toml" # CI / automation
cheetahclaws --web --port 8008 --no-auth # browser chat + terminalSee the Reference Guide for all 50+ slash commands, tools, and config options.
We welcome contributions! See the Contributing Guide for architecture, conventions, and the PR checklist.
git clone https://github.com/SafeRL-Lab/cheetahclaws.git && cd cheetahclaws
pip install -r requirements.txt && pip install pytest
python -m pytest tests/ -x -q # 341+ tests should pass
python cheetahclaws.py # run the REPLBuilding a plugin? See the Plugin Authoring Guide and the example template.
A few common questions — the full FAQ is in docs/guides/faq.md.
Q: How do I add an MCP server?
/mcp add git uvx mcp-server-git # or create .mcp.json in your project, then /mcp reload
Q: Tool calls don't work with my local Ollama model.
Not all models support function calling — use qwen2.5-coder, llama3.3, mistral, or phi4.
Q: How do I connect to a remote GPU server running vLLM?
/config custom_base_url=http://your-server-ip:8000/v1
/config custom_api_key=your-token
/model custom/your-model-name
Q: How do I check my API cost? Run /cost (shows input/output tokens + estimated USD).
Q: Can I use multiple API keys in one session? Yes — set all keys upfront (env or /config), then switch models freely; each call uses the active provider's key.
Q: How do I set a default model across projects? Add keys to ~/.bashrc/~/.zshrc and set { "model": "claude-sonnet-4-6" } in ~/.cheetahclaws/config.json.
Q: Can I pipe input to cheetahclaws?
cat error.log | cheetahclaws -p "What is causing this error?"Q: How do I set up voice input? pip install sounddevice faster-whisper numpy, then /voice in the REPL (downloads a ~150 MB Whisper model on first use). See the full FAQ for languages + keyterm tuning.
If you find the repository useful, please cite the study
@article{gu2026model,
title={From Model Scaling to System Scaling: Scaling the Harness in Agentic AI},
author={Gu, Shangding},
journal={arXiv preprint arXiv:2605.26112},
year={2026}
}
@article{cheetahclaws2026,
title={CheetahClaws: A Fast and Easy-to-Use Agent Harness Infrastructure for Long-Horizon, Multi-Model, and Tool-Using AI Systems},
author={CheetahClaws Team},
journal={github},
year={2026}
}


