Skip to content

Caveman

★ New
assess
AI / ML open-source MIT open-source

At a Glance

An MIT-licensed Agent Skills package that instructs Claude Code and 40+ other AI coding agents to respond in terse, article-dropped prose, self-reporting a 65% output token reduction while preserving code blocks and technical terms unchanged.

Type
open-source
Pricing
open-source
License
MIT
Adoption fit
small

What It Does

Caveman is a Claude Code skill — a small Agent Skills-formatted package — that instructs the AI agent to respond in minimal, caveman-style language. It strips filler phrases (“I’d be happy to help…”), hedging language, articles (a, an, the), and pleasantries from prose responses while leaving code blocks, technical terms, error messages, file paths, commands, and URLs completely unchanged. The result is shorter, denser responses that the project claims preserve full technical accuracy.

The project also ships a companion Python utility called caveman-compress that applies similar compression to CLAUDE.md project memory files, reducing the input token cost of loading project context at session start. The tool creates a backup (CLAUDE.original.md) before overwriting, which is a responsible design choice given the risk of lossy compression.

Key Features

  • Three compression levels: Lite (minimal filler removal, grammatically coherent), Full (default; dropped articles, fragment sentences), Ultra (maximum compression with abbreviations)
  • Selective preservation: Code blocks, inline code, technical terms, error messages, URLs, file paths, and commit messages are explicitly excluded from compression
  • Cross-agent compatibility: Packaged as an Agent Skills module; activates via npx skills add JuliusBrussee/caveman and works across Claude Code, GitHub Copilot, Cursor, Windsurf, Cline, and 35+ other agents. Also available as a Codex plugin ($caveman trigger)
  • Natural language triggers: Activated by /caveman, “talk like caveman,” “caveman mode,” or “less tokens please.” Deactivated with “stop caveman” or “normal mode”
  • Caveman Compress companion tool: Python utility that compresses CLAUDE.md input context files, self-reporting ~45% reduction in project memory file token counts
  • Reasoning token agnostic: Caveman affects only output prose — Claude’s reasoning/thinking tokens (if extended thinking is enabled) are not reduced

Use Cases

  • Interactive CLI sessions: Developers using Claude Code interactively who want shorter, faster responses during debugging or exploration sessions. Output tokens are a meaningful fraction of cost and latency in non-agentic interactive use.
  • High-volume developer tooling: Teams running many short Claude Code sessions per day where output verbosity is a noticeable cost factor.
  • Learning token dynamics: Teams wanting a concrete, installable demonstration of how output verbosity affects token counts — useful as an educational tool even if not production-deployed.

Adoption Level Analysis

Small teams (<20 engineers): Marginally fits — mostly as a developer quality-of-life preference rather than a cost reduction mechanism. Token savings are real but modest in absolute terms for small teams.

Medium orgs (20–200 engineers): Does not meaningfully fit. At this scale, input token accumulation across long agent conversations, tool call results, and context windows dominates cost — not output verbosity. Caveman addresses the wrong part of the token budget for agentic workloads.

Enterprise (200+ engineers): Does not fit. Enterprise LLM cost optimization requires gateway-level routing, caching, and model tiering — not style constraints on individual sessions. Caveman is not a substitute for structured cost governance.

Alternatives

AlternativeKey DifferencePrefer when…
LiteLLMGateway-level token budget enforcement, model routing, cost trackingYou need organization-wide token cost control
Prompt engineering (system prompt)Craft a concise system prompt once per deploymentYou want brevity without installing a skill dependency
LLM Gateway PatternArchitectural pattern for proxy-based cost governanceYou need cross-team, cross-model cost enforcement
LLMlingua (Microsoft)Algorithmic prompt compression preserving semantic informationYou need input token compression with measurable accuracy guarantees

Evidence & Sources

Notes & Caveats

  • Self-reported benchmarks only. The author disclosed on Hacker News that the headline “~75%” figure (later revised to “~65% average”) “needs proper benchmarking before credibility.” All benchmark data is from a single run with no variance statistics, baseline controls, or independent replication.
  • Output tokens are usually not the bottleneck. In agentic Claude Code workflows, the input context window (tool call results, file contents, conversation history) dominates token costs — not output verbosity. Multiple Hacker News commenters flagged this as the fundamental limitation of the approach. The Caveman Compress tool partially addresses this, but also carries the risk of lossy compression degrading agent behavior.
  • Style constraints may affect reasoning quality. Constraining an LLM to respond in a particular style can reduce the quality of multi-step reasoning — the model may “think” in fewer tokens than optimal. The cited arXiv paper (2604.00025) does support brevity improving accuracy in some cases, but that paper studied decoding-level constraints, not style-mimicry prompt instructions, so applicability is indirect.
  • Compression of CLAUDE.md is a human DX risk. Compressed caveman-style project instructions are harder for humans to read, maintain, and debug when agent behavior deviates. The backup file mitigates data loss but not cognitive load.
  • Started as a joke. The author explicitly described the project as joke-originated on Hacker News. It has since gained genuine traction (coverage in multiple tech media outlets) but the engineering rigor expected of a production cost-reduction tool is not present.
  • No security implications. Caveman is a markdown skill file with no executable code in the core skill. The caveman-compress tool is a Python script that modifies local files — read the source before running it.

Related