What It Does

Mini Coding Agent is a minimal, readable Python implementation of a coding agent harness created by Sebastian Raschka (PhD, author of “Build a Large Language Model From Scratch”). It is explicitly designed as an educational reference, not a production tool. The repository contains a single Python file with no external dependencies beyond the standard library, and uses Ollama for local model inference.

The project implements all six architectural components that Raschka identifies as central to production coding agents (Claude Code, Codex CLI): live workspace context collection, stable-prefix prompt architecture for cache reuse, structured tool validation with approval gates, context minimization via clipping and deduplication, persistent session memory, and bounded subagent delegation. Its value is as a readable, bottom-up explanation of how those production systems work — not as a replacement for them.

Key Features

Zero external dependencies: Standard library only; run with python mini_coding_agent.py or uv run — no pip install required
Ollama model backend: Uses locally-running models (default: qwen3.5:4b); configurable via CLI flags for model selection
Live workspace snapshot: Collects git status, directory tree, and project documentation upfront before each session
Stable prefix architecture: Separates system prompt, tool descriptions, and workspace summary (stable, cache-friendly) from recent transcript and user input (dynamic)
Structured tool validation: Pre-defined tool set with argument validation and workspace path confinement checks; tools include list_files, read_file, search, shell_command, write_file
Approval gates: Risky tools (shell commands, file writes) blocked by default pending user confirmation; configurable to allow-all
Context management: Clips long tool outputs, deduplicates repeated file reads, compresses older transcript entries
Dual memory model: Full transcript (complete history) and distilled working memory (current task summary, important files, recent notes) persisted across sessions
Bounded subagent delegation: Can spawn scoped child agent instances for isolated subtasks
Session resumption: Persists transcript and memory to disk; resumable via CLI flag

Use Cases

Educational use: Understanding the architecture of production coding agents (Claude Code, Codex CLI) through a minimal, annotated implementation
Experimentation platform: Testing harness design decisions (context management strategies, approval policies, memory architectures) with a codebase simple enough to modify in an afternoon
Local development: Small personal coding tasks using Ollama-hosted models without sending code to external APIs
Course material: Raschka’s “Ahead of AI” newsletter and forthcoming book use this as reference material for coding agent architecture

Adoption Level Analysis

Small teams (<20 engineers): Educational and personal use only. The agent is intentionally minimal and not optimized for robustness or performance. For serious team coding work, use Claude Code, Codex CLI, Gemini CLI, or OpenCode. Mini Coding Agent is the “read the source to understand how it works” tool, not the “ship with this” tool.

Medium orgs (20-200 engineers): Not recommended for production team use. The single-file architecture and Ollama-only backend limit extensibility. If you want a production-grade harness you control, evaluate Pi Coding Agent or OpenHands.

Enterprise (200+ engineers): Not applicable.

Alternatives

Alternative	Key Difference	Prefer when…
Pi Coding Agent	Production-grade, TypeScript, multi-provider, extensible	You want a minimal but production-capable harness you can extend
Codex CLI	OpenAI-backed, Rust, full feature set	You want a real coding agent, not an educational one
Claude Code	Anthropic-backed, full production agent	You want the best-in-class terminal agent experience
OpenHands	Production platform, multi-model, GUI + SDK	You want a full platform for autonomous coding agents

Evidence & Sources

Mini Coding Agent GitHub Repository — source code and documentation
Components of A Coding Agent (Ahead of AI, Sebastian Raschka) — companion article explaining the design
Ahead of AI Newsletter — Sebastian Raschka’s independent ML/AI newsletter with 150k+ subscribers

Notes & Caveats

Intentionally incomplete: The project README explicitly states it is “intentionally small and optimized for readability, not robustness.” Do not treat star count or Raschka’s credibility as a proxy for production suitability.
Ollama dependency for inference: Requires Ollama running locally with a compatible model. Performance is entirely dependent on local hardware and model choice. Default model (qwen3.5:4b) is too small for serious coding tasks; the README recommends qwen3.5:9b or larger.
Single-file architecture limits extensibility: The design choice of a single Python file makes it easy to read but hard to extend for production use cases. Adding authentication, logging, multi-user support, or cloud model backends requires significant restructuring.
Not actively maintained for production use: As a companion to educational content, the project is updated to reflect architectural points Raschka wants to illustrate, not to track production agent feature development.

Mini Coding Agent

At a Glance

What It Does

Key Features

Use Cases

Adoption Level Analysis

Alternatives

Evidence & Sources

Notes & Caveats

Related

Deep Agents

Mistral Vibe

Aider

Claude Northstar