Summary

Codex CLI is OpenAI’s open-source, terminal-based AI coding agent implemented in Rust. With 74.5k GitHub stars and 696+ releases (v0.120.0 released April 11, 2026), it is one of the most actively developed AI coding agents available today. The project provides a safety-first local execution environment wrapping OpenAI’s models with file read/write, shell execution, an interactive TUI, and cloud sandbox integration. This review covers the architecture, sandbox model, MCP integration, and practical trade-offs as of the v0.120.0 release.

What It Is

Codex CLI is a multi-crate Rust workspace with four main components:

core/ — Business logic library, reusable for building native applications on top of Codex capabilities
exec/ — Headless CLI for programmatic, non-interactive operation and automation workflows
tui/ — Interactive full-screen terminal interface built with Ratatui
cli/ — Multitool that consolidates the above as subcommands

The project is distinct from Codex Web (chatgpt.com/codex) and Codex App (desktop). The CLI is the local, developer-controlled execution path.

Key Technical Details

Sandbox Architecture

The sandbox model has three tiers with platform-specific implementations:

Mode	Access	Platform Implementation
`read-only`	Default; no file writes	macOS Seatbelt, Linux Landlock, Windows elevated policy
`workspace-write`	Writes scoped to the project directory	Same platform backends
`danger-full-access`	Unrestricted; explicit opt-in required	No sandbox

Platform-native sandbox backends (Seatbelt on macOS, Landlock on Linux) are a genuine differentiator. Unlike approaches that rely on restricted shell environments, Codex CLI uses OS-level enforcement. Docker-based local sandboxing is also available for stronger network isolation.

Approval Modes

Three execution approval modes control how the agent acts:

ask (default) — Prompts for each potentially dangerous action; highest human oversight
auto — Executes with guardrails; suitable for trusted workspaces
never — Read-only mode; no writes or command execution permitted

MCP Integration

Codex CLI operates in a dual MCP role:

MCP client — Connects to external MCP servers, enabling tool extensions
MCP server (experimental) — Exposes Codex capabilities so other agents can use it as a tool

The v0.120.0 release adds outputSchema details for MCP tools in code mode, improving structured output reliability for agent-to-agent chaining scenarios.

Configuration

Configuration uses TOML format (replacing legacy JSON). Settings include sandbox policies, model selection, custom hook scripts, and approval mode defaults.

Context and Model Support

192K context window (nominal; documented community reports of exhaustion in practice)
Supports OpenAI models; no multi-provider routing
Prompt caching: 75% discount on cached prefixes ($1.50/$6 per million tokens)

Cloud Sandbox

The Codex Web/cloud execution environment pre-loads repositories for parallel background task execution. This is powerful for delegating independent tasks but sends repository content to OpenAI infrastructure — a meaningful concern for IP-sensitive codebases.

v0.120.0 Notable Changes (April 11, 2026)

Realtime V2: Background agent progress streams while work continues; follow-up responses queue until active work completes
TUI improvements: Hook activity display is more scannable; live hooks shown separately; completed output retained only when relevant
Thread title support: Custom TUI status lines can include renamed thread titles
MCP tool typing: Code-mode tools now include outputSchema for precise structured results
SessionStart hooks: Can now distinguish sessions created via /clear from fresh startups or resumed sessions
Bug fixes: Windows elevated sandbox handling, symlinked writable root permissions, remote WebSocket stability, tool search result ordering, Stop-hook prompt timing, MCP cleanup on disconnect

Critical Assessment

Strengths

Safety model is the clearest of the major terminal coding agents. Three-tier approval with platform-native sandbox enforcement is more rigorous than ad-hoc restricted shell approaches. The explicit danger-full-access opt-in creates a meaningful UX barrier to unsafe operation.

Active development cadence is exceptional. 696+ releases with 15+ contributors per release indicates a team actively responding to production use. The Realtime V2 streaming in v0.120.0 addresses a real UX limitation of blocking on long-running tasks.

Rust binary eliminates runtime dependency issues. No Python environment, no Node.js version conflicts. Cross-platform binaries distributed via GitHub Releases with Homebrew cask support.

MCP dual role is architecturally interesting. Acting as both client and server enables Codex to be composed into larger agent pipelines, not just used as a standalone tool.

Cost economics favor ChatGPT subscribers. The 75% prompt cache discount plus subscription-included access makes Codex CLI one of the lower-cost paths for OpenAI API usage in extended agentic sessions.

Weaknesses

Context window exhaustion contradicts the nominal 192K claim. Community reports are consistent: the auto-compression mechanism does not always trigger appropriately, and long sessions exceed practical limits before the theoretical ceiling. This is a reliability concern for complex codebase tasks.

Cloud sandbox creates data governance exposure. Sending repository content to OpenAI’s cloud for parallel execution is unacceptable for many enterprise or IP-sensitive environments. The local execution path avoids this, but removes the parallel task capability.

OpenAI vendor lock-in is absolute. There is no configuration path to route to alternative model providers. For organizations that require provider flexibility, open-source alternatives like OpenCode or Goose are better positioned.

Rust implementation limits hackability. The Apache-2.0 license permits modification, but meaningful customization requires Rust expertise. Python or TypeScript-based alternatives are more accessible to teams wanting to extend or inspect the harness.

Enterprise story remains underdeveloped. No centralized policy management, no audit logging, no enterprise-scoped configuration management across developer instances. ChatGPT Enterprise may address some concerns at the subscription level, but the CLI itself lacks enterprise-grade operational controls.

Comparison to Direct Competitors

Agent	Implementation	Multi-provider	Sandbox	Context	Notable
Codex CLI	Rust, local	No (OpenAI only)	OS-native + Docker	192K nominal	Dual MCP, platform sandbox
Claude Code	TypeScript, local	No (Anthropic only)	Restricted shell	200K	Enterprise plans, strongest task quality
Gemini CLI	TypeScript, local	Google only	Basic	1M	Free 1k req/day tier
OpenCode	TypeScript, multi-provider	Yes	Basic	Varies by model	MIT, provider-agnostic
Goose	Python, local	Yes (MCP-native)	Basic	Varies	Neutral governance (AAIF)

Radar Recommendation

Assess — The safety model, active development, and MCP dual-role capability make Codex CLI worth hands-on evaluation for teams already in the OpenAI ecosystem. The context exhaustion issues and cloud sandbox data exposure require explicit evaluation before adoption. Not recommended as a default choice for organizations prioritizing vendor neutrality or enterprise controls.

The trajectory is promising: the Realtime V2 streaming, improved TUI, and MCP output schema work in v0.120.0 represent meaningful maturity improvements. Watch for resolution of the context management issues and enterprise policy controls before moving to Trial.

Codex CLI: OpenAI's Local Coding Agent — Architecture and Practical Assessment

Referenced in catalog