StrongDM Leash — Container-Based Policy Enforcement for AI Coding Agents

StrongDM Leash — Critical Review

TL;DR

Leash is an open-source (Apache 2.0) container-based sandbox that wraps AI coding agents (Claude, Codex, Gemini, Qwen, OpenCode) in monitored containers. It intercepts syscalls at the kernel level via eBPF, enforces Cedar policies on file access, network connections, process execution, and MCP tool calls, and exposes a Control UI for real-time observability. Written in Go, backed by StrongDM (acquired by Delinea in March 2026).

What’s Interesting

  1. Kernel-level enforcement via eBPF: Unlike application-layer sandboxes, Leash hooks at the OS level — syscall interception for filesystem, network, and process operations. Claims <1ms per decision overhead. This is a fundamentally different (and stronger) isolation model than process-level sandboxing.

  2. Cedar as the policy substrate: Cedar (Amazon’s open-source policy language) is the single source of truth. Policies are transpiled to eBPF rules and HTTP proxy configs — “Cedar is the only persisted artifact. Generated IR never touches disk.” This is elegant: operators write declarative intent, enforcement happens at multiple layers automatically.

  3. MCP-native governance: First tool I’ve seen that directly parses Model Context Protocol traffic, correlates MCP tool calls with filesystem/network telemetry, and lets you write Cedar policies against specific MCP servers and tools. This is the right level of abstraction for agent governance.

  4. Developer-friendly onramp: npm install -g @strongdm/leash && leash --open claude gets you running. Automatic API key forwarding, bind-mount management, and a web UI at localhost:18080. The friction-to-value ratio is excellent.

  5. Default-deny posture: Everything is forbidden unless explicitly permitted. Forbid rules always win over permit rules. This is the only sane default for agent sandboxing.

Critical Assessment

Strengths

  • Correct threat model: Treats AI agents as untrusted workloads, not trusted tools. Container isolation + syscall monitoring + policy enforcement is defense-in-depth done right.
  • Cedar policy language: Declarative, auditable, version-controllable. The transpilation to eBPF is a strong engineering choice.
  • Multi-agent support out of the box: Ships with Claude, Codex, Gemini, Qwen, OpenCode in a single container image. No per-agent setup friction.
  • MCP integration: First-mover on governing MCP tool calls at the infrastructure level rather than relying on the agent’s self-restraint.
  • Hot-reloadable policies: Can iterate on access rules without restarting agent sessions.
  • Open source (Apache 2.0): No license traps, auditable code, community-extensible.

Weaknesses & Gaps

  • MCP permit rules are informational-only in V1: Only forbid is enforced for MCP; permit just generates a linter warning. This means you can block MCP tools but can’t build a true allowlist yet.
  • No per-principal enforcement: All policies apply at the container/cgroup level. You can’t differentiate between two agents running in the same container.
  • No IPv6 or CIDR support: Network policies are host-based only. No subnet-level rules yet.
  • No argument filtering for ProcessExec: You can allow/deny /usr/bin/curl but can’t restrict what arguments it’s called with. A permitted binary can still do damage.
  • Linux/macOS only: WSL supported for Windows, but no native Windows containers.
  • Telemetry mismatch: The TELEMETRY.md doc only covers Statsig analytics events (start/session). The actual syscall monitoring and audit trail format is undocumented publicly — you have to read the Go source.
  • Delinea acquisition risk: StrongDM was acquired by Delinea (March 2026). Open-source commitment post-acquisition is unclear. The Apache 2.0 license provides a floor, but active maintenance could slow.
  • Young project: 512 stars, 75 commits, 5 open issues. Production readiness is unproven at scale.

Architecture Observations

The three-layer enforcement model is the most interesting design choice:

Cedar policy (declarative intent)
    ↓ transpile
eBPF rules (kernel-level file/net/proc enforcement)
HTTP proxy rules (header injection, TLS interception)
MCP observer (tool call interception)

This means a single Cedar policy file controls enforcement across three different mechanisms. The transpiler is the critical path — any bugs there are security bugs.

The container model uses bind-mounts for the project directory, which means file-level Cedar policies are the primary defense against agents modifying files outside the allowed scope. This is adequate but not as strong as a VM boundary.

Competitive Landscape

ToolApproachKey Difference
LeashContainer + eBPF + Cedar policiesPolicy-first, kernel-level enforcement, MCP-native
E2BFirecracker microVMsStronger isolation (VM boundary), but no policy language, 24h session limit
DaytonaDocker-compatible sandboxesFastest provisioning (<90ms), but no policy enforcement layer
NorthflankKata Containers + gVisorProduction-grade multi-tenant, but not agent-specific
ModalgVisor + Python-nativeGreat for Python workloads, no MCP or agent-specific governance
Native agent permissionsClaude’s built-in permission systemConvenient but relies on agent self-enforcement, not OS-level

Leash occupies a unique niche: it’s the only tool combining container isolation with a declarative policy language AND MCP-level governance. E2B has stronger isolation (VM vs container) but no policy expressiveness. Native agent permission systems (like Claude Code’s) work at the application layer and can theoretically be bypassed.

Who Should Care

  • Platform/DevOps teams deploying AI coding agents across an engineering org — Leash gives centralized policy control
  • Security teams that need audit trails of agent actions at the OS level
  • Regulated industries where “what did the AI agent do?” must be answerable
  • Anyone running agents against production infrastructure where a runaway agent could cause real damage

Who Should Wait

  • Teams needing Windows-native support
  • Organizations requiring per-principal (per-agent) policy differentiation
  • Anyone needing MCP allowlist enforcement (V1 only supports blocklist)
  • Teams that need subnet-level network policies (IPv6/CIDR)

Verdict

Assess — compelling architecture and the right threat model, but V1 limitations (MCP permit-only informational, no per-principal, no argument filtering) and the young codebase (75 commits) mean it’s not production-ready for high-stakes environments yet. The Delinea acquisition adds uncertainty. Worth tracking closely and experimenting with in dev environments. If MCP governance is a priority for your team, this is currently the only real option.

Sources