What It Does

OpenHands is an open-source platform for building and running autonomous AI coding agents. Agents interact with codebases the way a human developer would: reading and editing files, running terminal commands, browsing the web, and executing multi-step development tasks end-to-end. The platform provides a sandboxed Docker runtime for safe code execution, supports multiple LLM providers (Anthropic Claude, OpenAI GPT, Google Gemini, DeepSeek, Qwen, local Ollama models), and ships four distinct interfaces: a CLI, a local web GUI, a Python SDK for programmatic agent orchestration, and a hosted cloud platform.

Originally called OpenDevin, the project emerged from CMU and UIUC research and was published at ICLR 2025. The commercial entity All Hands AI provides the cloud and enterprise tiers while the core remains MIT-licensed.

Key Features

Docker-sandboxed code execution environment isolating agent actions from host system
Model-agnostic architecture supporting Claude, GPT, Gemini, DeepSeek, Qwen, and local models via Ollama
Software Agent SDK (Python + REST API) for defining custom agents with built-in tools (file editor, terminal, task tracker)
CLI interface comparable to Claude Code or Codex for interactive terminal-based development
Local web GUI with real-time observation of agent reasoning and actions
Cloud platform with GitHub, GitLab, Bitbucket, Slack, Jira, and Linear integrations
Public skills marketplace for distributing reusable agent capabilities
OpenHands Index — a multi-domain benchmark evaluating LLMs across five software engineering task types (issue resolution, greenfield dev, frontend dev, test generation, information gathering)
SWE-bench Verified score of 77.6% (as of early 2026), claimed #1 open-source agent on leaderboard
Enterprise self-hosted deployment via Kubernetes Helm charts with RBAC and multi-tenancy

Use Cases

Automated bug fixing and PR creation from issue trackers (GitHub Issues, Jira, Linear)
Code migration and dependency upgrades across microservice fleets
Vulnerability triage and automated patching at scale
Parallel agent orchestration for large refactoring or migration campaigns
Research and evaluation platform for testing new LLMs on software engineering benchmarks
Enterprise teams needing model-agnostic, self-hosted AI coding infrastructure to avoid vendor lock-in

Adoption Level Analysis

Small teams (<20 engineers): Possible but with friction. The CLI and local GUI work well for individual developers. However, useful autonomous coding requires frontier LLM API access (Claude, GPT-4+), which costs $3+/task based on real-world reports. Local models via Ollama produce dramatically worse results — 14-32B models managed only 1-2 actions before losing context in independent testing. Docker dependency for the sandbox adds setup overhead. Cost-effective for occasional use, but not a game-changer for small teams at current LLM pricing.

Medium orgs (20-200 engineers): Good fit. The cloud platform and SDK enable shared infrastructure for AI-assisted development. GitHub/GitLab integrations and multi-user support make it viable as a team tool. The model-agnostic architecture provides negotiating leverage with LLM providers. Cost management becomes important — heavy usage runs $100-200/month per active developer in LLM API costs alone.

Enterprise (200+ engineers): Viable but enterprise product is still maturing. Self-hosted Kubernetes deployment via Helm charts exists but is self-described as having “gotchas.” The PostgreSQL-backed multi-tenancy migration was targeted for April 2026 completion. For organizations with strict data residency or air-gapped requirements, this is one of the few open-source options. However, enterprises should evaluate RBAC maturity, audit logging completeness, and the dual-license model (MIT core + commercial enterprise directory) before committing.

Alternatives

Alternative	Key Difference	Prefer when…
Claude Code	Single-model (Anthropic), CLI-only, more polished autonomous coding experience	You are committed to Anthropic ecosystem and want the most refined CLI agent experience
Codex (OpenAI)	Single-model (OpenAI), async task delegation model	You want fire-and-forget task delegation with OpenAI models
Devin (Cognition)	Fully managed SaaS, proprietary, most autonomous	You want maximum autonomy without infrastructure management
Goose (Block)	MCP-native, lighter weight, community-governed via AAIF	You want a simpler agent with strong MCP ecosystem integration
OpenCode	MIT-licensed, TUI + desktop app, lighter footprint	You want a simpler open-source alternative without sandboxed execution overhead

Evidence & Sources

OpenHands ICLR 2025 Paper — peer-reviewed platform paper
Real-world experience with OpenHands (Medium) — independent user report with concrete cost/time data
All Hands AI raises $5M (TechCrunch) — funding and founding team profile
OpenHands vs SWE-Agent comparison (Local AI Master) — independent comparison
SWE-bench Verified Leaderboard (Epoch AI) — benchmark tracking
MLSys 2026 Poster — Software Agent SDK — SDK architecture

Notes & Caveats

Benchmark score nuance: The 77.6% SWE-bench Verified score reflects the combined system (OpenHands harness + frontier LLM). Performance collapses dramatically with smaller or local models. The score is heavily model-dependent, not platform-dependent.
SWE-bench Verified vs Live gap: Across all agents, SWE-bench Verified scores (60%+) far exceed SWE-bench Live scores (~19%), suggesting possible memorization effects in the static benchmark. METR found roughly half of test-passing SWE-bench PRs would not be merged by maintainers.
Local model quality: Independent testing found Ollama models (7B-70B) effectively unusable for autonomous coding with OpenHands. Only frontier models produce useful results.
Enterprise maturity: The Helm chart for self-hosted deployment is acknowledged as work-in-progress by the project itself. PostgreSQL-backed multi-tenancy targeted April 2026 completion.
Credentials and secrets: No native secrets management. GitHub tokens work via web interface, but other credentials require workarounds (prompt injection or environment variables), creating security exposure.
Git operations: Multiple independent reports of agents struggling with git operations — pushing to wrong branches, failing to use credentials correctly, inability to interact with PR comments programmatically.
Cost at scale: ~$3/task for simple microservice upgrades. Heavy usage estimated at $100-200/month per developer in LLM API costs. The platform cost is secondary to the LLM cost.
Dual licensing: Core MIT, enterprise directory source-available with commercial license. Docker images are MIT. This is a legitimate open-core model but teams should understand what requires a paid license.
Name history: Project was originally called “OpenDevin” before rebranding to OpenHands, which may cause confusion in older references and search results.

OpenHands

At a Glance

What It Does

Key Features

Use Cases

Adoption Level Analysis

Alternatives

Evidence & Sources

Notes & Caveats

Related

Codebuff

Augment Code

Codel

Deep Agents