Hermes Agent

★ New
assess
AI / ML open-source MIT open-source

What It Does

Hermes Agent is an open-source (MIT) self-improving AI agent built by Nous Research. Its core differentiator is autonomous skill creation: when the agent completes a complex task, it writes a reusable Markdown skill file that it can reference in future sessions. This is retrieval-based learning (not model weight retraining) — the agent accumulates procedural knowledge as files, persists them across sessions, and retrieves them via FTS5 full-text search combined with LLM summarization.

The agent operates as a Python process that connects to LLM providers (OpenRouter, OpenAI, Nous Portal, z.ai/GLM, Kimi/Moonshot, MiniMax), executes 40+ built-in tools, manages cross-session memory, and communicates across messaging platforms (Telegram, Discord, Slack, WhatsApp, Signal, Email) and a terminal UI. It supports six deployment backends: local, Docker, SSH, Daytona, Singularity, and Modal. The project integrates Honcho (by Plastic Labs) for dialectic user modeling and is compatible with the Agent Skills specification (agentskills.io).

Key Features

  • Autonomous skill creation: After complex tasks, the agent writes reusable Markdown skill files stored persistently. Skills are agent-curated, not community-contributed.
  • Cross-session memory: FTS5-based full-text search with LLM summarization for session recall. Memory persists across restarts and sessions.
  • Multi-platform messaging gateway: Telegram, Discord, Slack, WhatsApp, Signal, Email, and terminal UI from a single agent process.
  • Multi-provider LLM support: OpenRouter (300+ models), OpenAI, Nous Portal, z.ai, Kimi/Moonshot, MiniMax with command-line switching.
  • Six deployment backends: Local, Docker, SSH, Daytona, Singularity, Modal. Serverless hibernation support via Modal and Daytona.
  • Subagent spawning: Parallel workstream execution via subagent creation (details sparse in documentation).
  • Honcho integration: Dialectic user modeling from Plastic Labs for persistent personality and preference understanding.
  • Agent Skills spec compatibility: Skills conform to the agentskills.io open standard for cross-agent portability.
  • Cron scheduling: Built-in task scheduler for automated recurring agent operations.
  • Tinker-Atropos RL: Optional reinforcement learning training integration via Git submodule.

Use Cases

  • Personal AI assistant across platforms: Individuals wanting a single agent that grows smarter over time, accessible from any messaging app, with persistent memory of past interactions and preferences.
  • Technical task automation: Developers automating repetitive tasks (file management, code generation, API calls) where the agent learns and improves its approach across repeated sessions.
  • Self-hosted private agent: Privacy-conscious users or organizations needing an agent that runs entirely on their infrastructure (local models via Ollama + $5 VPS) with no data leaving their environment.
  • Multi-agent research workflows: Researchers using subagent spawning and the RL integration (Tinker-Atropos) for agent behavior experiments.

Adoption Level Analysis

Small teams (<20 engineers): Good fit. MIT license, minimal hardware requirements ($5 VPS claim is technically true for the orchestration layer), extensive model provider options including free tiers. The self-improving skill system reduces repeat configuration. Main risk: Python 3.11+ dependency and the need to self-host and debug the agent process.

Medium orgs (20-200 engineers): Moderate fit with caveats. The multi-channel messaging gateway is useful for teams wanting a shared AI assistant across Slack and other platforms. However, there are no published enterprise governance features, no multi-tenancy, and no audit logging. The single-process Python architecture may bottleneck under concurrent usage. No published scaling benchmarks exist.

Enterprise (200+ engineers): Does not fit without significant engineering investment. No commercial support, no SLA, no SOC2, no compliance features. The crypto/token association (NOUS token by Nous Research) may raise compliance concerns in regulated industries. The agent’s self-improving nature (writing its own skill files) introduces unpredictability that enterprise compliance teams may resist.

Alternatives

AlternativeKey DifferencePrefer when…
OpenClawNode.js gateway, 5400+ community skills, 25+ channels, Mission Control dashboardYou need the largest skills ecosystem and broadest channel support, and accept the security risks
Claude CodeAnthropic’s CLI agent with Auto-Dream memory, layered context systemYou are in a coding-focused workflow and want tight Anthropic model integration with proven memory consolidation
GooseMCP-native, donated to AAIF, broader extension architectureYou want a neutral governance model (Linux Foundation) and deep MCP integration
OpenHandsModel-agnostic coding platform with SDK, CLI, GUI, and cloudYou need a mature platform (70k+ stars) with commercial cloud tiers and SWE-bench leadership

Evidence & Sources

Notes & Caveats

  • “Self-improving” is retrieval-based, not weight-based. The agent does not retrain its underlying model. It writes Markdown skill files and retrieves them via FTS5 search. This is valuable but should not be confused with model-level improvement. The marketing framing as a “learning loop” overstates the mechanism.
  • Mid-session memory lag. Edits to MEMORY.md or USER.md made during a session only take effect in the next session, not mid-conversation. This is a meaningful UX friction point for long sessions.
  • Crypto/token association. Nous Research has a NOUS token with a $1B valuation funded by crypto VC Paradigm. The agent may serve as a user acquisition funnel for the token ecosystem. This does not diminish the technical quality but introduces a potential misalignment of incentives.
  • No independent security audit. Unlike OpenClaw (which has extensive CVE documentation, SafeClaw-R audit, and academic security analyses), Hermes Agent has not been subject to published independent security review. The 40+ tools and file system access represent a significant attack surface.
  • Limited scaling evidence. No published case studies document Hermes Agent running at organizational scale. The 24.7k stars indicate strong individual adoption but not necessarily production reliability.
  • Tinker-Atropos RL is optional and research-grade. The RL training integration is a Git submodule, not part of the default installation. Its maturity and applicability to typical users is unclear.
  • Python 3.11+ requirement. Excludes environments stuck on older Python versions, which is common in enterprise Linux distributions.