What It Does

ClawKeeper is a three-layer real-time security framework designed specifically for OpenClaw autonomous agents. It addresses the well-documented security vulnerabilities that arise when AI agents have broad operational privileges (shell execution, file access, tool integration) by implementing defense-in-depth across three complementary architectural layers: skill-based protection at the instruction level, plugin-based enforcement at runtime, and watcher-based independent monitoring at the system level.

The project originated from academic research by a team with ties to Microsoft Research Asia and Beijing University of Posts and Telecommunications. It was released alongside arXiv paper 2603.24414 in March 2026. The watcher-based layer is the most architecturally distinctive feature — it operates as a decoupled middleware that monitors agent state evolution and can halt or require human confirmation for high-risk actions, without coupling to the agent’s internal logic.

Key Features

Three-layer defense architecture: Skill-based (instruction injection), plugin-based (runtime enforcer), watcher-based (decoupled system monitor) — each can be deployed independently or together
Real-time action gating: Evaluates agent actions before execution, blocking high-risk behaviors including prompt injection attempts and credential leakage
Behavioral profiling: Establishes baselines for agent operations and detects anomalies in behavior patterns
Intent enforcement: Monitors multi-turn interactions to detect and prevent goal drift across conversation turns
Configuration integrity monitoring: Alerts when security-weakening configuration changes are made to the OpenClaw instance
Third-party skill monitoring: Inspects community-contributed skills for malicious behavior before and during execution
Comprehensive audit logging: Records all agent actions for compliance and post-incident analysis
Cross-platform support: TypeScript core (87.9%) with Swift, Kotlin, and Shell components for macOS, Linux, and Windows
Cloud and local deployment: Watcher layer supports both self-hosted and cloud deployment models

Use Cases

Hardening personal OpenClaw deployments: Individual developers or small teams running OpenClaw locally who want defense against prompt injection, credential leakage, and malicious skill execution without changing their OpenClaw setup.
Research testbed for agent security: Academic teams studying AI agent safety who need a reference implementation of multi-layer defense to benchmark against or extend.
OpenClaw pilot projects with security requirements: Organizations evaluating OpenClaw for internal use cases that require demonstrable security controls before approval.

Adoption Level Analysis

Small teams (<20 engineers): Reasonable fit for teams already using OpenClaw. The skill-based layer is zero-cost to try (just inject markdown into agent context). The plugin and watcher layers require Node.js deployment alongside OpenClaw. MIT license and straightforward setup lower the barrier. However, this is a v1.0 research release — expect rough edges, limited documentation, and no commercial support.

Medium orgs (20-200 engineers): Poor fit today. No production case studies, no enterprise features (RBAC, multi-tenant, centralized management), and no evidence of scaling beyond single-agent deployments. Medium orgs needing agent security should evaluate commercial options like StrongDM Leash or NVIDIA NanoClaw, or wait for ClawKeeper to mature.

Enterprise (200+ engineers): Does not fit. No audit certifications, no SLA, no commercial support, no integration with enterprise security tooling (SIEM, SOAR). The research paper quality is encouraging but insufficient for enterprise adoption. Enterprises should look at purpose-built commercial agent security platforms.

Alternatives

Alternative	Key Difference	Prefer when…
Leash by StrongDM	eBPF-based kernel-level interception + Cedar policies; commercial backing	You need proven kernel-level enforcement and policy-as-code for AI coding agents
SafeClaw-R (arXiv 2603.28807)	Execution graph mediation; 97.8% malicious skill detection	You need strong third-party skill vetting; academic comparison with ClawKeeper
”Don’t Let the Claw Grip Your Hand” (arXiv 2603.10387)	MITRE ATLAS/ATT&CK-derived rules + semantic judge + HITL	You want defense aligned with established threat frameworks (MITRE)
RAD Security clawkeeper	Bash CLI host auditor (42 checks); completely different scope	You need host-level security auditing, not agent-runtime protection
NVIDIA NanoClaw	Enterprise security wrapper with OS-level sandboxing + YAML policy engine	You need enterprise-grade, vendor-backed agent security

Evidence & Sources

ClawKeeper GitHub Repository (SafeAI-Lab-X)
arXiv 2603.24414: ClawKeeper Paper — 22 pages, 14 figures, 5 tables; preprint, not peer-reviewed
HuggingFace Paper Page — 169 upvotes, active discussion
arXiv 2603.10387: Don’t Let the Claw Grip Your Hand — competing defense framework
arXiv 2603.28807: SafeClaw-R — competing approach with execution graph mediation
arXiv 2603.27517: Systematic Taxonomy of OpenClaw Vulnerabilities — independent vulnerability classification
Chaozhuo Li’s Academic Profile — MSRA 2020-2024, 100+ papers

Notes & Caveats

OpenClaw-specific: ClawKeeper is tightly coupled to OpenClaw’s architecture. It is not a general-purpose agent security framework. If you migrate away from OpenClaw, ClawKeeper does not follow.
Self-benchmarked only: The 140-instance benchmark was constructed by the same team that built ClawKeeper. No independent reproduction or third-party validation exists. “Optimal defense performance” is an unverified claim.
Skill-based layer limitations: The instruction-injection approach (skill-based layer) relies on the LLM voluntarily obeying security instructions in its context. This is fundamentally the same mechanism that prompt injection attacks exploit — injecting competing instructions. Sophisticated adversaries may bypass this layer.
Name collision with RAD Security project: A completely separate commercial project also named “clawkeeper” exists at github.com/rad-security/clawkeeper. This is a bash-based host auditing CLI, not a runtime security framework. The name collision will cause confusion in searches.
No production deployments documented: As of April 2026, no case studies, post-mortems, or testimonials from real-world ClawKeeper deployments exist. The project has 305 GitHub stars, indicating community interest but not production validation.
Rapidly crowded space: At least 4 independent research groups published OpenClaw security papers in March 2026 alone. ClawKeeper is one of several competing approaches, and the “winner” in this space is far from determined.
Preprint status: The paper has not been peer-reviewed. ArXiv preprints do not undergo the same scrutiny as published conference or journal papers. The 169 HuggingFace upvotes indicate community interest but not academic validation.

ClawKeeper

At a Glance