What It Does
Agent Skill Supply Chain Risk is an emerging security threat pattern specific to the AI agent skills ecosystem. It describes the class of attacks that exploit the trust chain between skill registries (skills.sh, ClawHub, agentskill.sh), skill authors, and the AI agents that consume skill content. Unlike traditional software supply chain attacks (malicious npm packages, PyPI typosquatting), agent skill attacks exploit a unique attack surface: skills combine natural-language instructions that influence model behavior with executable scripts and tool configurations that agents run autonomously.
The pattern encompasses five documented attack mechanisms: MCP tool poisoning (hiding malicious instructions in tool descriptions), CI prompt injection (injecting payload text into metadata processed by agents in CI pipelines), passive/dormant injection (embedding hidden instructions that activate when agents process specific content), silent egress (covert data exfiltration through URL handling while visible output appears benign), and in-the-wild malicious skills actively deployed in public registries.
Key Features
- Unique attack surface: Skills are not just code — they are natural-language instructions that directly influence LLM reasoning, combined with scripts the LLM can execute. Traditional SAST/DAST tools cannot fully analyze this hybrid surface.
- 12% malicious rate documented: Independent audit of 2,857 skills across multiple registries found 341 malicious skills (Grith.ai/Koi Security, 2026). A separate audit of 22,511 skills found 140,963 issues. These are not theoretical risks.
- Multi-vector composition: The install-to-execution chain creates risk through composition of three elements: prompt instructions (influence model behavior), executable scripts (access filesystem, network, environment), and tool configurations (pre-approve tool usage).
- Registry gaming: Install-count rankings on skills.sh can be gamed, and popularity does not correlate with quality or safety. Low signal-to-noise ratios bury legitimate skills beneath malicious or low-quality entries.
- Partial mitigations emerging: Snyk and Socket partnerships with skills.sh provide automated scanning, but scanning is reactive and may not catch natural-language prompt injection attacks that traditional code analysis tools miss.
Use Cases
- Security review of agent skill adoption: Before installing third-party skills into development environments, teams should evaluate the supply-chain risk profile using this pattern as a threat model.
- Enterprise skills governance: Organizations establishing internal agent skills registries should implement scanning, review, and approval workflows informed by these documented attack vectors.
- Security tool development: Vendors building agent security tools need to understand the unique hybrid attack surface (NL instructions + executable code) to build effective detection.
- Incident response planning: Teams using agent skills should have playbooks for compromised skill discovery, including credential rotation, audit trail review, and skill quarantine procedures.
Adoption Level Analysis
Small teams (<20 engineers): High relevance. Small teams are most likely to install community skills without vetting. Mitigation: stick to vendor-published skills from trusted sources (Microsoft, Anthropic, framework authors). Review SKILL.md content before installation. Do not install skills that include scripts without understanding what the scripts do.
Medium orgs (20-200 engineers): High relevance. At this scale, multiple developers may independently install skills, creating an uncoordinated attack surface. Mitigation: establish a vetted skills allowlist, use the skills CLI in controlled CI environments, and integrate security scanning (Snyk, Socket) into skill installation workflows.
Enterprise (200+ engineers): Critical relevance. The combination of autonomous AI agents with unvetted third-party instructions is an enterprise security concern. Mitigation: maintain an internal skills registry with mandatory security review, enforce skill installation policies via CI/CD gates, use execution-layer sandboxing (E2B, Zerobox, Leash) to limit blast radius of compromised skills, and monitor agent behavior for anomalous file access or network activity.
Alternatives
| Alternative | Key Difference | Prefer when… |
|---|---|---|
| Agent Runtime Security (defense-in-depth) | Broader pattern covering all runtime threats, not just skill-specific | You need comprehensive agent security beyond skills |
| Internal skills registry | Avoids public registry risk entirely | Enterprise environments with compliance requirements |
| No skills (custom prompts only) | Eliminates supply chain entirely at cost of portability | Maximum security posture in sensitive environments |
Evidence & Sources
- Grith.ai: We Audited 2,857 Agent Skills. 12% Were Malicious. — primary independent audit documenting attack types and prevalence
- Snyk: Securing the Agent Skill Ecosystem — Snyk’s analysis of the threat landscape and partnership with Vercel
- Socket: Supply Chain Security for skills.sh — Socket’s approach to detecting malicious skills
- Vercel: Automated security audits for skills.sh — Vercel’s security response
- The New Stack: What a Security Audit of 22,511 AI Coding Skills Found — independent journalism covering the larger audit
- Vibecoding: Skills.sh Review — community review documenting quality concerns
Notes & Caveats
- This is an emerging pattern, not a solved problem. The agent skills ecosystem is growing faster than security tooling can keep up. The 12% malicious rate is from early 2026; the rate may improve as Snyk/Socket scanning matures, or may worsen as attackers adapt.
- Natural language attacks are hard to detect. Unlike malicious code (which can be statically analyzed), malicious SKILL.md instructions that subtly steer agent behavior toward data exfiltration or credential exposure may evade automated scanning. This is fundamentally a harder problem than traditional supply chain security.
- The “ClawHavoc” incident. The ClawHub registry experienced an incident with 1,184 malicious skills detected. This is the largest documented agent skill supply chain compromise to date.
- Execution-layer enforcement is the strongest mitigation. Security researchers recommend OS-level enforcement (evaluating file reads, commands, and network requests before operations complete) rather than relying solely on prompt hardening or advisory scanning. Tools like Zerobox, E2B, and Leash provide this layer.
- The problem is not unique to skills.sh. All agent skill registries face these risks. The pattern applies equally to ClawHub, agentskill.sh, Skills Directory, and any other marketplace that aggregates third-party agent instructions.