Skip to content

Manifest LLM Router

★ New
assess
AI / ML open-source MIT open-source

At a Glance

Open-source MIT-licensed Docker-deployed LLM router for personal AI agents that uses 23-dimension keyword scoring to route requests to the cheapest capable model across 300+ models from 13+ providers.

Type
open-source
Pricing
open-source
License
MIT
Adoption fit
small
Top alternatives

What It Does

Manifest is an open-source LLM router designed specifically for personal AI agents (primarily OpenClaw and Hermes Agent). It deploys as a Docker container that acts as an OpenAI-compatible proxy: agents point their API endpoint at the local Manifest instance, and Manifest scores each incoming request using a 23-dimension keyword algorithm, classifies it into one of four complexity tiers (simple, standard, complex, reasoning), and routes it to the cheapest configured model in that tier.

The project originated as a “backend-as-a-file” YAML micro-backend framework (NestJS + TypeORM + SQLite generating REST APIs from a single backend.yml) and pivoted in 2025 to its current LLM routing identity. The npm package is deprecated; Docker is now the only supported distribution. A PostgreSQL database stores routing metadata and dashboard analytics. An optional cloud-hosted tier (app.manifest.build) routes through Manifest’s servers but claims to retain only metadata (model name, token counts, latency), not prompt content.

Key Features

  • 4-tier complexity routing: Classifies each request as simple, standard, complex, or reasoning using a 23-dimension keyword frequency score. Configurable model per tier across connected providers.
  • Specificity routing (opt-in): 9 task-type categories (coding, web_browsing, data_analysis, image_generation, etc.) that override complexity tiers based on task-type keyword heuristics.
  • Up to 5 fallback models per tier: If the primary model fails, the next model in the tier’s fallback chain handles the request automatically.
  • Budget controls: Spending limits with email alerts (notification rules) and hard request blocking (block rules, returns HTTP 429) when thresholds are reached.
  • Dashboard analytics: Per-agent, per-model, per-message cost, token count, and latency breakdown stored in PostgreSQL.
  • 300+ model support: Integrates with OpenAI, Anthropic, Google Gemini, DeepSeek, xAI, Mistral, Qwen, MiniMax, Kimi, Z.ai, GitHub Copilot, OpenRouter, Ollama, and custom OpenAI-compatible endpoints.
  • Local Ollama integration: Connects to host-installed Ollama, vLLM, or LM Studio via the Docker bridge network for fully local model inference.
  • Privacy-by-default in self-hosted mode: All traffic flows agent → local container → LLM provider with no Manifest-controlled intermediary. Prompt content never leaves the user’s machine.
  • Single-command setup: One bash script installs Docker Compose, generates secrets, and launches the full stack at localhost:2099.
  • OpenAI-compatible API: Drop-in replacement endpoint — existing agents using the OpenAI SDK require only a base URL change.

Use Cases

  • Personal AI agent cost reduction: Running OpenClaw or Hermes Agent continuously generates many low-complexity requests (heartbeats, simple lookups). Manifest routes these to free or cheap models (DeepSeek R1 free, Qwen free tier) while reserving expensive models (GPT-4o, Claude 3.7 Sonnet) for complex reasoning tasks.
  • Local privacy-first agent deployment: Developers who cannot or will not route prompts through a third-party SaaS proxy (OpenRouter) but want multi-provider model access with cost visibility.
  • Home lab / indie developer setups: Single-developer or small team running persistent AI agents on a home server or VPS without the operational complexity of LiteLLM (no Python environment, no Redis, just Docker Compose).
  • Subscription leverage: Routing paid ChatGPT Plus or Claude Pro subscription traffic through Manifest to avoid additional API usage charges on top of existing subscriptions.

Adoption Level Analysis

Small teams (<20 engineers): Reasonable fit for individual developers running personal AI agents (OpenClaw, Hermes, custom OpenAI-SDK bots). Docker Compose is manageable for a solo developer. Budget controls and cost dashboards provide genuine value. However, the product is in beta with anonymous authorship and a recent pivot history, creating support and continuity risk. Not suitable for production applications serving end users.

Medium orgs (20-200 engineers): Poor fit. Platform teams at this scale need enterprise gateway features: SSO, audit logs, fine-grained RBAC, SLA guarantees, and production-grade support. LiteLLM or Portkey are significantly better choices. Manifest’s rule-based routing would require per-org tuning, and the PostgreSQL analytics do not integrate with standard observability stacks (no OpenTelemetry export found).

Enterprise (200+ engineers): Not suitable. No enterprise support tier, no published SLA, anonymous maintainers, beta status, and feature set focused on personal rather than organizational use.

Alternatives

AlternativeKey DifferencePrefer when…
OpenRouterFully managed SaaS, 300+ models, ~5% markup, no infrastructureYou want zero ops overhead and can accept a third-party data intermediary
LiteLLMPython proxy with virtual keys, team budgets, load balancing, 100+ providersYou need multi-team governance, Python ecosystem integration, or production throughput
Portkey AIGo-based enterprise gateway, guardrails, MCP governance, $18M raisedYou need production SLAs, compliance controls, or high-throughput performance
RouteLLMML-based learned routing (not keyword heuristics), open-source by lm-sysYou need higher routing accuracy and can invest in model training/calibration
each::labsPre-seed startup LLM router tightly integrated with klaw.sh agent orchestrationYou want routing + agent fleet management in one tool

Evidence & Sources

Notes & Caveats

  • Beta status and recent pivot. The project was a completely different product (YAML micro-backend framework) before 2025. The LLM router is explicitly in beta. This means breaking changes, missing features, and uncertain long-term support are realistic risks.
  • Anonymous maintainers. No named individuals, corporate entity, or funding information is publicly disclosed for the current product. The prior product was associated with a Paris-based agency (Buddyweb) incubated at Station F. Funding status for the LLM router is unknown. Critical infrastructure running on anonymous, unfunded open-source projects carries meaningful bus-factor risk.
  • Unverified cost savings claims. The 70% cost reduction headline is marketing, not a benchmark. Actual savings depend entirely on the distribution of request complexity in your workload, which model tiers you configure, and whether the keyword scorer correctly classifies requests. Workloads dominated by complex reasoning tasks will see minimal savings.
  • Keyword-based routing accuracy is uncharted. No published confusion matrix, mis-routing rate, or accuracy evaluation exists for the 23-dimension scorer. A request about “simple coding” being routed to a cheap model could fail if the task requires tool-use or structured output that the cheap model does not support.
  • npm package deprecated. The original Node.js manifest npm package is deprecated. Docker is the only supported distribution as of 2025. Teams evaluating the old backend framework product should treat it as unmaintained.
  • Tool-call and structured output routing edge case. If an agent uses tool-calling or JSON structured output and Manifest routes the request to a model tier that does not support tools or response_format, the downstream call will fail. Documentation does not address this scenario.
  • No OpenTelemetry or observability integration found. The dashboard is PostgreSQL-backed with no documented export to Langfuse, OpenLLMetry, or other OTel-compatible observability tools. Teams wanting unified LLM observability will need to augment with a separate solution.
  • Migration path if you outgrow it is straightforward. Because Manifest exposes a standard OpenAI-compatible API, switching to LiteLLM, Portkey, or direct provider APIs requires only updating the base URL. No vendor lock-in in the API layer.

Related