LiteLLM: Open-Source AI Gateway for 100+ LLM Providers

BerriAI (vendor page) April 4, 2026 vendor-analysis low credibility
View source

LiteLLM: Open-Source AI Gateway for 100+ LLM Providers

Source: litellm.ai | Author: BerriAI (vendor page) | Published: 2026-04-04 Category: vendor-analysis | Credibility: low

Executive Summary

  • LiteLLM is an open-source Python SDK and proxy server (AI Gateway) by BerriAI (YC W23) that provides a unified OpenAI-compatible API for calling 100+ LLM providers, with cost tracking, load balancing, fallbacks, and virtual key management.
  • The project has significant community traction (41k+ GitHub stars, 1,300+ contributors, 240M+ Docker pulls) and notable enterprise users (Netflix, Lemonade, RocketMoney), but suffered a critical supply chain attack in March 2026 that compromised PyPI releases for ~40 minutes.
  • Production-scale deployment reveals meaningful operational challenges: Python GIL-bound throughput limits, PostgreSQL log storage bottleneck, memory leaks requiring worker recycling, and a rapid release cadence that creates stability risk. The company has only raised $2.1M and employs fewer than 20 people, raising sustainability questions for enterprise-critical infrastructure.

Critical Analysis

Claim: “240+ million Docker pulls, over 1 billion requests served”

  • Evidence quality: vendor-sponsored
  • Assessment: The Docker pull count is verifiable via GitHub Container Registry and is plausible given LiteLLM’s age (since 2023) and position as the default open-source LLM proxy. The “1 billion requests served” likely refers to LiteLLM Cloud (their hosted offering) but is presented ambiguously — it could be interpreted as across all deployments, which would be unverifiable. These are vanity metrics; Docker pulls include CI/CD pipelines and do not equate to active production deployments.
  • Counter-argument: Docker pull counts are notoriously inflated by automated CI/CD pipelines. The number says nothing about how many of those pulls resulted in sustained production use. A more meaningful metric would be monthly active proxy instances or monthly API calls through the gateway, which LiteLLM does not publish.
  • References:

Claim: “Netflix uses LiteLLM for rapid model deployment”

  • Evidence quality: anecdotal (vendor-reported)
  • Assessment: Netflix is listed on the LiteLLM website as a user, and multiple independent sources reference “Netflix, Lemonade, and RocketMoney” as users. However, no detailed Netflix case study, engineering blog post, or conference talk was found that describes how Netflix uses LiteLLM, at what scale, or whether it is used for production inference vs. internal developer tooling. The claim “rapid model deployment” likely refers to day-zero access to new model releases through the unified API, not to Netflix deploying their own models.
  • Counter-argument: Vendor logo walls are among the weakest forms of evidence. A Fortune 500 team of 3 engineers experimenting with LiteLLM in a dev environment qualifies as “Netflix uses LiteLLM” in vendor marketing. Without a published case study, the scale and criticality of use is unknown.
  • References:

Claim: “80% uptime” (from website metrics)

  • Evidence quality: vendor-sponsored
  • Assessment: This is a remarkably poor uptime number to advertise. 80% uptime means 20% downtime, or roughly 73 days per year of outages. For any infrastructure component, this would be unacceptable. This likely refers to LiteLLM Cloud’s historical uptime during early periods and may have been a transparency measure rather than a boast. Alternatively, the metric may be stale or measured oddly. Either way, advertising 80% uptime undermines the credibility of the offering as production infrastructure.
  • Counter-argument: If the 80% figure is accurate for their cloud offering, it is disqualifying for any production use case. Most enterprises require 99.9% (8.7 hours downtime/year) minimum. Self-hosted deployments would have independent uptime characteristics, but the cloud figure signals organizational maturity concerns.
  • References:

Claim: Unified API for 100+ LLM providers with cost tracking and load balancing

  • Evidence quality: benchmark (community-verified via open-source usage)
  • Assessment: This is LiteLLM’s core value proposition and is well-evidenced. The GitHub repository demonstrates active support for major providers (OpenAI, Anthropic, Azure, AWS Bedrock, Google Vertex AI, Cohere, HuggingFace, vLLM, NVIDIA NIM, and many others). The OpenAI-compatible API format is a genuine simplification. Cost tracking per virtual key, team, and organization is a real feature confirmed by independent reviews. However, “100+ providers” includes many niche and self-hosted providers, and the quality of integration varies by provider.
  • Counter-argument: The provider count is somewhat misleading. Many “providers” are deployment variants of the same model (e.g., OpenAI direct vs. Azure OpenAI vs. OpenAI via proxy). The real differentiation is closer to 15-20 distinct model families accessible through a unified interface. Integration quality also varies — some providers have full feature parity (streaming, function calling, vision) while others have partial support.
  • References:

Claim: Enterprise-grade security with virtual keys, JWT auth, SSO, and audit logs

  • Evidence quality: vendor-sponsored
  • Assessment: The enterprise security features exist but must be weighed against the March 2026 supply chain attack. Compromised PyPI packages (v1.82.7 and v1.82.8) harvested environment variables, SSH keys, cloud credentials, Kubernetes tokens, and database passwords. The attack was active for approximately 40 minutes on March 24, 2026, and affected the entire downstream ecosystem (DSPy, MLflow, CrewAI, OpenHands). While LiteLLM’s response was reasonable (engaged Mandiant, rotated credentials, rebuilt CI/CD), the incident demonstrates that the security surface of a Python-based proxy is larger than many teams appreciate.
  • Counter-argument: The supply chain attack was not a vulnerability in LiteLLM’s code itself — it was stolen PyPI credentials used to publish backdoored packages. This is a class of attack that affects any PyPI-distributed software. However, for a product positioned as enterprise security infrastructure (controlling access to AI models and managing API keys), having its own distribution channel compromised is particularly damaging to trust. Docker image users were unaffected, which suggests that container-based deployment is the only defensible production approach.
  • References:

Credibility Assessment

  • Author background: This is a vendor homepage by BerriAI, a YC W23 company. The company has raised only $2.1M total (seed from YC, Pioneer Fund, Gravity Fund, FoundersX Ventures). Team size is reportedly under 20 people. The project is maintained primarily by the open-source community (1,300+ contributors), which is both a strength (broad development) and a risk (limited core team for security and stability).
  • Publication bias: Vendor website — pure marketing content. Every claim is designed to maximize perceived adoption and capability. The 80% uptime figure is the only moment of inadvertent honesty.
  • Verdict: low — Vendor marketing page with unverifiable adoption claims, no independent case studies linked, and the security/stability context is omitted entirely. Independent research reveals significant production challenges and a recent critical supply chain compromise that the homepage does not mention.

Entities Extracted

EntityTypeCatalog Entry
LiteLLMopen-sourcelink
LLM Gateway Patternpatternlink
OpenRoutervendorlink (exists)
Portkey AIvendorlink
Vercel AI Gatewayvendorlink (exists)