Skip to content

Milvus

★ New
trial
Database open-source Apache-2.0 open-source

At a Glance

Apache-2.0 distributed vector database for billion-scale similarity search, built for cloud-native Kubernetes deployment with GPU acceleration, multiple index types (HNSW, DiskANN, IVF), and sparse+dense hybrid search; the leading open-source vector database at 44k+ GitHub stars.

Type
open-source
Pricing
open-source
License
Apache-2.0
Adoption fit
medium, enterprise
Top alternatives

What It Does

Milvus is an open-source, cloud-native vector database designed for high-dimensional similarity search at billion-vector scale. Built in Go and C++, it stores dense and sparse embeddings alongside metadata, enabling semantic search, recommendation systems, and RAG pipelines across massive datasets. The project is maintained under the LF AI & Data Foundation with Zilliz as the primary commercial backer.

Milvus runs as a distributed system on Kubernetes, with separate components for query nodes, data nodes, index nodes, and coordinators — all persisting state to object storage (S3/MinIO/GCS). Version 2.6 introduced Woodpecker, a cloud-native WAL that eliminates the previously required Kafka or Pulsar cluster. A lightweight “Milvus Lite” variant (pip-installable) works for local development and prototyping, but is not suitable for production at scale.

Key Features

  • Multiple index types: HNSW, IVF_FLAT, IVF_SQ8, IVF_PQ, SCANN, DiskANN, and GPU-accelerated variants; configurable tradeoffs between recall, latency, and memory
  • Hybrid search: Simultaneous dense vector search and sparse (BM25-style) full-text search, enabling combined semantic + keyword retrieval in a single query
  • Woodpecker WAL (v2.6+): Cloud-native write-ahead log persisting to object storage, replacing Kafka/Pulsar; 450 MB/s in local filesystem mode, 3.5x faster than Kafka in vendor benchmarks
  • GPU acceleration: NVIDIA GPU-accelerated index building and search for CAGRA, IVF_FLAT, IVF_PQ, and BF indexes
  • Hot/cold tiered storage: Automatic data tiering to object storage for cost optimization at scale
  • Multi-tenancy: Role-based access control, resource groups, flexible isolation strategies (collection-level, partition-level)
  • Real-time streaming inserts: Data queryable as soon as inserted without requiring index rebuilds
  • Milvus Lite: Single-process pip-installable variant for development and small-scale inference; not for production
  • Milvus Operator: Kubernetes CRD-based operator for production deployment lifecycle management
  • LangChain/LlamaIndex integrations: First-class integrations with major RAG orchestration frameworks

Use Cases

  • Large-scale semantic search: Enterprise search over billions of documents — use Milvus when you need distributed horizontal scaling and are willing to operate Kubernetes-native infrastructure
  • RAG at scale: Vector retrieval backend for RAG systems processing large corpora (10M+ chunks) where ChromaDB or single-node alternatives have hit their ceiling
  • Recommendation systems: Real-time item-to-item or user-to-item similarity at platform scale (Reddit, e-commerce, media)
  • Multi-modal search: Combined text, image, and audio embedding search using multi-vector support and hybrid sparse+dense queries
  • Self-hosted alternative to Zilliz Cloud: Teams with existing Kubernetes platform investment who want to avoid managed-service vendor lock-in

Adoption Level Analysis

Small teams (<20 engineers): Does not fit without caveats. Milvus Lite or a single-node docker compose setup works for prototyping, but production Milvus requires Kubernetes + etcd + object storage. Without a dedicated platform engineer, operational overhead is significant. ChromaDB or Qdrant are better choices at this scale. Zilliz Cloud eliminates ops burden if team can absorb the cost.

Medium orgs (20–200 engineers): Fits with platform engineering support. Teams with existing Kubernetes infrastructure and at least one engineer comfortable with Helm/operators can run Milvus reliably. The 2.6 Woodpecker WAL reduces external dependency count. Evaluate against Qdrant Cloud (simpler ops, lower cost at this scale) unless you specifically need Milvus’s distributed multi-node scaling.

Enterprise (200+ engineers): Strong fit. Milvus is designed for this tier — distributed horizontal scaling, RBAC, SOC 2-compliant managed option (Zilliz Cloud), compliance readiness, and battle-tested at Reddit/Shopee/Grab scale. Apache-2.0 license eliminates commercial licensing risk. LF AI & Data governance provides project continuity assurance beyond Zilliz’s commercial interests.

Alternatives

AlternativeKey DifferencePrefer when…
QdrantRust-native, single-binary, lower ops overhead, Rust filtering engineYou want strong performance without Kubernetes complexity; medium-scale workloads
WeaviateBSL-1.1 license, GraphQL API, Engram agent memory layer, agentic AI focusYou need tight AI agent integration and can accept BSL license restrictions
ChromaDBMinimal setup, Python-native, prototyping-first, not billion-scaleDevelopment, RAG prototyping, small-scale production under ~10M vectors
PineconeFully managed serverless, no ops, proprietary, scales to billionsYou want zero ops and are willing to pay proprietary vendor pricing
pgvectorPostgres extension, SQL-native, unified relational+vectorYou’re already on Postgres and want to avoid a separate vector service
Zilliz CloudManaged Milvus with Zilliz’s Cardinal engine, 99.95% SLAYou want Milvus semantics without Kubernetes ops overhead

Evidence & Sources

Notes & Caveats

  • Kubernetes is non-negotiable for production: The standalone Docker Compose deployment is explicitly documented as unsuitable for production. Any team evaluating Milvus must factor in Kubernetes operational costs from day one.
  • etcd disk performance is critical: etcd requires local NVMe SSDs; slower disks cause frequent cluster elections that degrade the entire Milvus cluster. This is an easy misconfiguration to overlook in initial deployments.
  • Woodpecker is new (v2.6, early 2026): The elimination of Kafka/Pulsar is a genuine improvement, but Woodpecker’s long-term production characteristics are unproven. Teams upgrading from 2.5.x should follow the documented upgrade path carefully.
  • Zilliz is the primary contributor: Despite LF AI & Data governance, Zilliz employs the majority of active Milvus contributors. The project’s roadmap is heavily influenced by Zilliz’s commercial interests (Zilliz Cloud feature parity). This is not a red flag but should inform your assessment of the project’s independence.
  • VectorDBBench results favor Milvus: Zilliz maintains VectorDBBench. Its published leaderboard consistently shows Zilliz Cloud near the top. Independent analysis (benchANT) found methodological issues that distort comparisons. Run your own benchmarks with your actual data before architecture decisions.
  • Migration out is possible but non-trivial: Milvus uses its own data formats and API. Moving to Qdrant or Pinecone requires re-ingestion. Zilliz provides a Vector Transport Service for migration between Milvus deployments and to Zilliz Cloud, but cross-vendor migration is your engineering effort.
  • License history is clean: Apache-2.0 with no known BSL-style switches. Zilliz Cloud is the commercial monetization path, not license restrictions on the open-source project.
  • Milvus is under LF AI & Data Foundation: This provides project continuity beyond Zilliz’s commercial fate, though effective neutrality depends on community contributions remaining diverse.

Related