What It Does

VectorDBBench is an open-source benchmarking tool for evaluating and comparing vector database performance and cost-effectiveness. Built and maintained by Zilliz (the company behind Milvus), it tests 30+ vector databases across insertion performance, search latency, throughput (QPS), and filtered search scenarios. It provides both a CLI and a web UI for running tests and generating comparative reports.

The tool runs tests against real-world public datasets (SIFT-1M, GIST-1M, Cohere embeddings, OpenAI embeddings) at various scales and dimensions. Results feed into a publicly hosted leaderboard at zilliz.com/vdbbench-leaderboard. While open-source and reproducible, the methodology has documented limitations that make the published leaderboard results unreliable for production planning without independent reproduction.

Key Features

30+ supported databases: Milvus, Zilliz Cloud, Qdrant, Pinecone, Weaviate, Elasticsearch, pgvector, pgvectorscale, Redis, MongoDB, Chroma, Vespa, and more
Multiple test scenarios: Capacity tests, search performance (variable dataset sizes), filtered search performance, and streaming insertion scenarios
Public datasets: SIFT-1M (128-dim), GIST-1M (960-dim), Cohere (768-dim), OpenAI (1536-dim) embeddings for reproducible cross-database comparisons
CLI + Web UI: Command-line for automation and integration; browser-based interface for visualizing results
Cost-effectiveness analysis: Reports cost-per-query metrics for cloud-based database services
Timeout thresholds: Applies realistic timeouts to disqualify databases that cannot meet production latency budgets
Public leaderboard: Hosted at zilliz.com with regularly updated results (note: managed by Zilliz)

Use Cases

Pre-selection screening: Running VectorDBBench as a first-pass filter across multiple vector databases before deeper evaluation — useful for identifying obvious under-performers, not for final architecture decisions
Reproducing published results: Re-running specific test scenarios from the Zilliz leaderboard against your hardware/cloud configuration to verify they hold for your environment
Custom dataset benchmarking: Using the tool’s framework to benchmark with your own embeddings and collection sizes — more reliable than published results since you control the data
Vendor evaluation starting point: Gives a reproducible baseline for comparing database options before building application-specific load tests

Adoption Level Analysis

Small teams (<20 engineers): Useful tool for quick comparisons during proof-of-concept phases. Run it yourself rather than relying on published leaderboard results. The CLI setup is straightforward with Docker.

Medium orgs (20–200 engineers): Suitable as a first-pass benchmark. Must supplement with application-specific load testing. The single-client latency limitation is particularly problematic at this scale — real production latency under concurrent load will differ significantly.

Enterprise (200+ engineers): Insufficient as a standalone procurement benchmark. Use it as a starting point alongside application-specific benchmarks, hardware-matched testing, and independent third-party evaluations. Commission independent testing (e.g., benchANT) before major vector database infrastructure decisions.

Alternatives

Alternative	Key Difference	Prefer when…
benchANT/vectordbbench fork	Independent fork with methodology corrections	You want benchmarks without Zilliz’s organizational conflict of interest
Qdrant’s ANN benchmarks	Independent, open benchmarks from Qdrant	Evaluating Qdrant specifically; well-documented methodology
ann-benchmarks	Academic ANN benchmarks, no cloud database support	Pure algorithm comparison without infrastructure overhead
Custom load testing (k6, Locust)	Application-specific with realistic concurrency	Final production validation before architecture decisions

Evidence & Sources

VectorDBBench GitHub — source, 1.1k stars
benchANT/vectordbbench fork — independent fork documenting methodology issues
Vector Database Benchmarks are Misleading: What Matters (Actian) — documents vendor bias in benchmark suites including VectorDBBench
Vector Search Performance Benchmark of SingleStore, Pinecone and Zilliz (benchANT) — independent benchmark comparison
Qdrant Vector Search Benchmarks — independent alternative benchmark methodology

Notes & Caveats

Conflict of interest is structural: VectorDBBench is maintained by Zilliz, which commercially benefits from Milvus/Zilliz Cloud ranking well. The organization has financial incentive to optimize benchmark parameters that favor their architecture. This does not mean results are fabricated, but methodological choices accumulate in ways that favor distributed systems.
QPS and latency are not comparable: The published QPS_max is calculated by running queries at varying concurrency levels and taking the maximum. Published latency figures are measured under single-client (one query at a time) load. These two numbers cannot be directly compared — you do not know what the latency is at the concurrency level that produces maximum QPS. This is the most significant methodological flaw.
Post-ingestion testing only (standard scenarios): Most VectorDBBench scenarios test performance after all data has been ingested and indexes are fully built. Production databases serve reads and writes simultaneously; mixed-load performance is not captured in standard test scenarios.
Rewards distributed architectures: The benchmark’s timeout and QPS methodology naturally favors distributed systems (Milvus, Zilliz Cloud) over in-memory-first systems (Qdrant, Redis Vector) that may have better tail latencies under real concurrent load.
Custom tests are more valuable than published results: VectorDBBench’s framework is more trustworthy than its leaderboard. Running it with your own embeddings, dataset sizes, and on your target infrastructure eliminates many of the publishing bias concerns.
Last major release: VDBBench 1.0.20, February 12, 2026 — actively maintained.

VectorDBBench

At a Glance

What It Does

Key Features

Use Cases

Adoption Level Analysis

Alternatives

Evidence & Sources

Notes & Caveats

Related

ChromaDB

Milvus

Zilliz Cloud