What It Does
Weaviate is an open-source (BSL-1.1 licensed) vector database designed for AI-native applications. It stores data objects alongside their vector embeddings and enables combined vector, keyword, and hybrid search. Founded in 2019 and headquartered in Amsterdam, Weaviate provides both a self-hosted open-source database and a managed cloud service (Weaviate Cloud). The database is written in Go and supports automatic vectorization through integrations with embedding model providers (OpenAI, Cohere, Hugging Face, etc.), so users can insert raw text and have vectors generated automatically.
Weaviate is increasingly positioning itself as infrastructure for AI agents, not just a search database. In 2025-2026, the company launched Agent Skills (tools for coding agents to interact with Weaviate), a Query Agent, and Engram (an agent memory layer in preview). This represents a strategic pivot from “vector database” to “agentic AI infrastructure.”
Key Features
- Hybrid search: Combines vector (semantic) search with BM25 keyword search in a single query, with configurable alpha weighting
- ACORN filtered search: Proprietary filtered vector search algorithm that maintains performance under restrictive filters; ranks top-3 in independent benchmarks
- Automatic vectorization: Built-in modules for OpenAI, Cohere, Hugging Face, and other embedding providers — no pre-processing pipeline needed
- Multi-tenancy: Native support for isolating data per tenant, critical for SaaS and multi-agent applications
- Horizontal scaling: Sharding and replication with configurable consistency levels
- GraphQL and REST APIs: Both query interfaces available, with a Python/TypeScript/Go/Java client ecosystem
- MCP server: Official Model Context Protocol server for AI agent integration with Weaviate data
- Agent Skills: Open-source repository of tools enabling coding agents (Claude Code, Cursor, Copilot) to generate Weaviate-targeting code
- Generative search: RAG built into the database layer — retrieve objects and pass them to an LLM in a single query
Use Cases
- RAG pipelines: Store document chunks as vectors, retrieve semantically relevant context for LLM prompts
- AI agent memory: Persistent semantic memory for AI agents across sessions (via Engram or direct integration)
- Semantic search applications: Product search, content discovery, knowledge base search where keyword matching is insufficient
- Recommendation systems: Content or product recommendations based on embedding similarity
- Multi-modal search: Image, text, and cross-modal search using appropriate embedding models
Adoption Level Analysis
Small teams (<20 engineers): Possible but not ideal. Self-hosting Weaviate requires Go runtime knowledge and operational overhead for a database that needs monitoring, backup, and scaling. Weaviate Cloud’s free tier is limited. Small teams doing straightforward RAG may prefer simpler options like Chroma (in-process) or Pinecone (fully managed with generous free tier). Weaviate becomes worthwhile for small teams only if they need hybrid search or multi-tenancy.
Medium orgs (20-200 engineers): Good fit. Weaviate Cloud reduces operational burden. The hybrid search capability, multi-tenancy, and embedding integrations serve well for teams building multiple AI-powered products. The BSL-1.1 license is not a concern at this scale since it only restricts offering Weaviate as a competing managed service.
Enterprise (200+ engineers): Growing fit. Weaviate Cloud offers enterprise tiers with SLAs. However, users on community forums report memory issues at scale (300k+ records with high-dimensional vectors can trigger OOM errors), disk space management requires monitoring, and cluster coordination can be finicky. Enterprise teams should plan for dedicated Weaviate operations expertise. The BSL-1.1 license may be a governance concern for organizations with strict open-source policies.
Alternatives
| Alternative | Key Difference | Prefer when… |
|---|---|---|
| Pinecone | Fully managed, serverless, no self-host option | You want zero operational overhead and don’t need self-hosting |
| Qdrant | Rust-based, Apache-2.0 license, strong filtering | You need truly open-source licensing or advanced filtering performance |
| Chroma | In-process Python, lightweight, for prototyping | You need an embedded vector store for development or small-scale use |
| Milvus | Distributed architecture, Kubernetes-native | You need massive scale (billions of vectors) with distributed computing |
| pgvector | PostgreSQL extension, familiar SQL interface | You already use PostgreSQL and want vectors without a new database |
Evidence & Sources
- Weaviate G2 Reviews 2025
- Weaviate Gartner Peer Insights 2026
- Vector Databases 2026: Complete Guide (Calmops)
- Weaviate Official Documentation
- Weaviate Community Forum - Performance Issues
- Weaviate in 2025 Blog Post
Notes & Caveats
- BSL-1.1 license, not truly open source: Weaviate uses the Business Source License 1.1, which converts to Apache-2.0 after 4 years. Under BSL, you cannot offer Weaviate as a managed database service. This is the same licensing model as CockroachDB and MariaDB. Organizations with strict “OSI-approved licenses only” policies should be aware.
- Memory management at scale: Community reports indicate memory issues when indexing 300k+ records with high-dimensional vectors (1536-dim). HNSW indices are memory-intensive by design. Production deployments need careful capacity planning and monitoring.
- Disk space operational risk: Clusters can enter a read-only state when disk space is exhausted, requiring manual intervention to recover. Automated disk monitoring and alerting is essential.
- Funding and runway: $67.6M total funding (Series C at ~$200M valuation, October 2025). Team of ~81 employees as of February 2026. The company is well-funded relative to its size but is not yet profitable (assumption based on stage). The pivot toward agentic AI infrastructure (Engram, Agent Skills) suggests the company is seeking new growth vectors beyond the crowded vector database market.
- Agentic pivot is strategic but unproven: Weaviate’s move from “vector database” to “agentic AI infrastructure” (Engram, Agent Skills, Query Agent) is ambitious but the products are early (Engram is in preview). It is unclear whether a vector database vendor can successfully compete with purpose-built agent memory systems (Mem0, Zep, Letta) that are building on top of multiple storage backends.