Skip to content

Milvus

What it is

Milvus is an open-source, high-performance vector database built for scalable similarity search and AI applications. Developed by Zilliz and hosted by the Linux Foundation (LF AI & Data), it is designed to manage, index, and search massive collections of vector embeddings.

What problem it solves

Traditional databases are not optimized for the high-dimensional vector data produced by machine learning models. Milvus provides a specialized engine that can perform approximate nearest neighbor (ANN) searches across billions of vectors with millisecond latency. It solves the challenge of scaling vector search from local prototypes to massive, distributed enterprise production environments.

Where it fits in the stack

Category: Infrastructure / Vector Databases

Typical use cases

  • Enterprise RAG: Storing and retrieving billions of document chunks for large-scale Retrieval-Augmented Generation.
  • Multimodal Search: Enabling search across different data types (text-to-image, image-to-video) using a shared vector space.
  • Molecular Similarity Search: Used in drug discovery to find similar chemical structures.
  • Personalized Recommendations: Powering real-time recommendation engines for e-commerce and social media.

Strengths

  • Massive Scalability: Designed with a cloud-native, distributed architecture that can scale to tens of billions of vectors.
  • High Performance: Frequently benchmarks as one of the fastest vector databases, outperforming many competitors by 2-5x.
  • Flexible Deployment: Offers Milvus Lite (Python library), Standalone (Docker), and Distributed (Kubernetes) modes.
  • Rich Data Modeling: Supports various data types including JSON, arrays, and multiple vector types (dense, sparse, binary).

Limitations

  • Operational Complexity: The distributed version requires significant infrastructure knowledge (Kubernetes, S3, etcd, Pulsar/Kafka).
  • Resource Intensive: High-performance indexing and searching require substantial CPU and RAM, especially for large datasets.
  • Learning Curve: The feature set and architectural components are more complex than simpler alternatives like Pinecone.

When to use it

  • When you need a high-performance, open-source vector database that you can self-host.
  • For billion-scale vector search requirements where performance and cost-efficiency are critical.
  • When you require complex data modeling or hybrid search capabilities beyond simple vector-metadata pairs.

When not to use it

  • For small projects or early-stage prototypes where Milvus Lite or a simpler managed service like Pinecone would reduce overhead.
  • If you lack the DevOps resources to maintain a distributed Kubernetes-based database.

Getting started

Installation (Milvus Lite)

Ideal for local development and prototyping.

pip install pymilvus

Basic Setup

from pymilvus import MilvusClient

# Initialize a local Milvus Lite instance
client = MilvusClient("milvus_demo.db")

Technical Examples

1. Creating a Collection with Schema

client.create_collection(
    collection_name="demo_collection",
    dimension=768,  # Dimension for many open-source embedding models
)

2. Inserting Data

data = [
    {"id": 0, "vector": [0.1, 0.2, ...], "subject": "history"},
    {"id": 1, "vector": [0.3, 0.4, ...], "subject": "biology"},
]

res = client.insert(
    collection_name="demo_collection",
    data=data
)
res = client.search(
    collection_name="demo_collection",
    data=[[0.1, 0.2, ...]], # Query vector
    limit=5,
    output_fields=["subject"]
)

for result in res[0]:
    print(f"ID: {result['id']}, Score: {result['distance']}, Subject: {result['entity']['subject']}")
  • Pinecone — managed cloud-native vector database.
  • Weaviate — open-source vector database with GraphQL support.
  • Vector DB Comparison — choosing the right vector store.
  • RAG Pattern — architectural overview of retrieval systems.
  • Search Patterns — implementing effective search strategies.
  • Zilliz — the commercial entity behind Milvus, providing a managed cloud version.

Sources / references

Contribution Metadata

  • Last reviewed: 2026-06-06
  • Confidence: high