FerresDB delivers sub-millisecond vector search with hybrid BM25 retrieval, gRPC streaming, tiered storage, and enterprise-grade RBAC — all powered by Rust for uncompromising performance in RAG, semantic search, and recommendation systems.
From RAG pipelines to real-time recommendations, FerresDB powers the most demanding vector workloads.
Transform user queries into meaning-based results with Cosine, Euclidean, or Dot Product similarity. Combine with metadata filters for precision retrieval.
Hybrid vector + BM25 search in a single query with weighted or RRF fusion. Ground your LLM responses with the most relevant context from your knowledge base.
Real-time similarity matching with WebSocket streaming. Dot Product distance optimized for recommendation models. Auto-batching up to 1000 points/request.
A complete vector database with enterprise-grade features, built from the ground up in Rust for maximum performance and reliability.
P50 search at 100-500μs, P95 at 200-1000μs. No GC pauses — Rust delivers predictable, low-latency execution with zero runtime overhead.
Combine dense vector search with BM25 text retrieval using weighted fusion or Reciprocal Rank Fusion (RRF). Tunable alpha parameter for precision control.
REST API for simplicity, gRPC with bidirectional streaming for high-throughput, and WebSocket for real-time applications. All protocols run in parallel.
Automatically move vectors between RAM (Hot), memory-mapped (Warm), and disk (Cold) tiers based on access frequency. HNSW graph stays in memory for speed.
Role-based access control with Admin, Editor, and Viewer roles. Granular per-collection permissions with metadata restrictions. Daily-rotated audit logs.
Rebuild HNSW indexes in the background. Searches continue on the old index until the new one is ready. Auto-triggers when tombstones exceed 20%.
Write-Ahead Log with periodic snapshots every 1000 ops. Automatic crash recovery replays the WAL from the latest snapshot. Auto-save every 30 seconds.
Prometheus metrics endpoint, query profiling with /search/explain, slow query tracking, cost estimation with budget_ms, and a built-in web dashboard.
Fully-typed TypeScript SDK with Zod validation and WebSocket support. Async Python SDK with httpx. Both feature auto-retry, auto-batching, and structured logging.
Every layer of FerresDB is designed for performance, safety, and operational excellence.
REST, gRPC (port 50051), WebSocket — all running in parallel
RBAC with Admin/Editor/Viewer roles, per-collection permissions
Cosine/Euclidean/DotProduct metrics, metadata filters, hybrid fusion
Hot (RAM) / Warm (mmap) / Cold (disk), auto-save every 30s
Metrics, query profiling, slow queries, daily audit trail (JSONL)
The Hierarchical Navigable Small World index is tuned for an optimal balance of speed and recall.
mMax connections per layeref_constructionIndex build qualityef_searchQuery search widthBenchmarked with Criterion.rs — real numbers, not marketing claims
No GC pauses, zero-cost abstractions, memory safety without runtime overhead. Compiled to native machine code.
Multi-layer graph with O(log N) search complexity. Optimized for high recall with configurable ef_search.
Thread-safe design with parallelized batch operations. Ready for multi-threaded servers and concurrent requests.
Optional caching for repeated queries. Configurable cache size eliminates redundant computation on hot queries.
See how FerresDB compares to conventional vector databases
| Aspect | FerresDB | Others |
|---|---|---|
| Language | Pure Rust — zero GC, native performance | Python, Go, or Java with GC overhead |
| Search Latency | P50: 100–500μs (sub-millisecond) | Typically 1–50ms per query |
| Search Types | Vector + BM25 hybrid (weighted & RRF fusion) | Often vector-only focus |
| Protocols | REST + gRPC (streaming) + WebSocket | Usually REST or gRPC only |
| Storage | WAL + snapshots + tiered (Hot/Warm/Cold) | Not all offer WAL + crash recovery |
| Security | RBAC + API Keys + JWT + Audit Trail | Varies — often basic API keys only |
| Deployment | Single Docker container, no cloud lock-in | Many are managed-only or heavier |
| Observability | Prometheus + query profiling + dashboard | Depends on the product |
Deploy the full stack with Docker Compose or run individual containers
Recommended — runs Backend + Dashboard together
# 1. Pull both images
docker pull ferresdb/ferres-db-core:latest
docker pull ferresdb/ferres-db-frontend:latest
# 2. Run the backend
docker run -d -p 8080:8080 \
-e FERRESDB_API_KEYS=sk-your-key \
-e CORS_ORIGINS=http://localhost:3000 \
-v ferres-data:/data \
ferresdb/ferres-db-core:latest
# 3. Run the dashboard
docker run -d -p 3000:80 \
-e VITE_API_BASE_URL=http://localhost:8080 \
-e VITE_API_KEY=sk-your-key \
ferresdb/ferres-db-frontend:latestpnpm add @ferresdb/typescript-sdkpip install ferres-db-pythonFrom zero to vector search in under 10 lines of code
import { VectorDBClient, DistanceMetric } from "@ferresdb/typescript-sdk";
// Initialize client with auto-retry and timeout
const client = new VectorDBClient({{
baseUrl: "http://localhost:8080",
apiKey: "ferres_sk_...",
maxRetries: 3,
});
// Create a collection with hybrid search enabled
await client.createCollection({{
name: "documents",
dimension: 384,
distance: DistanceMetric.Cosine,
enable_bm25: true,
});
// Upsert vectors with metadata (auto-batches > 1000)
await client.upsertPoints("documents", [
{ id: "doc-1", vector: [0.1, 0.2, ...], metadata: { text: "Hello" } },
]);
// Hybrid search: vector + BM25 with weighted fusion
const results = await client.hybridSearch("documents", {
query_text: "how to deploy",
query_vector: [0.1, 0.2, ...],
limit: 5,
alpha: 0.5, // 0 = BM25 only, 1 = vector only
});Security, compliance, and operational features built-in — not bolted on.
API Keys (SHA-256 hashed, stored in SQLite) for programmatic access. JWT tokens (Argon2 passwords) for dashboard sessions.
Admin, Editor, Viewer roles with per-collection permissions. Restrict access to specific metadata fields and allowed values.
Every action logged: searches, mutations, logins, user management. Daily-rotated JSONL files with user, IP, duration, and result.
Use /search/explain to understand query execution. /search/estimate for cost prediction. Slow query tracking for optimization.
Rich filter operators: $eq, $ne, $in, $gt, $lt, $gte, $lte. Combine with vector search for precise, scoped retrieval.
Set budget_ms on any search query. Automatically fails with 422 if the latency budget is exceeded — perfect for SLA enforcement.
Join developers building the next generation of AI applications with FerresDB. Self-hosted, no cloud lock-in.