FerresDB Documentation

Complete reference for the FerresDB high-performance vector search engine. Learn how to install, configure, and integrate FerresDB into your applications.

Getting Started

FerresDB is a high-performance vector search engine written in Rust, designed for semantic search, RAG (Retrieval Augmented Generation), and recommendation systems. It features sub-millisecond search latency, HNSW-based approximate nearest-neighbor search, and hybrid BM25 text search.

Quick Start

1Run FerresDB Core with Docker (see Docker Installation)
2Optionally run the FerresDB Dashboard for a visual UI
3Install an SDK (TypeScript or Python) or use the REST API directly
4Create a collection, upsert vectors, and start searching

Key Features

Sub-millisecond P50 search latency
HNSW approximate nearest-neighbor
Hybrid vector + BM25 text search
API Key auth + RBAC + JWT dashboard
WAL + snapshots with crash recovery
Prometheus metrics + query profiling
Dashboard with user management
TypeScript, Python & Rust SDKs
Optional gRPC API (streaming, port 50051)
Background reindex & tiered storage (Hot/Warm/Cold)

Docker Installation

The fastest way to get FerresDB running is with Docker. Official images are published to Docker Hub.

FerresDB Core (Backend API)

The Core image runs the FerresDB REST API server.

terminal
# Pull the latest image
docker pull ferresdb/ferres-db-core:latest

# Run the container
docker run -d \
  -p 8080:8080 \
  -e PORT=8080 \
  -e CORS_ORIGINS=http://localhost:3000 \
  -e FERRESDB_API_KEYS=sk-abcd1234 \
  -v ferres-data:/data \
  ferresdb/ferres-db-core:latest
Result: API available at http://localhost:8080. Verify with curl http://localhost:8080/health

FerresDB Dashboard (Frontend)

The Dashboard provides a web UI for managing collections, browsing data, and monitoring performance.

terminal
# Pull the latest image
docker pull ferresdb/ferres-db-frontend:latest

# Run the container
docker run -d \
  -p 3000:80 \
  -e VITE_API_BASE_URL=http://localhost:8080 \
  -e VITE_API_KEY=sk-abcd1234 \
  ferresdb/ferres-db-frontend:latest
Result: Dashboard available at http://localhost:3000. Default login: root / ferresdb

Docker Compose (Full Stack)

Run both backend and frontend together with Docker Compose.

docker-compose.yml
version: "3.8"
services:
  backend:
    image: ferresdb/ferres-db-core:latest
    ports:
      - "${BACKEND_PORT:-8080}:${BACKEND_PORT:-8080}"
    volumes:
      - ferres-data:/data
    environment:
      - HOST=0.0.0.0
      - PORT=${BACKEND_PORT:-8080}
      - STORAGE_PATH=/data
      - LOG_LEVEL=${LOG_LEVEL:-info}
      - CORS_ORIGINS=${CORS_ORIGINS:-http://localhost:3000}
      - FERRESDB_API_KEYS=${FERRESDB_API_KEYS:-sk-abcd1234}
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:${BACKEND_PORT:-8080}/health"]
      interval: 30s
      timeout: 3s
      retries: 3

  frontend:
    image: ferresdb/ferres-db-frontend:latest
    ports:
      - "${FRONTEND_PORT:-3000}:80"
    environment:
      - VITE_API_BASE_URL=${VITE_API_BASE_URL:-http://localhost:8080}
      - VITE_API_KEY=${VITE_API_KEY:-sk-abcd1234}
    depends_on:
      backend:
        condition: service_healthy

volumes:
  ferres-data:
terminal
docker compose up -d

Configuration

FerresDB is configured through environment variables. These take precedence over values in config.toml.

VariableDefaultDescription
HOST0.0.0.0Server bind host
PORT8080Server port
STORAGE_PATH/dataPath for persistent data (collections, WAL, indexes)
LOG_LEVELinfoLog level: trace, debug, info, warn, error
FERRESDB_API_KEYS-Comma-separated bootstrap API keys
FERRESDB_JWT_SECRET(dev default)JWT secret for dashboard sessions (change in production!)
CORS_ORIGINSlocalhost:*Comma-separated allowed CORS origins
OTEL_EXPORTER_OTLP_ENDPOINTlocalhost:4317OpenTelemetry OTLP endpoint (when otel feature enabled)
GRPC_PORT50051gRPC server port (when grpc feature enabled)

Precedence: Environment variables > config.toml > defaults

config.toml
host = "0.0.0.0"
port = 8080
storage_path = "/data"
log_level = "info"
api_keys = "sk-my-secret-key"

Authentication

FerresDB uses two authentication methods: API Keys for programmatic access and JWT tokens for the dashboard.

API Key Authentication

All protected endpoints require an API key in the Authorization header.

HTTP request
Authorization: Bearer sk-abcd1234

Bootstrap keys: Set via FERRESDB_API_KEYS environment variable (comma-separated).

Managed keys: Create/delete via the API Keys endpoints. Keys are hashed with SHA-256 and stored in SQLite. The raw key is returned only once on creation.

Key format: ferres_sk_... or sk-... (legacy).

JWT Authentication (Dashboard)

POST /api/v1/auth/login
{
  "username": "root",
  "password": "ferresdb"
}

// Response:
{
  "token": "eyJhbGc...",
  "user": {
    "username": "root",
    "role": "admin"
  }
}
Warning: Change the default credentials (root / ferresdb) and FERRESDB_JWT_SECRET in production.

RBAC (Role-Based Access Control)

RolePermissions
Admin
Full access: users, audit, all collections, API keys
Editor
Read, write, create collections
Viewer
Read-only: search, get, list

Granular permissions can restrict access to specific collections and metadata fields:

RBAC permission example
{
  "username": "analyst",
  "role": "viewer",
  "permissions": [
    {
      "resource": { "type": "collection", "name": "sales-data" },
      "actions": ["read"],
      "metadata_restriction": {
        "field": "department",
        "allowed_values": ["sales"]
      }
    }
  ]
}

API Reference

Base URL: http://localhost:8080/api/v1. All protected endpoints require Authorization: Bearer <api-key>.

Public endpoints (no auth required): GET /health, GET /metrics, POST /api/v1/auth/login

Collections

POST/api/v1/collections
Auth
Create collection
GET/api/v1/collections
Auth
List all collections
GET/api/v1/collections/{name}
Auth
Get collection details
DELETE/api/v1/collections/{name}
Auth
Delete collection
POST /api/v1/collections
{
  "name": "my_collection",
  "dimension": 384,
  "distance": "Cosine",
  "enable_bm25": false,
  "bm25_text_field": "text"
}

// distance options: "Cosine", "Euclidean", "DotProduct"
// enable_bm25: optional (default: false)
// bm25_text_field: optional (default: "text")

Reindex (Background)

Rebuild the ANN index in the background without downtime. Useful when tombstone count grows (e.g. after many deletions). Searches continue against the old index until the new one is built and swapped. Only one reindex job per collection at a time. Auto-reindex is triggered when tombstones exceed 20% of indexed points.

POST/api/v1/collections/{name}/reindex
Auth
Start reindex job (returns 202 Accepted)
GET/api/v1/collections/{name}/reindex/{job_id}
Auth
Get reindex job status
GET/api/v1/collections/{name}/reindex
Auth
List reindex jobs for collection
POST /api/v1/collections/{name}/reindex - Response 202
{
  "job_id": "uuid",
  "status": "building",
  "message": "Reindex started"
}

Points

POST/api/v1/collections/{name}/points
Auth
Upsert points (max 1000/batch)
GET/api/v1/collections/{name}/points
Auth
List points (paginated)
GET/api/v1/collections/{name}/points/{id}
Auth
Get point by ID
DELETE/api/v1/collections/{name}/points
Auth
Delete points by IDs
POST /api/v1/collections/{name}/points
{
  "points": [
    {
      "id": "doc-1",
      "vector": [0.1, 0.2, -0.1, ...],
      "metadata": {
        "text": "Hello world",
        "category": "greeting"
      }
    },
    {
      "id": "doc-2",
      "vector": [0.3, -0.1, 0.5, ...],
      "metadata": {
        "text": "Goodbye world",
        "category": "farewell"
      }
    }
  ]
}

// Max 1000 points per batch
// Vector dimension must match collection
// Existing IDs are updated (upsert)
POST/api/v1/collections/{name}/search
Auth
Vector search
POST/api/v1/collections/{name}/search/hybrid
Auth
Hybrid search (vector + BM25)
POST/api/v1/collections/{name}/search/explain
Auth
Search with explanation
POST/api/v1/collections/{name}/search/estimate
Auth
Estimate search cost
POST /api/v1/collections/{name}/search
{
  "vector": [0.1, 0.2, -0.1, ...],
  "limit": 5,
  "filter": {
    "category": "tech",
    "price": { "$gte": 10, "$lte": 100 }
  },
  "budget_ms": 50
}

// filter: optional metadata filter
// budget_ms: optional, fails with 422 if exceeded
// Filter operators: $eq, $ne, $in, $gt, $lt, $gte, $lte

Stats & Metrics

GET/health
Health check
GET/metrics
Prometheus metrics
GET/api/v1/stats/global
Global statistics
GET/api/v1/stats/queries
Recent queries (24h)
GET/api/v1/stats/slow-queries
Slow queries
GET/api/v1/stats/feedback
Query feedback stats
GET/api/v1/collections/{name}/stats
Auth
Collection statistics
GET/api/v1/debug/query-profile/{query_id}
Query execution profile

API Keys Management

GET/api/v1/keys
Auth
List API keys (metadata only)
POST/api/v1/keys
Auth
Create new API key
DELETE/api/v1/keys/{id}
Auth
Delete API key
POST /api/v1/keys - Response
{
  "id": "key_abc123",
  "key": "ferres_sk_full_raw_key_shown_once",
  "prefix": "ferres_sk_full...",
  "created_at": "2026-02-07T12:00:00Z"
}

// The raw key is returned ONLY on creation.
// Store it securely - it cannot be retrieved later.

Users Management

GET/api/v1/users
Auth
List users (Admin only)
POST/api/v1/users
Auth
Create user (Admin only)
DELETE/api/v1/users/{id}
Auth
Delete user (Admin only)
PUT/api/v1/users/{username}/password
Auth
Update password (Admin only)
PUT/api/v1/users/{username}/permissions
Auth
Update permissions (Admin only)
POST /api/v1/users
{
  "username": "analyst",
  "password": "secure-password",
  "role": "viewer",
  "permissions": [
    {
      "resource": { "type": "collection", "name": "sales" },
      "actions": ["read"]
    }
  ]
}

// Roles: "admin", "editor", "viewer"
// Passwords hashed with Argon2

Audit Trail

GET/api/v1/audit
Auth
Query audit trail (Admin only)

The audit trail logs all significant actions: logins, searches, data mutations, user management, and more. Logs are stored in daily-rotated JSONL files at {STORAGE_PATH}/logs/audit-{date}.jsonl.

Audit log entry example
{
  "timestamp": "2026-02-07T14:30:00Z",
  "user_id": "analyst",
  "action": "search",
  "resource": "collection:sales-data",
  "details": { "query_id": "...", "limit": 10 },
  "result": "success",
  "ip_address": "192.168.1.100",
  "duration_ms": 3
}

Persistence

POST/api/v1/save
Auth
Force save all collections to disk

FerresDB automatically saves dirty collections every 30 seconds and on graceful shutdown. Use the save endpoint to force an immediate flush.

Tiered Storage

Opt-in feature to move vectors between Hot (RAM), Warm (mmap), and Cold (disk) tiers based on access frequency. The HNSW graph stays in memory; only point data is tiered. Use this endpoint to inspect distribution and estimated memory per tier.

GET/api/v1/collections/{name}/tiers
Auth
Get tier distribution and estimated memory

WebSocket

GET/api/v1/ws
Auth
WebSocket endpoint (HTTP upgrade)

Authenticate via query parameter ?token=sk-xxx or Authorization header. Limits: 100 max connections, 10 MB max message size, 30s heartbeat, 5 min inactivity timeout.

gRPC API

FerresDB can expose a native gRPC API (feature flag grpc) as a high-performance alternative to REST, with bidirectional streaming. The gRPC server listens on port 50051 (configurable via GRPC_PORT). When enabled, it runs in parallel with the REST API. The same RPCs mirror the REST API: collections CRUD, points upsert/delete/list, Search, HybridSearch, ExplainSearch, plus StreamUpsert and StreamSearch. Metadata and filters are sent as JSON strings. See the core repo docs/api.md and crates/server/proto/ferresdb.proto for full mapping and client generation (Python, TypeScript, Go).

Default port: 50051 — set GRPC_PORT to override. Requires protoc at build time when the gRPC feature is enabled.

Python SDK

PyPI
pip install ferres-db-python

The Python SDK provides an async client built on httpx with automatic retry, structured logging, and type hints. Includes RealtimeClient for WebSocket streaming (upsert, subscribe to collection events). Requires Python 3.8+. Optional dependency: websockets>=12.0 for RealtimeClient.

setup.py
import asyncio
from vector_db_client import VectorDBClient, Point, DistanceMetric

async def main():
    async with VectorDBClient(
        base_url="http://localhost:8080",
        api_key="ferres_sk_...",
        timeout=30,
    ) as client:
        # create_collection, upsert_points, search, etc.
        pass

asyncio.run(main())

SDK Features

Full type hints
Automatic retry with exponential backoff
Structured logging (structlog)
Auto-batching (>1000 points)
Async/await (httpx)
Python 3.8+ support
RealtimeClient: WebSocket upsert + event subscription
estimate_search_cost, search_explain
Quantization (SQ8) & tiered storage (Hot/Warm/Cold)
API keys: list_keys, create_key, delete_key
Reindex: start_reindex, get_reindex_job, list_reindex_jobs

TypeScript SDK

npm
pnpm add @ferresdb/typescript-sdkor npm / yarn

The TypeScript SDK provides a fully-typed client with automatic retry, runtime validation via Zod, and RealtimeClient for WebSocket streaming (upsert, subscribe to collection events). Works with Node.js 18+, Deno, and browser environments.

setup.ts
import { VectorDBClient, DistanceMetric } from "@ferresdb/typescript-sdk";

const client = new VectorDBClient({
  baseUrl: "http://localhost:8080",
  apiKey: "ferres_sk_...",
  timeout: 30000,
  maxRetries: 3,
  retryDelay: 1000,
});

SDK Features

Full TypeScript types
Runtime validation with Zod
Automatic retry with backoff
Auto-batching (>1000 points)
RealtimeClient: WebSocket upsert + event subscription
ESM and CJS exports
estimateSearchCost, searchExplain
Quantization (SQ8) & tiered storage config
API keys: listKeys, createKey, deleteKey
Reindex: startReindex, getReindexJob, listReindexJobs

Data Model

FerresDB organizes data into collections of points. Each point contains a vector, a unique ID, and arbitrary JSON metadata.

Collections

A collection is a named container with a fixed vector dimension and distance metric. All points in a collection must have vectors of the same dimension.

PropertyTypeDescription
namestringUnique name (a-zA-Z0-9_-)
dimensionintegerVector dimension (1 - 4096)
distanceenumCosine, Euclidean, or DotProduct
enable_bm25booleanEnable BM25 text index (default: false)
bm25_text_fieldstringMetadata field for BM25 indexing (default: "text")

Points

Point structure
{
  "id": "unique-string-id",
  "vector": [0.1, 0.2, -0.1, ...],  // f32 values, length = collection dimension
  "metadata": {                       // arbitrary JSON object
    "text": "content for BM25",
    "category": "example",
    "price": 42.5
  }
}

Distance Metrics

Cosine

Range: [0, 1] (1 = identical)

Best for: NLP embeddings, semantic similarity. Vectors are L2-normalized internally.

Euclidean

Range: [0, ∞)

Best for: Spatial data, image features. Measures straight-line distance.

Dot Product

Range: (-∞, ∞)

Best for: Pre-normalized vectors, recommendation systems.

HNSW Index

FerresDB uses the Hierarchical Navigable Small World (HNSW) algorithm for approximate nearest-neighbor search. It builds a multi-layer graph structure enabling sub-millisecond queries even on large datasets.

ParameterDefaultDescription
m16Max connections per layer
ef_construction200Search width during index construction
ef_search50Search width during query

Persistence Model

FerresDB uses a WAL (Write-Ahead Log) with periodic snapshots for durability. On crash recovery, the latest snapshot is loaded and the WAL is replayed.

Storage layout
{STORAGE_PATH}/
├── collections/
│   └── {name}/
│       ├── points.jsonl       # Current state
│       ├── wal.jsonl          # Write-ahead log (append-only)
│       ├── snapshot.jsonl     # Periodic snapshots (every 1000 ops)
│       └── index.bin          # HNSW index (binary)
├── api_keys.db                # SQLite: API keys (SHA-256 hashed)
├── users.db                   # SQLite: users (Argon2 passwords)
└── logs/
    ├── queries.log            # Query log (JSONL)
    └── audit-{date}.jsonl     # Daily audit trail
Auto-save: Dirty collections are saved every 30 seconds in the background. All collections are saved on graceful shutdown. Snapshots are created every 1000 write operations to keep WAL size bounded.

© 2024 FerresDB. Built with Rust.