FerresDB Documentation

Complete reference for the FerresDB high-performance vector search engine. Learn how to install, configure, and integrate FerresDB into your applications.

Getting Started

FerresDB is a high-performance vector search engine written in Rust, designed for semantic search, RAG (Retrieval Augmented Generation), and recommendation systems. It features sub-millisecond search latency, HNSW-based approximate nearest-neighbor search, and hybrid BM25 text search.

Quick Start

1Run FerresDB Core with Docker (see Docker Installation)

2Optionally run the FerresDB Dashboard for a visual UI

3Install an SDK (TypeScript or Python) or use the REST API directly

4Create a collection, upsert vectors, and start searching

Key Features

Sub-millisecond P50 search latency

HNSW approximate nearest-neighbor

Hybrid vector + BM25 text search

API Key auth + RBAC + JWT dashboard

WAL + snapshots with crash recovery

Prometheus metrics + query profiling

Dashboard with user management

TypeScript, Python & Rust SDKs

Optional gRPC API (streaming, port 50051)

Background reindex & tiered storage (Hot/Warm/Cold)

Docker Installation

The fastest way to get FerresDB running is with Docker. Official images are published to Docker Hub.

FerresDB Core (Backend API)

The Core image runs the FerresDB REST API server.

terminal

# Pull the latest image
docker pull ferresdb/ferres-db-core:latest

# Run the container
docker run -d \
  -p 8080:8080 \
  -e PORT=8080 \
  -e CORS_ORIGINS=http://localhost:3000 \
  -e FERRESDB_API_KEYS=sk-abcd1234 \
  -v ferres-data:/data \
  ferresdb/ferres-db-core:latest

Result: API available at http://localhost:8080. Verify with curl http://localhost:8080/health

FerresDB Dashboard (Frontend)

The Dashboard provides a web UI for managing collections, browsing data, and monitoring performance.

terminal

# Pull the latest image
docker pull ferresdb/ferres-db-frontend:latest

# Run the container
docker run -d \
  -p 3000:80 \
  -e VITE_API_BASE_URL=http://localhost:8080 \
  -e VITE_API_KEY=sk-abcd1234 \
  ferresdb/ferres-db-frontend:latest

Result: Dashboard available at http://localhost:3000. Default login: root / ferresdb

Docker Compose (Full Stack)

Run both backend and frontend together with Docker Compose.

docker-compose.yml

version: "3.8"
services:
  backend:
    image: ferresdb/ferres-db-core:latest
    ports:
      - "${BACKEND_PORT:-8080}:${BACKEND_PORT:-8080}"
    volumes:
      - ferres-data:/data
    environment:
      - HOST=0.0.0.0
      - PORT=${BACKEND_PORT:-8080}
      - STORAGE_PATH=/data
      - LOG_LEVEL=${LOG_LEVEL:-info}
      - CORS_ORIGINS=${CORS_ORIGINS:-http://localhost:3000}
      - FERRESDB_API_KEYS=${FERRESDB_API_KEYS:-sk-abcd1234}
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:${BACKEND_PORT:-8080}/health"]
      interval: 30s
      timeout: 3s
      retries: 3

  frontend:
    image: ferresdb/ferres-db-frontend:latest
    ports:
      - "${FRONTEND_PORT:-3000}:80"
    environment:
      - VITE_API_BASE_URL=${VITE_API_BASE_URL:-http://localhost:8080}
      - VITE_API_KEY=${VITE_API_KEY:-sk-abcd1234}
    depends_on:
      backend:
        condition: service_healthy

volumes:
  ferres-data:

terminal

docker compose up -d

Configuration

FerresDB is configured through environment variables. These take precedence over values in config.toml.

Variable	Default	Description
`HOST`	0.0.0.0	Server bind host
`PORT`	8080	Server port
`STORAGE_PATH`	/data	Path for persistent data (collections, WAL, indexes)
`LOG_LEVEL`	info	Log level: trace, debug, info, warn, error
`FERRESDB_API_KEYS`	-	Comma-separated bootstrap API keys
`FERRESDB_JWT_SECRET`	(dev default)	JWT secret for dashboard sessions (change in production!)
`CORS_ORIGINS`	localhost:*	Comma-separated allowed CORS origins
`OTEL_EXPORTER_OTLP_ENDPOINT`	localhost:4317	OpenTelemetry OTLP endpoint (when otel feature enabled)
`GRPC_PORT`	50051	gRPC server port (when grpc feature enabled)

Precedence: Environment variables > config.toml > defaults

config.toml

host = "0.0.0.0"
port = 8080
storage_path = "/data"
log_level = "info"
api_keys = "sk-my-secret-key"

Authentication

FerresDB uses two authentication methods: API Keys for programmatic access and JWT tokens for the dashboard.

API Key Authentication

All protected endpoints require an API key in the Authorization header.

HTTP request

Authorization: Bearer sk-abcd1234

Bootstrap keys: Set via FERRESDB_API_KEYS environment variable (comma-separated).

Managed keys: Create/delete via the API Keys endpoints. Keys are hashed with SHA-256 and stored in SQLite. The raw key is returned only once on creation.

Key format: ferres_sk_... or sk-... (legacy).

JWT Authentication (Dashboard)

POST /api/v1/auth/login

{
  "username": "root",
  "password": "ferresdb"
}

// Response:
{
  "token": "eyJhbGc...",
  "user": {
    "username": "root",
    "role": "admin"
  }
}

Warning: Change the default credentials (root / ferresdb) and FERRESDB_JWT_SECRET in production.

RBAC (Role-Based Access Control)

Role	Permissions
Admin	Full access: users, audit, all collections, API keys
Editor	Read, write, create collections
Viewer	Read-only: search, get, list

Granular permissions can restrict access to specific collections and metadata fields:

RBAC permission example

{
  "username": "analyst",
  "role": "viewer",
  "permissions": [
    {
      "resource": { "type": "collection", "name": "sales-data" },
      "actions": ["read"],
      "metadata_restriction": {
        "field": "department",
        "allowed_values": ["sales"]
      }
    }
  ]
}

API Reference

Base URL: http://localhost:8080/api/v1. All protected endpoints require Authorization: Bearer <api-key>.

Public endpoints (no auth required): GET /health, GET /metrics, POST /api/v1/auth/login

Collections

POST/api/v1/collections

Auth

Create collection

GET/api/v1/collections

Auth

List all collections

GET/api/v1/collections/{name}

Auth

Get collection details

DELETE/api/v1/collections/{name}

Auth

Delete collection

POST /api/v1/collections

{
  "name": "my_collection",
  "dimension": 384,
  "distance": "Cosine",
  "enable_bm25": false,
  "bm25_text_field": "text"
}

// distance options: "Cosine", "Euclidean", "DotProduct"
// enable_bm25: optional (default: false)
// bm25_text_field: optional (default: "text")

Reindex (Background)

Rebuild the ANN index in the background without downtime. Useful when tombstone count grows (e.g. after many deletions). Searches continue against the old index until the new one is built and swapped. Only one reindex job per collection at a time. Auto-reindex is triggered when tombstones exceed 20% of indexed points.

POST/api/v1/collections/{name}/reindex

Auth

Start reindex job (returns 202 Accepted)

GET/api/v1/collections/{name}/reindex/{job_id}

Auth

Get reindex job status

GET/api/v1/collections/{name}/reindex

Auth

List reindex jobs for collection

POST /api/v1/collections/{name}/reindex - Response 202

{
  "job_id": "uuid",
  "status": "building",
  "message": "Reindex started"
}

Points

POST/api/v1/collections/{name}/points

Auth

Upsert points (max 1000/batch)

GET/api/v1/collections/{name}/points

Auth

List points (paginated)

GET/api/v1/collections/{name}/points/{id}

Auth

Get point by ID

DELETE/api/v1/collections/{name}/points

Auth

Delete points by IDs

POST /api/v1/collections/{name}/points

{
  "points": [
    {
      "id": "doc-1",
      "vector": [0.1, 0.2, -0.1, ...],
      "metadata": {
        "text": "Hello world",
        "category": "greeting"
      }
    },
    {
      "id": "doc-2",
      "vector": [0.3, -0.1, 0.5, ...],
      "metadata": {
        "text": "Goodbye world",
        "category": "farewell"
      }
    }
  ]
}

// Max 1000 points per batch
// Vector dimension must match collection
// Existing IDs are updated (upsert)

Search

POST/api/v1/collections/{name}/search

Auth

Vector search

POST/api/v1/collections/{name}/search/hybrid

Auth

Hybrid search (vector + BM25)

POST/api/v1/collections/{name}/search/explain

Auth

Search with explanation

POST/api/v1/collections/{name}/search/estimate

Auth

Estimate search cost

POST /api/v1/collections/{name}/search

{
  "vector": [0.1, 0.2, -0.1, ...],
  "limit": 5,
  "filter": {
    "category": "tech",
    "price": { "$gte": 10, "$lte": 100 }
  },
  "budget_ms": 50
}

// filter: optional metadata filter
// budget_ms: optional, fails with 422 if exceeded
// Filter operators: $eq, $ne, $in, $gt, $lt, $gte, $lte

Stats & Metrics

GET/health

Health check

GET/metrics

Prometheus metrics

GET/api/v1/stats/global

Global statistics

GET/api/v1/stats/queries

Recent queries (24h)

GET/api/v1/stats/slow-queries

Slow queries

GET/api/v1/stats/feedback

Query feedback stats

GET/api/v1/collections/{name}/stats

Auth

Collection statistics

GET/api/v1/debug/query-profile/{query_id}

Query execution profile

API Keys Management

GET/api/v1/keys

Auth

List API keys (metadata only)

POST/api/v1/keys

Auth

Create new API key

DELETE/api/v1/keys/{id}

Auth

Delete API key

POST /api/v1/keys - Response

{
  "id": "key_abc123",
  "key": "ferres_sk_full_raw_key_shown_once",
  "prefix": "ferres_sk_full...",
  "created_at": "2026-02-07T12:00:00Z"
}

// The raw key is returned ONLY on creation.
// Store it securely - it cannot be retrieved later.

Users Management

GET/api/v1/users

Auth

List users (Admin only)

POST/api/v1/users

Auth

Create user (Admin only)

DELETE/api/v1/users/{id}

Auth

Delete user (Admin only)

PUT/api/v1/users/{username}/password

Auth

Update password (Admin only)

PUT/api/v1/users/{username}/permissions

Auth

Update permissions (Admin only)

POST /api/v1/users

{
  "username": "analyst",
  "password": "secure-password",
  "role": "viewer",
  "permissions": [
    {
      "resource": { "type": "collection", "name": "sales" },
      "actions": ["read"]
    }
  ]
}

// Roles: "admin", "editor", "viewer"
// Passwords hashed with Argon2

Audit Trail

GET/api/v1/audit

Auth

Query audit trail (Admin only)

The audit trail logs all significant actions: logins, searches, data mutations, user management, and more. Logs are stored in daily-rotated JSONL files at {STORAGE_PATH}/logs/audit-{date}.jsonl.

Audit log entry example

{
  "timestamp": "2026-02-07T14:30:00Z",
  "user_id": "analyst",
  "action": "search",
  "resource": "collection:sales-data",
  "details": { "query_id": "...", "limit": 10 },
  "result": "success",
  "ip_address": "192.168.1.100",
  "duration_ms": 3
}

Persistence

POST/api/v1/save

Auth

Force save all collections to disk

FerresDB automatically saves dirty collections every 30 seconds and on graceful shutdown. Use the save endpoint to force an immediate flush.

Tiered Storage

Opt-in feature to move vectors between Hot (RAM), Warm (mmap), and Cold (disk) tiers based on access frequency. The HNSW graph stays in memory; only point data is tiered. Use this endpoint to inspect distribution and estimated memory per tier.

GET/api/v1/collections/{name}/tiers

Auth

Get tier distribution and estimated memory

WebSocket

GET/api/v1/ws

Auth

WebSocket endpoint (HTTP upgrade)

Authenticate via query parameter ?token=sk-xxx or Authorization header. Limits: 100 max connections, 10 MB max message size, 30s heartbeat, 5 min inactivity timeout.

gRPC API

FerresDB can expose a native gRPC API (feature flag grpc) as a high-performance alternative to REST, with bidirectional streaming. The gRPC server listens on port 50051 (configurable via GRPC_PORT). When enabled, it runs in parallel with the REST API. The same RPCs mirror the REST API: collections CRUD, points upsert/delete/list, Search, HybridSearch, ExplainSearch, plus StreamUpsert and StreamSearch. Metadata and filters are sent as JSON strings. See the core repo docs/api.md and crates/server/proto/ferresdb.proto for full mapping and client generation (Python, TypeScript, Go).

Default port: 50051 — set GRPC_PORT to override. Requires protoc at build time when the gRPC feature is enabled.

Python SDK

PyPI

pip install ferres-db-python

The Python SDK provides an async client built on httpx with automatic retry, structured logging, and type hints. Includes RealtimeClient for WebSocket streaming (upsert, subscribe to collection events). Requires Python 3.8+. Optional dependency: websockets>=12.0 for RealtimeClient.

setup.py

import asyncio
from vector_db_client import VectorDBClient, Point, DistanceMetric

async def main():
    async with VectorDBClient(
        base_url="http://localhost:8080",
        api_key="ferres_sk_...",
        timeout=30,
    ) as client:
        # create_collection, upsert_points, search, etc.
        pass

asyncio.run(main())

SDK Features

Full type hints

Automatic retry with exponential backoff

Structured logging (structlog)

Auto-batching (>1000 points)

Async/await (httpx)

Python 3.8+ support

RealtimeClient: WebSocket upsert + event subscription

estimate_search_cost, search_explain

Quantization (SQ8) & tiered storage (Hot/Warm/Cold)

API keys: list_keys, create_key, delete_key

Reindex: start_reindex, get_reindex_job, list_reindex_jobs

TypeScript SDK

npm

pnpm add @ferresdb/typescript-sdkor npm / yarn

The TypeScript SDK provides a fully-typed client with automatic retry, runtime validation via Zod, and RealtimeClient for WebSocket streaming (upsert, subscribe to collection events). Works with Node.js 18+, Deno, and browser environments.

setup.ts

import { VectorDBClient, DistanceMetric } from "@ferresdb/typescript-sdk";

const client = new VectorDBClient({
  baseUrl: "http://localhost:8080",
  apiKey: "ferres_sk_...",
  timeout: 30000,
  maxRetries: 3,
  retryDelay: 1000,
});

SDK Features

Full TypeScript types

Runtime validation with Zod

Automatic retry with backoff

Auto-batching (>1000 points)

RealtimeClient: WebSocket upsert + event subscription

ESM and CJS exports

estimateSearchCost, searchExplain

Quantization (SQ8) & tiered storage config

API keys: listKeys, createKey, deleteKey

Reindex: startReindex, getReindexJob, listReindexJobs

Data Model

FerresDB organizes data into collections of points. Each point contains a vector, a unique ID, and arbitrary JSON metadata.

Collections

A collection is a named container with a fixed vector dimension and distance metric. All points in a collection must have vectors of the same dimension.

Property	Type	Description
`name`	string	Unique name (a-zA-Z0-9_-)
`dimension`	integer	Vector dimension (1 - 4096)
`distance`	enum	Cosine, Euclidean, or DotProduct
`enable_bm25`	boolean	Enable BM25 text index (default: false)
`bm25_text_field`	string	Metadata field for BM25 indexing (default: "text")

Points

Point structure

{
  "id": "unique-string-id",
  "vector": [0.1, 0.2, -0.1, ...],  // f32 values, length = collection dimension
  "metadata": {                       // arbitrary JSON object
    "text": "content for BM25",
    "category": "example",
    "price": 42.5
  }
}

Distance Metrics

Cosine

Range: [0, 1] (1 = identical)

Best for: NLP embeddings, semantic similarity. Vectors are L2-normalized internally.

Euclidean

Range: [0, ∞)

Best for: Spatial data, image features. Measures straight-line distance.

Dot Product

Range: (-∞, ∞)

Best for: Pre-normalized vectors, recommendation systems.

HNSW Index

FerresDB uses the Hierarchical Navigable Small World (HNSW) algorithm for approximate nearest-neighbor search. It builds a multi-layer graph structure enabling sub-millisecond queries even on large datasets.

Parameter	Default	Description
`m`	16	Max connections per layer
`ef_construction`	200	Search width during index construction
`ef_search`	50	Search width during query

Persistence Model

FerresDB uses a WAL (Write-Ahead Log) with periodic snapshots for durability. On crash recovery, the latest snapshot is loaded and the WAL is replayed.

Storage layout

{STORAGE_PATH}/
├── collections/
│   └── {name}/
│       ├── points.jsonl       # Current state
│       ├── wal.jsonl          # Write-ahead log (append-only)
│       ├── snapshot.jsonl     # Periodic snapshots (every 1000 ops)
│       └── index.bin          # HNSW index (binary)
├── api_keys.db                # SQLite: API keys (SHA-256 hashed)
├── users.db                   # SQLite: users (Argon2 passwords)
└── logs/
    ├── queries.log            # Query log (JSONL)
    └── audit-{date}.jsonl     # Daily audit trail

Auto-save: Dirty collections are saved every 30 seconds in the background. All collections are saved on graceful shutdown. Snapshots are created every 1000 write operations to keep WAL size bounded.

Home GitHub