Hey HN, Iโm excited to share Antfly: a distributed document database and search engine written in Go that combines full-text, vector, and graph search. Use it for distributed multimodal search and memory, or for local dev and small deployments.
I built this to give developers a single-binary deployment with native ML inference (via a built-in service called Termite), meaning you don't need external API calls for vector search unless you want to use them.
Some things that might interest this crowd:
Capabilities: Multimodal indexing (images, audio, video), MongoDB-style in-place updates, and streaming RAG.
Distributed Systems: Multi-Raft setup built on etcd's library, backed by Pebble (CockroachDB's storage engine). Metadata and data shards get their own Raft groups.
Single Binary: antfly swarm gives you a single-process deployment with everything running. Good for local dev and small deployments. Scale out by adding nodes when you need to.
Ecosystem: Ships with a Kubernetes operator and an MCP server for LLM tool use.
Native ML inference: Antfly ships with Termite. Think of it like a built-in Ollama for non-generative models too (embeddings, reranking, chunking, text generation). No external API calls needed, but also supports them (OpenAI, Ollama, Bedrock, Gemini, etc.)
License: I went with Elastic License v2, not an OSI-approved license. I know that's a topic with strong feelings here. The practical upshot: you can use it, modify it, self-host it, build products on top of it, you just can't offer Antfly itself as a managed service. Felt like the right tradeoff for sustainability while still making the source available.
Happy to answer questions about the architecture, the Raft implementation, or anything else. Feedback welcome!