Build Release License: Unlicense
A fast RAG tool that indexes documents using your choice of embedding provider and stores them in LanceDB for efficient similarity search.
# Create a config $ quickrag init # Index documents $ quickrag index gutenberg/ --output gutenberg.rag β Parsing documents from gutenberg/... (using recursive-token chunker) β Detecting embedding dimensions β Initializing database β Finding files to index β Removing deleted files from index β Preparing for indexing β Indexing files β Finalizing Indexing complete! Processed 622 chunks across 2 files. Removed 1 deleted file. Added 619 new chunks (3 already existed). Total chunks in database: 619 # Search $ quickrag query gutenberg.rag "Who is Sherlock Holmes?"
- Multiple embedding providers (VoyageAI, OpenAI, Ollama)
- Token-based recursive chunking (default) or character-based chunking
- LanceDB vector storage with persistent
.ragfiles - Idempotent indexing (tracks indexed files, skips unchanged)
- Automatic cleanup of deleted files from index
- UTF-8 sanitization for PDF conversions
- TypeScript & Bun
brew install statico/quickrag/quickrag
# macOS (Apple Silicon) curl -L https://github.com/statico/quickrag/releases/latest/download/quickrag-darwin-arm64 -o /usr/local/bin/quickrag chmod +x /usr/local/bin/quickrag # macOS (Intel) curl -L https://github.com/statico/quickrag/releases/latest/download/quickrag-darwin-x64 -o /usr/local/bin/quickrag chmod +x /usr/local/bin/quickrag # Linux (ARM64) curl -L https://github.com/statico/quickrag/releases/latest/download/quickrag-linux-arm64 -o /usr/local/bin/quickrag chmod +x /usr/local/bin/quickrag # Linux (x64) curl -L https://github.com/statico/quickrag/releases/latest/download/quickrag-linux-x64 -o /usr/local/bin/quickrag chmod +x /usr/local/bin/quickrag
Note: macOS binaries are not codesigned. You may need to run xattr -d com.apple.quarantine /usr/local/bin/quickrag to bypass Gatekeeper.
Requires Bun.
git clone https://github.com/statico/quickrag.git
cd quickrag
bun install
bun run dev --helpquickrag init
This creates ~/.config/quickrag/config.yaml:
provider: ollama model: nomic-embed-text baseUrl: http://localhost:11434 chunking: strategy: recursive-token chunkSize: 500 chunkOverlap: 50 minChunkSize: 50
Edit ~/.config/quickrag/config.yaml to set API keys and preferences:
provider: openai apiKey: sk-your-key-here model: text-embedding-3-small chunking: strategy: recursive-token chunkSize: 500 chunkOverlap: 50 minChunkSize: 50
quickrag index ./documents --output my-docs.rag
quickrag query my-docs.rag "What is the main topic?"Configuration Options:
provider: Embedding provider (openai,voyageai, orollama)apiKey: API key (can also use environment variables)model: Model name for the embedding providerbaseUrl: Base URL for Ollama (default:http://localhost:11434)chunking.strategy:recursive-token(default) orsimplechunking.chunkSize: Tokens (forrecursive-token, default: 500) or characters (forsimple, default: 1000)chunking.chunkOverlap: Tokens (forrecursive-token, default: 50) or characters (forsimple, default: 200)chunking.minChunkSize: Minimum chunk size in tokens (default: 50). Chunks smaller than this are filtered out to prevent tiny fragments.
Token-based splitting that respects semantic boundaries. Splits at paragraph breaks, line breaks, sentence endings, then word boundaries. Chunks are sized by estimated tokens (default: 500), aligning with embedding model expectations. Maintains configurable overlap (default: 50 tokens, ~10%).
Character-based chunking for backward compatibility. Chunks are sized by characters (default: 1000) with sentence boundary detection. Overlap is character-based (default: 200).
Benchmarked on test corpus (2 files: sherlock-holmes.txt, frankenstein.txt):
| Metric | Recursive Token | Simple |
|---|---|---|
| Chunks Created | 622 chunks | 2,539 chunks (4.1x more) |
| Indexing Time | ~19 seconds | ~37 seconds |
| Query Quality | β Better semantic matches, more context |
Recommendation: Use recursive-token for production. The indexing time difference is negligible compared to improved retrieval quality.
Most Use Cases:
strategy: recursive-tokenchunkSize: 400-512(tokens) - Research-backed sweet spot for 85-90% recallchunkOverlap: 50-100(tokens, ~10-20%)
Technical Documentation:
strategy: recursive-tokenchunkSize: 500-600(tokens)chunkOverlap: 75-100(tokens)
Narrative Text:
strategy: recursive-tokenchunkSize: 400-500(tokens)chunkOverlap: 50-75(tokens)
Academic Papers:
strategy: recursive-tokenchunkSize: 600-800(tokens)chunkOverlap: 100-150(tokens)
# Basic indexing quickrag index ./documents --output my-docs.rag # Override chunking parameters quickrag index ./documents --chunker recursive-token --chunk-size 500 --chunk-overlap 50 --min-chunk-size 50 --output my-docs.rag # Use different provider quickrag index ./documents --provider openai --model text-embedding-3-small --output my-docs.rag # Clear existing index quickrag index ./documents --clear --output my-docs.rag
Note: QuickRAG automatically detects and removes deleted files from the index. If a file was previously indexed but no longer exists in the source directory, it will be removed from the database during the next indexing run.
quickrag query my-docs.rag "What is the main topic?"quickrag interactive my-docs.rag
provider: voyageai apiKey: your-voyage-api-key model: voyage-3
provider: openai apiKey: sk-your-openai-key model: text-embedding-3-small
provider: ollama model: nomic-embed-text baseUrl: http://localhost:11434
.txt- Plain text files.md- Markdown files.markdown- Markdown files
bun install bun run dev index ./documents --provider ollama --output test.rag bun run build bun run typecheck
- Bun >= 1.0.0
- TypeScript >= 5.0.0
- For Ollama: A running Ollama instance with an embedding model installed (e.g.,
ollama pull nomic-embed-text)
This is free and unencumbered software released into the public domain.
For more information, see UNLICENSE or visit https://unlicense.org