TegDB is a lightweight, embedded database engine with a SQL-like interface designed for simplicity, performance, and reliability. It provides ACID transactions, crash recovery, and efficient RDBMS.
Design Philosophy: TegDB prioritizes simplicity and reliability over complexity. It uses a single-threaded design to eliminate concurrency bugs, reduce memory overhead, and provide predictable performance - making it ideal for embedded systems and applications where resource efficiency matters more than parallel processing.
- Key->offset B+tree index; values are stored on disk, small values can be inlined to reduce IO
- Bounded cache (configurable cap) to boost hot-read hit rates
- Primary key optimized queries (O(log n) lookups)
- Streaming query processing with early LIMIT termination
- Efficient binary serialization
- Atomicity: All-or-nothing transaction execution
- Consistency: Schema validation and constraint enforcement
- Isolation: Write-through with snapshot-like behavior
- Durability: Write-ahead logging with commit markers
- Crash recovery from write-ahead log
- File locking prevents concurrent access corruption
- Graceful handling of partial writes and corruption
- Automatic rollback on transaction drop
- Strong durability by default: per-transaction fsync, configurable group commit
- Observability: metrics for bytes read/written, cache hits/misses, fsync counts
- Single-threaded architecture eliminates race conditions
- Minimal dependencies (only
fs2for file locking) - Clean separation of concerns across layers
- Extensive test coverage including ACID compliance
- PostgreSQL-inspired plugin architecture
- Built-in string functions: UPPER, LOWER, LENGTH, TRIM, SUBSTR, REPLACE, CONCAT, REVERSE
- Built-in math functions: ABS, CEIL, FLOOR, ROUND, SQRT, POW, MOD, SIGN
- Create custom scalar and aggregate functions
- Type-safe function signatures with validation
This walkthrough uses released builds and the CLI tools, no code required.
- Install the CLIs
# Clone the repository cd tegdb # Build both CLI tools (tg and tgstream) cargo build --release # Or build individually: # cargo build --release --bin tg # cargo build --release --bin tgstream # Copy binaries to PATH (or add target/release to your PATH) cp target/release/tg ~/.cargo/bin/ cp target/release/tgstream ~/.cargo/bin/ # Ensure ~/.cargo/bin is on your PATH export PATH="$HOME/.cargo/bin:$PATH"
Alternative: If you prefer installing from crates.io:
cargo install tegdb --version 0.3.0 --bin tg cargo install tegdb --version 0.3.0 --bin tgstream
- Start MinIO locally and create a bucket
# Run MinIO -e MINIO_ROOT_USER=minioadmin -e MINIO_ROOT_PASSWORD=minioadmin \ quay.io/minio/minio server /data --console-address :9001 # Create a bucket using the MinIO Console at http://localhost:9001 (Login: minioadmin/minioadmin) # In the Console: Buckets → Create Bucket → Name: tegdb-backups
- Configure AWS-compatible env vars for MinIO
export AWS_ACCESS_KEY_ID=minioadmin export AWS_SECRET_ACCESS_KEY=minioadmin export AWS_REGION=us-east-1 export AWS_ENDPOINT_URL=http://127.0.0.1:9000 export TGSTREAM_BUCKET=tegdb-backups
- Create and query a database with the
tgCLI
# Use an absolute file URL ending with .teg DB=file:///$(pwd)/quickstart.teg # Create table and insert a row tg "$DB" --command "INSERT INTO users (id, name) VALUES (1, 'Alice');" # Query
- Enable continuous cloud backup to MinIO with
tgstream
# Create config file (use absolute path - replace /path/to with your actual path) cat > tgstream.toml <<EOF database_path = "$(pwd)/quickstart.teg" [s3] bucket = "tegdb-backups" prefix = "dbs/quickstart" region = "us-east-1" endpoint = "http://127.0.0.1:9000" access_key_id = "minioadmin" secret_access_key = "minioadmin" [base] interval_minutes = 1 segment_size_mb = 50 [segment] min_bytes = 1024 debounce_ms = 1500 [retention] bases = 3 max_segments_bytes = 107374182400 gzip = true EOF # Start replication (best run under a supervisor/tmux) tgstream run --config tgstream.toml # In another terminal, verify backup is working: tgstream list --config tgstream.toml # You should see base snapshots appearing every 15 minutes
- Restore database from backup
# List available backups # Restore to latest state # Verify restored data # Should show: Alice
Example restore scenario: If your original database gets corrupted or deleted, you can restore it from MinIO:
# Original database is lost/corrupted rm quickstart.teg # Restore from backup # Continue using the restored database
Add TegDB to your Cargo.toml:
[dependencies] tegdb = "0.3.0"
use tegdb::Database; fn main() -> tegdb::Result<()> { // Open or create a database (must use absolute file:// path ending in .teg) let mut db = Database::open("file:///tmp/my_app.teg")?; // Create a table db.execute("CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT(32), age INTEGER)")?; // Insert data db.execute("INSERT INTO users (id, name, age) VALUES (1, 'Alice', 30)")?; db.execute("INSERT INTO users (id, name, age) VALUES (2, 'Bob', 25)")?; // Query data let result = db.query("SELECT name, age FROM users WHERE age > 25")?; println!("Found {} users:", result.len()); for row in result.rows() { if let (Some(name), Some(age)) = (row.first(), row.get(1)) { println!("User: {:?}, Age: {:?}", name, age); } } Ok(()) }
use tegdb::Database; fn main() -> tegdb::Result<()> { let mut db = Database::open("file:///tmp/bank.teg")?; // Create accounts table db.execute("CREATE TABLE accounts (id INTEGER PRIMARY KEY, name TEXT(32), balance INTEGER)")?; db.execute("INSERT INTO accounts (id, name, balance) VALUES (1, 'Alice', 1000)")?; db.execute("INSERT INTO accounts (id, name, balance) VALUES (2, 'Bob', 500)")?; // Transfer funds using explicit transaction let mut tx = db.begin_transaction()?; // Debit from Alice's account tx.execute("UPDATE accounts SET balance = balance - 100 WHERE id = 1")?; // Credit to Bob's account tx.execute("UPDATE accounts SET balance = balance + 100 WHERE id = 2")?; // Commit the transaction (or it will auto-rollback on drop) tx.commit()?; println!("Transfer completed successfully!"); Ok(()) }
TegDB supports a comprehensive subset of SQL:
-- Create tables with constraints CREATE TABLE products ( id INTEGER PRIMARY KEY, name TEXT NOT NULL, price REAL, category TEXT ); -- Drop tables DROP TABLE IF EXISTS old_table;
-- Insert single or multiple rows INSERT INTO products (id, name, price) VALUES (1, 'Widget', 19.99); INSERT INTO products (id, name, price) VALUES (2, 'Gadget', 29.99), (3, 'Tool', 39.99); -- Update with conditions UPDATE products SET price = 24.99 WHERE name = 'Widget'; -- Delete with conditions DELETE FROM products WHERE price < 20.00; -- Query with filtering and limits SELECT name, price FROM products WHERE category = 'Electronics' LIMIT 10;
BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = 1; UPDATE accounts SET balance = balance + 100 WHERE id = 2; COMMIT; -- or ROLLBACK;
INTEGER- 64-bit signed integersREAL- 64-bit floating point numbersTEXT- UTF-8 strings (requires length specification, e.g., TEXT(100))NULL- Null values
TegDB provides a PostgreSQL-inspired extension system for adding custom functions. Extensions can be loaded via SQL commands (PostgreSQL-style) or programmatically via the Rust API.
The recommended way to manage extensions is using SQL commands:
-- Load built-in extensions CREATE EXTENSION tegdb_string; CREATE EXTENSION tegdb_math; -- Use extension functions in SQL SELECT UPPER('hello'), SQRT(144); -- Load custom extension from dynamic library CREATE EXTENSION my_extension; -- Or specify explicit path CREATE EXTENSION my_extension WITH PATH '/path/to/libmy_extension.so'; -- Extensions persist automatically - they'll be loaded on next database open -- Remove an extension DROP EXTENSION my_extension;
You can also register extensions programmatically:
use tegdb::{Database, StringFunctionsExtension, MathFunctionsExtension, SqlValue}; let mut db = Database::open("file:///tmp/test.teg")?; // Register built-in extensions db.register_extension(Box::new(MathFunctionsExtension))?; // Call functions directly let result = db.call_function("UPPER", &[SqlValue::Text("hello".to_string())])?; assert_eq!(result, SqlValue::Text("HELLO".to_string())); let result = db.call_function("SQRT", &[SqlValue::Integer(144)])?; assert_eq!(result, SqlValue::Real(12.0));
String Functions (tegdb_string):
UPPER(text)- Convert to uppercaseLOWER(text)- Convert to lowercaseLENGTH(text)- String lengthTRIM(text)/LTRIM(text)/RTRIM(text)- Trim whitespaceSUBSTR(text, start, length)- Extract substringREPLACE(text, from, to)- Replace occurrencesCONCAT(text, ...)- Concatenate strings (variadic)REVERSE(text)- Reverse string
Math Functions (tegdb_math):
ABS(number)- Absolute valueCEIL(number)/FLOOR(number)- Ceiling/floorROUND(number, decimals)- Round to decimal placesSQRT(number)- Square rootPOW(base, exponent)- PowerMOD(a, b)- ModuloSIGN(number)- Sign (-1, 0, or 1)
Define custom extensions by implementing the Extension trait:
use tegdb::{Extension, ScalarFunction, FunctionSignature, ArgType, DataType, SqlValue}; // Define a custom function struct DoubleFunction; impl ScalarFunction for DoubleFunction { fn name(&self) -> &'static str { "DOUBLE" } fn signature(&self) -> FunctionSignature { FunctionSignature::new(vec![ArgType::Numeric], DataType::Real) } fn execute(&self, args: &[SqlValue]) -> Result<SqlValue, String> { match &args[0] { SqlValue::Integer(i) => Ok(SqlValue::Integer(i * 2)), SqlValue::Real(r) => Ok(SqlValue::Real(r * 2.0)), _ => Err("Expected numeric argument".to_string()), } } } // Define an extension struct MyExtension; impl Extension for MyExtension { fn name(&self) -> &'static str { "my_extension" } fn version(&self) -> &'static str { "1.0.0" } fn scalar_functions(&self) -> Vec<Box<dyn ScalarFunction>> { vec![Box::new(DoubleFunction)] } } // Register and use let result = db.call_function("DOUBLE", &[SqlValue::Integer(21)])?; assert_eq!(result, SqlValue::Integer(42));
Via SQL:
-- Extensions are automatically persisted and loaded on database open -- No need to re-run CREATE EXTENSION after restarting
Via Rust API:
// List registered extensions for (name, version) in db.list_extensions() { println!("{} v{}", name, version); } // Check if a function exists if db.has_function("UPPER") { println!("UPPER function is available"); } // Unregister an extension
To create a dynamic library extension that can be loaded via CREATE EXTENSION:
-
Create a Rust library project:
cargo new --lib my_extension cd my_extension -
Configure Cargo.toml:
[lib] crate-type = ["cdylib"] [dependencies] tegdb = { path = "../tegdb" } # Or from crates.io
-
Implement the extension:
use tegdb::{Extension, ExtensionWrapper, ScalarFunction, SqlValue, FunctionSignature, ArgType, DataType}; struct MyFunction; impl ScalarFunction for MyFunction { fn name(&self) -> &'static str { "MY_FUNCTION" } fn signature(&self) -> FunctionSignature { FunctionSignature::new(vec![ArgType::TextLike], DataType::Text(None)) } fn execute(&self, args: &[SqlValue]) -> Result<SqlValue, String> { match &args[0] { SqlValue::Text(s) => Ok(SqlValue::Text(s.to_uppercase())), _ => Err("Expected text argument".to_string()), } } } struct MyExtension; impl Extension for MyExtension { fn name(&self) -> &'static str { "my_extension" } fn version(&self) -> &'static str { "1.0.0" } fn scalar_functions(&self) -> Vec<Box<dyn ScalarFunction>> { vec![Box::new(MyFunction)] } } #[no_mangle] pub extern "C" fn create_extension() -> *mut ExtensionWrapper { Box::into_raw(Box::new(ExtensionWrapper { extension: Box::new(MyExtension), })) }
-
Build and install:
cargo build --release cp target/release/libmy_extension.so ~/.tegdb/extensions/ # Or place in ./extensions/ relative to your database
-
Use in SQL:
CREATE EXTENSION my_extension; SELECT MY_FUNCTION('hello');
For complete examples, see examples/extension_demo.rs and examples/extension_template.rs.
- Primary key lookups: O(log n)
- Range scans: O(log n + k) where k = result size
- Inserts/Updates/Deletes: O(log n)
- Schema operations: O(1) with caching
- In-memory index: BTreeMap holds key -> value offset (values on disk), small values can inline
- Bounded cache: Byte-capped value/page cache for hot data
- Lazy allocation: Undo logs only allocated when needed
- Streaming queries: LIMIT processed without loading full result
- Fixed header: 64-byte header with magic
TEGDB0円, version (1), limits, flags - Append-only log: Fast writes after the header, no seek overhead
- Binary serialization: Compact data representation
- Key->offset layout: B+tree holds offsets; values stored in the data region, small values inline
- Automatic compaction: Reclaims space from old entries while preserving header
- Crash recovery: Replay from last commit marker
TegDB includes tgstream, a standalone streaming backup tool that continuously replicates your database to cloud storage (S3, MinIO, etc.), similar to Litestream for SQLite.
- Incremental Replication: Tracks file offsets and uploads only committed changes
- Base Snapshots: Periodic full database snapshots for fast recovery
- Automatic Rotation Detection: Handles database compaction/rotation automatically
- Point-in-Time Recovery: Restore to any previous state using base + segments
- Retention Policies: Configurable retention for snapshots and segments
- Compression: Optional gzip compression to reduce storage costs
Install from crates.io:
cargo install tegdb --version 0.3.0 --bin tgstream
# Or install both binaries:
cargo install tegdb --version 0.3.0 --bin tg --bin tgstreamCreate a configuration file tgstream.toml:
database_path = "/absolute/path/to/your/database.teg" [s3] bucket = "my-backup-bucket" prefix = "dbs/mydb" region = "us-east-1" # Optional: For MinIO or custom S3-compatible storage endpoint = "http://localhost:9000" # MinIO endpoint access_key_id = "minioadmin" # MinIO access key secret_access_key = "minioadmin" # MinIO secret key [base] interval_minutes = 60 # Create new base snapshot every hour segment_size_mb = 100 # Or after 100MB of segments [segment] min_bytes = 1024 # Minimum segment size to upload debounce_ms = 2000 # Wait 2 seconds before uploading [retention] bases = 3 # Keep last 3 base snapshots max_segments_bytes = 107374182400 # 100GB max segments gzip = true # Enable compression
For AWS S3, you can omit endpoint, access_key_id, and secret_access_key and use environment variables or IAM roles instead:
export AWS_ACCESS_KEY_ID=your-key export AWS_SECRET_ACCESS_KEY=your-secret export AWS_REGION=us-east-1
# Run continuous replication # Create a one-off snapshot # Restore database from backup # List available snapshots # Prune old snapshots
- Monitoring: Tegstream monitors your
.tegfile for new committed transactions - Segment Uploads: After each commit, new data is uploaded as incremental segments
- Base Snapshots: Periodically (every N minutes or after N MB of segments), a full snapshot is created
- State Tracking: Local state file tracks progress, file metadata, and prevents duplicate uploads
- Restore: Downloads base snapshot + all subsequent segments to reconstruct the database
The tool is designed to be run as a background service alongside your application, providing continuous off-site backup with minimal overhead.
flowchart LR
subgraph App
TG[tg CLI / App]
end
DB[(.teg file)]
TS[tgstream]
S3[(S3/MinIO Bucket)]
TG -- SQL --> DB
TS -- monitor commits --> DB
TS -- base snapshots --> S3
TS -- incremental segments --> S3
S3 -- base+segments --> Restore[tgstream restore]
TegDB implements a clean layered architecture with four distinct layers:
flowchart TB
A[Database API\nSQLite-like interface\nwith schema caching]
B[SQL Executor\nQuery optimization\n+ execution]
C[SQL Parser\nnom-based AST]
D[Storage Engine\nKV + WAL + TX]
A --> B --> C --> D
- Storage Engine: BTreeMap-based in-memory storage with append-only log persistence
- Transaction System: Write-through transactions with undo logging and commit markers
- SQL Support: Full SQL parser and executor supporting DDL and DML operations
- Index-Organized Tables: Primary key optimization with direct key lookups
- Schema Caching: Database-level schema caching for improved performance
- Crash Recovery: WAL-based recovery using transaction commit markers
See ARCHITECTURE.md for detailed information about:
- Layer-by-layer implementation details
- Storage format and recovery mechanisms
- Memory management and performance optimizations
- Transaction system and ACID guarantees
- Query optimization and execution strategies
use std::time::Duration; use tegdb::storage_engine::{DurabilityConfig, DurabilityLevel, EngineConfig}; let config = EngineConfig { max_key_size: 1024, // 1KB max key size max_value_size: 256 * 1024, // 256KB max value size auto_compact: true, // Auto-compact on open initial_capacity: None, // No default key cap; set Some(...) to enforce a hard key cap preallocate_size: None, // No default disk preallocation; set Some(bytes) to cap/preallocate // Durability: default is Immediate (fsync every commit) durability: DurabilityConfig { level: DurabilityLevel::Immediate, group_commit_interval: Duration::from_millis(0), // Set >0 to enable group commit }, // Inline hot/small values to avoid disk reads; the rest spill to data pages. inline_value_threshold: 64, // bytes cache_size_bytes: 8 * 1024 * 1024, // value/page cache cap // Compaction: absolute threshold + ratio + min delta bytes since last compact compaction_absolute_threshold_bytes: 10 * 1024 * 1024, compaction_ratio: 2.0, compaction_min_delta_bytes: 2 * 1024 * 1024, ..Default::default() }; // Note: Custom config requires low-level API
Key defaults:
- Values are stored on disk; the B+tree indexes key -> value offset. Small values (<=
inline_value_threshold) stay inline. - A byte-capped value cache (
cache_size_bytes) keeps hot values in memory. - Durability defaults to per-commit
fsync; setDurabilityLevel::GroupCommitwith a non-zerogroup_commit_intervalto coalesce flushes. - Default compaction uses an absolute threshold (10 MiB), fragmentation ratio (2.0), and a minimum written delta (2 MiB) since the last compaction.
- No default hard cap on key count or disk size; set
initial_capacityandpreallocate_sizeto enforce limits in production.
Metrics (observability):
let metrics = engine.metrics(); println!( "bytes_read={}, bytes_written={}, cache_hits={}, cache_misses={}, fsyncs={}", metrics.bytes_read, metrics.bytes_written, metrics.cache_hits, metrics.cache_misses, metrics.fsync_count );
For advanced use cases, you can access low-level APIs via module paths:
use tegdb::storage_engine::{StorageEngine, EngineConfig}; // Direct key-value operations (requires absolute PathBuf) let mut engine = StorageEngine::new(std::path::PathBuf::from("/tmp/data.teg"))?; engine.set(b"key", b"value".to_vec())?; let value = engine.get(b"key"); // Transaction control let mut tx = engine.begin_transaction(); tx.set(b"key1", b"value1".to_vec())?; tx.set(b"key2", b"value2".to_vec())?; tx.commit()?;
# Standard build cargo build # Run tests cargo test # Run benchmarks cargo bench
TegDB includes comprehensive tests covering:
- ACID transaction properties
- Crash recovery scenarios
- SQL parsing and execution
- Performance benchmarks
- Edge cases and error conditions
# Run the full native test suite ./run_all_tests.sh # Run with verbose output ./run_all_tests.sh --verbose # CI-friendly run (preserves test output) ./run_all_tests.sh --ci
Run a log server that exposes the WAL over Cap'n Proto RPC and connect with the
rpc:// protocol using the low-level engine API.
# Build with RPC support cargo build --features rpc --bin tglogd # Start the log server (absolute path recommended) ./target/debug/tglogd --listen 127.0.0.1:9000 --db /absolute/path/to/db.teg # Run the RPC smoke example (in another terminal) cargo run --features rpc --example rpc_log_smoke
Use the following commands to keep the tree clean:
# Format source code cargo fmt --all # Run Clippy with the same settings as CI cargo clippy --all-targets --all-features -- -D warnings # Run the full CI-equivalent precheck suite ./ci_precheck.sh
Run performance benchmarks against other embedded databases:
cargo bench
Included benchmarks compare against:
- SQLite
- sled
- redb
- Simplicity First: Prefer simple, understandable solutions
- Reliability: Prioritize correctness over performance optimizations
- Standard Library: Use std library when possible to minimize dependencies
- Single Threaded: Eliminate concurrency complexity and bugs
- Resource Efficient: Optimize for memory and CPU usage
- Single-threaded: No concurrent access support
- No secondary indexes: Only primary key optimization
- Limited SQL: Subset of full SQL standard
- No foreign keys: Basic constraint support only
- No joins: Single table queries only
- Secondary index support
- JOIN operation support
- More SQL features (subqueries, aggregation)
- Compression for large values
- Streaming for very large result sets
- Enhanced backup features (multi-DB support, encryption-at-rest)
Licensed under AGPL-3.0. See LICENSE for details.
The AGPL-3.0 ensures that any modifications to TegDB remain open source and available to the community.
Contributions welcome! Please:
- Follow the design principles above
- Include comprehensive tests
- Update documentation for new features
- Ensure benchmarks still pass
See CONTRIBUTING.md for detailed guidelines.