Beacon is intentionally small — it’s not about lines of code,
it’s about where they belong.
The core fits in a few hundred lines because it doesn’t need to fight Python — it flows with it.
Beacon is a lightweight Python library for experiment management in machine learning and data science.
It provides flexible configuration management, experiment tracking, and hyperparameter optimization —
all without the complexity or overhead of heavy frameworks.
- True Namespace Isolation: MultiScope provides independent config contexts (unique to Beacon!)
- Configuration Transparency: Visualize exact config merge order - debug configs with
manualcommand - Built-in Experiment Tracking: SQLite-based tracking with no external services required
- Structural Hashing: Track experiment structure changes automatically
- Zero Boilerplate: Auto-nested configs, lazy evaluation, attribute access
- CLI-first Design: Configure experiments from command line without touching code
- Framework Agnostic: Works with PyTorch, TensorFlow, JAX, or pure Python
pip install beacon-python
from beacon.scope import Scope scope = Scope() @scope.observe(default=True) def config(cfg): cfg.lr = 0.001 cfg.batch_size = 32 cfg.model = 'resnet50' @scope def train(cfg): print(f"Training {cfg.model} with lr={cfg.lr}") # Your training code here if __name__ == '__main__': train() # python train.py # Override from CLI: python train.py lr=0.01 model=%resnet101%
- ADict: Enhanced Dictionary
- Scope: Configuration Management
- MultiScope: Namespace Isolation ⭐ Unique to Beacon
- Config Documentation & Debugging ⭐ Unique to Beacon
- SQL Tracker: Experiment Tracking
- Hyperparameter Optimization
- Best Practices
- Comparison with Hydra
ADict is an enhanced dictionary designed for managing experiment configurations. It combines the simplicity of Python dictionaries with powerful features for ML workflows.
These are the fundamental capabilities that make ADict powerful for experiment management:
| Feature | Description | Why It Matters |
|---|---|---|
| Structural Hashing | Hash based on keys + types, not values | Track when experiment structure changes |
| Nested Access | Dot notation for nested configs | config.model.lr instead of config['model']['lr'] |
| Format Agnostic | Load/save JSON, YAML, TOML, XYZ | Work with any config format |
| Safe Updates | update_if_absent() method |
Prevent accidental overwrites |
These utilities maximize developer productivity and reduce boilerplate:
| Feature | Description | Benefit |
|---|---|---|
Auto-nested (ADict.auto()) |
Infinite depth lazy creation | config.a.b.c = 1 just works - no KeyError |
| Attribute-style Assignment | config.lr = 0.1 |
Cleaner, more readable code |
| Conditional Updates | Only update missing keys | Merge configs safely |
from beacon.adict import ADict # Structural hashing - track config structure changes config1 = ADict(lr=0.1, epochs=100, model='resnet50') config2 = ADict(lr=0.01, epochs=200, model='resnet101') print(config1.get_structural_hash() == config2.get_structural_hash()) # True config3 = ADict(lr=0.1, epochs='100', model='resnet50') # epochs is str! print(config1.get_structural_hash() == config3.get_structural_hash()) # False # Load/save any format config = ADict.from_file('config.json') config.dump('config.yaml') # Safe updates config.update_if_absent(lr=0.01, scheduler='cosine') # Only adds scheduler
The most loved feature - no more manual nesting:
# ❌ Traditional way config = ADict() config.model = ADict() config.model.backbone = ADict() config.model.backbone.layers = [64, 128, 256] # ✅ With ADict.auto() config = ADict.auto() config.model.backbone.layers = [64, 128, 256] # Just works! config.data.augmentation.brightness = 0.2
Perfect for Scope integration:
from beacon.scope import Scope scope = Scope() @scope.observe(default=True) def config(cfg): # No pre-definition needed! cfg.training.optimizer.name = 'AdamW' cfg.training.optimizer.lr = 0.001 cfg.model.encoder.num_layers = 12
Works with CLI:
python train.py model.backbone.resnet.depth=50 data.batch_size=32
# Attribute-style access config.lr = 0.1 print(config.lr) # Instead of config['lr'] # Nested access print(config.model.backbone.type) # Clean and readable # Conditional updates - merge configs safely base_config.update_if_absent(**experiment_config)
Scope solves configuration complexity through priority-based merging and CLI integration. No more scattered config files or hard-coded parameters.
Default Configs (priority=0)
↓
Named Configs (priority=0+)
↓
CLI Arguments (highest priority)
↓
Lazy Configs (computed after CLI)
from beacon.scope import Scope scope = Scope() @scope.observe() def my_config(config): config.dataset = 'cifar10' config.lr = 0.001 config.batch_size = 32 @scope def train(config): print(f"Training on {config.dataset}") # Your code here if __name__ == '__main__': train()
@scope.observe(default=True) # Always applied def defaults(cfg): cfg.lr = 0.001 cfg.epochs = 100 @scope.observe(priority=1) # Applied after defaults def high_lr(cfg): cfg.lr = 0.01 @scope.observe(priority=2) # Applied last def long_training(cfg): cfg.epochs = 300
python train.py # lr=0.001, epochs=100 python train.py high_lr # lr=0.01, epochs=100 python train.py high_lr long_training # lr=0.01, epochs=300
Override any parameter from command line:
# Simple values python train.py lr=0.01 batch_size=64 # Nested configs python train.py model.backbone=%resnet101% model.depth=101 # Lists and complex types python train.py layers=[64,128,256,512] dropout=0.5 # Combine with named configs python train.py my_config lr=0.001 batch_size=128
Note: Wrap strings with % (e.g., %resnet101%) instead of quotes.
Sometimes you need configs that depend on other values set via CLI:
@scope.observe() def base_config(cfg): cfg.model = 'resnet50' cfg.dataset = 'imagenet' @scope.observe(lazy=True) # Evaluated AFTER CLI args def computed_config(cfg): # Adjust based on dataset if cfg.dataset == 'imagenet': cfg.num_classes = 1000 cfg.image_size = 224 elif cfg.dataset == 'cifar10': cfg.num_classes = 10 cfg.image_size = 32
python train.py dataset=%cifar10% computed_config
# Results in: num_classes=10, image_size=32Python 3.11+ Context Manager:
@scope.observe() def my_config(cfg): cfg.model = 'resnet50' cfg.num_layers = 50 with Scope.lazy(): # Evaluated after CLI if cfg.model == 'resnet101': cfg.num_layers = 101
Unique to Beacon: Manage completely separate configuration namespaces. Unlike Hydra's config groups, MultiScope provides true namespace isolation with independent priority systems.
| Challenge | Hydra's Approach | Beacon's MultiScope |
|---|---|---|
| Separate model/data configs | Config groups in one namespace | Independent scopes with own priorities |
| Avoid key collisions | Manual prefixing (model.lr, train.lr) |
Automatic namespace isolation |
| Different teams/modules | Single config file | Each scope can be owned separately |
| Priority conflicts | Global priority system | Per-scope priority system |
from beacon.scope import Scope, MultiScope model_scope = Scope(name='model') data_scope = Scope(name='data') scope = MultiScope(model_scope, data_scope) @model_scope.observe(default=True) def model_config(model): model.backbone = 'resnet50' model.pretrained = True @data_scope.observe(default=True) def data_config(data): data.dataset = 'cifar10' data.batch_size = 32 @scope def train(model, data): # Named parameters match scope names print(f"Training {model.backbone} on {data.dataset}")
Different team members can own different scopes without conflicts:
# team_model.py - ML team owns this model_scope = Scope(name='model') @model_scope.observe(default=True) def resnet_default(model): model.backbone = 'resnet50' model.lr = 0.1 # Model-specific learning rate @model_scope.observe(priority=1) def resnet101(model): model.backbone = 'resnet101' model.lr = 0.05 # Different lr for bigger model # team_data.py - Data team owns this data_scope = Scope(name='data') @data_scope.observe(default=True) def cifar_default(data): data.dataset = 'cifar10' data.lr = 0.001 # Data augmentation learning rate (no conflict!) @data_scope.observe(priority=1) def imagenet(data): data.dataset = 'imagenet' data.workers = 16 # train.py - Integration point from team_model import model_scope from team_data import data_scope scope = MultiScope(model_scope, data_scope) @scope def train(model, data): # Both have 'lr' but in separate namespaces! print(f"Model LR: {model.lr}, Data LR: {data.lr}")
Key advantage: model.lr and data.lr are completely independent. No need for naming conventions like model_lr vs data_lr.
Override each scope independently:
# Override model scope only python train.py model.backbone=%resnet101% # Override data scope only python train.py data.dataset=%imagenet% # Override both python train.py model.backbone=%resnet101% data.dataset=%imagenet% # Call named configs per scope python train.py resnet101 imagenet
Beacon supports importing configs from multiple frameworks:
@scope.observe() def load_external(config): # Load from any format config.load('experiments/baseline.json') config.load('models/resnet.yaml') # Export to any format config.dump('output/final_config.toml') # Import OpenMMLab configs - handles _base_ inheritance automatically config.load_mm_config('mmdet_configs/faster_rcnn.py')
OpenMMLab compatibility is built-in:
- Automatically resolves
_base_inheritance chains - Supports
_delete_keys for config overriding - Makes migration from MMDetection/MMSegmentation/etc. seamless
Hydra-style config composition is also built-in via compose_hierarchy:
from beacon.adict import ADict # Hydra-style directory structure: # configs/ # ├── config.yaml # base config # ├── model/ # │ ├── resnet50.yaml # │ └── resnet101.yaml # └── data/ # ├── cifar10.yaml # └── imagenet.yaml config = ADict.compose_hierarchy( root='configs', config_filename='config', select={ 'model': 'resnet50', # or ['resnet50', 'resnet101'] for multiple 'data': 'imagenet' }, overrides={ 'model.lr': 0.01, 'data.batch_size': 64 }, required=['model.backbone', 'data.dataset'], # Validation on_missing='warn' # or 'error' )
Key features:
- Config groups (model/, data/, optimizer/, etc.)
- Automatic file discovery (tries .yaml, .json, .toml, .xyz)
- Dotted overrides (
model.lr=0.01) - Required key validation
- Flexible error handling
Mix Beacon with existing argparse code:
from beacon.scope import Scope import argparse scope = Scope(use_external_parser=True) parser = argparse.ArgumentParser() parser.add_argument('--gpu', type=int, default=0) parser.add_argument('--seed', type=int, default=42) @scope.observe(default=True) def config(cfg): cfg.lr = 0.001 cfg.batch_size = 32 @scope def train(cfg): print(f"GPU: {cfg.gpu}, LR: {cfg.lr}") if __name__ == '__main__': parser.parse_args() # Merges argparse with scope train()
One of Beacon's most powerful features: Auto-generate documentation AND visualize the exact order of configuration application.
@scope.manual def config_docs(cfg): cfg.lr = 'Learning rate for optimizer' cfg.batch_size = 'Number of samples per batch' cfg.model = 'Model architecture (resnet50, resnet101, etc.)'
python train.py manual
Output:
--------------------------------------------------
[Scope "config"]
(The Applying Order of Views)
defaults → (CLI Inputs) → lazy_config → main
(User Manuals)
config.lr: Learning rate for optimizer
config.batch_size: Number of samples per batch
config.model: Model architecture (resnet50, resnet101, etc.)
--------------------------------------------------
The applying order visualization shows you exactly how your configs are merged:
- Which config functions are applied (in order)
- When CLI inputs override values
- Where lazy configs are evaluated
- The final function that uses the config
This prevents configuration bugs by making the merge order explicit and debuggable.
For complex projects with multiple scopes, manual shows each scope separately:
from beacon.scope import Scope, MultiScope model_scope = Scope(name='model') train_scope = Scope(name='train') scope = MultiScope(model_scope, train_scope) @model_scope.observe(default=True) def model_defaults(model): model.backbone = 'resnet50' model.num_layers = 50 @model_scope.observe(priority=1) def model_advanced(model): model.pretrained = True @model_scope.observe(lazy=True) def model_lazy(model): if model.backbone == 'resnet101': model.num_layers = 101 @train_scope.observe(default=True) def train_defaults(train): train.lr = 0.001 train.epochs = 100 @model_scope.manual def model_docs(model): model.backbone = 'Model backbone architecture' model.num_layers = 'Number of layers in the model' @train_scope.manual def train_docs(train): train.lr = 'Learning rate for optimizer' train.epochs = 'Total training epochs' @scope def main(model, train): print(f"Training {model.backbone} with lr={train.lr}") if __name__ == '__main__': main()
python train.py manual
Output:
--------------------------------------------------
[Scope "model"]
(The Applying Order of Views)
model_defaults → model_advanced → (CLI Inputs) → model_lazy → main
(User Manuals)
model.backbone: Model backbone architecture
model.num_layers: Number of layers in the model
--------------------------------------------------
[Scope "train"]
(The Applying Order of Views)
train_defaults → (CLI Inputs) → main
(User Manuals)
train.lr: Learning rate for optimizer
train.epochs: Total training epochs
--------------------------------------------------
This is especially valuable when debugging why a config value isn't what you expect:
@scope.observe(default=True) def defaults(cfg): cfg.lr = 0.001 @scope.observe(priority=1) def experiment_config(cfg): cfg.lr = 0.01 @scope.observe(priority=2) def another_config(cfg): cfg.lr = 0.1 @scope.observe(lazy=True) def adaptive_lr(cfg): if cfg.batch_size > 64: cfg.lr = cfg.lr * 2
When you run python train.py manual, you see:
(The Applying Order of Views)
defaults → experiment_config → another_config → (CLI Inputs) → adaptive_lr → main
Now it's crystal clear why lr=0.1 (from another_config) and not 0.01!
Lightweight experiment tracking using SQLite - no external services, no setup complexity.
- Zero Setup: Just a SQLite file, no servers
- Full History: Track all runs, metrics, and artifacts
- Smart Search: Find similar experiments by config structure
- Code Versioning: Track code changes via fingerprints
Project (my_ml_project)
├── Experiment (run_1)
│ ├── config: {...}
│ ├── structural_hash: "abc123..."
│ ├── Metrics: [loss, accuracy, ...]
│ ├── Artifacts: [model.pt, plots/*, ...]
│ └── Fingerprints: [model_forward, train_step, ...]
├── Experiment (run_2)
└── ...
from beacon.db_routers.sql.manager import SQLLogger from beacon.adict import ADict # Setup config config = ADict( experiment=ADict( project_name='image_classification', sql=ADict(db_path='sqlite:///experiments.db') ), # Your hyperparameters lr=0.001, batch_size=32, model='resnet50' ) # Create logger logger = SQLLogger(config) # Start experiment run run_id = logger.run(tags=['baseline', 'resnet50', 'cifar10']) # Training loop for epoch in range(100): # Your training code train_loss = train_one_epoch() val_acc = validate() # Log metrics logger.log_metric('train_loss', train_loss, step=epoch) logger.log_metric('val_accuracy', val_acc, step=epoch) # Log artifacts logger.log_artifact(run_id, 'checkpoints/model_best.pt', data_type='model', metadata={'epoch': best_epoch}) # Finish run logger.finish(status='completed')
from beacon.db_routers.sql.manager import SQLFinder finder = SQLFinder(config) # Get all runs in project runs = finder.get_runs_in_project('image_classification') for run in runs: print(f"Run {run.id}: {run.config.model} - {run.status}") # Find best performing run best_run = finder.find_best_run( project_name='image_classification', metric_key='val_accuracy', mode='max' # or 'min' for loss ) print(f"Best config: {best_run.config}") # Find similar experiments (same config structure) similar = finder.find_similar_runs(run_id=123) print(f"Found {len(similar)} runs with similar config structure") # Trace statistics (code fingerprints) stats = finder.get_trace_statistics('image_classification', trace_id='model_forward') print(f"Model forward pass has {stats['static_trace_versions']} versions")
# Compare hyperparameter impact finder = SQLFinder(config) runs = finder.get_runs_in_project('my_project') for run in runs: # Get final accuracy final_metrics = [m for m in run.metrics if m.key == 'val_accuracy'] best_acc = max(m.value for m in final_metrics) if final_metrics else 0 print(f"LR: {run.config.lr}, Batch: {run.config.batch_size} → Acc: {best_acc:.2%}")
| Feature | Description |
|---|---|
| Structural Hash | Auto-track config structure changes |
| Metric Logging | Time-series metrics with step tracking |
| Artifact Management | Track model checkpoints, plots, data files |
| Fingerprint Tracking | Version control for code (static & runtime) |
| Smart Search | Find similar configs, best runs, statistics |
Built-in Hyperband algorithm for efficient hyperparameter search with early stopping.
Beacon's hyperopt module is built for extensibility and reusability:
| Component | Purpose | Benefit |
|---|---|---|
GridSpaceMixIn |
Parameter sampling logic | Reusable across different algorithms |
HyperOpt |
Base optimization class | Easy to implement custom strategies |
DistributedMixIn |
Distributed training support | Optional, composable |
This design makes it trivial to implement custom search algorithms:
from beacon.hyperopt.base import GridSpaceMixIn, HyperOpt class RandomSearch(GridSpaceMixIn, HyperOpt): def main(self, func): # Reuse GridSpaceMixIn.prepare_distributions() configs = self.prepare_distributions(self.config, self.search_spaces) # Implement random sampling import random random.shuffle(configs) results = [] for config in configs[:10]: # Sample 10 random configs metric = func(config) results.append((config, metric)) return max(results, key=lambda x: x[1])
Hyperband uses successive halving:
- Start with many configs, train briefly
- Keep top performers, discard poor ones
- Train survivors longer
- Repeat until one winner remains
from beacon.adict import ADict from beacon.hyperopt.hyperband import HyperBand from beacon.scope import Scope scope = Scope() # Define search space search_spaces = ADict( lr=ADict( param_type='FLOAT', param_range=(1e-5, 1e-1), num_samples=20, space_type='LOG' # Logarithmic spacing ), batch_size=ADict( param_type='INTEGER', param_range=(16, 128), num_samples=5, space_type='LOG' ), model=ADict( param_type='CATEGORY', categories=['resnet50', 'resnet101', 'efficientnet_b0'] ) ) # Create Hyperband optimizer hyperband = HyperBand( scope, search_spaces, halving_rate=0.3, # Keep top 30% each round num_min_samples=3, # Stop when <= 3 configs remain mode='max' # Maximize metric (use 'min' for loss) ) @hyperband.main def train(config): # Your training code model = create_model(config.model) optimizer = Adam(lr=config.lr) # Use __num_halved__ for early stopping num_epochs = compute_epochs(config.__num_halved__) # Train and return metric val_acc = train_and_evaluate(model, optimizer, num_epochs) return val_acc if __name__ == '__main__': # Run hyperparameter search best_result = train() print(f"Best config: {best_result.config}") print(f"Best metric: {best_result.metric}")
Let Hyperband compute optimal training steps:
hyperband = HyperBand(scope, search_spaces, halving_rate=0.3, num_min_samples=4) max_steps = 100000 steps_per_generation = hyperband.compute_optimized_initial_training_steps(max_steps) # Example output: [27, 88, 292, 972, 3240, 10800, 36000, 120000] # Use in training @hyperband.main def train(config): generation = config.__num_halved__ num_steps = steps_per_generation[generation] metric = train_for_n_steps(num_steps) return metric
| Type | Description | Example |
|---|---|---|
FLOAT |
Continuous values | Learning rate, dropout |
INTEGER |
Discrete integers | Batch size, num layers |
CATEGORY |
Categorical choices | Model type, optimizer |
Space types:
LOG: Logarithmic spacing (good for learning rates)LINEAR: Linear spacing (default)
Beacon supports distributed hyperparameter optimization out of the box:
from beacon.hyperopt.hyperband import DistributedHyperBand import torch.distributed as dist # Initialize distributed training dist.init_process_group(backend='nccl') rank = dist.get_rank() world_size = dist.get_world_size() # Create distributed hyperband hyperband = DistributedHyperBand( scope, search_spaces, halving_rate=0.3, num_min_samples=3, mode='max', rank=rank, world_size=world_size, backend='pytorch' ) @hyperband.main def train(config): # Your distributed training code model = create_model(config) model = DDP(model, device_ids=[rank]) metric = train_and_evaluate(model) return metric if __name__ == '__main__': result = train() if rank == 0: print(f"Best config: {result.config}")
Key features:
- Automatic work distribution across GPUs
- Synchronized config selection via
broadcast_object_from_root - Results aggregation with
all_gather_object - Compatible with PyTorch DDP, FSDP, DeepSpeed
my_project/
├── configs/
│ ├── default.py # Default config with @scope.observe(default=True)
│ ├── models.py # Model-specific configs
│ └── datasets.py # Dataset configs
├── train.py # Main training script
├── experiments.db # SQLite experiment tracking
└── experiments/
├── run_001/
│ ├── checkpoints/
│ └── logs/
└── run_002/
# configs/default.py from beacon.scope import Scope scope = Scope() @scope.observe(default=True) def defaults(cfg): # Data cfg.data = ADict( dataset='cifar10', batch_size=32, num_workers=4 ) # Model cfg.model = ADict( backbone='resnet50', pretrained=True ) # Training cfg.train = ADict( lr=0.001, epochs=100, optimizer='adam' ) # Experiment tracking cfg.experiment = ADict( project_name='my_project', sql=ADict(db_path='sqlite:///experiments.db') )
from beacon.scope import Scope from beacon.db_routers.sql.manager import SQLLogger from configs.default import scope @scope def train(cfg): # Setup experiment tracking logger = SQLLogger(cfg) run_id = logger.run(tags=[cfg.model.backbone, cfg.data.dataset]) try: # Training loop for epoch in range(cfg.train.epochs): loss = train_epoch() acc = validate() logger.log_metric('loss', loss, epoch) logger.log_metric('accuracy', acc, epoch) logger.finish(status='completed') except Exception as e: logger.finish(status='failed') raise e if __name__ == '__main__': train()
- ✅ Use structural hashing to track config changes
- ✅ Log all hyperparameters to SQLLogger
- ✅ Tag experiments with meaningful labels
- ✅ Track artifacts (checkpoints, plots)
- ✅ Use lazy configs for derived parameters
- ✅ Document configs with
@scope.manual
- Python >= 3.7
- SQLAlchemy (for SQL Tracker)
- PyYAML, toml (for config serialization)
See pyproject.toml for full dependencies.
MIT License
Contributions are welcome! Please feel free to submit issues or pull requests.
git clone https://github.com/yourusername/beacon.git cd beacon pip install -e .
| Feature | Beacon | MLflow | W&B | Hydra |
|---|---|---|---|---|
| Core Features | ||||
| Zero setup | ✅ | ❌ | ❌ | ✅ |
| Offline-first | ✅ | Partial | ❌ | ✅ |
| Config priority system | ✅ Explicit | Partial (Tags) | Partial (Run params) | ✅ Override |
| True namespace isolation | ✅ MultiScope | ❌ | ❌ | ❌ Config groups only |
| Config merge visualization | ✅ manual |
❌ | ❌ | Partial (--cfg tree) |
| Structural hashing | ✅ | ❌ | ❌ | ❌ |
| Built-in HyperOpt | ✅ Hyperband | ❌ | ✅ Sweeps | Plugins (Optuna) |
| CLI-first design | ✅ | ❌ | ❌ | ✅ |
| Compatibility | ||||
| Framework agnostic | ✅ | ✅ | ✅ | ✅ |
| Distributed training | ✅ Native + DDP/FSDP(1) | ✅ | ✅ | ✅ |
| Distributed HyperOpt | ✅ DistributedHyperBand |
❌ | Partial | Plugins |
| Hydra-style composition | ✅ compose_hierarchy |
N/A | N/A | Native |
| OpenMMLab configs | ✅ load_mm_config |
❌ | ❌ | ❌ |
| Visualization & UI | ||||
| Web dashboard | 🔜 Planned | ✅ | ✅ | ❌ |
| Real-time metrics | 🔜 Planned | ✅ | ✅ | ❌ |
| Interactive plots | 🔜 Planned | ✅ | ✅ | ❌ |
| Metric comparison UI | 🔜 Planned | ✅ | ✅ | ❌ |
| Advanced Features | ||||
| Model registry | 🔜 Planned | ✅ | ✅ | ❌ |
| Dataset versioning | 🔜 Planned | Partial | ✅ | ❌ |
| Team collaboration | ✅ MultiScope(2) | ✅ Platform | ✅ Platform | ❌ |
(1) Native distributed hyperparameter optimization via DistributedHyperBand. Regular training is compatible with any distributed framework (DDP, FSDP, DeepSpeed) - just integrate logging, no special code needed.
(2) Team collaboration via MultiScope: separate config ownership per team (e.g., Team A owns model scope, Team B owns data scope) without naming conflicts.
Note on config compatibility: Beacon provides built-in support for other config frameworks:
- Hydra-style composition:
compose_hierarchy()supports config groups, select, overrides - full compatibility - OpenMMLab configs:
load_mm_config()handles_base_inheritance and_delete_keys - Migration from existing projects is seamless - just import your configs and go
While Hydra is excellent for config composition, Beacon provides unique features:
| Aspect | Hydra | Beacon |
|---|---|---|
| Namespace isolation | Config groups share namespace | ✅ MultiScope with independent namespaces (no key collisions) |
| Priority system | Single global override system | ✅ Per-scope priority + lazy evaluation |
| Config merge debugging | Tree view (--cfg)Shows final config |
✅ manual commandShows merge order & execution flow |
| Experiment tracking | Requires external tools (MLflow/W&B) |
✅ Built-in SQL tracker |
| Team workflow | Single config file ownership | ✅ Separate scope ownership per team(3) |
(3) Example: Team A defines model_scope, Team B defines data_scope, both can use model.lr and data.lr without conflicts.
Use Beacon over Hydra when:
- Multiple teams need independent config ownership (MultiScope)
- You want to avoid key collision issues (no manual prefixing needed)
- You need to debug why a config value was set (
manualcommand) - You want experiment tracking without adding MLflow/W&B
- You're migrating from OpenMMLab projects
Use Hydra when:
- You have very deep config hierarchies with complex inheritance
- You prefer YAML over Python
- You need the mature plugin ecosystem (Ray, Joblib, etc.)
- You don't need namespace isolation
Why not both?
- Beacon has built-in Hydra-style composition via
compose_hierarchy() - You can use Hydra's directory structure and config groups directly in Beacon
- Get MultiScope + experiment tracking + merge debugging on top of Hydra's composition
- Migration is literally just replacing
hydra.compose()withADict.compose_hierarchy()
Beacon is for you if:
- You want lightweight, offline-first experiment tracking
- You need true namespace isolation for team collaboration
- You want to debug config merge order visually (unique to Beacon!)
- You prefer simple Python over complex frameworks
- You want reproducibility without overhead