Each robot runs MASt3R-SLAM, VGGT-SLAM 2.0, Pi3, and LoGeR independently (scale-ambiguous). MR.ScaleMaster fuses the per-robot trajectories and point clouds into a single, metrically consistent global map using Sim(3) loop-closure constraints optimized with g2o.
graph LR
R1[Robot 1] --> LC[Heterogeneous Front-end]
R2[Robot 2] --> LC
RN[Robot N] --> LC
LC --> G2O[Sim3 Graph Optimization]
G2O --> MAP[Consistent Global Map]
| Component | Specification |
|---|---|
| OS | Ubuntu 22.04 / 24.04 |
| GPU | CUDA-capable (tested on RTX 5090) |
| Python | 3.11+ |
git clone git@github.com:team-aprl/MR.ScaleMaster.git
cd MR.ScaleMasterπ‘ Tip:
scripts/setup.bash(environment + build) andscripts/download_checkpoint.sh(checkpoints) are independent β we recommend running them in parallel in two separate terminals to save time.
π₯οΈ Terminal 1 β Environment & Build
./scripts/setup.bash
π₯οΈ Terminal 2 β Checkpoint Download
./scripts/download_checkpoint.sh
- β Installs uv (if missing)
- β
Creates a Python 3.11 virtual environment (
.venv/) - β Detects your CUDA version and installs the matching PyTorch
- β
Clones and installs MASt3R-SLAM (into
../MASt3R-SLAM/) - β Downloads model checkpoints (~3.0 GB)
- β Installs all Python dependencies
- β Builds the C++ Sim(3) optimizer
# Run ./scripts/run.sh examples/Exp1 --fps 2.0 --config config/Exp1.yaml ./scripts/run.sh examples/Exp2 --fps 2.0 --config config/Exp2.yaml ./scripts/run.sh examples/kitti_00 --fps 1.0 --config config/KITTI.yaml # If you downloaded KITTI datasets.
π‘
scripts/run.shactivates the virtual environment and setsPYTHONPATHautomatically β no manualsourceorexportneeded.
| Argument | Default | Description |
|---|---|---|
data_root |
./examples/Exp1 |
Path to dataset folder |
--fps |
1.0 |
Playback rate (keyframes per second per robot) |
πΌοΈ A GUI will open. Click βΆ Start to begin loading. β γγγ Stop pauses and βΆ Start resumes from where it left off.
No calibration required. Just bring your videos.
graph LR
V["πΉ video.mp4"] -->|"Γ· N"| R1["robot_01"]
V -->|"Γ· N"| R2["robot_02"]
V -->|"Γ· N"| RN["robot_N"]
R1 --> L1["LoGeR"]
R2 --> L2["LoGeR"]
RN --> LN["LoGeR"]
L1 -->|"pose + pcd"| M["MR.ScaleMaster"]
L2 -->|"pose + pcd"| M
LN -->|"pose + pcd"| M
M --> O["πΊοΈ Global Map"]
./scripts/install_loger.sh
# Usage: ./scripts/do_collaborative_mapping.sh <input_video> <num_robots> [options] # Example: ./scripts/do_collaborative_mapping.sh your_videos/your_video.mp4 4
Open config/default.yaml and adjust a few parameters if needed (frontend, image resolution, etc.).
The script automatically splits videos into keyframes, runs LoGeR per video, and fuses everything with MR.ScaleMaster.
π‘ Tip: For longer videos, increase
subsampleinscripts/do_collaborative_mapping.sh(default:2) to speed up processing.
Exp1 and Exp2 are included in the repository.
Additional KITTI sequences are available on π€ HuggingFace.
source .venv/mrscalemaster/bin/activate hf download --repo-type dataset hyoseokju/examples kitti_00.tar.gz --local-dir examples/ cd examples && tar -xzf kitti_00.tar.gz && rm kitti_00.tar.gz hf download --repo-type dataset hyoseokju/examples kitti_02.tar.gz --local-dir examples/ cd examples && tar -xzf kitti_02.tar.gz && rm kitti_02.tar.gz hf download --repo-type dataset hyoseokju/examples kitti_05.tar.gz --local-dir examples/ cd examples && tar -xzf kitti_05.tar.gz && rm kitti_05.tar.gz hf download --repo-type dataset hyoseokju/examples kitti_07.tar.gz --local-dir examples/ cd examples && tar -xzf kitti_07.tar.gz && rm kitti_07.tar.gz hf download --repo-type dataset hyoseokju/examples kitti_08.tar.gz --local-dir examples/ cd examples && tar -xzf kitti_08.tar.gz && rm kitti_08.tar.gz
π Dataset Format (click to expand)
examples/
βββ kitti_00/
βββ robot_01/
β βββ kf_000000/
β β βββ image.png
β β βββ pose_4x4.npy # Sim(3) pose (Γγ°γ€4, scale encoded in rotation)
β β βββ pointcloud_local.npz # keys: xyz (N,3), colors (N,3) uint8
β β βββ pose_tum.txt # timestamp tx ty tz qx qy qz qw
β βββ kf_000001/ ...
βββ robot_02/ ...
βββ robot_03/ ...
Arbitrary folder names (e.g., go2, hand_held_01) are also supported β sorted alphabetically and assigned robot IDs automatically.
π― Bring your own data: Prepare your data in the format above and MR.ScaleMaster will work out of the box.
Each experiment has its own config file under config/ (Exp1.yaml, Exp2.yaml, KITTI.yaml).
π Full configuration structure (click to expand)
device: "cuda:0" paths: mast3r_config: "./MASt3R-SLAM/config/base.yaml" model_weights: "./MASt3R-SLAM/checkpoints/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth" retriever_weights: "./MASt3R-SLAM/checkpoints/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric_retrieval_trainingfree.pth" loop_vis_save_dir: "./MASt3R-SLAM/retrieval_test/loop_closures" image: height: 384 width: 512 # noise values are 1-sigma: information = 1 / sigma^2 # rotation unit: degrees, translation unit: meters graph: odometry: t_noise: [0.1, 0.1, 0.1] # translation sigma (m) r_noise: [5.0, 5.0, 5.0] # rotation sigma (deg) s_noise: 0.05 # scale sigma loop: t_noise: [1.0, 1.0, 1.0] r_noise: [25.0, 25.0, 25.0] s_noise: 1.0 loop_inhibit_window: 5 # frames to suppress repeated loop edges between the same pair same_robot_min_seq_gap: 90 # minimum seq-index gap for same-robot loop closure matching: retrieval_k: 3 # top-k candidates from image retrieval retrieval_min_thresh: 0.025 # minimum retrieval score to consider a candidate quality_threshold: 1.5 # Qk threshold for valid 3-D correspondences match_frac_min: 0.1 # minimum fraction of valid matches to attempt Sim3 point_cloud: local_voxel_size: 0.3 # voxel size for local map downsampling (m) pre_slice_rate: 10 # stride applied before voxel downsampling global_voxel_size: 0.3 # voxel size for global map rebuild after optimization (m) anchor: scale_min: 0.5 # reject child anchor if scale < this scale_max: 4.0 # reject child anchor if scale > this optimization: stage1_iters: 10 stage2_iters: 50 verbose: 0 vis: point_radii: 0.15 # rerun point cloud radius trajectory_radii: 0.15 # rerun trajectory line radius
β οΈ Updatepaths.mast3r_config,paths.model_weights, andpaths.retriever_weightsto match your MASt3R-SLAM installation path.
ποΈ View directory tree (click to expand)
MR.ScaleMaster/
βββ main.py # Entry point
βββ requirements.txt
β
βββ scripts/
β βββ run.sh # Launch script (activates venv, sets PYTHONPATH)
β βββ setup.bash # One-time environment & build setup
β βββ download_checkpoint.sh # Model checkpoint downloader
β βββ install_loger.sh # LoGeR front-end setup (optional)
β βββ do_collaborative_mapping.sh # Try Your Own Data! pipeline
β βββ demo_viser_for_mrscalemaster.py
β
βββ config/
β βββ Exp1.yaml # Config for Exp1
β βββ Exp2.yaml # Config for Exp2
β βββ KITTI.yaml # Config for KITTI sequences
β
βββ cores/ # Main Python package
β βββ dataloader.py # Dataset scanner & Qt data loader thread
β βββ slam_backend.py # MASt3R inference + Sim(3) loop detection
β βββ optimizer_worker.py # g2o optimization thread
β βββ inference_thread.py # Per-robot inference & map update
β βββ retrieval.py # Image retrieval for loop candidates
β βββ visualizer.py # Rerun-based 3D visualization
β βββ config.py # Colors, timing utilities
β βββ math_utils.py
β βββ io_utils.py
β βββ gui/
β βββ monitor_window.py # Main Qt window
β βββ robot_panel.py # Per-robot status panel
β
βββ cpp/ # C++ Sim(3) optimizer (pybind11)
βββ build.sh
βββ CMakeLists.txt
βββ cmake/
βββ src/
βββ include/
βββ thirdparty/
βββ g2o/ # git submodule
βββ cnpy/ # git submodule
cd cpp ./build.sh # Release build ./build.sh debug # Debug build ./build.sh clean # Remove build/ ./build.sh rebuild # Clean + rebuild
π¦ The compiled
.sois placed directly incores/so Python can import it asimport g2o_multirobot as g2o.
If you use this work, please cite:
@article{ju2026mrscalemaster, title = {{MR.ScaleMaster}: Scale-Consistent Collaborative Mapping from Crowd-Sourced Monocular Videos}, author = {Ju, Hyoseok and Kim, Giseop}, journal = {arXiv preprint arXiv:2604.11372}, year = {2026} }
π BibTeX entries for underlying front-ends (click to expand)
@inproceedings{mast3rslam, title = {MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors}, author = {Murai, Riku and others}, booktitle = {CVPR}, year = {2025} } @article{maggio2026vggtslam2, title = {VGGT-SLAM 2.0: Real-time Dense Feed-forward Scene Reconstruction}, author = {Maggio, Dominic and Carlone, Luca}, journal = {arXiv preprint arXiv:2601.19887}, year = {2026} } @article{wang2025pi3, title = {$\pi^3$: Permutation-Equivariant Visual Geometry Learning}, author = {Wang, Yifan and Zhou, Jianjun and Zhu, Haoyi and others}, journal = {arXiv preprint arXiv:2507.13347}, year = {2025} } @article{zhang2026loger, title = {LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory}, author = {Zhang, Junyi and Herrmann, Charles and Hur, Junhwa and Sun, Chen and Yang, Ming-Hsuan and Cole, Forrester and Darrell, Trevor and Sun, Deqing}, journal = {arXiv preprint arXiv:2603.03269}, year = {2026} }