Name	Name	Last commit message	Last commit date
Latest commit History 10 Commits
benchmarking	benchmarking
figures	figures
notebooks	notebooks
.gitignore	.gitignore
README.md	README.md
pyproject.toml	pyproject.toml
renovate.json	renovate.json
uv.lock	uv.lock

TIMS-Bench: Towards community standards for benchmarking untargeted trapped ion mobility metabolomics tools and datasets

This repository contains code and data described in detail in our paper (Rajkumar et al., 2026).

Citation
Requirements
Data
Repository structure
How to run

Citation

If you have found our manuscript useful in your work, please consider citing:

Rajkumar, P, et al. (2026). TIMS-Bench: Towards community standards for benchmarking untargeted trapped ion mobility metabolomics tools and datasets.

Requirements

Python >= 3.13.5
UV for environment management
A Linux machine is recommended for running the DreaMS embeddings (see note in How to run)

Data

Datasets are publicly available and can be directly downloaded from Zenodo (DOI: TBD). Unzip the downloaded files and place them under the data/ directory as described in Repository structure.

Repository structure

.
├── benchmarking/ # Python package with shared utilities
│ ├── harmonizer/ # Tool-specific parsers (MetaboScape, MS-DIAL, MZmine)
│ ├── metrics/ # Benchmarking metrics (base metrics, clique analysis)
│ ├── similarity/ # Spectral similarity methods (cosine, entropy, DreaMS)
│ ├── constants.py
│ ├── loader.py
│ ├── plots.py
│ └── utils.py
├── data/ # Downloaded from Zenodo (not tracked by git)
│ ├── groundtruth_dataset/ # MSV000098263, plant_spikein, nist_srm
│ ├── library_spectra/ # Reference library files (.parquet, .pq)
│ └── public_dataset/ # Eg. MSV000084402, MSV000090327, ..
├── figures/ # Output figures for the manuscript
├── notebooks/ # Analysis notebooks — run in numbered order
│ ├── 01a_harmonization.ipynb
│ ├── 01b_annotations.ipynb
│ ├── 01b_annotations.py
│ ├── 01c_dataset_qc.ipynb
│ ├── 02_tolerance_selection.ipynb
│ ├── 03_base_metrics.ipynb
│ ├── 03_groundtruth_metrics.ipynb
│ ├── 04a_reframe_based_metrics.ipynb
│ ├── 04b_reframe_css_evaluation.ipynb
│ ├── 04c_reframe_mirror_plots.ipynb
│ ├── 05_nist_srm_based_metrics.ipynb
│ ├── 06_plant_spikein_base_metrics.ipynb
│ └── 06_plant_spikein_overlap.ipynb
└── pyproject.toml

Each dataset folder under groundtruth_dataset/ and public_dataset/ follows the same layout:

{dataset}/
├── raw/ # Original tool exports (MetaboScape, MS-DIAL, MZmine)
├── harmonized/ # Unified parquet files per tool
├── annotated_cosine_similarity/
├── annotated_spectral_entropy/
├── annotated_dreams_similarity/
└── embeddings/ # DreaMS embedding files (.npz)

The library spectra folder looks like this:

.
├── all_sorted_library_spectra.parquet
├── all_sorted_library_spectra.npz # DreaMS embedding files (.npz)
├── nist_srm_spikein_lib.pq
├── plant_spikein_lib.pq
├── reframe_ms2s_with_ccs.parquet
└── reframe_spikein_lib.pq

How to run

Clone the repository:

git clone https://github.com/enveda/benchmarking-untargeted-metabolomics-software.git
cd benchmarking-untargeted-metabolomics-software

Prepare the data/ directory as described in the Data and repository section section.
Install dependencies using UV:

uv sync

Run the notebooks in numbered order. Select the UV virtual environment as the kernel, or launch Jupyter directly:

uv run jupyter notebook

For standalone Python scripts (used only for running DreaMS matching):

uv run python notebooks/01b_annotations.py

NOTE: The DreaMS embeddings and matching were run independently on a Linux server. Ensure you have the correct environment configuration as per their GitHub.

Notebook overview

01a_harmonization - Code to read the raw output files of the tool and generate feature tables for analysis.
01b_annotations - Annotates harmonized feature tables using multiple similarity approaches (Spectral Entropy and Cosine) against a spectral library.
- 01b_annotations.py - Python script used to run DreaMS similarity search. Works only on Linux environment.
01c_dataset_qc - Merges and performs quality control on feature tables from public and internal datasets across multiple tools with configurable similarity thresholds.
02_tolerance_selection - Identifies optimal MS1 and MS2 tolerance parameters by testing varied tolerance values on the ReFRAME library dataset and comparing annotation results.
03a_base_metrics - Computes and visualizes base performance metrics across 10 public metabolomics datasets, comparing detection and annotation performance across analysis tools.
03b_groundtruth_metrics - Calculates and visualizes base metrics across three ground-truth datasets (ReFRAME, NIST SRM, plant spike-in) with radar plots comparing tool performance.
04a_reframe_based_metrics - Analyzes ReFRAME spike-in library performance using precision-recall curves, F1 scores, and CCS error distributions across different similarity thresholds and annotation methods.
04b_reframe_css_evaluation - Evaluates CCS-based discrimination of structural isomers from the ReFRAME library using relative CCS differences and ion mobility separation thresholds.
04c_reframe_mirror_plots - Generates spectral mirror plots comparing experimental MS2 spectra against ReFRAME library reference spectra to visually validate annotations.
05_nist_srm_based_metrics - Computes precision-recall curves and R2 distributions for the NIST SRM spike-in dataset to evaluate annotation accuracy and correlation with expected concentrations.
06a_plant_spikein_base_metrics - Analyzes plant spike-in dataset performance using precision-recall metrics, R2 distributions, and concentration-dependent recovery curves across analysis tools.
06b_plant_spikein_overlap - Visualizes compound detection overlap across analysis tools at different spike-in concentrations using Venn diagrams and identifies compounds detected at all concentration levels.

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enveda/tims-bench

Folders and files

Latest commit

History

Repository files navigation

TIMS-Bench: Towards community standards for benchmarking untargeted trapped ion mobility metabolomics tools and datasets

Table of Contents

Citation

Requirements

Data

Repository structure

How to run

Notebook overview

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TIMS-Bench: Towards community standards for benchmarking untargeted trapped ion mobility metabolomics tools and datasets

Table of Contents

Citation

Requirements

Data

Repository structure

How to run

Notebook overview

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages