Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

enveda/tims-bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

10 Commits

Repository files navigation

TIMS-Bench: Towards community standards for benchmarking untargeted trapped ion mobility metabolomics tools and datasets

This repository contains code and data described in detail in our paper (Rajkumar et al., 2026).

Table of Contents

Citation

If you have found our manuscript useful in your work, please consider citing:

Rajkumar, P, et al. (2026). TIMS-Bench: Towards community standards for benchmarking untargeted trapped ion mobility metabolomics tools and datasets.

Requirements

  • Python >= 3.13.5
  • UV for environment management
  • A Linux machine is recommended for running the DreaMS embeddings (see note in How to run)

Data

Datasets are publicly available and can be directly downloaded from Zenodo (DOI: TBD). Unzip the downloaded files and place them under the data/ directory as described in Repository structure.

Repository structure

.
├── benchmarking/ # Python package with shared utilities
│ ├── harmonizer/ # Tool-specific parsers (MetaboScape, MS-DIAL, MZmine)
│ ├── metrics/ # Benchmarking metrics (base metrics, clique analysis)
│ ├── similarity/ # Spectral similarity methods (cosine, entropy, DreaMS)
│ ├── constants.py
│ ├── loader.py
│ ├── plots.py
│ └── utils.py
├── data/ # Downloaded from Zenodo (not tracked by git)
│ ├── groundtruth_dataset/ # MSV000098263, plant_spikein, nist_srm
│ ├── library_spectra/ # Reference library files (.parquet, .pq)
│ └── public_dataset/ # Eg. MSV000084402, MSV000090327, ..
├── figures/ # Output figures for the manuscript
├── notebooks/ # Analysis notebooks — run in numbered order
│ ├── 01a_harmonization.ipynb
│ ├── 01b_annotations.ipynb
│ ├── 01b_annotations.py
│ ├── 01c_dataset_qc.ipynb
│ ├── 02_tolerance_selection.ipynb
│ ├── 03_base_metrics.ipynb
│ ├── 03_groundtruth_metrics.ipynb
│ ├── 04a_reframe_based_metrics.ipynb
│ ├── 04b_reframe_css_evaluation.ipynb
│ ├── 04c_reframe_mirror_plots.ipynb
│ ├── 05_nist_srm_based_metrics.ipynb
│ ├── 06_plant_spikein_base_metrics.ipynb
│ └── 06_plant_spikein_overlap.ipynb
└── pyproject.toml

Each dataset folder under groundtruth_dataset/ and public_dataset/ follows the same layout:

{dataset}/
├── raw/ # Original tool exports (MetaboScape, MS-DIAL, MZmine)
├── harmonized/ # Unified parquet files per tool
├── annotated_cosine_similarity/
├── annotated_spectral_entropy/
├── annotated_dreams_similarity/
└── embeddings/ # DreaMS embedding files (.npz)

The library spectra folder looks like this:

.
├── all_sorted_library_spectra.parquet
├── all_sorted_library_spectra.npz # DreaMS embedding files (.npz)
├── nist_srm_spikein_lib.pq
├── plant_spikein_lib.pq
├── reframe_ms2s_with_ccs.parquet
└── reframe_spikein_lib.pq

How to run

  1. Clone the repository:
git clone https://github.com/enveda/benchmarking-untargeted-metabolomics-software.git
cd benchmarking-untargeted-metabolomics-software
  1. Prepare the data/ directory as described in the Data and repository section section.

  2. Install dependencies using UV:

uv sync
  1. Run the notebooks in numbered order. Select the UV virtual environment as the kernel, or launch Jupyter directly:
uv run jupyter notebook

For standalone Python scripts (used only for running DreaMS matching):

uv run python notebooks/01b_annotations.py

NOTE: The DreaMS embeddings and matching were run independently on a Linux server. Ensure you have the correct environment configuration as per their GitHub.

Notebook overview

  • 01a_harmonization - Code to read the raw output files of the tool and generate feature tables for analysis.
  • 01b_annotations - Annotates harmonized feature tables using multiple similarity approaches (Spectral Entropy and Cosine) against a spectral library.
    • 01b_annotations.py - Python script used to run DreaMS similarity search. Works only on Linux environment.
  • 01c_dataset_qc - Merges and performs quality control on feature tables from public and internal datasets across multiple tools with configurable similarity thresholds.
  • 02_tolerance_selection - Identifies optimal MS1 and MS2 tolerance parameters by testing varied tolerance values on the ReFRAME library dataset and comparing annotation results.
  • 03a_base_metrics - Computes and visualizes base performance metrics across 10 public metabolomics datasets, comparing detection and annotation performance across analysis tools.
  • 03b_groundtruth_metrics - Calculates and visualizes base metrics across three ground-truth datasets (ReFRAME, NIST SRM, plant spike-in) with radar plots comparing tool performance.
  • 04a_reframe_based_metrics - Analyzes ReFRAME spike-in library performance using precision-recall curves, F1 scores, and CCS error distributions across different similarity thresholds and annotation methods.
  • 04b_reframe_css_evaluation - Evaluates CCS-based discrimination of structural isomers from the ReFRAME library using relative CCS differences and ion mobility separation thresholds.
  • 04c_reframe_mirror_plots - Generates spectral mirror plots comparing experimental MS2 spectra against ReFRAME library reference spectra to visually validate annotations.
  • 05_nist_srm_based_metrics - Computes precision-recall curves and R2 distributions for the NIST SRM spike-in dataset to evaluate annotation accuracy and correlation with expected concentrations.
  • 06a_plant_spikein_base_metrics - Analyzes plant spike-in dataset performance using precision-recall metrics, R2 distributions, and concentration-dependent recovery curves across analysis tools.
  • 06b_plant_spikein_overlap - Visualizes compound detection overlap across analysis tools at different spike-in concentrations using Venn diagrams and identifies compounds detected at all concentration levels.

About

TIMS-Bench: Towards community standards for benchmarking untargeted trapped ion mobility metabolomics tools and datasets

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

AltStyle によって変換されたページ (->オリジナル) /