Tests codecov Documentation Status PyPI version Code style: black DOI
Mowgli is a novel method for the integration of paired multi-omics data with any type and number of omics, combining integrative Nonnegative Matrix Factorization and Optimal Transport. Read the paper!
Mowgli is implemented as a Python package seamlessly integrated within the scverse ecosystem, in particular Muon and Scanpy.
On all operating systems, the easiest way to install Mowgli is via PyPI. Installation should typically take a minute and is continuously tested with Python 3.10 on an Ubuntu virtual machine.
pip install mowgli
git clone git@github.com:cantinilab/Mowgli.git pip install ./Mowgli/
pytest .
Mowgli takes as an input a Muon object and populates its obsm
and uns
fields with the embeddings and dictionaries, respectively. Visit mowgli.rtfd.io for more documentation and tutorials.
You may download a preprocessed 10X Multiome demo dataset here.
A GPU is not required for small datasets, but is strongly recommended above 1,000 cells. On CPU, the cell lines demo (206 cells) should run in under 5 minutes and the PBMC demo (500 cells) should run in under 10 minutes (tested on a Ubuntu 20.04 machine with an 11th gen i7 processor).
import mowgli import mudata as md import scanpy as sc # Load data into a Muon object. mdata = md.read_h5mu("my_data.h5mu") # Initialize and train the model. model = mowgli.models.MowgliModel(latent_dim=15) model.train(mdata) # Visualize the embedding with UMAP. sc.pp.neighbors(mdata, use_rep="W_OT") sc.tl.umap(mdata) sc.pl.umap(mdata)
@article{huizing2023paired, title={Paired single-cell multi-omics data integration with Mowgli}, author={Huizing, Geert-Jan and Deutschmann, Ina Maria and Peyr{\'e}, Gabriel and Cantini, Laura}, journal={Nature Communications}, volume={14}, number={1}, pages={7711}, year={2023}, publisher={Nature Publishing Group UK London} }
If you're looking for the repository with code to reproduce the experiments in our preprint, here is is!