This site needs JavaScript to work properly. Please enable it to take advantage of the complete set of features!
Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log in
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2024 Aug 9;25(1):213.
doi: 10.1186/s13059-024-03343-2.

Genomic reproducibility in the bioinformatics era

Affiliations
Review

Genomic reproducibility in the bioinformatics era

Pelin Icer Baykal et al. Genome Biol. .

Abstract

In biomedical research, validating a scientific discovery hinges on the reproducibility of its experimental results. However, in genomics, the definition and implementation of reproducibility remain imprecise. We argue that genomic reproducibility, defined as the ability of bioinformatics tools to maintain consistent results across technical replicates, is essential for advancing scientific knowledge and medical applications. Initially, we examine different interpretations of reproducibility in genomics to clarify terms. Subsequently, we discuss the impact of bioinformatics tools on genomic reproducibility and explore methods for evaluating these tools regarding their effectiveness in ensuring genomic reproducibility. Finally, we recommend best practices to improve genomic reproducibility.

Keywords: Reproducibility; bioinformatics tools; genomics; synthetic replicates; technical replicates.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Schematic representation of three key concepts: technical replicates, methods reproducibility, and genomic reproducibility. The same sample is processed (library preparation) and sequenced multiple times, possibly in different laboratories, but using the same experimental protocols and sequencing platform. The output of these sequencing runs are technical replicates represented as FASTQ files. Data analysis is performed for each technical replicate multiple times to assess consistency of genomic results, which refers to methods reproducibility. Genomic reproducibility, on the other hand, evaluates the consistency of genomic results across technical replicates
Fig. 2
Fig. 2
Schematic representation of generating synthetic replicates. Based on a given dataset consisting of five reads R1, ..., R5 (left) four different types of synthetic replicates (right) are created by either randomly shuffling the order of the five reads (a), or by taking the reverse complement of each read (b), or by bootstrapping, i.e., resampling of the five reads with replacement (c), or by subsampling, i.e., selecting a subset consisting of three reads from the original five reads (d)

References

    1. Leipzig J, Nüst D, Hoyt CT, Ram K, Greenberg J. The role of metadata in reproducible computational research. Patterns (N Y). 2021;2:100322. 10.1016/j.patter.2021.100322 - DOI - PMC - PubMed
    1. Bakinam T Essawy, Jonathan L. Goodall, Daniel Voce, Mohamed M. Morsy, Jeffrey M. Sadler, Young Don Choi, David G. Tarboton, Tanu Malik. A taxonomy for reproducible and replicable research in environmental modelling. Environmental Modelling and Software. 2020;134:104753.
    1. Arnold, B. et al. The Turing Way: A Handbook for Reproducible Data Science. 10.5281/zenodo.3233986.
    1. Goodman, S. N., Fanelli, D. & Ioannidis, J. P. A. What does research reproducibility mean? Sci. Transl. Med. 8, 341ps12 (2016). - PubMed
    1. Whitaker, K. Showing Your Working: A Guide to Reproducible Neuroimaging Analyses. (figshare, 2016). 10.6084/M9.FIGSHARE.4244996.V1.

LinkOut - more resources

Cite

AltStyle によって変換されたページ (->オリジナル) /