This site needs JavaScript to work properly. Please enable it to take advantage of the complete set of features!
Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log in
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 May 19;11(1):2500.
doi: 10.1038/s41467-020-16366-7.

Precise phylogenetic analysis of microbial isolates and genomes from metagenomes using PhyloPhlAn 3.0

Affiliations

Precise phylogenetic analysis of microbial isolates and genomes from metagenomes using PhyloPhlAn 3.0

Francesco Asnicar et al. Nat Commun. .

Abstract

Microbial genomes are available at an ever-increasing pace, as cultivation and sequencing become cheaper and obtaining metagenome-assembled genomes (MAGs) becomes more effective. Phylogenetic placement methods to contextualize hundreds of thousands of genomes must thus be efficiently scalable and sensitive from closely related strains to divergent phyla. We present PhyloPhlAn 3.0, an accurate, rapid, and easy-to-use method for large-scale microbial genome characterization and phylogenetic analysis at multiple levels of resolution. PhyloPhlAn 3.0 can assign genomes from isolate sequencing or MAGs to species-level genome bins built from >230,000 publically available sequences. For individual clades of interest, it reconstructs strain-level phylogenies from among the closest species using clade-specific maximally informative markers. At the other extreme of resolution, it scales to large phylogenies comprising >17,000 microbial species. Examples including Staphylococcus aureus isolates, gut metagenomes, and meta-analyses demonstrate the ability of PhyloPhlAn 3.0 to support genomic and metagenomic analyses.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. PhyloPhlAn 3.0 phylogenetically places microbial isolate or metagenomic assemblies.
PhyloPhlAn 3.0 provides strain-to-phylum level phylogenies built from newly generated microbial genomes (isolate or metagenomic assemblies) in the context of over 80,000 existing isolate genomes and 150,000 metagenomic assemblies. It automatically selects the most informative loci on a clade-specific basis, handles incomplete or fragmented assemblies, and can be configured to provide the resulting multiple-sequence alignment, estimated mutation rates (optionally), and phylogenetic tree.
Fig. 2
Fig. 2. Accurate reconstruction of Staphylococcus aureus phylogenies using PhyloPhlAn 3.0.
a Phylogenetic tree of 135 S. aureus strains from a pediatric hospital reconstructed by PhyloPhlAn 3.0 using 2127 automatically identified core genes (rendered by GraPhlAn see Supplementary Fig. 2 for a full comparison). Green circles represent the methicillin-sensitive S. aureus (MSSA), while red circles represent methicillin-resistant S. aureus (MRSA). Blue circles internal to the phylogeny identify subtrees with bootstrap >80%. b Normalized phylogenetic distances in the PhyloPhlAn 3.0-reconstructed tree and in a manually curated phylogeny from ref. highlighting strong consistency between the automated PhyloPhlAn 3.0 results and the curated tree (0.992 Pearsonʼs correlation coefficient). c Multidimensional scaling ordination of pairwise phylogenetic distances from the tree integrating the 135 S. aureus isolates (crosses) with 1000 automatically selected S. aureus reference genomes (circles, Supplementary Fig. 1). The ten most prevalent sequence types (STs) are highlighted in different colors.
Fig. 3
Fig. 3. Phylogenetic analysis of MAGs from 50 rural Ethiopian metagenomes.
a Occurrence of the 20 most prevalent SGBs among 50 previously sequenced Ethiopian gut metagenomes highlights the presence of many previously identified but largely uncharacterized species-level genome bins (uSGBs) and the identification of few additional MAGs (unassigned) that are not recapitulated in any already defined SGB. The presence/absence profiles are clustered using average linkage with Euclidean distances. b Multidimensional scaling ordination using the t-SNE algorithm on phylogenetic distances from PhyloPhlAn 3.0's tree of eight Ethiopian E. coli MAGs (kSGB 10068) integrated with 200 automatically selected E. coli reference genomes using 3246 UniRef90 gene families for phylogenetic reconstruction. c PhyloPhlAn 3.0 phylogeny of Ethiopian MAGs assigned to uSGB ID 19436 including all reference genomes for the closest phyla (589 in total) according to the prokaryotes tree-of-life in Fig. 4. Phylogeny reconstruction used 400 universal markers selected by PhyloPhlAn 3.0 for deep-branching phylogenies. Portions of the tree collapsed are labeled and numbers in parentheses represent the number of genomes in the collapsed subtrees. Uncollapsed phylogeny is available in Supplementary Fig. 4.
Fig. 4
Fig. 4. PhyloPhlAn 3.0 microbial tree-of-life with 17,672 species-representative genomes from 51 known and 84 candidate phyla.
With 17,672 species-dereplicated isolate genomes and MAGs as input (see "Methods"), PhyloPhlAn 3.0 used 400 optimized universal marker sequences to produce a pan-microbial phylogeny in approximately 10 days (~24,000 CPU-hours on 100 parallel cores). The underlying multiple-sequence alignment comprised 4522 amino acid positions from among 1,872,710 in the untrimmed concatenated marker alignments.

References

    1. Segata N, Börnigen D, Morgan XC, Huttenhower C. PhyloPhlAn is a new method for improved phylogenetic and taxonomic placement of microbes. Nat. Commun. 2013;4:2304. doi: 10.1038/ncomms3304. - DOI - PMC - PubMed
    1. Darling AE, et al. PhyloSift: phylogenetic analysis of genomes and metagenomes. PeerJ. 2014;2:e243. doi: 10.7717/peerj.243. - DOI - PMC - PubMed
    1. Wu Y-W. ezTree: an automated pipeline for identifying phylogenetic marker genes and inferring evolutionary relationships among uncultivated prokaryotic draft genomes. BMC Genomics. 2018;19:921. doi: 10.1186/s12864-017-4327-9. - DOI - PMC - PubMed
    1. Lee, M. D. GToTree: a user-friendly workflow for phylogenomics. Bioinformatics10.1093/bioinformatics/btz188 (2019). - PMC - PubMed
    1. Wu M, Eisen JA. A simple, fast, and accurate method of phylogenomic inference. Genome Biol. 2008;9:R151. doi: 10.1186/gb-2008年9月10日-r151. - DOI - PMC - PubMed

Publication types

Cite

AltStyle によって変換されたページ (->オリジナル) /