This site needs JavaScript to work properly. Please enable it to take advantage of the complete set of features!
Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log in
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb 13;16(2):e1008576.
doi: 10.1371/journal.pgen.1008576. eCollection 2020 Feb.

A molecular barcode to inform the geographical origin and transmission dynamics of Plasmodium vivax malaria

Affiliations

A molecular barcode to inform the geographical origin and transmission dynamics of Plasmodium vivax malaria

Ernest Diez Benavente et al. PLoS Genet. .

Abstract

Although Plasmodium vivax parasites are the predominant cause of malaria outside of sub-Saharan Africa, they not always prioritised by elimination programmes. P. vivax is resilient and poses challenges through its ability to re-emerge from dormancy in the human liver. With observed growing drug-resistance and the increasing reports of life-threatening infections, new tools to inform elimination efforts are needed. In order to halt transmission, we need to better understand the dynamics of transmission, the movement of parasites, and the reservoirs of infection in order to design targeted interventions. The use of molecular genetics and epidemiology for tracking and studying malaria parasite populations has been applied successfully in P. falciparum species and here we sought to develop a molecular genetic tool for P. vivax. By assembling the largest set of P. vivax whole genome sequences (n = 433) spanning 17 countries, and applying a machine learning approach, we created a 71 SNP barcode with high predictive ability to identify geographic origin (91.4%). Further, due to the inclusion of markers for within population variability, the barcode may also distinguish local transmission networks. By using P. vivax data from a low-transmission setting in Malaysia, we demonstrate the potential ability to infer outbreak events. By characterising the barcoding SNP genotypes in P. vivax DNA sourced from UK travellers (n = 132) to ten malaria endemic countries predominantly not used in the barcode construction, we correctly predicted the geographic region of infection origin. Overall, the 71 SNP barcode outperforms previously published genotyping methods and when rolled-out within new portable platforms, is likely to be an invaluable tool for informing targeted interventions towards elimination of this resilient human malaria.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Principal component (PC) analysis plot generated using 720,340 high quality SNPs across 433 P. vivax isolates reveals geographic clustering.
Isolates are coloured according to country of origin. Clustering by region can be observed, with Southeast Asian isolates appearing to group at the bottom right of the plot, Oceania at the top right, and South American isolates on the centre left. A relative degree of clustering by country can be observed, especially for isolates from Oceania and to a lesser extent Southeast Asia. The percentage of variation explained for each PC is shown in the axis labels. Additional region-specific plots are shown for clarity.
Fig 2
Fig 2. The sub-setting of SNPs by minimum allele frequency (MAF) reveals the strong explanatory power of high frequency SNPs in Plasmodium vivax.
Three equally sized groups of SNPs were constructed from the distribution of the minor allele frequencies: (Left) [MAF 0–0.2%], (Centre) [MAF 0.2–0.7%] and (Right) [MAF 0.7–50%]. (Top) Each of these subsets was used to construct a neighbour-Joining tree revealing only a clear geographic clustering in the high frequency SNP group [MAF 0.7–50%]; (Middle) The Pearson’s r2 correlation of the genome distance calculation using all SNPs and each subset separately, reveals the poor correlation for the low frequency SNPs (r2: [MAF 0–0.2%; left] 0.25, [MAF 0.2–0.7%; middle] 0.27) and a strong correlation for the high frequency subset ([MAF 0.7–50%; right] r2 = 0.94); (Bottom) A Bland-Altman analysis comparing the differences in genetic distance between using whole genome SNPs ("gold standard") and each of the SNP subsets. This reveals that subsets of SNPs with low MAF tend to overestimate the distance (panels left and centre, with mean differences -7.99 and -1.81 respectively; SD = standard deviation). Whilst, in the high MAF subset (right) the genetic distance was underestimated (mean of differences: 0.113).
Fig 3
Fig 3. Geographic clustering of Plasmodium vivax isolates using the 71 SNP barcode.
(Top) A principal component (PC) analysis plot shows clustering by region and country when using the 71 SNP barcode. The percentage of variation explained by each PC is shown in the axis labels; (Middle) A strong Pearson’s r2 correlation of 0.898 was observed between the genetic distances based on genome-wide (n = 720k) and 71 barcoding SNPs, revealing the potential for the barcode to identify closely related intra-border isolates; (Bottom) A Bland-Altman analysis comparing the differences in genetic distance between using whole genome SNPs ("gold standard") and the 71 SNP barcode.
Fig 4
Fig 4. Use of the 71 SNP barcode in Plasmodium vivax isolates from Sabah, Malaysia reveals patterns of transmission.
A dataset of 60 isolates from a near-elimination setting that has been exhaustively characterised by whole genome sequencing in [31] was analysed here by means of a principal component analysis (PCA) using the 71 SNP barcode from our study. (A) The principal component (PC) analysis revealed the previously reported outbreak population (K2, yellow). However, there was one "K2" isolate showing distant clustering (ERR1475456, highlighted with arrows in the three panels); (B) The distribution of pairwise genome SNP distances for each of the isolates in the outbreak, showing that ERR1475456 is not as closely related to the outbreak as indicated by microsatellite genotyping in [31]; (C) A neighbour-joining tree revealed isolates from the West Coast Division in Sabah (dashed lines) clustering together; isolates are coloured in the tree according to cluster.
Fig 5
Fig 5. The principal component (PC) analysis plot of the 565 P. vivax isolates, constructed using the 71 SNP barcode. The isolates include the 433 used in the design of the barcode (circles) and the 132 prospective UK traveller samples (stars).
The plot shows clear geographic region clustering, with the traveller samples from each region (strong star-dots for each colour) overlapping with the previously sequenced data (light circular-dots for each colour). The percentage of variation explained for each PC is shown in the axis labels.

References

    1. Howes RE, Battle KE, Mendis KN, Smith DL, Cibulskis RE, Baird JK, et al. Global Epidemiology of Plasmodium vivax. Am J Trop Med Hyg. 2016;95: 15–34. 10.4269/ajtmh.16-0141 - DOI - PMC - PubMed
    1. WHO. World Malaria Report 2017. Geneva; 2017.
    1. Tjitra E, Anstey NM, Sugiarto P, Warikar N, Kenangalem E, Karyana M, et al. Multidrug-Resistant Plasmodium vivax Associated with Severe and Fatal Malaria: A Prospective Study in Papua, Indonesia. PLOS Med. 2008;5: e128 Available: 10.1371/journal.pmed.0050128 - DOI - PMC - PubMed
    1. Poespoprodjo JR, Fobia W, Kenangalem E, Lampah DA, Warikar N, Seal A, et al. Adverse Pregnancy Outcomes in an Area Where Multidrug-Resistant Plasmodium vivax and Plasmodium falciparum Infections Are Endemic. Clin Infect Dis. 2008;46: 1374–1381. 10.1086/586743 - DOI - PMC - PubMed
    1. Poespoprodjo JR, Fobia W, Kenangalem E, Lampah DA, Hasanuddin A, Warikar N, et al. Vivax Malaria: A Major Cause of Morbidity in Early Infancy. Clin Infect Dis. 2009;48: 1704–1712. Available: 10.1086/599041 - DOI - PMC - PubMed

Publication types

MeSH terms

Substances

Cite

AltStyle によって変換されたページ (->オリジナル) /