This site needs JavaScript to work properly. Please enable it to take advantage of the complete set of features!
Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log in
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Feb;27(2):234-245.
doi: 10.1101/gr.205146.116. Epub 2016 Nov 15.

microRNA target prediction programs predict many false positives

Affiliations

microRNA target prediction programs predict many false positives

Natalia Pinzón et al. Genome Res. 2017 Feb.

Abstract

According to the current view, each microRNA regulates hundreds of genes. Computational tools aim at identifying microRNA targets, usually selecting evolutionarily conserved microRNA binding sites. While the false positive rates have been evaluated for some prediction programs, that information is rarely put forward in studies making use of their predictions. Here, we provide evidence that such predictions are often biologically irrelevant. Focusing on miR-223-guided repression, we observed that it is often smaller than inter-individual variability in gene expression among wild-type mice, suggesting that most predicted targets are functionally insensitive to that microRNA. Furthermore, we found that human haplo-insufficient genes tend to bear the most highly conserved microRNA binding sites. It thus appears that biological functionality of microRNA binding sites depends on the dose-sensitivity of their host gene and that, conversely, it is unlikely that every predicted microRNA target is dose-sensitive enough to be functionally regulated by microRNAs. We also observed that some mRNAs can efficiently titrate microRNAs, providing a reason for microRNA binding site conservation for inefficiently repressed targets. Finally, many conserved microRNA binding sites are conserved in a microRNA-independent fashion: Sequence elements may be conserved for other reasons, while being fortuitously complementary to microRNAs. Collectively, our data suggest that the role of microRNAs in normal and pathological conditions has been overestimated due to the frequent overlooking of false positive rates.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Inter-individual variability in miR-223 target expression is frequently larger than miR-223-guided repression. (A) Principle of the experiment. (B) The measured microarray signal is the sum of the underlying biological value and technical noise (here illustrated with the Styx gene). Measured signals (m1–m5) are deconvoluted using the measured technical variability (see Supplemental Experimental Procedures). (C) For each predicted miR-223 target, the amplitude of miR-223-guided repression is compared to the amplitude of gene expression variability across the five mice (here illustrated with the Styx gene). The p-value measures the probability that the underlying biological variability is smaller than miRNA-guided repression. We used the median repression value measured by Baek et al. (2008) (represented by the red horizontal bar) to estimate miR-223-guided repression. (D) Genes whose inter-individual fluctuations are not significantly larger than miR-223-guided repression (p ≥ 0.05). Middle column: fold-change due to miR-223-mediated repression according to data of Baek et al. (2008) (note that some genes have a fold-change < 1, thus appearing to be up-regulated by the miRNA: these genes may be indirectly affected by miR-223). Right column: p-value, measured as in panel C (median across all probe sets for that gene).
Figure 2.
Figure 2.
Human haplo-insufficient genes tend to bear the most highly conserved miRNA binding sites. (A) Known haplo-insufficient genes in humans (Dang et al. 2008) exhibit more conserved miRNA binding sites than other human genes. (B) Conservation of miRNA binding sites correlates with the probability of human genes for being haplo-insufficient, as calculated by Huang et al. (2010) (genes were binned into boxplots according to their PCT for clarity). In every boxplot in this figure, the number of genes in each category is indicated inside the boxes. Note: Even though the PCT was initially defined as a probability (Friedman et al. 2009), values in the latest PCT data set (in TargetScan v7, described in Agarwal et al. [2015]) can be larger than 1.
Figure 3.
Figure 3.
Identification of candidate miRNA-titrating mRNAs in differentiating C2C12 cells. (A) Left lanes: synthetic miR-1a and miR-206 (for calibration). (M) Size marker. Right lanes: 20 μg total RNA from differentiating C2C12 cells. (B) Quantification of three biological replicates of the experiment shown in panel A for each miRNA family (mean ± standard error). (C) Experimental identification of miR-1a/miR-206 and miR-133 targets in C2C12 cells. Cells were transfected with 2′-O-Me oligonucleotides directed against miR-1a and miR-206, against miR-133, or against no murine miRNA ("anti-Ø"). mRNAs immunoprecipitated with AGO proteins were quantified by poly(A)-independent RNA-seq. (D) Identified miRNA targets for miR-1a/miR-206 (top panel) and miR-133 (bottom panel). Red: mRNAs with 3′ UTR perfect seed matches. Blue: mRNAs whose best 3′ UTR match is one of the top three enriched imperfect matches (CNATTCC, CATNCC, or CNTTCC for miR-1a/miR-206; GACCANA, GNACCAA, or GACNCAA for miR-133). (E) Free and bound miRNA concentrations were calculated from our measures, and after conceptual loss of the miRNA binding site of interest. (F) Binding sites that exert the highest titrating activity (>10% increase in free miRNA concentration if site is lost).
Figure 4.
Figure 4.
Tmsb4x titrates efficiently miR-1a/miR-206 in differentiated C2C12 cells. (A) Mutagenesis strategy. A luciferase reporter and G418 resistance cassette was introduced 286 bp downstream from the Tmsb4x poly(A) signal, and the Tmsb4x 3′ UTR was replaced by a copy where the miR-1a/miR-206 seed match is either replaced by itself ("wt") or by a hexamer that is not matched by any known murine miRNA seed ("mutant"). (B) Luciferase activity was assessed in each of the five wild-type and four mutant polyclonal cell lines after differentiation. Each cell line was analyzed as 12 technical replicates (replicates for the same cell line are represented by the same symbol and same color). Mixed-effect linear modeling (taking into account heteroscedasticity within each genotype) shows that genotype of the Tmsb4x miR-1a/miR-206 binding site has a significant effect on reporter activity (p = 0.0285).
Figure 5.
Figure 5.
The most highly expressed genes tend to bear the most highly conserved miRNA binding sites. (A) Volcano plots represent correlation coefficients between microarray signal and the aggregate probability of conserved targeting (PCT) for each mRNA (x-axis), and their p-values (y-axis). Each miRNA family is represented by a circle. p-values were adjusted using the Benjamini-Hochberg correction. The dotted red line indicates an adjusted p-value of 0.05 and the dotted black line indicates a correlation coefficient of zero. Adjusted p-values lower than 2.2 ×ばつ 10−16 were set to 2.2 ×ばつ 10−16 for graphical clarity. (B) Same conventions as in panel A, but miRNA families with highly specific expression patterns are colored (red: miRNA specific for another tissue than the one analyzed; blue: miRNA specific for the analyzed tissue).
Figure 6.
Figure 6.
Computationally identified conserved seed matches are frequently more conserved than miRNA seeds themselves. (A) The miR-134 family is specific to placental mammals, but its predicted binding site in USP9X is more broadly conserved. (B) Four vertebrate clades had enough clade-specific miRNA families for a detailed statistical analysis (10 Hominidae-specific families, 14 Catarrhini-specific families, 14 Boreoeutheria-specific families, 10 Euteolostomi-specific families) (see Supplemental Fig. S7). Each point in the boxplot represents an miRNA seed family. The proportion of overconserved 3′ UTR seed matches is defined as the fraction of matches that are conserved in at least one species outside the clade of interest. (C) Proportion of overconserved seed matches among the predictions of several miRNA target prediction programs. Note that PicTar2 ignores Hominidae- and Catarrhini-specific miRNAs, while TargetScan predicts and ranks targets of mammalian-specific miRNAs without using phylogenetic conservation. In order to make every program output comparable, analyses were restricted to perfect seed matches in 3′ UTRs, excluding matches that overlap exon–exon junctions (see Supplemental Table S5 for detailed statistics). (D) 3′ UTR seed matches were analyzed as in panel B, but each group of clade-specific seeds was scored for conserved seed matches outside each of the four clades. Nonseed hexamers (i.e., hexamers that do not constitute the seed of any vertebrate miRNA in miRBase 21) were analyzed identically.

References

    1. Agarwal V, Bell GW, Nam JW, Bartel DP. 2015. Predicting effective microRNA target sites in mammalian mRNAs. eLife 4: e05005. - PMC - PubMed
    1. Alvarez-Saavedra E, Horvitz HR. 2010. Many families of C. elegans microRNAs are not essential for development or viability. Curr Biol 20: 367–373. - PMC - PubMed
    1. Ameres SL, Horwich MD, Hung JH, Xu J, Ghildiyal M, Weng Z, Zamore PD. 2010. Target RNA–directed trimming and tailing of small silencing RNAs. Science 328: 1534–1539. - PMC - PubMed
    1. Baek D, Villén J, Shin C, Camargo FD, Gygi SP, Bartel DP. 2008. The impact of microRNAs on protein output. Nature 455: 64–71. - PMC - PubMed
    1. Bartel DP. 2009. MicroRNAs: target recognition and regulatory functions. Cell 136: 215–233. - PMC - PubMed

Publication types

Cite

AltStyle によって変換されたページ (->オリジナル) /