This site needs JavaScript to work properly. Please enable it to take advantage of the complete set of features!
Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log in
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 May 15;32(10):1478-88.
doi: 10.1038/emboj.2013.79. Epub 2013 Apr 12.

Precision mapping of the human O-GalNAc glycoproteome through SimpleCell technology

Affiliations

Precision mapping of the human O-GalNAc glycoproteome through SimpleCell technology

Catharina Steentoft et al. EMBO J. .

Abstract

Glycosylation is the most abundant and diverse posttranslational modification of proteins. While several types of glycosylation can be predicted by the protein sequence context, and substantial knowledge of these glycoproteomes is available, our knowledge of the GalNAc-type O-glycosylation is highly limited. This type of glycosylation is unique in being regulated by 20 polypeptide GalNAc-transferases attaching the initiating GalNAc monosaccharides to Ser and Thr (and likely some Tyr) residues. We have developed a genetic engineering approach using human cell lines to simplify O-glycosylation (SimpleCells) that enables proteome-wide discovery of O-glycan sites using 'bottom-up' ETD-based mass spectrometric analysis. We implemented this on 12 human cell lines from different organs, and present a first map of the human O-glycoproteome with almost 3000 glycosites in over 600 O-glycoproteins as well as an improved NetOGlyc4.0 model for prediction of O-glycosylation. The finding of unique subsets of O-glycoproteins in each cell line provides evidence that the O-glycoproteome is differentially regulated and dynamic. The greatly expanded view of the O-glycoproteome should facilitate the exploration of how site-specific O-glycosylation regulates protein function.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no conflict of interest.

Figures

Figure 1
Figure 1
The SimpleCell O-GalNAc glycoproteomics strategy. (A) SimpleCell lines originating from different organs as illustrated were generated by ZFN-mediated knockout of COSMC. (B) SC express homogeneous truncated O-glycans (Tn/STn), and glycopeptides from cell lysates as well as conditioned media can be isolated by LWAC after protease digestion with trypsin or chymotrypsin. If necessary, sialic acids are removed by neuraminidase treatment. GalNAc-glycopeptides are isolated on LWAC and separated by IEF prior to nLC-MS/MS analysis. *Glycoproteins from conditioned media were pre-concentrated by passing the media over a short VVA column (see Extended Experimental Procedures). (C) The SimpleCell strategy identified 629 glycoproteins (not including possible GlcNAc cases), with only 48 in common with glycoproteins previously reported in UniProt (106 total) and O-GLYCBASE (78 total). Of the 771 glycoproteins in these three data sets, only 526 are predicted to have a signal peptide by the SignalP predictor.
Figure 2
Figure 2
Number of O-glycoproteins and O-glycosites identified in individual SimpleCell lines. The distribution of glycoproteins (A) and unambiguous O-glycosites (B) identified in secretome, in TCLs, and in both are shown. Total numbers are given for each cell line as well as in parenthesis unique identifications in the particular cell line. Chymotrypsin data (Capan-1 and HEK293) omitted.
Figure 3
Figure 3
Cellular distribution of O-glycoproteins and O-glycosites. (A) Distribution of identified glycoproteins and sites by number of cell lines. Chymotrypsin data (Capan-1 and HEK293) omitted and only unambigously assigned sites included. (B) Number of O-glycosites identified per number of proteins (all sites included).
Figure 4
Figure 4
Cellular component GO analysis. Cellular components that are significantly over (A) or under (B) represented in the SimpleCell data set compared to the entire human proteome using BinGO plugin for Cytoscape (www.cytoscape.org). The individual annotations for each protein have not been validated manually as the analysis is merely for relative representation purposes. EC, extracellular; ECM, extracellular matrix.
Figure 5
Figure 5
Graphic depiction of glycosites in proteins by a novel GlycoDomainViewer. Representative examples of selected O-glycoproteins depicted based on the GlycoDomainViewer. O-GalNAc sites identified in SC are listed in the upper side of each protein while sites predicted by NetOGlyc4.0 are located on the lower half. N-glycan sites, experimentally verified as well as predicted, are obtained from UniProt. Designations are as follows: SP, signal peptide; TNFR, tumour necrosis factor receptor; TM, transmembrane; alpha_CA, carbonic anhydrase alpha; FN3, fibronectin type 3; HSP70, heat shock protein 70; MIR, domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Glyco_Hydro_47, Glycosyl hydrolase family 47; PG_binding_1, putative peptidoglycan-binding domain; ZnMc_MMP, zinc-dependent metalloprotease; Hemop, hemopexin; FZ.; frizzled; LDLa, low-density lipoprotein receptor domain class A; EGF, calcium-binding EGF domain; LY, low-density lipoprotein-receptor YWTD domain. The listed proteins are TNF1B (tumour necrosis factor receptor superfamily member 1B, P20333), PTPRG (receptor-type tyrosine-protein phosphatase gamma, P23470), GRP78 (78 kDa glucose-regulated protein, P11021), SDF2L (stromal cell-derived factor 2-like protein 1, Q9HCN8), MAN1C1 (Mannosyl-oligosaccharide 1,2-alpha-mannosidase IC, Q9NR34), MMP15 (matrix metalloproteinase-15, P51511), FZD2 (frizzled-2, Q14332), LDLR (low-density lipoprotein receptor, P01130), and VLDLR (very low-density lipoprotein receptor, P98155).
Figure 6
Figure 6
NetOGlyc4.0 predictor performance. (A) A comparison of the performance of NetOGlyc 3.1 and the novel v4.0 predictors on three different data sets. CV MCC values indicated. The v3.1 shows poor sensitivity with the SimpleCell data set. By comparison, v4.0 exhibits a marked general improvement in sensitivity when testing with the O-GLYCBASE subset used to train v3.1, annotated experimental O-GalNAc from the curated subset of UniProt, as well as the SimpleCell data set. (B) Comparative analysis of predictors on the SignalP proteome (human curated UniProt). (C) Comparative analysis of O-glycoprotein and O-glycosite predictions.
Figure 7
Figure 7
In-vitro analysis of GalNAc-T isoform substrate specifities. (A) One hundred and eighty-one peptide substrates from the SimpleCell data set were tested by in-vitro enzyme assays with recombinant GalNAc-Ts. Seventy-three peptides were not glycosylated by either GalNAc-T1, T2, or T3. When testing additional five enzymes (GalNAc-T5, T11, T12, T14, and T16), only five more peptides were glycosylated. (B) Number of unique peptide substrates for each GalNAc-T isoform.

References

    1. Alfaro JF, Gong CX, Monroe ME, Aldrich JT, Clauss TR, Purvine SO, Wang Z, Camp DG 2nd, Shabanowitz J, Stanley P, Hart GW, Hunt DF, Yang F, Smith RD (2012) Tandem mass spectrometry identifies many mouse brain O-GlcNAcylated proteins including EGF domain-specific O-GlcNAc transferase targets. Proc Natl Acad Sci USA 109: 7280–7285 - PMC - PubMed
    1. Bennett EP, Hassan H, Clausen H (1996) cDNA cloning and expression of a novel human UDP-N-acetyl-alpha-D-galactosamine polypeptide N-acetylgalactosaminyltransferase, GalNAc-T3. J Biol Chem 271: 17006–17012 - PubMed
    1. Bennett EP, Mandel U, Clausen H, Gerken TA, Fritz TA, Tabak LA (2012) Control of mucin-type O-glycosylation: a classification of the polypeptide GalNAc-transferase gene family. Glycobiology 22: 736–756 - PMC - PubMed
    1. Darula Z, Sherman J, Medzihradszky KF (2012) How to dig deeper? Improved enrichment methods for mucin core-1 type glycopeptides. Mol Cell Proteomics 11: O111.016774 - PMC - PubMed
    1. Davis CG, Elhammer A, Russell DW, Schneider WJ, Kornfeld S, Brown MS, Goldstein JL (1986) Deletion of clustered O-linked carbohydrates does not impair function of low density lipoprotein receptor in transfected fibroblasts. J Biol Chem 261: 2828–2838 - PubMed

Publication types

Substances

Cite

AltStyle によって変換されたページ (->オリジナル) /