TRG_NLS_MonoExtN_4

Accession:	ELME000271
Functional site class:	NLS classical Nuclear Localization Signals
Functional site description:	Many nuclear proteins possess a nuclear localisation signal (NLS) that is recognised by the importer protein importin-alpha. The NLS motif is primarily composed of basic residues and is found in two main variants: a monopartite form and a bipartite form with two short basic segments separated by a flexible linker. Importin-alpha is itself an adaptor for the nuclear transport receptor importin-beta. The latter is docked on the cytosolic side of the nuclear pore via repetitive FG, FxFG and GLFG linear motifs found in several nucleoporin proteins (FG-Nups) (Terry,2009). The cargo loaded importin complexes translocate through the nuclear pore while remaining attached to the flexible FG-Nups. Finally, binding of RanGTP to importin-beta drives cargo release, with the importin-alpha still being bound to nucleoporins located on the nucleoplasmic side. Importin-alpha must be returned to the cytosol to repeat the process.
ELMs with same func. site:	TRG_NLS_Bipartite_1 TRG_NLS_MonoCore_2 TRG_NLS_MonoExtC_3 TRG_NLS_MonoExtN_4
ELM Description:	The monopartite nuclear localisation signal (mNLS) binds to the major site of importin-alpha. It always has a short basic cluster of lysines and arginines. Originally, the main positions were assigned as P1-P5. Now, the four positions P2-P5 are considered to build the core with three of these positions always occupied by basic residues. Complementary charges as well as the size of the individual binding pockets in importin alpha strictly control the P2 and P3 basic side chain preference: The P2 position is critical to anchor the motif: it must be occupied by a basic residue. Furthermore, if P2 is arginine, P3 must be lysine; whereas if P2 is lysine, P3 may be arginine or lysine. At least one of P4 and P5 must be a basic residue: There are preferences for hydrophobic or proline and against acidic residues at P4 and P5. Outside the core motif, there are additional preferences, which, though weaker, clearly play a role in NLS binding affinity, especially at P1, P6 and P7. P1 and P6 both have a preference for basic, Pro and hydrophobic residues. Acidic residues are almost never found in P1 and P6, whereas P7 has a clear preference for acidic residues: e.g. for c-myc (320-PAAKRVKLD-328), the lysine at P5 forms an intramolecular salt bridge with the aspartate residue (position 328) at P7, thus stabilizing the P5 residue by neutralizing its charge intramolecularly (Conti,2000). If positions with weaker preferences are all unfavourable, it is likely that P2-P5 must be four consecutive basic residues to achieve the required binding energy (possible alternative: Pro in P4). The rejection of acidic residues in most positions around the NLS, may allow some NLS activities to be regulated by phosphorylation as Ser and Thr residues are quite often found in the non-core positions. For ease of understanding, the monopartite NLS has been split into three regular expressions which collectively capture the core motif and the adjacent preferences.
Pattern:	`(([PKR].{0,1}[^DE])\|([PKR]))((K[RK])\|(RK))(([^DE][KR])\|([KR][^DE]))[^DE]`
Pattern Probability:	`0.0012764`
Present in taxon:	Eukaryota
Interaction Domain:	Arm (PF00514) Armadillo/beta-catenin-like repeat (Stochiometry: 1 : 1) PDB Structure: 1EE4

o See 28 Instances for TRG_NLS_MonoExtN_4

o Abstract

Proteins are synthesised in the cytosol, so they must travel across membranes to reach other cell compartments. Proteins which function in the nucleus pass through the nuclear envelope via the nuclear pores. Small proteins might be capable of diffusing through the pores, although the process is likely to be inefficient. Therefore, almost all well studied nuclear proteins are transported into the nucleus using active translocation through the pore. This requires a targeting motif to bind the transport machinery. Proteins that enter the nucleus in preformed complexes may not require their own targeting motif (Dingwall,1982). Nevertheless, it is clear that most proteins do specify their own import signal and, of these, the vast majority have an NLS that binds to importin-alpha (also termed karyopherin-alpha). Importin-alpha constitutes a multiprotein family in Metazoa with generally similar but perhaps not identical binding specificities (Mason,2009). Importin-alpha is an adaptor protein for importin-beta which interacts with nuclear pore components to effect transport into the nucleus. Several other proteins such as snurportin and transportin may be considered as specialised importin-beta adaptors for specific cargoes like snRNP complexes, while some proteins may interact directly with importin-beta to effect their import (e.g. the viral protein HIV Rev (Henderson,1997)).
A striking feature of protein transfer through the nuclear pore is that key roles are played by several linear motifs. Besides the NLS for nuclear import, the CRM1-binding NES motif is found in proteins that are re-exported, while the FG, FxFG and GLFG repeating motifs are found in a subset of nucleoporins (FG-Nups) and are docking sites for the transfer complexes (Terry,2009). The FG-Nups are large, predominantly natively disordered proteins that line the inner pore, extending projections into both the cytosol and nucleoplasm. Importin-beta makes multivalent interactions with motifs in the FG-Nups. Importin-beta stays docked to the FG-Nups while it translocates through the pore. Additional regulatory interactions involving globular interfaces, e.g. with the RAN GTPase, effect allosteric rearrangements of the transport receptors. Overall the nuclear transfer systems are highly cooperative, which is the hallmark of robust cellular operations requiring multiple regulatory inputs to carefully control each step in the process.
The "classical" or "conventional" importin-alpha binding NLS is found in the majority of nuclear proteins. It was one of the earliest linear motifs to be described. The dominating feature of the motif is its highly basic nature. It exists in two main variants, a "monopartite" form (mNLS) with a single cluster of basic amino acids, originally found in the sequences of the large T Antigen (SV40) and E1A (Adenovirus) (Kalderon,1984; Smith,1985) and a "bipartite" form (bNLS) with two basic clusters separated by a short linker region, first analysed in Nucleoplasmin (Dingwall,1991). NLS motifs can be found anywhere within a protein sequence provided that the location is well exposed: in practice, they are nearly always in segments of native disorder. Possession of an NLS is sufficient for targeting a protein located in the cytosol into the nucleus. As a consequence, proteins whose localizations are commonly restricted to the cytoplasm can be experimentally directed to the nucleus by fusing an NLS motif onto them. Nuclear import may be very fast but can also be slow, particularly if it is regulated. Thus, the NLS of the Polyomavirus VP1 protein mediated nuclear localization within one minute (Chang,1992). In contrast, for the Adenovirus E1A protein, nuclear localization required up to 30 minutes (Lyons,1987).
In nuclear proteins, sequences matching the mNLS are more common than for the bNLS. Structures of NLS peptides in complex with importin-alpha show that the site binding the mNLS also binds the second basic cluster of the bNLS. In the ELM NLS entries, we will refer to this binding site as the "major importin site", in contrast to its interacting counterpart, the "major NLS site" (and similarly to the "minor importin site" and the "minor NLS site" respectively). The NLS itself is known to bind in a predominantly extended conformation (Fontes,2003).
Although basic amino acids predominate in the NLS, other amino acids can contribute to the binding interaction, resulting in sufficient variation that accurate definition of motif patterns has not been straightforward. Historically, five positions P1-P5 were used to designate the major importin site (Fontes,2000). Earlier NLS pattern search applications either assumed a fuzzy distribution of the basic residues at these five positions (Nakai,1999) or else used very restricted definitions (Chelsky,1989). Even if it continued to be used for designing pattern search algorithms (Seiler,2006), from a biochemical point of view, such a fuzzy characterization remained unsatisfying. The availability of several crystal structures of NLS peptides in complex with importin-alpha (Conti,2000; Fontes,2003; Tarendeau,2007) has lead to a better understanding, with the positions P2-P5 now designated as the core of the major importin site. The binding interaction is anchored at positions P2 and P3: these are the only positions where basic amino acids are obligatory. In positions P4 and P5, basic residues are preferred to the extent that one of these positions must be Arg or Lys. Pro and hydrophobic residues are reasonable substitutes at P4 and P5. As P4 is the position of the NLS core where most amino acid variation is allowed, we regard [KR][KR]X[KR] to be stronger than [KR][KR][KR]X. Examination of binding peptides, mutational studies and sequence alignments of NLS-containing proteins indicate that adjacent positions modulate the major site by the presence/non-presence of favoured residues: When a hydrophobic residue is found at P4 or P5, favoured residues must be present either in residues preceding P2-P5, or else in positions P6 and P7. As a third option, if these adjacent positions lack favoured residues, then the NLS must be bipartite, with a pair of basic residues occupying the minor site. There are also strongly disfavoured amino acids - in particular the acidic residues (Asp, Glu) appear to be rejected from P1, P4, P5 and P6.
In principle, it would be possible to assemble the NLS amino acid preferences into a single regular expression. However, this would be overly complicated and difficult to understand. Therefore for the NLS in ELM, the residue preferences have been split into four patterns, three mNLS and one bNLS. Often a given NLS will match multiple ELM NLS patterns.
For proteins that shuttle in and out of the nucleus, it is likely that the NLS can be conditionally inactivated, or they would immediately reimport into the nucleus without executing their cytosolic function. Conditionally inactive NLSes may also be required for the many transcription factors used in transient signalling pathways that are initially targeted to the plasma membrane cytosolic side (Fabbro,2003). Upon cell surface receptor stimulation, they are transferred to the nucleus, where they transiently regulate target genes before being ubiquitinated and destroyed by the proteasome. Such proteins must remain outside the nucleus indefinitely, and therefore the NLS is likely to be activated when the transfer signal is made. Most probably, the main way to control NLS availability is by phosphorylation of the motif at positions that cannot be negatively charged e.g. P1, P4, P5, P6 and possibly at other nearby residue positions (Fontes,2003; Sorokin,2007).

o 21 selected references:

The use of additive and subtractive approaches to examine the nuclear localization sequence of the polyomavirus major capsid protein VP1.
Chang D, Haynes JI 2nd, Brady JN, Consigli RA
Virology 1992 Aug; 189 (2), 821-7
PMID: 1322607
Nuclear targeting sequences--a consensus?
Dingwall C, Laskey RA
Trends Biochem Sci 1991 Dec; 16 (12), 478-81
PMID: 1664152
Two interdependent basic domains in nucleoplasmin nuclear targeting sequence: identification of a class of bipartite nuclear targeting sequence.
Robbins J, Dilworth SM, Laskey RA, Dingwall C
Cell 1991 Feb 8; 64 (3), 615-23
PMID: 1991323
Sequence requirements for synthetic peptide-mediated translocation to the nucleus.
Chelsky D, Ralph R, Jonak G
Mol Cell Biol 1989 Jun; 9 (6), 2487-92
PMID: 2668735
The nuclear location signal.
Smith AE, Kalderon D, Roberts BL, Colledge WH, Edge M, Gillett P, Markham A, Paucha E, Richardson WD
Proc R Soc Lond B Biol Sci 1985 Oct 22; 226 (1242), 43-58
PMID: 2866523
Pentapeptide nuclear localization signal in adenovirus E1a.
Lyons RH, Ferguson BQ, Rosenberg M
Mol Cell Biol 1987 Jul; 7 (7), 2451-6
PMID: 3614197
A short amino acid sequence able to specify nuclear location.
Kalderon D, Roberts BL, Richardson WD, Smith AE
Cell 1984 Dec; 39 (3), 499-509
PMID: 6096007
A polypeptide domain that specifies migration of nucleoplasmin into the nucleus.
Dingwall C, Sharnick SV, Laskey RA
Cell 1982 Sep; 30 (2), 449-58
PMID: 6814762
Interactions between HIV Rev and nuclear import and export factors: the Rev nuclear localisation signal mediates specific binding to human importin-beta.
Henderson BR, Percipalle P
J Mol Biol 1997 Dec 19; 274 (5), 693-707
PMID: 9405152
PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization.
Nakai K, Horton P
Trends Biochem Sci 1999 Jan; 24 (1), 34-6
PMID: 10087920
Crystallographic analysis of the specific yet versatile recognition of distinct nuclear localization signals by karyopherin alpha.
Conti E, Kuriyan J
Structure 2000 Mar 15; 8 (3), 329-38
PMID: 10745017
Structural basis of recognition of monopartite and bipartite nuclear localization sequences by mammalian importin-alpha.
Fontes MR, Teh T, Kobe B
J Mol Biol 2000 Apr 14; 297 (5), 1183-94
PMID: 10764582
Regulation of tumor suppressors by nuclear-cytoplasmic shuttling.
Fabbro M, Henderson BR
Exp Cell Res 2003 Jan 15; 282 (2), 59-69
PMID: 12531692
Structural basis for the specificity of bipartite nuclear localization sequence binding by importin-alpha.
Fontes MR, Teh T, Jans D, Brinkworth RI, Kobe B
J Biol Chem 2003 Jul 25; 278 (30), 27981-7
PMID: 12695505
Role of flanking sequences and phosphorylation in the recognition of the simian-virus-40 large T-antigen nuclear localization sequences by importin-alpha.
Fontes MR, Teh T, Toth G, John A, Pavo I, Jans DA, Kobe B
Biochem J 2003 Oct 15; 375, 339-49
PMID: 12852786
Phospholipid scramblase 1 contains a nonclassical nuclear localization signal with unique binding site in importin alpha.
Chen MH, Ben-Efraim I, Mitrousis G, Walker-Kopp N, Sims PJ, Cingolani G
J Biol Chem 2005 Mar 18; 280 (11), 10599-606
PMID: 15611084
The 3of5 web application for complex and comprehensive pattern matching in protein sequences.
Seiler M, Mehrle A, Poustka A, Wiemann S
BMC Bioinformatics 2006; 7, 144
PMID: 16542452
Structure and nuclear import function of the C-terminal domain of influenza virus polymerase PB2 subunit.
Tarendeau F, Boudet J, Guilligay D, Mas PJ, Bougault CM, Boulo S, Baudin F, Ruigrok RW, Daigle N, Ellenberg J, Cusack S, Simorre JP, Hart DJ
Nat Struct Mol Biol 2007 Mar; 14 (3), 229-33
PMID: 17310249
Nucleocytoplasmic transport of proteins.
Sorokin AV, Kim ER, Ovchinnikov LP
Biochemistry (Mosc) 2007 Dec; 72 (13), 1439-57
PMID: 18282135
Evolution of the metazoan-specific importin alpha gene family.
Mason DA, Stage DE, Goldfarb DS
J Mol Evol 2009 Apr; 68 (4), 351-65
PMID: 19308634
Flexible gates: dynamic topologies and functions for FG nucleoporins in nucleocytoplasmic transport.
Terry LJ, Wente SR
Eukaryot Cell 2009 Dec; 8 (12), 1814-27
PMID: 19801417

o 8 GO-Terms:

Biological Process:
Protein-Nucleus Import (also annotated in these classes: MOD_PRMT_GGRGG_1 MOD_SUMO_for_1 MOD_SUMO_rev_2 TRG_NLS_Bipartite_1 TRG_NLS_MonoCore_2 TRG_NLS_MonoExtC_3 )
Nucleocytoplasmic Transport (also annotated in these classes: TRG_NES_CRM1_1 TRG_NESrev_CRM1_2 TRG_NLS_Bipartite_1 TRG_NLS_MonoCore_2 TRG_NLS_MonoExtC_3 )
Nls-Bearing Substrate Import Into Nucleus (also annotated in these classes: TRG_NLS_Bipartite_1 TRG_NLS_MonoCore_2 TRG_NLS_MonoExtC_3 )
Cellular Compartment:
Nuclear Pore (also annotated in these classes: LIG_GLEBS_BUB3_1 TRG_NLS_Bipartite_1 TRG_NLS_MonoCore_2 TRG_NLS_MonoExtC_3 )
Nucleus (also annotated in these classes: CLV_C14_Caspase3-7 CLV_Separin_Fungi CLV_Separin_Metazoa CLV_TASPASE1 DEG_APCC_DBOX_1 DEG_APCC_KENBOX_2 DEG_APCC_TPR_1 DEG_Cend_DCAF12_1 DEG_Cend_FEM1AC_1 DEG_Cend_FEM1B_2 DEG_Cend_KLHDC2_1 DEG_Cend_TRIM7_1 DEG_COP1 DEG_COP1_1 DEG_CRL4_CDT2_1 DEG_CRL4_CDT2_2 DEG_Kelch_Keap1_1 DEG_Kelch_Keap1_2 DEG_MDM2_SWIB_1 DEG_ODPH_VHL_1 DEG_SCF_COI1_1 DEG_SCF_FBW7_1 DEG_SCF_FBW7_2 DEG_SCF_FBXO31_1 DEG_SCF_SKP2-CKS1_1 DEG_SCF_TIR1_1 DEG_SCF_TRCP1_1 DEG_SIAH_1 DEG_SPOP_SBC_1 DOC_ANK_TNKS_1 DOC_CDC14_PxL_1 DOC_CKS1_1 DOC_CYCLIN_D_Helix_1 DOC_CYCLIN_RevRxL_6 DOC_CYCLIN_RxL_1 DOC_CYCLIN_yClb1_LxF_4 DOC_CYCLIN_yClb3_PxF_3 DOC_CYCLIN_yClb5_NLxxxL_5 DOC_CYCLIN_yCln2_LP_2 DOC_MAPK_DCC_7 DOC_MAPK_FxFP_2 DOC_MAPK_gen_1 DOC_MAPK_GRA24_9 DOC_MAPK_HePTP_8 DOC_MAPK_JIP1_4 DOC_MAPK_MEF2A_6 DOC_MAPK_NFAT4_5 DOC_MAPK_RevD_3 DOC_PIKK_1 DOC_PP1_MyPhoNE_1 DOC_PP1_RVXF_1 DOC_PP1_SILK_1 DOC_PP2A_B56_1 DOC_PP2A_KARD_1 DOC_PP2B_LxvP_1 DOC_PP2B_PxIxIT_1 DOC_PP4_FxxP_1 DOC_PP4_MxPP_1 DOC_USP7_MATH_1 DOC_USP7_MATH_2 DOC_USP7_UBL2_3 DOC_WW_Pin1_4 LIG_14-3-3_CanoR_1 LIG_14-3-3_ChREBP_3 LIG_14-3-3_CterR_2 LIG_ANK_PxLPxL_1 LIG_APCC_ABBA_1 LIG_APCC_Cbox_1 LIG_APCC_Cbox_2 LIG_ARL_BART_1 LIG_ARS2_EDGEI_1 LIG_BRCT_BRCA1_1 LIG_BRCT_BRCA1_2 LIG_BRCT_MDC1_1 LIG_CaM_1-14-15-16_REV_1 LIG_CaMK_CASK_1 LIG_CORNRBOX LIG_CSL_BTD_1 LIG_CtBP_PxDLS_1 LIG_CtBP_RRT_2 LIG_DCNL_PONY_1 LIG_EF_ALG2_ABM_1 LIG_EF_ALG2_ABM_2 LIG_EH1_1 LIG_FHA_1 LIG_FHA_2 LIG_GLEBS_BUB3_1 LIG_HCF-1_HBM_1 LIG_HOMEOBOX LIG_HP1_1 LIG_IRF7_LxLS_2 LIG_IRFs_LxIS_1 LIG_KEPE_1 LIG_KEPE_2 LIG_KEPE_3 LIG_LEDGF_IBM_1 LIG_LSD1_SNAG_1 LIG_MAD2 LIG_Menin_MBM1_1 LIG_MLH1_MIPbox_1 LIG_MSH2_SHIPbox_1 LIG_MTR4_AIM_1 LIG_Mtr4_Air2_1 LIG_Mtr4_Trf4_1 LIG_Mtr4_Trf4_2 LIG_MYND_1 LIG_MYND_2 LIG_MYND_3 LIG_NBox_RRM_1 LIG_NRBOX LIG_Nrd1CID_NIM_1 LIG_PALB2_WD40_1 LIG_PCNA_APIM_2 LIG_PCNA_PIPBox_1 LIG_PCNA_TLS_4 LIG_PCNA_yPIPBox_3 LIG_PTAP_UEV_1 LIG_RBL1_LxSxE_2 LIG_RB_LxCxE_1 LIG_RB_pABgroove_1 LIG_REV1ctd_RIR_1 LIG_RPA_C_Plants LIG_RPA_C_Vert LIG_RRM_PRI_1 LIG_Rrp6Rrp47_Mtr4_1 LIG_Sin3_1 LIG_Sin3_2 LIG_Sin3_3 LIG_SUFU_1 LIG_SUMO_SIM_anti_2 LIG_SUMO_SIM_par_1 LIG_TPR LIG_Trf4_IWRxY_1 LIG_TRFH_1 LIG_UBA3_1 LIG_ULM_U2AF65_1 LIG_VCP_SHPBox_1 LIG_VCP_VBM_3 LIG_VCP_VIM_2 LIG_WD40_WDR5_VDV_1 LIG_WD40_WDR5_VDV_2 LIG_WD40_WDR5_WIN_1 LIG_WD40_WDR5_WIN_2 LIG_WD40_WDR5_WIN_3 LIG_WRPW_1 LIG_WRPW_2 LIG_WW_2 MOD_AAK1BIKe_LxxQxTG_1 MOD_CDC14_SPxK_1 MOD_CDK_SPK_2 MOD_CDK_SPxK_1 MOD_CDK_SPxxK_3 MOD_CK1_1 MOD_CK2_1 MOD_DYRK1A_RPxSP_1 MOD_GSK3_1 MOD_NEK2_1 MOD_NEK2_2 MOD_PIKK_1 MOD_PKA_1 MOD_PKA_2 MOD_PKB_1 MOD_PLK MOD_Plk_1 MOD_Plk_2-3 MOD_Plk_4 MOD_PRMT_GGRGG_1 MOD_ProDKin_1 MOD_SUMO_for_1 MOD_SUMO_rev_2 ELM:old_LIG_14-3-3_1 ELM:old_LIG_14-3-3_2 ELM:old_LIG_14-3-3_3 TRG_NES_CRM1_1 TRG_NESrev_CRM1_2 TRG_NLS_Bipartite_1 TRG_NLS_MonoCore_2 TRG_NLS_MonoExtC_3 )
Nls-Dependent Protein Nuclear Import Complex (also annotated in these classes: TRG_NLS_Bipartite_1 TRG_NLS_MonoCore_2 TRG_NLS_MonoExtC_3 )
Molecular Function:
Protein Binding (also annotated in these classes: CLV_C14_Caspase3-7 CLV_Separin_Fungi CLV_Separin_Metazoa DEG_APCC_TPR_1 DEG_Cend_DCAF12_1 DEG_Cend_FEM1AC_1 DEG_Cend_FEM1B_2 DEG_Cend_KLHDC2_1 DEG_Cend_TRIM7_1 DEG_COP1 DEG_COP1_1 DEG_CRBN_cyclicCter_1 DEG_CRL4_CDT2_1 DEG_CRL4_CDT2_2 DEG_ODPH_VHL_1 DEG_SCF_COI1_1 DEG_SCF_FBW7_1 DEG_SCF_FBW7_2 DEG_SCF_FBXO31_1 DEG_SCF_SKP2-CKS1_1 DEG_SCF_TIR1_1 DEG_SCF_TRCP1_1 DEG_SIAH_1 DOC_AGCK_PIF_1 DOC_AGCK_PIF_2 DOC_AGCK_PIF_3 DOC_ANK_TNKS_1 DOC_CKS1_1 DOC_MAPK_DCC_7 DOC_MAPK_GRA24_9 DOC_MAPK_HePTP_8 DOC_MAPK_JIP1_4 DOC_MAPK_MEF2A_6 DOC_MAPK_NFAT4_5 DOC_PIKK_1 DOC_PP1_MyPhoNE_1 DOC_PP1_RVXF_1 DOC_PP1_SILK_1 DOC_PP2A_B56_1 DOC_PP2A_KARD_1 DOC_PP2B_LxvP_1 DOC_RSK_DDVF_1 DOC_SPAK_OSR1_1 DOC_WD40_RPTOR_TOS_1 LIG_14-3-3_ChREBP_3 LIG_ActinCP_CPI_1 LIG_ActinCP_TwfCPI_2 LIG_ANK_PxLPxL_1 LIG_AP2alpha_1 LIG_AP2alpha_2 LIG_APCC_Cbox_1 LIG_APCC_Cbox_2 LIG_AP_GAE_1 LIG_ARL_BART_1 LIG_ARS2_EDGEI_1 LIG_BH_BH3_1 LIG_BIR_II_1 LIG_BIR_III_1 LIG_BIR_III_2 LIG_BIR_III_3 LIG_BIR_III_4 LIG_CaM_IQ_9 LIG_CaMK_CASK_1 LIG_CNOT1_NIM_1 LIG_deltaCOP1_diTrp_1 LIG_DLG_GKlike_1 LIG_Dynein_DLC8_1 LIG_EABR_CEP55_1 LIG_EF_ALG2_ABM_1 LIG_EF_ALG2_ABM_2 LIG_EH_1 LIG_eIF4E_1 LIG_eIF4E_2 LIG_EVH1_1 LIG_EVH1_2 LIG_FAT_LD_1 LIG_FHA_1 LIG_FHA_2 LIG_FXI_DFP_1 LIG_GLEBS_BUB3_1 LIG_HCF-1_HBM_1 LIG_IBAR_NPY_1 LIG_Integrin_isoDGR_2 LIG_IRF7_LxLS_2 LIG_IRFs_LxIS_1 LIG_KLC1_Yacidic_2 LIG_LEDGF_IBM_1 LIG_LIR_Apic_2 LIG_LIR_Gen_1 LIG_LIR_LC3C_4 LIG_LIR_Nem_3 LIG_LRP6_Inhibitor_1 LIG_LSD1_SNAG_1 LIG_LYPXL_L_2 LIG_LYPXL_S_1 LIG_LYPXL_SIV_4 LIG_LYPXL_yS_3 LIG_MAD2 LIG_Menin_MBM1_1 LIG_MLH1_MIPbox_1 LIG_MSH2_SHIPbox_1 LIG_MTR4_AIM_1 LIG_Mtr4_Air2_1 LIG_Mtr4_Trf4_1 LIG_Mtr4_Trf4_2 LIG_MYND_3 LIG_Nrd1CID_NIM_1 LIG_NRP_CendR_1 LIG_OCRL_FandH_1 LIG_PALB2_WD40_1 LIG_PDZ_Class_1 LIG_PDZ_Class_2 LIG_PDZ_Class_3 LIG_PDZ_Wminus1_1 LIG_Pex14_1 LIG_Pex14_2 LIG_Pex3_1 LIG_PTB_Apo_2 LIG_PTB_Phospho_1 LIG_RBL1_LxSxE_2 LIG_RB_pABgroove_1 LIG_REV1ctd_RIR_1 LIG_RPA_C_Plants LIG_RPA_C_Vert LIG_RuBisCO_WRxxL_1 LIG_SH2_CRK LIG_SH2_GRB2like LIG_SH2_NCK_1 LIG_SH2_SFK_2 LIG_SH2_SFK_CTail_3 LIG_SH2_STAP1 LIG_SH3_1 LIG_SH3_2 LIG_SH3_3 LIG_SH3_4 LIG_SH3_CIN85_PxpxPR_1 LIG_SH3_PxxDY_5 LIG_SPRY_1 LIG_SUFU_1 LIG_TRAF2like_MATH_loPxQ_2 LIG_TRAF2like_MATH_shPxQ_1 LIG_TRAF3_MATH_PxP_3 LIG_TRAF4_MATH_1 LIG_TRAF6_MATH_1 LIG_Trf4_IWRxY_1 LIG_UFM1_UFIM_1 LIG_VCP_SHPBox_1 LIG_VCP_VBM_3 LIG_VCP_VIM_2 LIG_Vh1_VBS_1 LIG_WD40_WDR5_VDV_1 LIG_WD40_WDR5_VDV_2 LIG_WD40_WDR5_WIN_1 LIG_WD40_WDR5_WIN_2 LIG_WD40_WDR5_WIN_3 LIG_WH1 LIG_WRC_WIRS_1 LIG_WW_1 LIG_WW_2 LIG_WW_3 MOD_Plk_2-3 MOD_Plk_4 MOD_PRMT_GGRGG_1 TRG_AP2beta_CARGO_1 TRG_Cilium_Arf4_1 TRG_Cilium_RVxP_2 TRG_DiLeu_BaEn_1 TRG_DiLeu_BaEn_2 TRG_DiLeu_BaEn_3 TRG_DiLeu_BaEn_4 TRG_DiLeu_BaLyEn_6 TRG_DiLeu_LyEn_5 TRG_ER_diLys_1 TRG_ER_FFAT_1 TRG_ER_FFAT_2 TRG_Golgi_diPhe_1 TRG_LysEnd_APsAcLL_1 TRG_LysEnd_APsAcLL_3 TRG_LysEnd_GGAAcLL_1 TRG_LysEnd_GGAAcLL_2 TRG_NES_CRM1_1 TRG_NESrev_CRM1_2 TRG_NLS_Bipartite_1 TRG_NLS_MonoCore_2 TRG_NLS_MonoExtC_3 )
Nuclear Localization Sequence Binding (also annotated in these classes: TRG_NLS_Bipartite_1 TRG_NLS_MonoCore_2 TRG_NLS_MonoExtC_3 )

o 28 Instances for TRG_NLS_MonoExtN_4
(click table headers for sorting; Notes column: =Number of Switches, =Number of Interactions)

Acc., Gene-, Name	Start	End	Subsequence	Logic	#Ev.	Organism	Notes
A0A0H3LYM0 BB4970 A0A0H3LYM0_BORBR	155	160	GTMLALPEKKKTKARSAEKA	TP	2	Bordetella bronchiseptica RB50
Q5ZUS4 legAS4 Q5ZUS4_LEGPH	22	28	RSKNDSKLKKKSALQSKFKE	TP	6	Legionella pneumophila subsp. pneumophila str. Philadelphia 1
Q13309 SKP2 SKP2_HUMAN	67	72	GHPESPPRKRLKSKGSDKDF	TP	8	Homo sapiens (Human)	2 2
O15527 OGG1 OGG1_HUMAN	333	339	SRHAQEPPAKRRKGSKGPEG	TP	3	Homo sapiens (Human)	1
P13051 UNG UNG_HUMAN	15	21	SFFSPSPARKRHAPSPEPAV	TP	2	Homo sapiens (Human)	1
Q9UBP0 SPAST SPAST_HUMAN	7	13	MNSPGGRGKKKGSGGASNPV	TP	2	Homo sapiens (Human)
Q62315 Jarid2 JARD2_MOUSE	106	111	DFEEGPSRKRPRLQAQRKFA	TP	2	Mus musculus (House mouse)
P38398 BRCA1 BRCA1_HUMAN	606	611	NIHNSKAPKKNRLRRKSSTR	TP	3	Homo sapiens (Human)
P38398 BRCA1 BRCA1_HUMAN	504	509	PLTNKLKRKRRPTSGLHPED	TP	3	Homo sapiens (Human)	1
P25054 APC APC_HUMAN	2049	2054	CISSAMPKKKKPSRLKGDNE	TP	2	Homo sapiens (Human)
P19838 NFKB1 NFKB1_HUMAN	361	366	KDKEEVQRKRQKLMPNFSDS	TP	2	Homo sapiens (Human)
P18870 JUN JUN_CHICK	251	257	NRIAASKCRKRKLERIARLE	TP	2	Gallus gallus (Chicken)
P0C1C7 P/V/C W_NIPAV	437	443	MFEDHPPTKKARVSMRRMSN	TP	3	Nipah virus
P05411 JUN JUN_AVIS1	224	230	NRIAASKSRKRKLERIARLE	TP	2	Avian sarcoma virus 17
P03096 Minor capsid VP2_POVMA	314	319	TPTWATVIEEDGPQKKKRRL	TP	2	Murine polyomavirus strain A2
P03090 Major capsid VP1_POVMA	3	8	MAPKRKSGVSKCETKCTKAC	TP	2	Murine polyomavirus strain A2
P03073 Large T antig LT_POVMA	191	196	PPRTPVSRKRPRPAGATGGG	TP	2	Murine polyomavirus strain A2
P03070 Large T antig LT_SV40	127	132	SQHSTPPKKKRKVEDPKDFP	TP	2	Simian virus 40	1 2
P02406 RPL28 RL28_YEAST	7	13	MPSRFTKTRKHRGHVSAGKG	TP	2	Saccharomyces cerevisiae (Baker"s yeast)
P02293 HTB1 H2B1_YEAST	31	36	TSTSTDGKKRSKARKETYSS	TP	2	Saccharomyces cerevisiae (Baker"s yeast)
P01106 MYC MYC_HUMAN	320	327	STRKDYPAAKRVKLDSVRVL	TP	3	Homo sapiens (Human)	1EE4 1EE4 1
P16046 UL80 PPR_SCMVC	455	460	EADHGKARKRLKAHHGRDNN	TP	2	Cercopithecine herpesvirus 5
P09959 SWI6 SWI6_YEAST	161	167	HRELGSPLKKLKIDTSVIDA	TP	3	Saccharomyces cerevisiae S288c	1
Q14118 DAG1 DAG1_HUMAN	778	783	AMICYRKKRKGKLTLEDQAT	TP	8	Homo sapiens (Human)
P16790 UL44 VPAP_HCMVA	425	432	DSEDSVTFEFVPNTKKQKCG	TP	2	Human herpesvirus 5 strain AD169	1
P25054 APC APC_HUMAN	2048	2054	CISSAMPKKKKPSRLKGDNE	TP	2	Homo sapiens (Human)
P38398 BRCA1 BRCA1_HUMAN	501	508	ERPLTNKLKRKRRPTSGLHP	TP	3	Homo sapiens (Human)	1
P03070 Large T antig LT_SV40	126	132	SQHSTPPKKKRKVEDPKDFP	TP	3	Simian virus 40	1Q1T 1Q1T 2

Please cite: ELM-the Eukaryotic Linear Motif resource-2024 update. (PMID:37962385)

ELM data can be downloaded & distributed for non-commercial use according to the ELM Software License Agreement

feedback@elm.eu.org