This site needs JavaScript to work properly. Please enable it to take advantage of the complete set of features!
Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log in
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 26;121(48):e2412719121.
doi: 10.1073/pnas.2412719121. Epub 2024 Nov 20.

Predicting multiple conformations of ligand binding sites in proteins suggests that AlphaFold2 may remember too much

Affiliations

Predicting multiple conformations of ligand binding sites in proteins suggests that AlphaFold2 may remember too much

Maria Lazou et al. Proc Natl Acad Sci U S A. .

Abstract

The goal of this paper is predicting the conformational distributions of ligand binding sites using the AlphaFold2 (AF2) protein structure prediction program with stochastic subsampling of the multiple sequence alignment (MSA). We explored the opening of cryptic ligand binding sites in 16 proteins, where the closed and open conformations define the expected extreme points of the conformational variation. Due to the many structures of these proteins in the Protein Data Bank (PDB), we were able to study whether the distribution of X-ray structures affects the distribution of AF2 models. We have found that AF2 generates both a cluster of open and a cluster of closed models for proteins that have comparable numbers of open and closed structures in the PDB and not too many other conformations. This was observed even with default MSA parameters, thus without further subsampling. In contrast, with the exception of a single protein, AF2 did not yield multiple clusters of conformations for proteins that had imbalanced numbers of open and closed structures in the PDB, or had substantial numbers of other structures. Subsampling improved the results only for a single protein, but very shallow MSA led to incorrect structures. The ability of generating both open and closed conformations for six out of the 16 proteins agrees with the success rates of similar studies reported in the literature. However, we showed that this partial success is due to AF2 "remembering" the conformational distributions in the PDB and that the approach fails to predict rarely seen conformations.

Keywords: binding hot spot; conformational change; machine learning; protein mapping; protein structure prediction.

PubMed Disclaimer

Conflict of interest statement

Competing interests statement:The authors declare no competing interest.

Figures

Fig. 1.
Fig. 1.
Distributions of binding site conformations and pocket volumes in X-ray structures and AF2 models of Group 1 proteins with balanced distributions of open and closed states. (A) Bovine β lactoglobulin. (B) KRAS. (C) MAPK. (D) Pyruvate Dehydrogenase Kinase. Each column includes the same four subpanels. Top: RMSD of the moving fragment to open and closed reference structures in the X-ray structures of the PDB ensemble. Point shapes indicate the presence or absence of a ligand at the bindings site, and color indicates the druggability score. Second row: Same as the top panel for the predicted structures in the AF2 ensemble. Larger scatter points indicate cluster centroid structures, with color indicating druggability scores. Third row: Binding pocket volumes in the X-ray structures of the PDB ensemble. Bottom: Binding pocket volumes in the predicted structures of the AF2 ensemble.
Fig. 2.
Fig. 2.
Distributions of binding site conformations and pocket volumes in X-ray structures and AF2 models of Group 2 proteins with imbalanced distributions of open and closed states. (A) TEM β-lactamase. (B) cAMP-dependent protein kinase. (C) Glutamate receptor 2. (D) AMPc β -Lactamase. Each column includes the same four subpanels as in Fig. 1.
Fig. 3.
Fig. 3.
Distributions of binding site conformations and pocket volumes in X-ray structures and AF2 models of Group 3 proteins with many conformations distant from open and closed states. (A) Myosin II. (B) Ricin. (C) Androgen receptor. (D) Hsp90. Each column includes the same four subpanels as in Fig. 1.
Fig. 4.
Fig. 4.
RMSD of the moving fragment to open and closed reference structures with various levels of MSA reduction. (A) Bovine β lactoglobulin. (B) TEM β-lactamase. (C) Myosin II. For each protein, the four panels show RMSD distributions of AF2 generated models with the (max_seq, max_extra_seq) pairs of (156, 512), (64, 128), (32, 64) and (8, 16) from left to right.
Fig. 5.
Fig. 5.
Properties of AF2 models obtained using reduced MSAs for the three groups of proteins considered in this study. (A) Group 1 proteins with balanced numbers of open and closed states. (B) Group 2 proteins with imbalanced numbers of open and closed states. (C) Group 3 proteins with many structures in neither open and nor closed states. Left panels show the normalized diversity distances as a functions of the max_seq parameter. Middle panels show the average global RMSD values to the ligand-bound reference state, also as functions of max_seq. Right panels show average plDDT values as functions of the global RMSD.

References

    1. Nayal M., Honig B., On the nature of cavities on protein surfaces: Application to the identification of drug-binding sites. Proteins 63, 892–906 (2006). - PubMed
    1. Hwang H., Dey F., Petrey D., Honig B., Structure-based prediction of ligand-protein interactions on a genome-wide scale. Proc. Natl. Acad. Sci. U.S.A. 114, 13685–13690 (2017). - PMC - PubMed
    1. Gao M., Skolnick J., A comprehensive survey of small-molecule binding pockets in proteins. PLoS Comput. Biol. 9, e1003302 (2013). - PMC - PubMed
    1. Wass M. N., Kelley L. A., Sternberg M. J., 3DLigandSite: Predicting ligand-binding sites using similar structures. Nucleic Acids Res. 38, W469–473 (2010). - PMC - PubMed
    1. Jumper J., et al. , Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). - PMC - PubMed

LinkOut - more resources

Cite

AltStyle によって変換されたページ (->オリジナル) /