Protein Allosteric Site Identification Using Machine Learning and Per Amino Acid Residue Reported Internal Protein Nanoenvironment Descriptors

Authors: Folorunsho Bright Omage, José Augusto Salim, Ivan Mazoni, Inácio Henrique Yano, Luiz Borro, Jorge Enrique Hernández Gonzalez, Fabio Rogerio de Moraes, Poliana Fernanda Giachetto, Ljubica Tasic, Raghuvir Krishnaswamy Arni, Goran Neshich

Published in: Computational and Structural Biotechnology Journal, Elsevier

Abstract: Allosteric regulation plays a crucial role in modulating protein functions and represents a promising strategy in drug development, offering enhanced specificity and reduced toxicity compared to traditional active site inhibition. Existing computational methods for predicting allosteric sites on proteins often rely on static protein surface pocket features, normal mode analysis, or extensive molecular dynamics simulations encompassing both the protein function modulator and the protein itself. In this study, we introduce an innovative methodology that employs a per amino acid residue classifier to distinguish allosteric site-forming residues (AFRs) from non-allosteric, or free residues (FRs). Our model, STINGAllo, exhibits robust performance, achieving Distance Center Center (DCC) success rate when all AFRs were predicted within pockets identified by FPocket, overall DCC, F1 score, and a Matthews correlation coefficient…

Read Full Article

STINGAllo: a web server for high-throughput prediction of allosteric site-forming residues using internal protein nanoenvironment descriptors

Authors: Folorunsho Bright Omage, José Augusto Salim, Ivan Mazoni, Inácio Henrique Yano, Luiz Borro, Jorge Enrique Hernández Gonzalez, Fabio Rogerio de Moraes, Poliana Fernanda Giachetto, Ljubica Tasic, Raghuvir Krishnaswamy Arni, Goran Neshich

Published in: Briefings in Bioinformatics, Volume 26, Issue 4, Oxford Academic

Publication Date: July 2025

DOI: 10.1093/bib/bbaf424

Abstract: Allosteric regulation is essential for modulating protein function and represents a promising target for therapeutic intervention, yet the complex dynamics of the protein nanoenvironment hinder the reliable identification of allosteric sites. Traditional pocket-based predictors miss 18% of experimentally confirmed sites that lie outside surface invaginations. To overcome this limitation, we developed STINGAllo, an interactive web server that introduces a residue-centric machine-learning model. Using 54 optimized internal protein nanoenvironment descriptors, STINGAllo predicts allosteric site-forming residues at single-residue resolution. By integrating hydrophobic interaction networks, local density, graph connectivity, and a unique "sponge effect" metric, STINGAllo detects allosteric sites independently of surface geometry, including concave pockets, flat surfaces, or even cryptic regions. It achieves a success rate of 78% on benchmark datasets, substantially outperforming existing methods with a 60.2% overall success rate compared with 21.1%–24.2% for contemporary pocket-based predictors. Our analysis further reveals that nearly 52.7% of unique proteins in the Protein Data Bank [(PDB); 119 851 entries, 14 November 2024] contain at least one chain with a predicted allosteric site. STINGAllo accepts protein structures via PDB identifiers or custom uploads, provides interactive 3D visualization of predicted pockets, and supports integration into computational pipelines through a RESTful application programming interface. Overall, STINGAllo bridges advanced computational prediction with user-friendly design, offering a robust tool expected to deepen understanding of protein regulation and accelerate allosteric drug discovery. The server is freely accessible at https://www.stingallo.cbi.cnptia.embrapa.br/.

Read Full Article