Intrinsically disordered proteins (IDPs) are proteins that are central to neurodegenerative diseases and aggressive cancers like prostate cancer, have been considered ‘undruggable’ as structures predicted by traditional tools to determine how they function and interact, and to be studied in medicine and biotechnology are static. ToposBio introduced Topos-1, an all-atom generative model, which addresses this gap by predicting realistic conformational ensembles, outperforming existing tools across global and local structural evaluations.

Issues with ‘Undruggable Proteins’ and the Need for a Generative Model

Many proteins, especially intrinsically disordered proteins (IDPs), constantly shift between 3D structures, rather than existing in a single fixed state. Although this flexibility is important for binding, catalysis, and regulation, it also makes it a challenge to study and design therapeutic drugs tackling diseases linked to these proteins, making them ‘Undruggable’.

As IDPs make up nearly a third of the human proteome, ignoring their flexibility leaves a major gap in drug discovery. Current machine learning models, like AlphaFold, are not designed to generate conformational ensembles, a collection of all possible shapes that a protein can adopt.

Topos-1, an all-atom generative model, addresses this gap by predicting realistic conformational ensembles, meaning they aren’t just random protein structures, but obey the physical and chemical rules of the molecular behaviour. This helps make sure the conformations align with experimental data and known biophysical principles.

An Introduction to Topos1

Topos-1 can be considered a breakthrough generative AI model specifically designed to handle the challenge of structure prediction for highly flexible proteins. It is the largest IDP-focused simulation corpus to date.

Instead of training mainly on structured proteins, such as those from the Protein Data Bank (PDB), Topos-1’s architecture and training pipeline are tailored to dynamic motifs by combining in-house physics-based simulations with experimental IDP data.

As a result of this, it produces physically realistic conformational ensembles that generalize well to novel IDPs and achieve best-in-class accuracy across both global (large-scale structural properties) and local properties (fine-grain details crucial for drug design). 

Topos-1’s performance continues to improve predictably as model size, computation, and training data increase.

Scaling Potential and Efficiency of Topos-1

MD simulations are traditional gold standards for studying proteins that lack a stable 3D structure, but they are extremely computationally expensive, requiring days of GPU time.

A major advantage of Topos-1 is its computational efficiency, as it can be parallelized across many GPUs, speeding up the generation of ensembles by 1000 times compared to molecular dynamics, making it not only accurate but also more practical for large-scale biomedical applications such as drug discovery.

Evaluations of Topos-1: Setting a new standard for Global and Local Poperties

The model’s performance was validated by comparing it against an evaluation dataset of experimental measurements from small-angle X-ray scattering (SAXS) measurements, restricted to IDPs with length <200 residues, as shorter sequences are computationally manageable. To ensure fairness, none of the proteins in the evaluation set was included in the training data. They also excluded any sequences with more than 50% similarity to training data proteins, using MMseqs2 linclust.

The goal was to prevent the model from being tested on proteins too close to what it has already learned.

  • Global Properties

Global properties describe the overall size and shape of the ensemble. The key metric here is the radius of gyration (Rg), which reflects how compact or extended the protein conformations are on average.

For each protein, researchers generated multiple conformations, calculated Rg for each, and then averaged the values to represent the ensembles. Errors were computed for each protein and then aggregated across the dataset of 104 IDPs.

Predicted Rg values when compared against the evaluation dataset, Topos-1 reduced normalized Rg error by 43% vs BioEmu, 60% vs Chai-1, 71% vs Boltz-2x, 79% vs AlphaFold-2.

These improvements show that Topos-1 is far better at catching global ensemble properties much more accurately than existing models.

  • Local Properties

Local properties capture finer details of the protein backbone and side chains, which are quantified using chemical shifts from NMR experiments. Researchers compared predicted chemical shifts against experimental ones. Here, BioEmu was excluded because it doesn’t predict side chain atoms. Among other models, Topos-1 achieved the lowest error, outperforming Boltz-2x by 29%, AlphaFold-2 by 46% and Chai-1 by 50%.

These results show that Topos-1 captures fine-grained local conformational details of IDPs much more accurately than general predictors.

Proof of Impact: Case Studies

  • Parkinson’s disease: α‑synuclein is a classic IDP that misfolds and aggregates, driving Parkinson’s pathology. Topos-1 generated ensembles that matched experimental secondary structures and long-time scale MD simulations, while also achieving superior accuracy in fine-grained geometric structures.
  • Prostate Cancer: The AR NTD is a validated but difficult drug target because of its disordered nature. Using Topos-1 ensembles as a structural basis for ligand evaluation, researchers found that predicted ligand rankings correlated strongly with experimental potencies measured in cell assays.

Together, these case studies show Topos-1 can accurately model dynamic ensembles of disordered proteins and enable drug discovery for originally ‘undruggable’ proteins.

Article Source: Reference | Reference Article

Disclaimer:
The research discussed in this article was conducted and published by the authors of the referenced paper. CBIRT has no involvement in the research itself. This article is intended solely to raise awareness about recent developments and does not claim authorship or endorsement of the research.

Important Note: The referenced report may not have undergone peer review. As a result, it is important to note that it should not be considered conclusive evidence, nor should they be used to direct clinical practice or influence health-related behavior. It is also important to understand that the information presented in these papers is not yet considered established or confirmed.

Learn More:

Website |  + posts

Saniya is a graduating Chemistry student at Amity University Mumbai with a strong interest in computational chemistry, cheminformatics, and AI/ML applications in healthcare. She aspires to pursue a career as a researcher, computational chemist, or AI/ML engineer. Through her writing, she aims to make complex scientific concepts accessible to a broad audience and support informed decision-making in healthcare.

LEAVE A REPLY

Please enter your comment!
Please enter your name here