Recombinant proteins must be stable in order to be used in biotechnological or medicinal applications. Both the creation of new proteins and the stabilization of preexisting ones have demonstrated excellent performance using deep-learning protein design techniques such as ProteinMPNN. Nevertheless, given that the natural proteins in the training set are very weakly stable biophysically, it is unlikely that the designs’ stability will be appreciably higher. Because hyperthermophiles and mesophiles have very different amino acid compositions, researchers gathered predicted protein structures from the former. Notably, the distinct amino acid makeup of ProteinMPNN is not restored. Here, researchers from Leipzig University, Germany, demonstrate that a retrained network, called HyperMPNN, on predicted proteins from hyperthermophiles not only restores this distinct amino acid composition but can also be used on non-hyperthermophile proteins. 

Introduction

In nature, proteins maintain the careful balance between stability and function, which may tolerate undesirable changes without losing their ability to function. With a melting point above the optimal growth temperature (OGT), natural proteins are very weakly stable and do not offer any evolutionary advantages. However, a prerequisite for protein activity is energy frustration, which leads to flexibility and is linked to the source organism’s physiological temperature. Industrial operations use temperatures higher than OGTs to speed up reactions and lower the risk of microbial contamination. Enhancing a protein’s initial stability is a common objective of protein engineering campaigns, as it frequently limits the usage of proteins as medicines or biocatalysts.

Deep learning techniques for identifying stabilizing mutations in proteins are becoming more accurate because of recent advancements in computational prediction. The requirement for redesign and the paucity of data, however, make it difficult to improve stability beyond 100 amino acids. The majority of the data that is now accessible only discusses single-point mutations; combinations are not taken into account. Protein stabilization can also be accomplished experimentally through mutagenesis studies or guided evolution.

Study Background

Proteins’ thermal stabilization techniques have changed to accommodate their ability to tolerate high temperatures. By taking advantage of hyperthermophiles’ evolutionary adaptation to high temperatures, a machine learning model called ProteinMPNN was retrained on anticipated structures from these organisms. ProteinMPNN was retrained on predicted protein structures when the study discovered that it was unable to retrieve the distinct amino acid composition of hyperthermophiles.

HyperMPNN demonstrated that both ProteinMPNN and HyperMPNN are independent of the input protein’s source organism by effectively transferring the distinct amino acid composition of hyperthermophile proteins to other animals. By creating the I53-50B.4PT1 pentamer of the icosahedral protein nanoparticle I53-50B, this strategy was experimentally evaluated. The study emphasizes how crucial proteins are for thermal stability and how they may adjust to high temperatures.

Protein thermal stabilization has been investigated through the use of supervised machine learning techniques, evolutionary data, and biophysical scoring functions. A considerable amount of data is necessary for evolutionary data to be effective in eliminating mutations from variants while maintaining functionality. ThermoMPNN, which predicts ddG values for mutations using learned characteristics from ProteinMPNN, is one example of a supervised algorithm that has improved accuracy due to high throughput data collecting. The supervised technique, on the other hand, is limited in its ability to generalize to larger folds due to its low diversity and mostly tiny protein basis. Due to the large diversity of sizes, folds, and functions in the hyperthermophile sequence dataset, self-supervised learning of a widely applicable method is possible.

Looking into HyperMPNN

It has been demonstrated that HyperMPNN considerably increases the thermal stability of proteins, outperforming the parent protein at temperatures as high as 95°C. This is especially important for vaccine technologies like mRNA vaccines and medication delivery systems. The sequence design approach used by HyperMPNN may prove to be a useful technique for creating protein nanoparticle variations with improved thermal resistance. However, misfolding during synthesis at temperatures 30 to 50°C below the typical development temperature of hyperthermophile organisms may be the cause of the reduced expression levels of the HyperMPNN construct in comparison to the parent or ProteinMPNN sequence. Folding, stability, and overall expression yield may all be enhanced by heterologous expression in Thermus thermophilus, a thermophilic host more compatible with the native folding environment. 

Considering the fact that high temperatures are essential for increasing the efficiency of industrial operations, HyperMPNN’s potential for creating thermostable enzymes is especially noteworthy. When combined with the I53-50A trimer and negatively charged cargo, the suggested I53-50B variant, which has an overall positive charge in comparison to the parent or ProteinMPNN design, may improve nanoparticle assembly.

Limitation

The use of projected structures from the AlphaFold database is one of the method’s drawbacks. The gathered predicted structures were filtered using both local and overall pLDDT to prevent low-quality predictions. Additionally, compared to approaches that concentrate on single-point mutations, the strategy may necessitate redesigning more residues because it is predicated on a general shift in the amino acid composition of surface residues. Since the underlying training data is exclusively from prokaryotes, another possible drawback is the application of HyperMPNN to protein folds that are unique to mammalian animals. 

Conclusion

Researchers successfully stabilized a vaccine scaffold in a first proof of concept, increasing its melting temperature by 30°C. Researchers have developed a new technique to (re-)design proteins for high thermal stability, taking inspiration from hyperthermophilic species. The designs remained stable at 95°C when this innovative method was applied to a protein nanoparticle with a melting temperature of 65°C. In conclusion, using data from hyperthermophiles, researchers developed a novel approach to designing extremely thermostable proteins using self-supervised learning.

Article Source: Reference Paper | Code for reproducing the results can be found on GitHub.

Disclaimer:
The research discussed in this article was conducted and published by the authors of the referenced paper. CBIRT has no involvement in the research itself. This article is intended solely to raise awareness about recent developments and does not claim authorship or endorsement of the research.

Important Note: bioRxiv releases preprints that have not yet undergone peer review. As a result, it is important to note that these papers should not be considered conclusive evidence, nor should they be used to direct clinical practice or influence health-related behavior. It is also important to understand that the information presented in these papers is not yet considered established or confirmed.

Learn More:

Deotima
Website |  + posts

Deotima is a consulting scientific content writing intern at CBIRT. Currently she's pursuing Master's in Bioinformatics at Maulana Abul Kalam Azad University of Technology. As an emerging scientific writer, she is eager to apply her expertise in making intricate scientific concepts comprehensible to individuals from diverse backgrounds. Deotima harbors a particular passion for Structural Bioinformatics and Molecular Dynamics.

LEAVE A REPLY

Please enter your comment!
Please enter your name here