An innovative artificial intelligence program called CLEAN (contrastive learning–enabled enzyme annotation) has the ability to predict enzyme activities based on their amino acid sequences, even if the enzymes are unfamiliar or inadequately understood. The researchers have reported that CLEAN has surpassed the most advanced tools in terms of precision, consistency, and sensitivity. However, a deeper understanding of enzymes and their roles would be beneficial in a number of disciplines, including genetics, chemistry, pharmaceuticals, medicine, and industrial materials.

The scientists are using the protein language to forecast their performance, similar to how ChatGPT uses written language data to generate predictive phrases. Almost all scientists desire to comprehend the purpose of a protein as soon as they encounter a new protein sequence. Furthermore, this tool will aid researchers in promptly recognizing the suitable enzymes needed to manufacture chemicals and materials for various applications, be it in biology, medicine, or industry.

Thanks to genomics breakthroughs, numerous enzymes have been identified, and their sequence has been determined. Still, the researchers observed that how these enzymes work needs to be clarified. The research findings were published in the journal Science.

Computational methods can be used to make predictions about enzyme function by analyzing the queried sequence and comparing it to a list of established enzymes to identify similar sequences and assign an enzyme commission number. However, these approaches may not be as effective for enzymes that are less researched, have incomplete descriptions, or serve multiple functions, as noted by the authors.

While previous studies have also used artificial intelligence (AI) methods to predict enzyme commission numbers, this particular research stands out as the first to utilize the innovative deep-learning contrastive learning approach. The results indicate that the CLEAN algorithm surpasses other AI technologies in accuracy. While the method may offer better results than two or three other methods, precise predictions for every product are not guaranteed.

By utilizing both computational analysis and in vitro experimentation, the scientists were able to verify their tool experimentally. Their findings revealed that this tool not only precisely detected enzymes possessing two or more functions but also corrected the classification of enzymes that were inaccurately identified by the leading tools. Furthermore, the tool was also able to predict the function of enzymes that had not been previously characterized.

The research team has made CLEAN accessible online for fellow researchers seeking to characterize an enzyme or assess its potential to facilitate a specific reaction.

The researchers have high expectations that the broader scientific community will widely utilize this tool. By using the online interface, researchers can conveniently access the results simply by entering the sequence in a search box resembling a search engine.

The scientists plan to broaden the application of CLEAN, which is driven by AI, to encompass other types of proteins, including binding proteins. Additionally, the research team is striving to enhance the machine learning algorithms, enabling the AI to propose the most suitable enzyme for a given reaction when a user initiates a search.

There are various unknown binding proteins found in the cell, which consist of receptors and transcription factors. The researchers’ objective is to predict the potential functions of these proteins. Their ultimate objective is to get a thorough grasp of the complete cell and investigate how it may be used in biotechnology or medicine. By predicting the functions of each protein, the researchers hope to obtain a clear picture of all the proteins that exist within the cell and make informed decisions on how to engineer them.


The study underlines the potential for AI in drug development and industrial processes while demonstrating the effectiveness of CLEAN in predicting enzyme performance. AI-based techniques will continue to get more accurate as more data becomes accessible, resulting in fresh perspectives and scientific advances in the field of enzymology.

Article Source: Reference Paper | Reference Article

Learn More:

Website | + posts

Dr. Tamanna Anwar is a Scientist and Co-founder of the Centre of Bioinformatics Research and Technology (CBIRT). She is a passionate bioinformatics scientist and a visionary entrepreneur. Dr. Tamanna has worked as a Young Scientist at Jawaharlal Nehru University, New Delhi. She has also worked as a Postdoctoral Fellow at the University of Saskatchewan, Canada. She has several scientific research publications in high-impact research journals. Her latest endeavor is the development of a platform that acts as a one-stop solution for all bioinformatics related information as well as developing a bioinformatics news portal to report cutting-edge bioinformatics breakthroughs.


Please enter your comment!
Please enter your name here