Friday, June 12, 2026
Home AI BioReason-Pro Allows Structured Reasoning for Protein Function Predictions

BioReason-Pro Allows Structured Reasoning for Protein Function Predictions

BioReason-Pro
Image Description: Overview of BioReason-Pro Image Source: https://doi.org/10.64898/2026.03.19.712954

A team of scientists from the University Health Network – Canada’s Hospital, Arc Institute and Vector Institute developed BioReason-Pro to tackle one of the greatest challenges in biology: protein function prediction. Unlike existing AI tools, which give baffling black-box predictions, BioReason-Pro generates structured reasoning that explains why a protein is assigned a function. By integrating protein embeddings and gene ontology via GO-GPT, it managed to achieve 73.6% Fmax on GO term prediction and scored 8/10 from LLM judges on functional summaries, proving its expert-level biological reasoning.

Assigning a function to a protein is the process of predicting what a protein does based on its sequence similarities and structure. It is essential because most proteins discovered in genomes have unknown functions, and accurate annotation speeds up drug discovery, disease research, and medicine development. But the challenge stands as less than 1% of known protein sequences have experimentally verified functions, leaving a massive gap in biological knowledge.

As genomic sequencing produces millions of protein sequences, far beyond what labs can test experimentally, AI can be critical for computational protein function annotation. While computational approaches to this issue already exist, they rely on shallow sequence similarities or often treat prediction as isolated classification tasks.

They miss the integrative reasoning across sequence, structure, domains, and interactions that human experts use. This makes BioReason-Pro stand out as not just another AI tool but a multimodal reasoning LLM for protein function prediction.

How Researchers Built a Biology Language Multimodal Model

Domain biologists determine a protein’s function by combining sequences, structures, architectures, evolutions, interactions, etc. BioReason-Pro is designed to mimic a similar chain-of-thought reasoning that expert biologists use and reason about the protein’s function in natural language.

Hence, the researchers came up with a multimodal large language model having the following components, ensuring BioReason-Pro isn’t just an abstract AI model:

  • ESM3 Residue Embedding
    ESM3, a deep learning model trained on millions of protein sequences, provides embedding vectors (mathematical fingerprints) to each residue (amino acid), capturing its role in protein structure and function.
  • GO graph encoder
    The GO graph encoder turns the Gene Ontology tree into embeddings to provide BioReason-Pro with relationships between terms, so the model does not treat functions as isolated labels but actually understands the hierarchy.
  • GO-GPT predictions
    GO-GPT is a specialized, aggressive transformer model that predicts the GO terms for proteins one after the other, making the first guesses about functions. This allows the model to consider dependencies across aspects like molecular function, cellular components, and processes instead of treating prediction as a flat classification problem.

Step-by-Step Pipeline On How BioReason-Pro Assigns Functions to Respective Proteins

The dataset compiled from databases like UniProt, InterPro, STRING, and PDB, with about 133k proteins across 3k organisms, was collected and truncated to 2 thousand amino acids to fit into ESM3. 

ESM3 embeddings and the GO encoder capture the sequences and structures, along with Gene Ontology (GO), which acts as a biological vocabulary for the model.

The small model, GO-GPT, makes the initial guesses about the functions the protein might have one by one, respecting the hierarchy. These predictions are used as draft hypotheses. Here, BioReason-Pro reads all the embeddings, GO-GPT predictions, and reasons why a protein should have a certain function.

The model was trained on synthetic reasoning traces written by GPT-5 to teach the model how to explain itself. Then, it was fine-tuned with reinforcement learning to make the reasoning accurate.

Human Experts Prefer BioReason-Pro Over Global References

The automated metrics and LLM judges were promising in terms of annotations, but it’s difficult to examine whether the annotations are genuinely useful to biologists. Therefore, 27 molecular biologists were recruited who reviewed 162 randomly selected test proteins, while the evaluators didn’t know whether the annotations came from BioReason-Pro (be it SFT or RL) or UniProt. Over that, they had access to external sources and literature, so they could judge the model realistically on a five-point scale.

Surprisingly, in the majority of the cases, human experts found BioReason-Pro’s predictions to be as good or better than curated UniProt annotations. Each prediction was also rated on seven axes, using a 1-10 scale.

BioReason-Pro’s Performance

BioReason-Pro SFT and RL: SFT achieved a 79% tie-or-exceed rate compared to UniProt with an 8.0/10 scale, while RL achieved a 73% tie-or-exceed rate and averaged 7.4/10 on scale.

Statistical Test: McNemar’s exact test proved that RL improved GO term accuracy without reducing reasoning quality. Moreover, Spearman’s correlation |ρ| < 0.06 and p > 0.05 shows no significant dependence on the similarity of training set proteins vs novel proteins.

Supervised Fine Tuning (SFT) vs Reinforcement Learning (RL): SFT outputs were preferred by the researchers for their mechanistic depth, and RL outputs were more conservative with fewer hallucinations and errors.

Important Takeaways

The research proved that BioReason-Pro isn’t just another AI model developed to tackle protein annotation, but a tool that can speed up real biological discoveries with annotations, which are found to be on par with or better than gold standard databases in most cases. It is a multimodal LLM that reasons protein functions by inferring sequence, structures, domains, and interactions, balancing depth and reliability.

Article Source: Reference Paper |Web-server

Disclaimer:
The research discussed in this article was conducted and published by the authors of the referenced paper. CBIRT has no involvement in the research itself. This article is intended solely to raise awareness about recent developments and does not claim authorship or endorsement of the research.

Important Note: bioRxiv releases preprints that have not yet undergone peer review. As a result, it is important to note that these papers should not be considered conclusive evidence, nor should they be used to direct clinical practice or influence health-related behavior. It is also important to understand that the information presented in these papers is not yet considered established or confirmed.

Learn More:

Website |  + posts

Saniya is a graduating Chemistry student at Amity University Mumbai with a strong interest in computational chemistry, cheminformatics, and AI/ML applications in healthcare. She aspires to pursue a career as a researcher, computational chemist, or AI/ML engineer. Through her writing, she aims to make complex scientific concepts accessible to a broad audience and support informed decision-making in healthcare.

LEAVE A REPLY

Please enter your comment!
Please enter your name here