Researchers from Insilico Medicine designed and devised inClinico to predict Phase-II clinical trial outcomes. The transformer-based artificial intelligence software platform, inClinico, conjugates an ensemble of clinical trial outcome prediction (CTOP) engines that harness generative artificial intelligence and multimodal data that consolidates omics, text, clinical trial design, and small molecule properties. Integrating multimodal data sources, multiple scoring approaches, and biological background along with deep learning models, inClinico software generates a probability of success (PoS) against a Phase II clinical trial evaluation to furnish insight into the probability of Phase II success and, thereby, the transition from Phase II to Phase III. The inClinico CTOP models can be leveraged by the pharmaceutical industries to optimize funding and elevate productivity.

Clinical Trial Outcome Prediction: Approaches and Advantages

The success of Drug discovery and development is a blessing for civilization, but the expense and involvement of risk in terms of labor, finance, and time; that are associated with the entire process from the pre-discovery stage to marketing are enormous. Computer-aided techniques now ubiquitously complement the pre-discovery and discovery stages.

After that, the clinical trial phase alone takes 6-7 years. Generally, several thousand volunteers are clinically tested with the drug in Phase II. A prior notion about the likelihood of success in the clinical trial is advantageous regarding the effectiveness, the productivity of the R&D department of a pharma company and provides beneficial estimation in cutting down on irrelevant expenses due to failure and further financial investments.

Several attempts have been made to develop artificial intelligent technologies that will foretell whether the clinical trial will be successful. Such works included toxicity prediction through machine learning scoring ensemble based on drug descriptors and target features, drug-induced pathway activation and consequent side effects of a drug, a deep neural network to predict clinical trial outcomes using multimodal data on the molecule tested, etc.

The previous models are constrained by the incorporation of only small datasets and the showcase of only retrospective validation methods. Addressing the limitations of prior models, here, in this paper, the InSilico scientist team merged all scoring approaches, multimodal data sources, biological backgrounds, and state-of-the-art deep learning models to develop a framework that executes clinical trial outcome prediction (CTOP). 

However, the incorporation of AI-based approaches by pharmaceutical organizations to predict the probability of technical and regulatory success (PTRS) or the feasibility of the drug achieving approval are nil. Nonetheless, the paper demonstrates the performance of the investment portfolio based on prospective inClinico forecasts, which proposed a good return on investment. 

Image Description: inClinico workflow.
Image Source:

inClinico: Potential to Assist Pharmaceutical Companies

inClinico framework is formulated using machine learning models features representing two modalities that are omics features for drugs and targets, and another is clinical trials’ attributes. InSilico’s proprietary training dataset, which comprises 5,653 unique Phase II clinical trials, is curated by experts from the biomedical domain and GPT-3.5 via API provided by OpenAI. 

Here, the generative language model GPT-3.5 accelerates the data creation process by extracting clinical trial results from published literature and press releases. Biomedical experts then review the created dataset. An NLP (Natural Language Processing) pipeline; based on the state-of-the-art Drug and Disease Interpretation Learning with Biomedical Entity Representation Transformer (DILBERT); maps trials to therapies and conditions and assembles two modules that are entity recognition (NER) module and entity linking (EL) module.

It ensembles two predictive models: trial design and target choice, and devised a meta-model based on Target choice and Trial design models, effectively amalgamating the prediction to produce the final probability of success. Most importantly, Shapley Additive Explanations (SHAP) value is provided with the prediction representing the impact of each feature in the trial design on the probability of success. Also, SHAP evaluation can be extrapolated to understand the weakest points for each clinical trial. 

Thus, inClinico could offer valuable insights into the failure of clinical tests. Moreover, inClinico was also validated in retrospective, quasi-prospective, and prospective evaluation studies where the platform achieved 0.88 ROC AUC in predicting the transition from Phase II to Phase III on a quasi-prospective validation dataset. Another important understanding inferred from the Quasi-prospective validation study is that Target choice is the most significant modality for predicting the outcome of Phase II clinical trial where there is only a marginal contribution of the Target Choice. 

This observation suggests that clinical trials mostly fail to achieve success due to a lack of efficacy. Remarkably, inClinico generated predictions about the success of first-in-class factor B inhibitor (LNP023) for a rare disease paroxysmal nocturnal hemoglobinuria, suggesting that inClinico is efficient even without previous acquaintance with the clinical relevance of the mechanism of drug’s action in the disease. Overall, the study demonstrates the potential of inClinico to be employed for practical application in the pharma industries for important and optimized decision-making regarding the dedication of funds for clinical programs and investors’ financial assistance. 


inClinico aspires to predict the success of phase II, which can be defined by a green signal preceding the last clinical trial phase. inClinico is an excellent consolidation of machine learning GPT-3.5 for data creation and NLP. Its capability of indicating the possibility of trial-II success and the most probable aspects responsible for the unsuccessful trial clearly constructs the anticipation that it would be capable of taking care of the optimization and productivity strategies of big pharma companies.

Article Source: Reference Paper

Learn More:

Website | + posts

Aditi is a consulting scientific writing intern at CBIRT, specializing in explaining interdisciplinary and intricate topics. As a student pursuing an Integrated PG in Biotechnology, she is driven by a deep passion for experiencing multidisciplinary research fields. Aditi is particularly fond of the dynamism, potential, and integrative facets of her major. Through her articles, she aspires to decipher and articulate current studies and innovations in the Bioinformatics domain, aiming to captivate the minds and hearts of readers with her insightful perspectives.


Please enter your comment!
Please enter your name here