The goal of drug discovery and development is to bring new medicines to patients suffering from critical illnesses. Earlier, drug discovery was a tedious process. Bringing a drug to market still takes 10 to 15 years. As a result, there is a lot of interest in finding new approaches to developing drugs using novel technological approaches. Machine learning tools and techniques are proving their importance at every stage of drug discovery, reducing the risk and lowering the cost and expenditure used in clinical trials. It proves crucial in QSAR analysis, de novo drug design, hit discoveries, target validation, prognostic biomarkers, digital pathology, etc.

The discovery of a drug has a lengthy procedure to go through before reaching the market. They are discovery and development, pre-clinical research, clinical research, FDA review, and FDA post-market safety monitoring.

Process of Drug Discovery

The drug discovery process has a lot of loopholes even today. There is a shortage of promising discoveries of drugs for chronic and deadly diseases.

The implementation of recent technological advances in drug discovery process provides tailored solutions to meet challenges in the field.

The drug discovery process costs about 2.6 billion dollars and takes more than ten years to reach the people who need it the most. This process is most resource-consuming. A person desires to get a maximum return if he/she is contributing most of the resources and time. A similar thing applies here. Research in the field of cancer drugs is advancing more than any other field, as it can give high returns. This way, more than 90% of rare diseases are left uncurable, or there isn’t any effective treatment. Because either it doesn’t have a high return scope or its drug discovery process is complex and somewhat impossible.

The process of drug discovery is complex. It goes through these four steps:

1. Drug Discovery or Target Identification

Target identification is the first step in discovering a drug. It usually takes two years. This step is all about understanding the targets that are responsible for the particular disease. Identification of disease-causing microorganisms is done. This step involves DNA mutations and misfolded protein identifications.

The challenge in this step is that there are so many processes, compounds, changes, and mutations seen at this early stage. It becomes difficult for a human being to make meaningful sense out of a huge possible combination. 

2. Pre-clinical

This step involves screening, testing the compound, and proofing. This step has three sub-steps, which are: –

  • Pre-clinical – Lead Discovery

This step involves screening thousands of compounds that could interfere with the target disease. This is done to achieve a small range of compounds that are worthy of interfering with disease targets. This step usually takes 1-2 years. This step narrows down the bulk of compounds to a small worthy interfering compound.

  • Pre-clinical – Medicinal Chemistry

How the screened-out compounds interact with the disease targets is tested at this step. The screened-out compounds are analyzed and tested for their maximum interference with disease targets. Their 3D configurations are taken into account for testing. The compound’s high interaction with the target disease is further optimized. This step usually takes one or more than two years of time.

  • Pre-clinical – In vitro studies

This step involves the proof of the compound, i.e., it is made to prove itself if it is the one highly interactive with target diseases. In this step, compounds are tested in a cell system. The major challenge in this step is that it deals with the 2D model of the cell system, whereas in actuality, we deal in 3D mode. This then results in the high failure rate of the entire drug discovery process. This step usually takes more than two years of time.

3. Clinical Review

In this step, trials are performed so as to validate its performance against the target disease. It has further two sub-steps, which are: –

  • Clinical Review – Animal Studies

In in-vitro – studies, the compound is tested on animal models like rat or mouse models before giving a green signal to test on humans. This step provides more detailed results than in-vitro studies in 2D models. However, this model is more expensive than in-vitro studies. The failure rates at this step are also very high. This step usually takes more than two years to complete.

  • Clinical – Trials

If all the above steps show a positive response, then the compound is tested in clinics, usually with human interventions. The objective of clinical trials is to validate the safety and efficiency of compounds. This is the most lengthy and expensive step. This step is the three-phased step. It usually takes more than six years of time. This long duration is to make out the long-term side effects of the drug, if there are any.

4. FDA Review and FDA Post-Market Safety Monitoring

When all the testing is completed, it is sent for review by the FDA department. Once approved, it can be commercially used in clinics and hospitals and can be available at medical stores. This step usually takes more than one year to get approved. The companies typically get 20 years of patent rights over the drug to protect it from the competition. The initial cost of the drug is often very expensive as the cost of its research and development is so costly and time-consuming. The regular update and ensuring that its quality doesn’t get degraded is all maintained by the company that gets the patent and checked by the FDA.

Technological Advances in Drug Discovery

The implementation of recent technological advances in drug discovery process provides tailored solutions to meet challenges in the field.

There is a need to process a huge amount of data going through the procedure of drug discovery. Manually, it would take years for drug discovery. The advent of artificial intelligence and machine learning has accelerated the pace of drug discovery. Drug discovery is a data-driven process. It scans and analyses the data and performs genomic data analysis, image processing, and metabolites. Machine Learning combined with deep learning correlates, integrates, and processes the pattern found among the data.

1. AI and Machine Learning in Drug Discovery

Since the 1960s, artificial intelligence has been utilized in medicinal chemistry to create molecules for drug discovery. From millions of candidate compounds, machine-learning technologies such as quantitative structure-activity relationship (QSAR) modeling have revealed promising target molecules. AI is being used for a variety of activities ranging from robotics control to image analysis and logistics, in addition to medication discovery. From target selection to hit detection to lead optimization to preclinical investigations and clinical trials, AI has been used throughout the drug discovery process.

The implementation of recent technological advances in drug discovery process provides tailored solutions to meet challenges in the field.

Machine Learning is teaching machines or making machines to learn something. Images, data, and information are first fed into the machine. Machines are trained or customized to perform activities like a human brain. Based on some algorithm, it generates a pattern. Using the algorithm-based patterns and learning knowledge fed, the machine processes or implements the next set of instructions. Therefore, Artificial intelligence can be grouped as a subject dealing with a wide range of data and information, computing it based on some algorithms and interpreting useful knowledge. Artificial intelligence is related to fields like statistics, machine learning, pattern recognition, neural networks, computational intelligence, etc.

Machine learning approaches in drug discovery can be applied to: –

  • Predict the structure of the target disease accurately
  • Identify and optimize hits
  • Explore drug-protein interactions
  • Design models to predict the properties of drug candidates

The target protein analysis is tedious for biologists and scientists to do because it involves a strict screening procedure. Today’s technology involving machine learning in medicine can do this job in a quick span of time, also lowering the overall cost.

Machine Learning with deep learning involves a method called DeepBAR, which can calculate the percentage of the same proteins between drug candidates and their targets. If the percentage is pretty high, then that is the target of that particular drug. This technique requires calculations of the past to process the nest. It can calculate binding energy accurately.

The ‘BAR’ in DeepBAR stands for “Bennet Acceptance Ratio.” Earlier, the Bennet Acceptance Ratio was calculated manually. Today, it is being calculated with the help of computers and technology integrated with machine learning.  

The DeepBAR method to calculate the Bennet Acceptance Ratio uses two points or states of the protein. Two molecules are being studied in this method, a molecule of drug bound to the protein and another molecule not bounded to the protein. The information of intermediatory states is also applied in the calculation. Through machine learning implementation, the Bennet Acceptance Ratio is calculated even when the intermediatory state information is not taken into use.

This technique of avoiding intermediatory states’ knowledge in the calculation of the Bennet Acceptance Ratio is done through deep generative models. This technique calculates binding free energy accurately and unexpectedly faster than previous methods.

The DeepBAR method can, in the future, be used in protein design and engineering as it can accurately detect multiple proteins, which has now become feasible with the advent of computer vision and technology.

Binding free energy is used to calculate the affinity between a drug molecule and a target protein. This energy is inversely proportional to the binding of the drug molecule and the protein. The smaller the binding free energy, the stronger the binding will be. This implies that the drug with lower binding free energy will have more success competing against other molecules.

Scientists have been using Machine Learning as it can classify digital images of cells, each of which is treated with different experimental compounds. The algorithms used in machine learning can quickly group similar compounds just by looking at the images.

The drug discovery process, which usually takes more than ten years, can be accelerated by the advent of machine language, Artificial Intelligence, and computer science technologies.

Some of the current innovations in this field are: –

  • Insilico Medicine     

This is an artificial intelligence-based startup working in the field of pharma and medicine. The company discovered a new drug in just 46 days. All the tasks in the drug discovery procedure, from identifying and synthesizing to validating, were performed in a short period of time.

  • AlphaFold

AlphaFold is an AI system developed by DeepMind, a child company of Google. DeepMind works on building an AI algorithm to predict the 3D structure of the protein. The structure of a protein is a crucial step in identifying and designing drugs in drug discovery. This company can perform AI algorithm-based predictions of protein 3D structure in a quick span of time with high accuracy. With the introduction of AlphaFold 2, the accuracy has further increased. The results of AlphaFold 2 at CASP14 (Critical Assessment of Techniques for Protein Structure Prediction 14) were described as astounding and transformational. AlphaFold 2, in combination with High-Performance Computing, has been applied by researchers for genome scale protein and function prediction.

  • Variational Autoencoders (VAE)

Synbiolic is an AI-powered end-to-end rational drug design platform that aims to make drugs more accessible to people all around the world. Synbiolic leverages Variational Autoencoders (VAE), which is a generative machine learning algorithm. A generative model is an AI architecture algorithm that generates outputs based on previous training data. There are two components of a neural network of Variational Autoencoders:


The encoder is used to encode the input, thereby reducing the dimension of data. A compact representation of data is performed, which is known as latent space representation.


The decoder is used to decode the encoded data. It is the reverse of the encoder. It constructs the original dimension of data before it is encoded.

Artificial intelligence with machine learning and deep learning is advancing in the fields of medicine and biology with a positive consequence. With the advances in machine learning tools and techniques, the lengthy and costly drug discovery process has shortened and has become cost-effective and feasible.

2. Automation in Drug Discovery

In the area of drug discovery, automation enables pharmaceutical companies to make better decisions faster. The need for automation and robots in drug discovery is not new, but it is making continuous progress. Implementing high-throughput techniques requires automation. The application of automation and robotics started to increase throughput, but the constancy of the automated procedure and improved data quality made it more desirable. Human error is reduced, and an audit trail enables traceability in the event that questions emerge. Scientist can devote their time to other tasks while the automation is running.

Recent advancements in fields like microfluidics-assisted chemical synthesis and biological testing, as well as artificial intelligence systems that refine a design hypothesis through feedback analysis, are laying the groundwork for more automation in this process.

Advancements in areas like ‘organ-on-a-chip’ technologies and artificial intelligence are intensely laying the groundwork for more widespread use of semi-autonomous or even fully autonomous processes to assist project teams in screening and optimizing tools and hit compounds in drug discovery.

Large pharmaceutical corporations initially used automation for high-throughput screening (HTS) trials. HTS allowed researchers to test libraries of small molecule compounds under a single or a series of numerous experimental conditions to see if they may be used to treat a specific ailment. HTS has progressed to the point that it can screen libraries of millions of compounds, but the exorbitant cost of equipment has limited automation to large pharmaceutical corporations.

These days, new types of robots, together with sophisticated software tools, have helped to democratize access to automation, allowing pharma and biotechnology companies of all sizes to implement these solutions in their labs.

Automation is being used in drug discovery for a variety of purposes, including parallel and combinatorial synthesis, virtual screening, molecular design for medicinal chemistry discovery cycles, and compound libraries. It could also be used to develop stable formulations for small molecules, proteins, RNA-based medications, and cell therapies in drug formulation studies.

In the drug discovery sector, automated liquid handlers are one of the most popular types of automated equipment. These machines range from simple liquid dispensing machines to more complex systems that allow variable-driven volumetric transfers, error handling to meet experimental variability, and even the integration of additional lab instruments and devices. Manual liquid handling that needs accuracy and precision can be time-consuming and raise the danger of repetitive stress injuries, therefore, these technologies are a godsend to scientists. The process can be automated to reduce risk and increase the quality and integrity of data acquired from experiments.

As automation becomes more widespread in the area of drug discovery, management software capable of coordinating the multiple equipment and systems in a lab has become increasingly important. This sort of automation software is always expanding to provide more functionalities and reliable data capturing of a variety of parameters, including experimental operations and ambient conditions.

3. Robotics in Drug Discovery

Since the 1980s, robots have been utilized in biomedical research, mostly for sample processing. Their significance in the drug discovery process has remained limited, which explains why the development, testing, and commercialization process takes 15 to 20 years on average.

Robotics is now making an increasing impact in the pharmaceutical industry. The use of robotics in drug discovery provides more options to drug developers around the world.

Recent improvements in laboratory automation and robotics, as well as advances in artificial intelligence and machine learning, have ushered in a new era in life science and pharmaceutical research. Tasks can now be completed at speeds and precision that surpass human capabilities.

Robot scientist in the laboratory automates scientific research using artificial intelligence (AI). It generates hypotheses on its own, prepares experiments, conducts them using laboratory automation equipment, analyses the results, and repeats the process.

AstraZeneca’s drug discovery robot ‘NiCoLA-B’ at the U.K. Center for Lead Discovery tests more than 300,000 compounds per day in a ballet of operations led by a central mechanical arm, which the company claims is the world’s quickest of its kind.

A digital biotech start-up ‘3Scan’, is working on a robotic microscopy platform for tissue morphology, medication response, discovery, and development, as well as automated cell counting and diagnostics. It creates a digital 3-D spatial screening map using its own robotic microscope and computer vision tools. It is also possible to assess the vascular system and blood flow. Life scientists will be able to work with larger data sets derived from tissue samples thanks to the system. Its knife-edge scanning microscope can slice a tissue sample into more than 3,000 slices each hour.

Eve, the robot scientist, has been assembled and is presently in use at the Chalmers University of Technology. Eve’s first objective is to find and test medications that are effective against Covid-19. ​


Evidently, drug discovery is a difficult task that necessitates skillful navigation across a multidimensional, multimodal search space. The implementation of recent technological advances in drug discovery provides tailored solutions to meet challenges in the field. Embracing new technologies discussed in the article (AI, machine learning, automation, and robotics) for planning and performing compound design, synthesis, and testing without fearing a loss of control could provide substantial improvements in the effectiveness of drug discovery.



Learn More:

 | Website

Anjali Soni is a Scientific Content Writing Intern at CBIRT. She's a student at Global Engineering College at Jabalpur, a writer by day, and a reader by night. She loves to listen to soft music and breathe in cold spring evenings alone under the tree, lost in her own world. Read, eat, travel, and enjoy are all that she wishes for in Life.


Please enter your comment!
Please enter your name here