Computer programming is crucial for life scientists, as it enables them to perform many essential research tasks. However, learning to code can be challenging for life science students and researchers. Recently, a study evaluated the effectiveness of one large language model (LLM), ChatGPT, in completing basic-to-moderate-level programming exercises from an introductory bioinformatics course. The results were remarkable and highlighted ChatGPT’s great potency in providing life sciences students and researchers with appropriate assistance to code and program!
The Increasing Need for Computational Tools for Life Scientists
There is an utmost need for life scientists to possess computing skills in order to formalize scientific processes, accelerate research progress, and improve job prospects. In fact, a 2016 survey revealed that nearly 90% of life science researchers had a current or impending requirement to use computational methods in their work. As a result, there have been more and more educational programs (courses, workshops, and training programs) that aim to provide computing skills in the field of life sciences to researchers/students.
What makes coding increasingly crucial for researchers is that it allows them to address tasks that cannot be accomplished with existing tools. Moreover, it will enable researchers to customize algorithms to suit their needs and effectively organize extensive and complex data. Currently, the programming language Python is a popular choice due to its ease of use and the availability of libraries for several tasks. However, finding effective methods to learn coding/programming skills represents a significant challenge in the life sciences context.
The Potential of ChatGPT
In recent years, advancements in AI, such as OpenAI’s large language model, ChatGPT, can potentially aid or even replace the need for life scientists to code and program. This could have significant implications for life sciences research and education. A recent paper posted onย arXivย evaluates the effectiveness of ChatGPT in performing bioinformatics programming tasks, suggesting that machine-learning models could produce usable solutions and transform how life science researchers and students approach coding in their careers.
The authors used ChatGPT for this study because it is a recently-developed language model, it is accessible on a web browser, it can communicate with users in a conversational fashion, and it is very popular with over 100 million active users at any given moment!
The study evaluated and documented ChatGPT’s effectiveness based on how it solves Python programming exercises from an introductory bioinformatics course taught to undergraduate university students. ChatGPT’s abilities to interpret prompts, respond to human feedback, and generate functional code were examined.
Performance of ChatGPT
The study assessed ChatGPT’s ability to solve 184 Python programming problems. They found that 139 problems were solved on the first try, but after interacting with the model and providing more information, ChatGPT solved a staggering 97.3% of exercises after a few attempts. Only 5 of the 184 exercises remained unsolved, and this was because of their complexity and the need to combine multiple programming skills to solve them.
Furthermore, the study also evaluated the factors that influenced ChatGPT’s success rate, including the difficulty level of the exercises, the length of the instructors’ solutions, and the exercise prompt length. Fascinatingly, the code length for the exercises solved by ChatGPT did not vary much from those not resolved. However, the instructor’s solution length strongly correlated with ChatGPT’s solutions regarding the number of characters and lines. It is worth noting that the exercise-prompt length did not significantly influence ChatGPT’s success rate.
The study also analyzed the number of attempts needed for ChatGPT to solve each exercise. It was observed that this positively correlated with the length of the instructor’s solution and the prompt, even for the unsolved activities. While this metric should be used cautiously due to ChatGPT’s stochastic nature, fewer attempts may suggest that the model could provide valid solutions more quickly, thus reducing user time. Moreover, the authors found that 53.3% of the prompts were framed in a biological context and that these prompts were longer than the others. Furthermore, for 24 exercises, the prompt size was higher than the maximum limit, and therefore the prompts were truncated. ChatGPT was still successful in solving all of these exercises, despite the truncated data.
Limitations
Despite the remarkable performance of ChatGPT, the authors did highlight several challenges when interpreting the programming prompts. In some exercises, ChatGPT used correct logic; however, it produced outputs that differed from the expected outputs. In other exercises, ChatGPT generated code that resulted in logic or runtime errors. Also, ChatGPT generated code that did not directly tackle the prompt in two exercises.
Furthermore, ChatGPT also showed several practical issues. For instance, ChatGPT used at least one programming technique that would have been familiar to most students in the course for 22.2% of the exercises. Also, a major limitation observed was that ChatGPT often generated code that only addressed the latter parts of prompts and possibly ‘forgot’ earlier parts of the conversation or was ‘distracted’ by subsequent inputs.
Implications and Conclusion
The remarkable findings of this study signal a new era for life science researchers as machine learning models are increasingly used for programming tasks. These models, such as ChatGPT, can accurately translate natural language descriptions into code for basic-to-moderate-level programming tasks, representing a significant advance to prior models and opening new possibilities for researchers and students in the field.
The authors acknowledge that the conversations with ChatGPT were sometimes awkward and required cognitive effort. Essentially, the authors stress the fact that while machine learning models can provide valuable learning assistance, they should not be relied upon exclusively. In fact, human feedback is crucial, notably when predicting the output of code, addresses the latter parts of prompts and possibly ‘forgot’ earlier parts of the conversation, or is ‘distracted’ by subsequent inputs.
To conclude, the study results demonstrate that ChatGPT represents a considerable advance in the life sciences field, with applications and implications for education and research!
Article Source: Reference Article
Important Note: arXiv releases preprints that have not yet undergone peer review. As a result, it is important to note that these papers should not be considered conclusive evidence, nor should they be used to direct clinical practice or influence health-related behavior. It is also important to understand that the information presented in these papers is not yet considered established or confirmed.
Learn More:
Diyan Jain is a second-year undergraduate majoring in Biotechnology at Imperial College, London, and currently interning as a scientific content writer at CBIRT. His passion for writing and science has led him to pursue this opportunity to communicate cutting-edge research and discoveries engagingly to a broader public. Diyan is also working on a personal research project to evaluate the potential for genome sequencing studies and GWAS to identify disease likelihood and determine personalized treatments. With his fascination for bioinformatics and science communication, he is committed to delivering high-quality content a CBIRT.