Home AI Scientists from the University of California Developed a Highly Accurate Algorithm ‘La...

Scientists from the University of California Developed a Highly Accurate Algorithm ‘La Jolla Assembler (LJA)’ to Scale-up Complete Genome Assembly

March 12, 2022

A team of international researchers led by the University of California San Diego’s Department of Computer Science and Engineering has demonstrated that the La Jolla Assembler (LJA), a new genome assembly algorithm, vastly improves large genome reconstruction, the process of arranging DNA snippets into complete genomes, which is a crucial part of genomic sequencing.

Furthermore, LJA lowers error rates and improves the capacity to scale the entire human genome assembly. This will make massive population studies easier to undertake, in which thousands or millions of people’s genomes are sequenced and compared in order to better understand the genetic variables that lead to disease. The findings were published in the journal Nature Biotechnology this week.

Image Description: UC San Diego computer scientist Pavel Pevzner’s team adopted a computational approach called de Bruijn graphs, which helped them assemble millions of “reads” into complete genomes. This technique models a genome as a complex road network that connects various cities (short genomic fragments) and finds ways to traverse the network while using each road.
Image Source: https://ucsdnews.ucsd.edu/pressrelease/algorithm-scales-ability-to-assemble-complete-genomes

“We used LJA to completely reconstruct almost half of the chromosomes in the human genome in a completely automatic fashion,” says Pavel Pevzner, the Ronald R. Taylor Distinguished Professor of Computer Science and senior author on the paper. The La Jolla Assembler (LJA) is a fast technique that uses the Bloom filter, sparse de Bruijn graphs, and disjointig generation to enable automated assembly of lengthy, HiFi reads.

When compared to previous assembly methods that used extended, high-fidelity (HiFi) readings, this resulted in a five-fold reduction in assembly mistakes. The precision of this method will be useful in large population investigations of complicated and little researched areas of the human genome, such as centromeres or antibody-generating sites.

Genome assemblers are computer programs that reassemble genomes from a set of shorter sequences (reads). Short read methods, which generate reads of up to 300 nucleotides, were virtually solely used by researchers for many years. These produced critical genomic data, but they also left gaps in genomic sequences, many of which were in biomedically significant areas. As a result, the Human Genome Project, which was completed two decades ago, left thousands of unassembled sections – unknown DNA with clinical and scientific implications.

“This incomplete human genome assembly produced a revolution in biology and medicine 20 years ago,” says Anton Bankevich, a postdoctoral researcher in the Department of Computer Science and Engineering and first author on the paper. “However, the missing pieces of the genome may hold many more secrets.”

Long, HiFi reads (greater than 10,000 nucleotides) have lately become popular among scientists, allowing them to sequence whole human and animal genomes. The Telomere-to-Telomere (T2T) group produced the first entire human genome last year, which was a significant milestone. This effort, however, necessitated a great deal of human labor and would be nearly hard to expand to hundreds, let alone millions, of genomes.

Pevzner’s team used de Bruijn graphs, a computer tool that helped them assemble millions of reads into whole genomes, to automate the process and boost speed and accuracy. This method, which represents a genome as a complicated road network linking numerous towns (short genomic segments) and discovers ways to traverse the network while utilizing each road, was devised by Dutch mathematician Nicolaas de Bruijn and has since become a sequencing workhorse. In some ways, history was repeating itself. Pevzner and others employed de Bruijn graphs to make sense of brief readings more than 20 years ago.

“Although it looks like simply applying this 20-year old technique to HiFi reads would lead to excellent human genome assemblies, all previously developed algorithmic ideas fall apart when faced with constructing the enormously complex de Bruijn graph of the human genome,” said Andrey Bzikadze, a co-author on the paper and a graduate student in the Bioinformatics and Systems Biology Program at UC San Diego. “Reusing old methods would require a prohibitive amount of computer memory, making them impossible to implement.”

This problem is solved by LJA, which reduces data footprint and assembly mistakes. It paves the way for faster and more accurate large-scale population studies, in which scientists will need to assemble millions of genomes to find the gene sequences that cause disease or bestow good health.

Assembling a single genome isn’t enough to boost biological progress. Scientists can learn about the functioning of different genomes and their links to illnesses by comparing them. As a result, we need to scale genome assembly efforts and develop algorithms that yield genome assembly of the same quality as the T2T human genome but can be done automatically.

Story Source: Bankevich, A., Bzikadze, A.V., Kolmogorov, M. et al. Multiplex de Bruijn graphs enable genome assembly from long, high-fidelity reads. Nat Biotechnol (2022). https://doi.org/10.1038/s41587-022-01220-6

https://ucsdnews.ucsd.edu/pressrelease/algorithm-scales-ability-to-assemble-complete-genomes

Dr. Tamanna Anwar

Website | + posts

Dr. Tamanna Anwar is a Scientist and Co-founder of the Centre of Bioinformatics Research and Technology (CBIRT). She is a passionate bioinformatics scientist and a visionary entrepreneur. Dr. Tamanna has worked as a Young Scientist at Jawaharlal Nehru University, New Delhi. She has also worked as a Postdoctoral Fellow at the University of Saskatchewan, Canada. She has several scientific research publications in high-impact research journals. Her latest endeavor is the development of a platform that acts as a one-stop solution for all bioinformatics related information as well as developing a bioinformatics news portal to report cutting-edge bioinformatics breakthroughs.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Meet BiOmics: The AI Agent Bridging Data and Biological Meaning

Scientists Discover TB’s Metabolic “Control Switch” — A New Target for Tuberculosis Drugs

ATHENA-R1: A Reinforcement Learning AI Agent for Smarter Drug and Treatment Decisions

OpenAI Unveils GeneBench-Pro to Benchmark AI in Genomics and Translational Medicine

Meet BiOmics: The AI Agent Bridging Data and Biological Meaning

Scientists Discover TB’s Metabolic “Control Switch” — A New Target for Tuberculosis Drugs

ATHENA-R1: A Reinforcement Learning AI Agent for Smarter Drug and Treatment Decisions

OpenAI Unveils GeneBench-Pro to Benchmark AI in Genomics and Translational Medicine

NVIDIA BioNeMo Agent Toolkit: Turning AI Agents into Biomolecular Scientists

Meet BiOmics: The AI Agent Bridging Data and Biological Meaning

Scientists Discover TB’s Metabolic “Control Switch” — A New Target for Tuberculosis Drugs

ATHENA-R1: A Reinforcement Learning AI Agent for Smarter Drug and Treatment Decisions

OpenAI Unveils GeneBench-Pro to Benchmark AI in Genomics and Translational Medicine

NVIDIA BioNeMo Agent Toolkit: Turning AI Agents into Biomolecular Scientists

Meet BiOmics: The AI Agent Bridging Data and Biological Meaning

Scientists Discover TB’s Metabolic “Control Switch” — A New Target for Tuberculosis Drugs

ATHENA-R1: A Reinforcement Learning AI Agent for Smarter Drug and Treatment Decisions

OpenAI Unveils GeneBench-Pro to Benchmark AI in Genomics and Translational Medicine

NVIDIA BioNeMo Agent Toolkit: Turning AI Agents into Biomolecular Scientists

Dr. Tamanna Anwar

LEAVE A REPLY Cancel reply

Must Read

Meet BiOmics: The AI Agent Bridging Data and Biological Meaning

Scientists Discover TB’s Metabolic “Control Switch” — A New Target for Tuberculosis Drugs

ATHENA-R1: A Reinforcement Learning AI Agent for Smarter Drug and Treatment Decisions

OpenAI Unveils GeneBench-Pro to Benchmark AI in Genomics and Translational Medicine

NVIDIA BioNeMo Agent Toolkit: Turning AI Agents into Biomolecular Scientists

Company

Latest News

Scientists Discover TB’s Metabolic “Control Switch” — A New Target for Tuberculosis Drugs

ATHENA-R1: A Reinforcement Learning AI Agent for Smarter Drug and Treatment Decisions

OpenAI Unveils GeneBench-Pro to Benchmark AI in Genomics and Translational Medicine

Popular Categories