- Nvidia and research partners introduce Evo 2.
- Evo 2 can identify disease-causing mutations and assist in synthetic genome design.
Nvidia and its research partners have developed an artificial intelligence model designed to analyse genetic sequences at an unprecedented scale.
Announced on February 19, the Evo 2 AI is built to read and design genetic code from different life forms. By finding patterns in DNA and RNA sequences, Evo 2 can process biological data in ways that would take researchers years of manual work.
The model was designed to detect disease-causing mutations in human genes, and it can also generate synthetic genomes as complex as those found in bacteria. Scientists believe that the model’s ability to analyse data at scale could speed research in medicine, genetics, and bio-engineering.
Expanding AI’s role in biology
Evo 2 builds on its predecessor, Evo 1, which focuses on single-cell genomes. The newer version has been trained on 9.3 trillion nucleotides sourced from more than 128,000 whole genomes. Nucleotides are the fundamental components of genetic material.
The model also examines metagenomic data, expanding its knowledge base beyond bacteria, archaea, and phages to include genetic information from humans, plants, and multi-cellular species.
According to the researchers, such a model can recognise complex patterns in genetic sequences that would be difficult for traditional methods to detect. One of its primary applications is to identify dangerous mutations, like those associated with genetic illnesses.
In early tests, Evo 2 correctly identified 90% of potentially harmful mutations in BRCA1, a breast cancer-linked gene. Scientists believe that this capability could support the development of targeted gene therapies, allowing treatments to target only specific cells while lowering the risk of unintended genetic modifications.
Patrick Hsu, co-founder of the Arc Institute and senior researcher on Evo 2, described the model as a step toward generative biology, in which AI can “read, write, and think in the language of nucleotides.” He said Evo 2 has a wide understanding of genetic structures, making it useful for tasks like identifying disease-causing mutations, and designing artificial genetic sequences for scientific research.
Computing power behind Evo 2
Evo 2 was trained over several months using Nvidia DGX Cloud AI on AWS infrastructure, and used 2,000 Nvidia H100 GPUs. The model is capable of processing genetic sequences of up to 1 million nucleotides at once, allowing it to analyse complex relationships across entire genomes. To support this degree of processing, researchers developed a new AI architecture called StripedHyena 2, which is designed to handle large-scale biological datasets efficiently.
According to the team, the architecture enabled Evo 2 to process 30 times more data than Evo 1 and analyse eight times more nucleotides. Greg Brockman, co-founder of OpenAI, worked on the project during a sabbatical, helping to optimise the AI for large-scale biological research.
Applications beyond medicine
While Evo 2 has shown promise in medical research, scientists believe the model could also help progress in fields such as agriculture, environmental science, and synthetic biology. Some potential applications might include:
- Developing crops that are more resilient to climate change, with improved resistance to drought, pests, and extreme weather conditions.
- Engineering organisms capable of breaking down environmental pollutants, offering new approaches to reducing industrial and agricultural waste.
- Studying genetic adaptations in different species to better understand evolutionary biology and biodiversity.
Collaborative research effort
The project used Nvidia’s computing capabilities with research from the Arc Institute, a nonprofit organisation dedicated to addressing long-term scientific concerns. The institute was established in 2021 with $650 million in funding, and works with Stanford University, UC Berkeley, and UC San Francisco to advance research in bio-engineering, medicine, and genetics.
Evo 2 is now freely available to researchers worldwide through Nvidia’s BioNeMo research platform, which includes various AI-powered tools for analysing and modelling biological data. By making the model accessible, the research team hopes to speed innovation in genomics, synthetic biology, and other fields that rely on large-scale genetic analysis.
Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.
Explore other upcoming enterprise technology events and webinars powered by TechForge here.