Stanford Researchers Unveil Evo: AI for Synthetic Genome Generation

द्वारा संपादित: Надежда Садикова

Researchers from Stanford University and the Arc Institute have developed Evo, an artificial intelligence capable of generating synthetic genomic sequences from scratch. This tool, trained on microbial genomic data, has potential applications in biotechnology and synthetic biology.

AI models have transformed methodologies across various disciplines, particularly in biomedical sciences and molecular biology. Recent years have seen the emergence of innovative AI tools with broad applications, including therapeutic target identification and protein structure prediction.

A study published in the journal Science introduced Evo, a new AI tool that can generate complete genomes from scratch. Unlike other AI systems, Evo utilizes large language models, enabling it to generate entire bacterial genomic sequences. Its development presents new opportunities for the design of proteins and synthetic genomes using artificial intelligence.

Evo is a generative AI based on large language models, capable of identifying patterns in DNA and generating new sequences. Led by Dr. Brian Hie, the team trained Evo with data from over 27 million prokaryotic genomes, bacteriophages, and plasmids, allowing it to identify evolutionary patterns in DNA.

A significant advancement of Evo over previous AI models lies in its extensive context length, which allows it to process long DNA sequences. While other AIs analyze short DNA fragments, Evo can analyze longer sequences, enhancing its ability to identify connections between genes and genomic sequences.

After designing Evo, the research team evaluated its effectiveness in predicting the impact of mutations on protein functionality. They introduced specific mutations into prokaryotic cell genomes and compared Evo's predictions with results from other studies that generated the same mutations. The results indicated that Evo is more effective in predicting mutation effects than other AI models.

In a second phase of the study, Dr. Hie's team assessed Evo's potential for designing optimized versions of the Cas9 protein, a crucial tool for genetic editing via CRISPR. The authors trained the model with over 70,000 bacterial sequences coding for Cas proteins and their associated RNAs.

Post-training, Evo generated millions of synthetic sequences encoding Cas9 enzymes. Dr. Hie and his team analyzed these sequences, selecting the 11 most promising versions for laboratory synthesis and evaluation. Results showed that some Cas9 proteins synthesized by Evo were as efficient as the commercial version.

Historically, obtaining more effective versions of Cas9 posed a significant challenge for researchers, requiring the discovery of bacteria evolved to possess more potent enzyme variants. 'We don't have to wait for evolution to create a new Cas9,' explained Dr. Hie.

Generating synthetic genomes remains a major challenge in synthetic biology. Dr. Hie and his team questioned whether Evo could generate complete synthetic genomic sequences. The AI successfully generated much of the genomic sequences, including essential genes for cellular function, but omitted critical genomic regions necessary for survival.

Another limitation of Evo was noted in generating synthetic Cas9 protein sequences, where the AI proposed non-functional sequences. Such errors are common in many other generative AI models based on large language frameworks, including ChatGPT.

Despite its limitations, Evo represents a significant advancement in the use of generative AI tools. Future research will aim to enhance this tool for the design of proteins and synthetic genomes.

क्या आपने कोई गलती या अशुद्धि पाई?

हम जल्द ही आपकी टिप्पणियों पर विचार करेंगे।