AI Creates New Fluorescent Protein From Scratch, A ‘ChatGPT Moment’ for Biology

Just as ChatGPT generates text by predicting the next word in a sequence, a new AI model, ESM3, can create entirely new proteins from scratch. This groundbreaking model, developed by researchers from EvolutionaryScale, was trained on a massive dataset of 2.78 billion proteins, learning their sequences, structures, and functions. ESM3 is similar to OpenAI’s GPT-4, the engine behind ChatGPT. Researchers trained the model by masking parts of protein information and having it predict the missing pieces. This builds upon previous work by the same team, who in 2022 developed a precursor to ESM3 that predicted unknown microbial protein structures. While other AI models have been trained on protein data, ESM3 stands out for its scale and its ability to generate novel proteins with specific functions. The model can be queried to design proteins with desired properties, such as fluorescence. This capability has led some to describe it as a “ChatGPT moment for biology.” In a new study, researchers used ESM3 to create a new fluorescent protein that shines in a distinct shade of green. This type of protein is essential for biological research, allowing scientists to track and visualize molecules by attaching them to fluorescent proteins. The model generated 96 proteins with structures likely to produce fluorescence, and the researchers selected one with the fewest similarities to naturally occurring fluorescent proteins. Despite being initially less bright than natural green fluorescent proteins, the researchers iteratively improved its brightness using ESM3, ultimately creating a new, nature-unseen green fluorescent protein, named “esmGPF.” This process, which took the AI just moments, would have required 500 million years of evolution to occur naturally. While ESM3 has enormous potential for revolutionizing protein engineering and synthetic biology, some scientists caution that our understanding of how newly designed proteins behave in living systems is still limited. Despite these unknowns, the ability to create proteins with unprecedented speed and precision opens up exciting possibilities for fields like drug discovery, biomaterial development, and environmental remediation.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top