In the third year of the COVID-19 pandemic, researchers reported another worrying virus. Identified in 35 people in eastern China since 2018, Langya henipavirus causes breathing problems, fever and other troubling symptoms.
Langya hasn’t been linked to any deaths yet. But it is related to some deadly viruses, so researchers were keen to develop vaccines. There was just one problem: a viral protein that could form the basis of a jab seemed impossible to make in the lab.
“If we can’t even study the protein, how are we going to understand how it works, and how are we going to make a vaccine?” says David Veesler, a structural virologist at the University of Washington (UW) in Seattle.
Preparing the world for the next pandemic
Now, thanks to artificial intelligence (AI), biologists have cracked Langya’s nut. In work posted on the bioRxiv preprint server in August, Veesler’s team describes using the prediction tool AlphaFold to map the structure of a protein with which Langya invades cells1. Another AI tool identified mutations that transformed the unruly molecule into a suitable vaccine candidate.
The research, which has not yet been peer reviewed, is part of nascent efforts to use groundbreaking advances in AI, such as AlphaFold and large language models, to prepare for future pandemics. Funders are pouring money into this approach, which is already bearing fruit. In a Nature paper published on 11 October, researchers report a machine-learning tool that can predict the evolution of viruses with the potential to cause a pandemic2. This information could improve the resilience of vaccines, including those against COVID-19, and could give the world a head start when the next pandemic threat appears.
“Does machine learning give us new arrows in our quivers? Yes, absolutely,” says Neil King, a structural biologist at UW. “But it’s still early days.”
‘We got lucky’
Table of Contents
During the COVID-19 pandemic, a stroke of luck meant that researchers were well prepared to develop vaccines. Previous research on other coronaviruses, including the one that causes Middle East respiratory syndrome (MERS), gave them a good idea of how to turn SARS-CoV-2’s genetic sequence into a vaccine. “We got totally lucky, hugely lucky,” says King.
But there are no such insights for many other viruses with pandemic potential. That’s where machine learning is, increasingly, playing a part.
AI tools are designing entirely new proteins that could transform medicine
Langya virus is part of a family called henipaviruses. These include the highly lethal Nipah virus and Hendra virus, which causes deadly outbreaks in horses and can be fatal to people. But Langya is sufficiently genetically distinct that countermeasures against its relatives — including a Hendra vaccine approved for horses and trialled in humans — are unlikely to work, says Veesler.
His team set out to map the structure of Langya’s version of the ‘G protein’ that is the target of infection-blocking antibodies in other henipaviruses, and the basis of the Hendra vaccine.
Initial efforts to coax human cells to make Langya’s G protein for study failed, so the researchers used AlphaFold to predict what the protein looks like. Another AI tool identified mutations that made the protein stable enough to investigate in the lab. The AI-tweaked viral protein is now the basis of a prototype vaccine for Langya, says Veesler. Machine learning, he says, “enabled something that would not have been possible otherwise”.
Faster vaccine design
Jason McLellan, a structural biologist at the University of Texas at Austin, has started using some of the same AI tools to study viral proteins and modify them for vaccine design. In some cases, they’ve identified changes that he would have missed otherwise.
“These new techniques are changing our everyday life, making things so much faster,” says Clara Schoeder, who works on computational design of therapies at Leipzig University in Germany.
She is part of an effort, funded by the Coalition for Epidemic Preparedness Innovations (CEPI) in Oslo, to use machine-learning tools to help create a library of potential vaccines for a set of worrying viruses. Earlier this year, the US National Institute of Allergy and Infectious Diseases announced US$100 million in funding for similar work on vaccine libraries.
Schoeder and her colleagues are exploring whether a type of neural network called a protein language model can help with vaccine design. These networks, inspired by large language models such as ChatGPT, are trained on vast troves of protein sequences. This could help the networks identify evolutionarily plausible changes that could be used to make better vaccines, says Brian Hie, a computational biologist at Stanford University in California. “It’s a very promising area,” he says.
Other researchers are using AI to make vaccines based on designer proteins. King’s team is leaning heavily on machine-learning tools based on image-generating AIs such as Midjourney. One creation that the group has not yet published is a designer protein that holds a snippet of a malaria molecule, forcing it into a shape that attracts a powerful antibody response. “These methods have totally changed the game in terms of what is possible to design,” King says.
Staying ahead
Researchers are also beginning to use machine learning to design vaccines that are one or two steps ahead of the most troubling viruses. COVID-19 vaccines were wildly successful at protecting against the disease at first. But the emergence of the immune-evading Omicron variant and its descendants has sapped the vaccines’ strength and durability. Updated vaccines — the most recent ones based on the XBB.1.5 lineage, which emerged in late 2022 — still lag months behind viral evolution.
Debora Marks at Harvard University in Cambridge, Massachusetts, applies AI to biological problems. In today’s Nature paper2, she and her team describe developing a deep-learning network able to predict mutations that help SARS-CoV-2 to spread and overcome immunity. They trained the model, called EVEscape, on coronavirus sequences generated before January 2020, when SARS-CoV-2 was first identified in Wuhan, China.
What’s next for AlphaFold and the AI protein-folding revolution
EVEscape is an impressive SARS-CoV-2 soothsayer. Half of the mutations the model predicted in a region of the cell-invading spike protein most prone to change have already been observed in real-world SARS-CoV-2 variants, a figure that should grow as the virus continues to evolve2.
In work posted on bioRxiv this week3, the team used the model to create a set of potential sequences for the SARS-CoV-2 spike protein, some containing as many as 46 mutations from the ancestral strain, with the hope of anticipating the virus’s future evolution and contributing to the development of experimental vaccines.
The model isn’t limited to SARS-CoV-2. Marks and her colleagues found that it could also predict the evolution of HIV, influenza, Nipah and the virus that causes Lassa haemorrhagic fever. When a new virus with pandemic potential pops up, the team hopes to be ready with predictions for its evolution — and perhaps even vaccines based on those predictions.
Melanie Saville, CEPI’s executive director of vaccine research and development, says that machine learning has the potential to vastly increase the number of vaccine designs that can be used quickly to identify the most promising candidates. “We just need to be cautious not to say this computational design is going to be the solution to everything,” she says. “But it’s a great starting point.”