‘For me, the concern was just how easy it was to do’
It took less than six hours for drug-developing AI to invent 40,000 potentially lethal molecules. Researchers put AI normally used to search for helpful drugs into a kind of “bad actor” mode to show how easily it could be abused at a biological arms control conference.
All the researchers had to do was tweak their methodology to seek out, rather than weed out toxicity. The AI came up with tens of thousands of new substances, some of which are similar to VX, the most potent nerve agent ever developed. Shaken, they published their findings this month in the journal Nature Machine Intelligence.
The paper had us at The Verge a little shook, too. So, to figure out how worried we should be, The Verge spoke with Fabio Urbina, lead author of the paper. He’s also a senior scientist at Collaborations Pharmaceuticals, Inc., a company that focuses on finding drug treatments for rare diseases.
This interview has been lightly edited for length and clarity.
This paper seems to flip your normal work on its head. Tell me about what you do in your day-to-day job.
Primarily, my job is to implement new machine learning models in the area of drug discovery. A large fraction of these machine learning models that we use are meant to predict toxicity. No matter what kind of drug you’re trying to develop, you need to make sure that they’re not going to be toxic. If it turns out that you have this wonderful drug that lowers blood pressure fantastically, but it hits one of these really important, say, heart channels — then basically, it’s a no-go because that’s just too dangerous.
So then, why did you do this study on biochemical weapons? What was the spark?
We got an invite to the Convergence conference by the Swiss Federal Institute for Nuclear, Biological and Chemical Protection, Spiez Laboratory. The idea of the conference is to inform the community at large of new developments with tools that may have implications for the Chemical/Biological Weapons Convention.
We got this invite to talk about machine learning and how it can be misused in our space. It’s something we never really thought about before. But it was just very easy to realize that as we’re building these machine learning models to get better and better at predicting toxicity in order to avoid toxicity, all we have to do is sort of flip the switch around and say, “You know, instead of going away from toxicity, what if we do go toward toxicity?”
Can you walk me through how you did that — moved the model to go toward toxicity?
I’ll be a little vague with some details because we were told basically to withhold some of the specifics. Broadly, the way it works for this experiment is that we have a lot of datasets historically of molecules that have been tested to see whether they’re toxic or not.
In particular, the one that we focus on here is VX. It is an inhibitor of what’s known as acetylcholinesterase. Whenever you do anything muscle-related, your neurons use acetylcholinesterase as a signal to basically say “go move your muscles.” The way VX is lethal is it actually stops your diaphragm, your lung muscles, from being able to move so your lungs become paralyzed.
Obviously, this is something you want to avoid. So historically, experiments have been done with different types of molecules to see whether they inhibit acetylcholinesterase. And so, we built up these large datasets of these molecular structures and how toxic they are.
We can use these datasets in order to create a machine learning model, which basically learns what parts of the molecular structure are important for toxicity and which are not. Then we can give this machine learning model new molecules, potentially new drugs that maybe have never been tested before. And it will tell us this is predicted to be toxic, or this is predicted not to be toxic. This is a way for us to virtually screen very, very fast a lot of molecules and sort of kick out ones that are predicted to be toxic. In our study here, what we did is we inverted that, obviously, and we use this model to try to predict toxicity.
The other key part of what we did here are these new generative models. We can give a generative model a whole lot of different structures, and it learns how to put molecules together. And then we can, in a sense, ask it to generate new molecules. Now it can generate new molecules all over the space of chemistry, and they’re just sort of random molecules. But one thing we can do is we can actually tell the generative model which direction we want to go. We do that by giving it a little scoring function, which gives it a high score if the molecules it generates are towards something we want. Instead of giving a low score to toxic molecules, we give a high score to toxic molecules.
Now we see the model start producing all of these molecules, a lot of which look like VX and also like other chemical warfare agents.
Tell me more about what you found. Did anything surprise you?
We weren’t really sure what we were going to get. Our generative models are fairly new technologies. So we haven’t widely used them a lot.
The biggest thing that jumped out at first was that a lot of the generated compounds were predicted to be actually more toxic than VX. And the reason that’s surprising is because VX is basically one of the most potent compounds known. Meaning you need a very, very, very little amount of it to be lethal.
Now, these are predictions that we haven’t verified, and we certainly don’t want to verify that ourselves. But the predictive models are generally pretty good. So even if there’s a lot of false positives, we’re afraid that there are some more potent molecules in there.
Second, we actually looked at a lot of the structures of these newly generated molecules. And a lot of them did look like VX and other warfare agents, and we even found some that were generated from the model that were actual chemical warfare agents. These were generated from the model having never seen these chemical warfare agents. So we knew we were sort of in the right space here and that it was generating molecules that made sense because some of them had already been made before.
For me, the concern was just how easy it was to do. A lot of the things we used are out there for free. You can go and download a toxicity dataset from anywhere. If you have somebody who knows how to code in Python and has some machine learning capabilities, then in probably a good weekend of work, they could build something like this generative model driven by toxic datasets. So that was the thing that got us really thinking about putting this paper out there; it was such a low barrier of entry for this type of misuse.
Your paper says that by doing this work, you and your colleagues “have still crossed a gray moral boundary, demonstrating that it is possible to design virtual potential toxic molecules without much in the way of effort, time or computational resources. We can easily erase the thousands of molecules we created, but we cannot delete the knowledge of how to recreate them.” What was running through your head as you were doing this work?
This was quite an unusual publication. We’ve been back and forth a bit about whether we should publish it or not. This is a potential misuse that didn’t take as much time to perform. And we wanted to get that information out since we really didn’t see it anywhere in the literature. We looked around, and nobody was really talking about it. But at the same time, we didn’t want to give the idea to bad actors.
At the end of the day, we decided that we kind of want to get ahead of this. Because if it’s possible for us to do it, it’s likely that some adversarial agent somewhere is maybe already thinking about it or in the future is going to think about it. By then, our technology may have progressed even beyond what we can do now. And a lot of it’s just going to be open source — which I fully support: the sharing of science, the sharing of data, the sharing of models. But it’s one of these things where we, as scientists, should take care that what we release is done responsibly.