Automating chem labs is only the first step. Once we our workflows include robots, we can unleash machine learning to optimize the process.
This fall researchers down the street at the University of Illinois Urbana Champaign report, and I quote:
“a simple closed-loop workflow that leverages data-guided matrix down-selection, uncertainty-minimizing machine learning, and robotic experimentation to discover general reaction conditions.”
([1], p.1)
Or, as the press release puts it, “powerful AI and a molecule-making machine to find the best conditions for automated complex chemistry”. [2]
If I understand correctly, the point is to discover general processes for synthesizing a desired molecule. There are many starting places, and many paths, so a brute force search is difficult. In practice, chemists usually start from a convenient input (e.g., from a small set of commonly used conditions), which works but is not optimal for most purposes.
The goal of this research is to figure out a better set of starting places.
The researchers report that they tried to use machine learning on the literature, seeking optimal starting points. This effort failed for the interesting reason that negative results are generally not reported. The ML could not learn without these results.
(!)
We’ve been telling you to publish negative results, and now we have a really good reason: they are useful!
The logical thing to do, then, is to connect the ML with a robot lab, and let it generate its own results, successful and not. At least if you work at my old Alma Mater, that’s the logical thing to do. : – )
Basically, this games is a search for short, fast, and cheap paths through a ginormous space of possible chemical building blocks and ways to combine them, looking to get to the desired molecule.
First, the system used published data to guide the search by “down-selecting” the starting points, i.e., picking out a small, strategic set of good places to start. The selected set were used in “seeding” experiments to further prune this set into “only” 512 initial conditions.
Second, machine learning was used to guide the search. This apparently involved some non standard data wrangling and probabilistic reasoning; not-quite-Bayesian plus “active learning”, minimizing uncertainty. (Honestly, I don’t know this math at all. I’m taking their word for it.)
Third, let’s close the loop. The search above is marching through samples of experimental conditions. So, at each iteration, the robot executes the experiments, and returns the results to the ML model. (The system prioritized experiments that were “unexplored”.) After only a relative few rounds, the uncertainly was not changing,
The result is a list of initial conditions and predicted yields. The ML actually explored a range of yields. The top yields known from the literature were near the top, but there were also reactions with even higher yields than previous reports.
Cool!
The researchers note that the dataset includes a wide range of results, showing that the search “learned by probing both low- and high-yielding conditions”, unlike the published literature which is “heavily skewed toward positive outcomes”. ([1], p.6)
Looking at the progress of the model, it seems to look for good reactions first, then looks wider for other even better candidates, and later focusses on reducing uncertainty by exploring “negative results”.
The resulting conditions are “higher-yielding general conditions”, i.e., good places to start for many different reactions. The model’s selections were tested as inputs to generate a variety of molecules. The results were substantially higher yields than the standard benchmarks, with the top achieving double the benchmark value. In addition, more of the samples achieved a practical minimum yield, i.e., would be useable.
Nice work all. Very neat stuff!
- Nicholas H. Angello, Vandana Rathore, Wiktor Beker, Agnieszka Wołos, Edward R. Jira, Rafał Roszak, Tony C. Wu, Charles M. Schroeder, Alán Aspuru-Guzik, Bartosz A. Grzybowski, and Martin D. Burke, Closed-loop optimization of general reaction conditions for heteroaryl Suzuki-Miyaura coupling. Science, 378 (6618):399-405, 2022/10/28 2022. https://doi.org/10.1126/science.adc8743
- Liz Ahlberg Touchstone, Artificial intelligence and molecule machine join forces to generalize automated chemistry, in Illinois Research News, October 28, 2022. https://news.illinois.edu/view/6367/1723467564