Here is a provocative perspective: emerging technologies in the research lab—like automation, machine learning, and artificial intelligence—are not revolutionary. They are complementary tools that enable the advancement of research.
This can be explained by looking at the historic progress of how experiments are performed: new technologies and methods are always emerging over time.
I will use examples from chemistry, because I can benchmark these against my own experiences, but the trend and paradigms are common across all disciplines.
In my undergrad organic synthesis labs, we performed syntheses in 100 mL round bottom flasks with about a gram of starting materials.
Today, the practice has evolved to be more efficient. As I scan lab manuals across different universities, I see that some labs still conduct experiments at this scale. However, a majority of schools have scaled down quantities. They are using microscale glassware. Experiments are done in a 10 mL round bottom flask or even in a 5 mL conical vial.
This reduces waste and cost, it increases safety, and it teaches students about how reactions are done in graduate labs.
Automation transforms single experiments into high throughput experimentation
When you use smaller sized equipment, you can fit more experiments into a fume hood. Each scientist usually only gets one fume hood. Space needs to be used wisely. I recall a colleague in graduate school remarking that she had completed four reactions in one day. That was quite a bit of work, and it was not something one does on a daily basis… back then.
Compare that experience to a presentation I saw at an industry symposium last November by the “High Throughput Experimentation (HTE) Group” of a global pharmaceutical company. This group is responsible for conducting large numbers of reactions to find the best reaction conditions to make a given compound. They screen multiple variables at a time, such as the catalyst, solvent, base, and temperature. The number of reactions that need to be performed are hence quite large.
To respond to the need for more experiments, they are now performing high throughput experiments in 96 well plates.
Multiwell plates have been common in life science research for over two decades. These plates have multiple wells, most often 96 to 1536 wells or more, serving as small “test tubes” to perform a different experiment in each well. When using a 1536 well plate, that is performing up to 1536 experiments at one time, which is what happens in a high throughput screening (HTS) campaign. An earlier post described how this is done.
What we see now is technology crossing over into other disciplines.
When adapted to the chemistry lab, each well in the plate accommodates a conical vial, allowing 96 experiments on each plate. This allows them to screen, for example, 12 catalysts x 4 solvents x 2 bases all on one plate. To control temperature, the plate is wrapped around a temperature jacket to heat up the vials. Hence, a stack of three 96 well plates examines reactions at three different temperatures. That’s 288 experiments at one time!
Once the scientist designs the experiments, anyone can go to the lab to collect the equipment and set it up on the robotic system, a process that takes just 30 minutes. Then the automation takes over.
Specialized software for this type of high throughput experimentation is emerging to run the robotics, manage the course of the reactions, record the reaction conditions, and collect the data. I am aware of Chemspeed and ACD Labs providing such software, and I am sure there are others.
The company used to take a week or more to find the optimal reaction conditions to scale up a compound. With this type of high throughput parallel experimentation, it can take as little as one day to find the optimal conditions.
In another application, performing a Design of Experiment (DOE) is usually a lengthy process, because one needs to conduct dozens of reactions to map out a response surface. With this type of automation, a DOE is done in one shot when you can do 96 experiments or more at one time.
What this means for the scientist is that it vastly simplifies the workflow, and it also changes the work of the scientist. In the example of my colleague back in her day, the vast majority of human effort was spent in the lab setting up the glassware, performing manual operations, and watching over the experiments. In the automated high throughput mode, the scientist’s time is now spent on thinking about what experiments can be done, on the interpretation of the results, and on thinking about science. In short, it improves productivity.
This automation used to be a luxury afforded by the largest pharmaceutical and specialty chemical companies, but not anymore. The costs are coming down. Automation is becoming mainstream. Even in a university, can you imagine if a university core facility with something like this?
Automation as a tool for discovery
The above example was using automation to optimize. Automation can also be used to discover.
The mechanism of a chemical reaction is studied using stop flow chemistry, a technique that was developed in 1940. This technique has its limitations, but it was for the longest time one of the few methods available.
A better method is to be able to analyze a reaction as it is happening in real time in the reaction vessel, in situ, so you can see intermediates as they are formed and converted into other species. However, most analytical instruments are too large and expensive to allow this to be done on a routine basis… until now.
It is only recently that analytical instruments with greater power, lower cost, and smaller size have become available commercially to allow rapid sampling and multiple analyses right at the benchtop.
Professor Jason Hein of the University of British Columbia specializes in situ analysis for chemical discovery.
Jason’s group uses automated sampling. They configure different analytical instruments on the fly, like Lego blocks, as needed, to analyze the chemical species in a reaction. These could be a temperature probe, an infrared spectroscopy probe, gas chromatography, mass spectrometry, or bench top nuclear magnetic resonance.
They use Gilson liquid handlers, driven by Trilution software. They write their own custom Python scripts. There is a real “garage” feel to the way this new chemistry is being improvised and assembled to study nature and to solve problems. This is the way it feels when one is working at the cutting edge of methodology.
As a case example (all of this work is published): they studied a reaction that takes 4 days in the lab to reach completion. In a manufacturing plant, this reaction will take 3 weeks. Can they speed this up?
They used a Mettler Toledo EasySampler to automatically withdraw reaction samples, quench it, and run HPLC, all done online. They got full kinetic data on the reaction in 24 hours. The data allowed them to identify intermediates that don’t appear in the final product. These intermediates allowed them to determine the mechanism of reaction.
Based on their understanding of the mechanism that they just discovered, they used a parallel synthesizer containing multiple 10 mL reaction vessels with automated valves to add reactants at different rates. This allowed them to find a new process that reduces the reaction time from 3 weeks to 2 days in the manufacturing plant.
The current direction of Jason’s group is to use AI and machine learning to self-optimize new reactions.
One of Jason’s postdocs has a PhD in art history. Her role is visualizing the graphical results. I described in previous posts (here and here) the importance of data visualization. Here is a research group that is expanding into more data utilization, and they understand how data visualization is critical to the communication of their results.
Automation integrates multiple emerging technologies
At another industry symposium last October, I saw a presentation from the “Analytical Enabling Technologies (AET) Group” of a large pharmaceutical company.
Like the prior examples, this group performs “data rich experimentation” with the aim of improving process chemistry (the reactions used for large scale manufacturing).
Their essential tools are chromatography, imaging, and chemometrics.
Mettler Toledo is becoming a leading vendor in this space. This AET group also use probes made by Mettler Toledo to monitor reactions: particle vision (and Focused Beam Reflectance Measurement), Raman spectroscopy, infrared spectroscopy, pH, and dissolved oxygen.
As a case example, a 50-person process chemistry team spent a year optimizing the reaction sequence to make 50 metric tons of product per year. Despite this effort, the process still releases substantial carbon monoxide side product. This is a toxic gas that requires a capital investment of expensive scrubbers in the plant to remove the gas. In fact, the environmental regulations in certain jurisdictions will not even permit this type of manufacturing.
Can they reduce the carbon monoxide output? They set up an automated sampling system similar to that described in the prior example. They used GC headspace sampling with an infrared probe to detect the carbon monoxide. They used dual mass flow controllers and a Mettler Toledo EasySampler to automatically withdraw reaction samples.
Within 12 hours, the automated system produced a 0-100 calibration curve. Normally, manual experimentation will take longer to obtain a calibration curve, and the range will be much narrower.
Through online sampling by HPLC, they discovered that the reaction mechanism has a temporal intermediate. By changing process conditions to minimize this intermediate, they were able to reduce the carbon monoxide side product. This saved tens of millions in capital investment for carbon monoxide scrubbers and also enabled the manufacturing to be transferable to different countries if necessary.
Again, we see these systems vastly simplifying the workflow and the work of the scientist. In prior times, one can spend considerable amounts of time collecting calibration curves. Now, one can set it up and let the automation do this task.
In this example, that 50-person team won’t be eliminated. Their work will focus on developing manufacturing that is more efficient and with a lower environmental impact.
The scientists that use the machines that will replace the scientists that don’t
This concludes the series on how emerging technologies are changing the practice of science and the career of scientists:
- The first post showed that accessing information and databases is becoming a critical skill.
- The second post made the point that electronic lab notebooks will be a standard practice of the modern lab, and that a lab’s data infrastructure will be critical to its capability.
- The third post described why statistics, modelling and simulation, and data visualization are the three broad capabilities where scientists need skills to manage their data.
- The fourth post and this one presents the perspective that automation, machine learning and AI in the research lab are but additional facets of a broader expansion of research into more data and data analytics.
New technologies continue to arise all the time. An underlying theme about many of the current emerging technologies is that they either generate or utilize large amounts of data. My point is that this will require scientists to have additional skill sets and capabilities to work with data.
Last year, I heard a quote more than once going around in conferences and in the press:
“It’s not the machines that will replace the chemists, it’s the chemists that use the machines that will replace the chemists that don’t.”
It was a quote that a New York Times article attributed to popular pharmaceutical industry blogger Derek Lowe. Derek told me that he heard it from someone else, but he forgot the source.
I think one reason this quote was popular is because it spoke to the feeling that there is some big revolution happening in the world, being driven by automation and artificial intelligence. I can’t speak for what revolutionary change these will bring for the future, and I don’t believe anyone else can either.
The lesson that I think can be learned from these words is more modest. It is that data tools are becoming the common tools of science, and they will be the tools of the craft that practitioners will need to know.
It is important to note that having these skills sets and infrastructure will confer formidable capabilities. Within industry, companies at the forefront of acquiring these capabilities will out-compete those that are not. Among start-up companies in the physical sciences, having these capabilities are essential.