A sputtering machine, controlled by software that had read thousands of research papers that most scientists would never see, was silently layering thin coatings in a quiet corner of a Cambridge lab. That machine was making suggestions rather than just carrying out tasks. It had examined unsuccessful recipes, identified minute irregularities, and suggested future steps. In essence, this is what companies like Lila Sciences are aiming for—a future in which AI doesn’t merely compute what ought to occur but also adjusts based on what didn’t work.
A new scientific perspective has emerged in recent years, propelled by businesses that view error as a signal rather than noise. Among those spearheading this shift are Periodic Labs and Radical AI, creating systems in which machines learn by examining failure rather than merely success. These AI systems, which are incredibly adaptable and becoming more perceptive, search through trashed data, searching for important hints among the forgotten and the flawed.
Neither a Silicon Valley keynote nor a whiteboard brainstorm led to this change. It originated on the laboratory bench—from the ongoing annoyance of materials that did not form as expected, from batteries that broke down too soon, and from substances that seemed stable on paper but collapsed when exposed to moisture. The realization dawned at that point: what if the dataset contained the errors?
These startups are creating AI systems that are extremely effective at identifying abnormalities by including failure as training data. These agents start to show patterns, whether it’s a temperature fluctuation that destroys conductivity or a mix ratio that continuously works poorly. Then, in an incredibly brilliant move, they try again, but this time in a different way.
This strategy is being ingrained in the foundation of Periodic Labs in San Francisco. Ekin Dogus Cubuk, one of the cofounders, had contributed to the creation of DeepMind’s enormous theoretical materials library. Liam Fedus, his partner, made contributions to ChatGPT. Together, they are developing tools that plan and direct synthesis, fostering a dynamic dialogue between past mistakes and upcoming experiments rather than static directives.

Robots aren’t where they’re starting. With the use of models that have analyzed the history of chemistry, they are beginning with people. These AI agents do more than just predict results; they also identify anomalies, foresee potential problems, and encourage further investigation. This technique makes discovery less linear and more iterative, more akin to jazz than classical.
Radical AI, a branch of Berkeley’s A-Lab, has advanced it with a completely self-sufficient platform that iterates through physical experiments without requiring human input. The system learns in real time and is incredibly effective, getting much better with every cycle. In just a few minutes, its robotic arms can combine powders, measure outcomes, and make adjustments to plans. Direction is equally as important as speed. The machine gets closer to making a more intelligent prediction with each compound that fails.
There is still tension, of course. A technician at Lila Sciences meticulously adjusted the sputtering angle as I observed an early-stage experiment—evidence that human monitoring is still crucial. However, the program next to her had marked the experiment because it was unique enough to be intriguing, not because it was likely to be successful. It was now machines that were modeling that urge, that curiosity.
How to scale something as abstract as scientific curiosity is a practical challenge for startups like these. Elegant failure is not rewarded by venture funding. It incentivizes outputs, including platforms, patents, and goods. Thankfully, these AI systems are important even if they don’t discover miracle materials. A more effective magnet or biodegradable material that breaks down more quickly could be the difference. Supply chains may change as a result of especially advantageous advancements, even if they are slight.
Researchers have found it difficult to validate even a small portion of DeepMind’s list of theoretically stable crystals in the lab after it was made public. Particularly in non-ideal circumstances, real materials exhibit uneven behavior. Data-trained agents perform better than simulations in this situation. They incorporate resilience into their process and take into consideration human messiness, including contamination, humidity, and undetected impurities.
The strategy is subtly divisive. The idea of letting algorithms create hypotheses is opposed by certain researchers. They contend that intuition, not pattern recognition, is frequently the source of real discoveries. However, that view is dwindling. Labs are increasingly using AI to enhance their peripheral vision—to identify what has been overlooked and to propose what is statistically promising but not yet proven—rather than to replace scientists.
These firms are producing direction at a surprisingly low cost for labs that sometimes operate on extremely tight budgets by utilizing rejected data points. They carry out fewer, more intelligent trials rather than rushing through dozens of experiments. For early-stage materials research, where testing can be costly and time-consuming, this change is especially novel.
