OpenAI's Kevin Weil on the Future of Scientific Discovery
Takeaways from a fireside chat between the OpenAI VP of Science and a16z speedrun visiting partner Sam Shank
Earlier this month, OpenAI’s VP of Science Kevin Weil sat down for a fireside chat before an audience of a16z speedrun founders, hosted by SR visiting partner Sam Shank. After leading product teams at Twitter, Facebook, and Instagram, Kevin is now focused on accelerating scientific discovery with AI.
The conversation ranged from how AI capabilities are progressing to the future of robotic labs, how founders should think about building with AI, and why this might be the best moment in history to start a company.
Watch the talk in full below, or read on for five moments that stood out to us.
1) On the rapid nature of AI capability increases
Weil opened the conversation by pointing out that AI models have now solved open mathematics problems that humans had never solved before. When Sam Shank asked him to unpack how AI capabilities are progressing, Weil offered a framework that applies far beyond science:
“If there’s one thing I’ve learned over the last few years, it’s that you go very quickly from ‘models could never do this thing, it is beyond the capability of AI today’ to ‘models can just barely do this thing, and it kind of sucks at it, and it’s wrong most of the time, but you get these glimmers.’ Maybe it only works 5 or 10% of the time. And then six to twelve months later, it’s like ‘models are great at this thing, and I would always use AI anytime I ever do that again.’ In eval language, you go very quickly from zero to 5 or 10% to 60 or 80%.”
The implication for founders: if you see a capability that barely works today, it’s probably six to twelve months away from working well. Weil says we’re in exactly that middle phase right now with frontier science.
2) The science of 2050, but in 2030
Weil’s biggest bet at OpenAI is that AI will accelerate scientific discovery in ways we can physically feel: new materials, personalized medicine, fusion power. He laid out a specific vision for how that happens:
“The science of the future will definitely involve robotic labs and reinforcement learning loops that go through the real world. The model is thinking, maybe running a simulation, thinking some more, refining the experiment it can run using the best possible parameters, and then sending that to a bunch of robotic labs, which you can scale horizontally. The experiments run in real life. The results come back to the model. The model thinks, runs more simulations, thinks. You have multiple loops: tight loops with the model thinking and the simulation, and longer loops that go through the real world.”
“You have robotic labs that can scale horizontally, that can run 24 hours a day. They’re not grad students pipetting things who need to take breaks and sleep. And then the grad students can do things that are much more leveraging of what makes us human than pipetting things.”
Weil says this isn’t speculative. Robotic labs already exist, open math problems are already being solved by AI, and the pace of progress makes him confident the full loop won’t take long to close.
3) “I just wasted an hour”
When Shank asked about the moment Weil realized the world had fundamentally shifted, he didn’t point to a weekend project. He pointed to a meeting:
“I was sitting in a meeting and I closed my laptop because we’re all trying to be better about not multitasking. I had not gotten a Codex job running before I closed my laptop, and I was like, ‘Shit, I just wasted an hour.’ Not because of the meeting, but because I could have been multitasking during that hour. My Codex agent could have been fixing a bug or implementing a feature or doing something for me.”
“The same thing before you go to bed at night: ‘Okay, what really hard task can I give Codex and just let it chunk away for ten hours?’ If you’re really good at it, you’re not just juggling one job. You’ve got three or four things running in parallel across different work trees.”
This is a concrete picture of how the highest-performing people at OpenAI are already working: always running background tasks, always parallelizing, treating dead time as lost compute.
4) “The most fertile ground for startups that has ever been”
When the conversation turned to why Weil’s bullish on the current startup environment, Weil made the case that the sheer novelty of AI capabilities creates an unprecedented window for founders:
“Today, models can do something that computers have never been able to do in the history of computers. And in another month, that’s going to happen again. And in another month, it’s going to happen again. We don’t always know what’s coming. Sometimes these things are emergent capabilities of models as we build them. And when you’re surprised, everyone in the world that uses a model has this new capability that nobody has ever had before.”
“One of the reasons I’m so bullish about startups right now: there are so many new capabilities and the world doesn’t quite know what’s possible. OpenAI doesn’t always know what’s possible. We’re not going to have all the ideas. It is just the most fertile ground for startups that there has ever been.”
Weil connects this to a practical point: this moment selects for people who are high agency. If you have an interesting idea, you have no excuse not to get Codex working on it in the background while you do whatever you were going to do that morning. You might have a working prototype by end of day.
5) On mistakes founders make when building using AI
Shank asked what mistakes he sees founders make when building on top of AI. Weil’s answer was specific and practical.
“A lot of things today turn out best when you use an ensemble of models. You might have an initial model that’s orchestrating, putting a plan together, understanding what you should do to answer the question. Then you have different models, maybe cheaper models that are trained to do one thing really well. The orchestration model is calling the other models. I don’t see people doing that enough.”
“Behind the scenes, we use ensembles of models in lots of places. Trying to use small models where you can, bigger models where you need to, and then have them all work together versus just prompt-engineering one giant prompt and hoping the answer is right.”
For now, the gap between one-shotting a complex flow and reliably handling it with an ensemble is significant, and he sees too many startups leaving reliability on the table.
Thanks to OpenAI’s Kevin Weil for speaking with our founders. And for more weekly dives into the world of early stage startups, subscribe below.





The future is Now.