
Logan Kilpatrick on Who Ships AGI, DeepMind and the Problem With More Software
Also available on
Transcript
Simon Maple: Back in November, we hosted the first-ever in-person AI Native DevCon in New York. This June 1st and 2nd, we are bringing it to London. It's a two-day build for AI-native developers and engineering teams. One day full of hands-on workshops and one full of practical talks on agent skills, context engineering, agent orchestration and enablement platforms, and how teams are actually shipping AI in production.
Join us at the brewery in London, near the Barbican, for all of that, plus networking parties, giveaways, and a room full of people building the future of AI-native development. You could also join us from anywhere in the world via the livestream. As you're listening to this podcast, you get 30% off your ticket with code POD30.
Just head to ainative.devcon.io, and we'll see you in London.
Hello and welcome to another episode of the AI Native Dev, and we have a special episode today. We actually have Logan Kilpatrick from Google DeepMind joining us today. Logan, a massive thank you and welcome to the episode. How are you?
Logan Kilpatrick: I'm hanging in there. It's another week of chaos in the AI ecosystem, so I'm trying to stay on top of it all.
Simon Maple: There have been wildfires all over Twitter and the social spaces already this week, haven't there? It's been quite the ride. And for those listening, it's April 1st today.
But this is the week in which Anthropic, knowingly or unknowingly, open-sourced Claude Code and things like that. So, lots of fun things happening in the world right now. Logan, for those of you who don't know Logan, Logan was previously a leader of the developer relations team at OpenAI and is currently a member of the technical staff at Google DeepMind.
So, Logan, why don't you tell us a little bit about, I guess, some of your time at OpenAI and moving over to Google, and what your day-to-day looks like.
Logan Kilpatrick: Yeah, there are lots of parallels. I do think the ecosystem has changed so much. I've been at Google; actually, today's my two-year anniversary at Google, which is crazy.
Simon Maple: Oh, congratulations.
Logan Kilpatrick: Yeah, thank you. I joined OpenAI at the end of 2022, a week after ChatGPT had launched and sort of got to launch GPT-4 and ship a bunch of, you know, the Chat Completions API, the Assistants API, plugins, GPTs, a bunch of other stuff across the ecosystem. And just seeing how fast things have changed and how much progress we've made.
I think my most interesting reaction is, you know, all the agent stuff 12 months ago. There were lots of conversations about agents in the developer ecosystem and more broadly, and to see them actually work now. People are building agent tech products; people are using agents to do the things that were sort of the promise, and it's interesting to see this, and we should talk more about it because it was kind of like a joke 12 months ago.
It was like, "Oh, you're doing agents, ha ha ha." Like, that doesn't really work. It's kind of science fiction. And I think today it's very real. And so I'm always thinking about, what are those next set of things? And so I spend a lot of time with our team thinking about what those next steps today are. things that we need to be thinking about that are sort of a little bit science fiction today?
We have a bunch of these "research to reality" monikers internally that we sort of think about, like how it's our team's responsibility to bridge research to reality. Because we sit in a research org inside of DeepMind, and our job is to build products for builders and developers.
So, the day-to-day looks like shipping products and shipping features and building a platform so that we can make that research-to-reality flywheel happen.
Simon Maple: That's amazing and such an exciting time. And I'd love to actually jump in on today; we'll start off with today, and then we'll kind of lean into tomorrow towards the end of the episode.
So, one of the super interesting pieces that have come out of Google recently is the AI Studio. And of course many are already familiar with, you know, more of the terminal UI style of using Gemini and things like this. So first of all, talk us through a little bit about what AI Studio is and why people would maybe lean into AI Studio more than Gemini and when people would kind of lean into the terminal versus the UI.
Logan Kilpatrick: Yeah, it's a great question. I think one of the beauties and one of the challenges of Google is that we have a very wide ecosystem. We're sort of touching every part of the developer lifecycle. We're touching every part of the different ecosystems across web and mobile and cloud, et cetera, et cetera.
AI Studio, I think the most simple way to frame it is a platform, a set of things that help you go from prompt to prototype to production and do that really fast using all of our AI infrastructure. So historically, AI Studio was like a UI playground that you could sort of test all of Google DeepMind's latest models and experiment with them.
And then actually take them into the API and go and build products and tools and apps on top of the Gemini API. And I think what's happened in the last six months is we've sort of continued to evolve the platform. We still have that playground functionality so you can test all the latest models on day one and sort of experiment with them.
But we also now have an entire vibe coding experience. And I think part of this is it's a natural extension of the work that we were doing with the playground, which is like: help people understand how to actually get these models into products and bring their ideas to life. I think the main thing that's also shifted is there's just way more people now.
It's not just developers. It's this sort of next-gen developer-builder persona who historically couldn't use these tools, who now can. And I think it's part of our mission in AI Studio and broadly inside of DeepMind is to bring this technology to as many people as possible. So now you can go to AIStudio/build.
You can vibe-code entire apps that are built on top of, you know, Firebase, which is our sort of storage solution in parts of Google. You can deploy using Cloud Run, you can sort of use Google Search, you can have grounding data with Google Maps and sort of, we're exposing all these bits of the Google ecosystem and lots more cool stuff coming on that front in a way that is just simple.
You can click a couple of buttons, you don't need to sign up for eight different accounts. And we're sort of bringing together the ecosystem and building this platform for people to build on top of, which has been really, really exciting. So, lots of cool things.
To answer the question specifically, like Gemini the app versus AI Studio, and like Gemini CLI versus AI Studio, and then even Anti-Gravity versus AI Studio, Gemini app is sort of like a personal assistant for your everyday use. So like, I was asking a bunch of medical questions to the Gemini app this morning.
The Gemini CLI is a form factor of the CLI. So if you're a developer, you use CLIs and sort of that feels native to you; great product experience as a developer to go and use that. And then Anti-Gravity is sort of this full-stack IDE agentic developer platform where you can actually go in and, similar to what you would do inside of like Cursor or Claude Code, et cetera, et cetera, you can build apps.
So if you're a developer, it's sort of your daily driver to go and write code and like really complex codebases, et cetera, et cetera. AI Studio sort of has a batteries-included app builder and then a bunch of underlying infrastructure that folks can build on top of.
Simon Maple: And do you think that a lot of this is very kind of dependent on how the engineer or the developer wants to work?
Some people prefer terminal UI, some people prefer more of the IDE approach. There's a version that people believe where sometimes, you know, vibe coding is maybe more for non-developers. And actually, senior engineers don't really need to change the way they work and things like that.
Do you feel like we will actually converge into an approach whereby if you look at development today, 99% of develop, or not today, sorry, five years ago, 99% of developers were using an IDE to develop; today that's very much more split. Do you feel like when we get to a stage where agentic development is just the norm, we will have a single typical way where people will develop code?
Or do you still feel like there will be the more technical folks that maybe want to lean into the terminal, other people will maybe want a visual aspect? Are we in a moment in time or do you think that's gonna persist?
Logan Kilpatrick: My assumption is that it will persist. And part of this is like back to the thread of personal preference. This has always been the case. There've been developers, pre-AI era, who were using IDEs. I've always been sort of an IDE user myself. I was using VS Code and before, like pre-Microsoft acquisition, that I used a bunch that I really liked. And at the same time there's always been like Vim and Emacs and a bunch of these more terminal CLI-centric tools.
And I think this is just like a developer comfort thing. There's like tradeoffs from an ergonomic perspective or tradeoffs from a productivity perspective. I think that will actually increase with all this stuff. And I think part of the worldview of this is the ability to create new software, the cost is going down.
So I think there'll be even more exploration. There'll be even more weird stuff. Like, you could imagine each developer actually has this very customised experience of how they like to build software, how their mental model, like the way I think about software, maybe is different because we learned in different languages and we sort of, you know, come from a different place or whatever it is.
So you can sort of really customise that, and I think developers will actually probably end up building a lot of this stuff for themselves. I also think there'll be platforms that have this level of extensibility where, you know, you could imagine ever in the future every developer has like a fork of VS Code, for example.
And it's like, there's the editor part of that, but then there's an agentic part of that as well that they sort of build themselves and customise. That's maybe like an extreme worldview, but I think the answer is probably something in the middle. And I think you'll see a lot of this developer choice meet people where they are.
All these different ecosystems require different stuff. People's level of comfort and familiarity with AI and these tools is also different. So it warrants a different product experience, and it's actually hard to build that highly extensible, customised product experience, at least today, if you're one of the teams that are building this product.
Simon Maple: Yeah, no, absolutely. I'd love to kind of go a little bit deeper into how we can vibe-code well. And I guess the one thing that you mentioned was if you want 30 things as part of your vibe-coding task, ask for 30 things in the first prompt. The model is now smart enough to handle that logic, and that was kind of a little bit different to the original form where we had to essentially ask for something small and build up because given too much information, too much context essentially would overwhelm that model. Do we need to rethink the way we were kind of originally taught to develop with AI?
Logan Kilpatrick: Yeah, a hundred percent. And I think it's not, my framing of this is like this phenomenon is going to continue. So you need to have this sort of, it's part of the challenge of this moment, and I fall into the bucket of the folks who have a difficulty with this as well. So I'm not saying that I've figured it out, but you need to have the mental plasticity to just sort of change with the ecosystem.
The way to use AI tools today, three months ago was different, and three months before that was different. And it's mostly if you want to get the frontier-level productivity gains, you need to sort of continue to evolve. And that's hard. It's just difficult as a human to continue to do that.
But that is the reality of what needs to happen. And this "ask for 30 things" example is a very acute way of feeling the difference. Truly, for me, 12 months ago I was like, "Let me ask for the bare minimum thing possible, because otherwise the model and the agent will fumble over itself, not be able to actually do what I ask."
And now I'm constantly kicking myself to be like, "Maybe I should ask for three extra things," or four extra things, or five extra things, or all 30 things that I want. And the rate limit is now how quickly can I ask for things. And that's just a very different world to be in. And I think that it's been literally in the last six months that the shift has happened.
I think this was not the case a year ago. So, yeah, you kind of need to, the rules are being rewritten under our feet and you can either sort of ride the wave or you're not gonna get these frontier-level productivity gains.
Simon Maple: Another thing that has really shown its face here is context. And I think prompt engineering was a thing which we discussed and talked about a couple of years ago, but seems to have died off now in terms of the importance compared to something like context engineering.
And in particular, I guess skills, over the last six months or so, skills have been built up and used so heavily by people for productivity gains. How much do you use or how much do you hold context and skills as a part of a real efficiency gain in your work, in AI Studio and in Gemini?
Logan Kilpatrick: We have a ton of skills that, when I do engineering work and use Anti-Gravity internally to sort of build AI Studio, our engineering teams build tons of skills, which is great for me because I'm not a day-to-day engineer at Google, so I don't know how a lot of the systems work. So they can bring that context into the skill and sort of influence the architecture decisions and stuff like that, which is really helpful.
My worldview has always been that prompt engineering was a bug. If you go and talk to users, they don't want to prompt engineer, and actually the things that you're asking them to add in already exist somewhere else. So your job as the human using AI systems, and sort of like the LLM or AI app 1.0 era, was to do the context engineering.
That was the value add of what you were providing as the human in that experience: going and finding all these disparate sources, bringing it into a little chat box, and then sending it off to the model so it could do something useful. I think my worldview sort of shifted when Deep Research came out, which was you could sort of take this really ill-formed question or idea or hypothesis and you could sort of send it off to Deep Research.
And Gemini would sort of go and browse the web and find all the different data sources and do this context engineering on the fly to really bring all the information in. I had this magic "aha" moment where the Deep Research UI would show you all the different sites it was visiting, and you'd see hundreds of sites and thousands of sites, even in some cases, being visited.
And it was just the epiphany to me that this is clearly the way that the products are going to end up going, which is: people do the thing that humans do, which is we ask our very context-thin questions, comments, requests, and then the system actually goes and does the work in order to go and find that.
And you see that now actually with coding tools, which is like: I can ask for some change. I'm in a million-line codebase somewhere, and the model will then go grab all the files and sort of look through and try to find and pull in the right pieces of context. I don't need to say, "And I think it's sitting in this random HTML file on this folder here." That is a complete bug.
You shouldn't have to do that. You should be able to, with pretty minimal context, go and do these things and the model should be able to figure it out. And so I'm very happy that on the coding side we've seen that as the direction of travel, and I think hopefully we'll see that in a bunch of other domains as well. But this is the unlock: the model on the fly doing context engineering.
It's an interesting thread of skills because they're obviously very helpful today. I think it will be a similar direction of travel, which is: the models will probably learn how to make skills on the fly and thus not, and actually the reason why this will end up being the case is because it's a token-saving efficiency thing. If you can get some really solid skill that has the right context, it just makes it so that the model can figure a bunch of this stuff out.
It is just gonna fumble its way through; it's gonna send a hundred requests to the Google Drive API, none of them will work, and it'll self-correct a bunch of times and read 20 web pages and look at 70 examples. It'll figure it out, but it just takes a lot of time and it wastes your tokens. So skills, I think in the short term, are a helpful way to just get around that.
But I would expect over time that the model just pre-writes a bunch of this stuff or pulls from some repository of domain authority skills, whatever it is, and then solves this problem so that humans aren't handcrafting skills in the way that I think they are today.
Simon Maple: Yeah, it's super interesting and I think I love the fact that you called prompt engineering a bug there, and I think that's absolutely the right way of thinking about it because there really was that disconnect between human and LLM, whereby typically it's the human not asking for something in a way that the LLM really expected to be asked about a certain thing.
Context is very interesting, particularly with skills, because the model will maybe not understand or know exactly how that user or that developer wants something to be presented. There are policies, or an average of how industry skills work. It's more how a company, maybe its policies, maybe very specific things in a skill work.
I guess the model can find that out locally. But I really like your approach there of saying it could do all that research, but if the skills are just there, they could be handcrafted once, or they could be built by the LLM, but once they're there, you might as well just have that in some repository and just pull that as needed. So I love that approach.
I'd love to ask a question about when you talked about it in a previous quote where you said software volume is going to be a million times what it is now in 10 years. That's both exciting, amazing, and scary at the same time. What do you feel that's going to do to the value of development or a developer? How's that gonna change when we have to look after and deal with that much code?
Logan Kilpatrick: Yeah, it's an interesting question because I think the landscape is shifting so quickly and so dynamically. I think there are a few things. In 10 years, what we consider to be development will likely look significantly different.
There will still be things that are similar; people will still be fixing problems with software. I don't think that's gonna go away. I think the way in which the tools are wielded, the scope, and the level of detail, I think a lot of those things are up in the air in my head as far as how it's going to shake out.
The interesting thing is why I am relatively bullish for software engineering or just engineering as a discipline in general is because as the amount of software in the world increases and as the number of people who are creating software and having software created on their behalf increases, there's going to be lots of problems.
There are going to be lots of things that don't work. There are going to be lots of edge cases. There's always going to be a frontier of what the tools can do that the average person wielding them can't do. And the difference between those two things is where I think engineering or traditional software engineering developers will add a huge amount of value because that gap is going to change, but the gap is also going to be ever-present, even if the models and the tools get really, really good.
I also think people get caught up in a lot of the pedagogy of this conversation, which is what I think about software engineering. I think about a way of solving problems, a way of thinking about the world, a way of approaching problems.
And I don't think about it; I'm not, at least personally, attached to the idea that I type keys and characters show up on the screen and then those things represent some formal structured programming language like Python or JavaScript or whatever. I think it's more general.
And actually, if you look at computer science education, in a lot of cases there's a difference in the way that it's taught in different places, but there's lots of computer science education, which is what I'm talking about, which is this very problem-solving, way-of-thinking-about-the-world approach.
AI doesn't minimise the value of that. I think it actually accelerates the value of it. And that value of the way of thinking about the world, the way of problem-solving, I think is going to be super, super valuable. Again, I think the value of typing keys that make characters render is probably going to go down, though I still think there'll be reasons to do that and there'll be value in it.
But this way of solving problems, I think, is going to persist, and I'm grateful that I spent the time to think about those things because it manifests in the way that I build stuff today.
Yeah.
Simon Maple: Do you think there's a mechanical sympathy there? Essentially, in terms of us understanding how things have been built to allow us to actually architect and make applications, you know, more reliable, more robust going forward.
Logan Kilpatrick: A hundred percent. Yeah. I think someone needs to have the level of depth to understand all these things. I think it's like, does everyone need that level of depth? Probably not. But you want to have experts to go deep in these different areas and think really deeply about the systems and know the right questions to ask.
Also, a lot of it is as the means to build software have dramatically increased; lots of these technical decisions don't have a real answer. They have somebody who has a strong opinion. And so I think you still need somebody to have the opinion based on their own experiences or their sense of the direction that you want to go to make a bunch of these technical decisions.
There's often not a single right technical direction. There are many possible technical directions, and it's based on people's own lived experiences and intuition and understanding of technical constraints that they make a bunch of decisions. So I think all of that will continue on in the future.
Simon Maple: Interesting. Let's continue looking into the future a little bit and talk a little bit about the wonderful topic of AGI. Now, a couple of years ago, you asked on Twitter (on X), "How long until AGI?" and Musk replied, "Next year." That didn't happen. And I actually really like your approach of sitting underneath the hype.
You said someone is going to weave together the right components at the product level with a model that's really smart, and people are going to call that AGI. I think that's really interesting because obviously the AGI timeline predictions that many have thrown around haven't really aged that well. So I guess where do you feel like we are today versus a year ago on the path to AGI?
Logan Kilpatrick: This conversation has gotten even more complex over time. Because when a lot of these conversations started a few years ago, and even 10 years ago, you didn't have tools that could do any of these things.
It was very academic, very philosophical. I think a lot of the definitions and preconceived notions are grounded in that initial stage where we didn't have tools that had any of these capabilities. Or the most advanced version was AI playing games.
But it didn't generalise to a bunch of other things yet. I think the product adoption and the tooling and the way that this is impacting our lives have changed so dramatically in the last three years that I've almost decoupled myself from some of these AGI conversations. I think it's important that somebody has these conversations and has this more academically rigorous point of view.
I don't have that point of view. My worldview is very grounded in how the average person is going to interact with this stuff. And I think for the average person, if you took the technology we had today and you brought it three years ago, they'd be like, "Holy crap, that's the future." These systems are so smart and can do everything."
So in some sense, the goalpost keeps moving. In some sense, I think AI coding is an example of so much value being created. All these research tools, all these other things, it feels like we're close to what folks will feel as this narrow superintelligence. I think that's closer to the way of describing it. If you could have a model or a system that could build anything with code, I would think about that as narrow superintelligence.
Humans can't compete on the same level in that respect. From the academically rigorous point of view, this AGI question is grounded in whether or not the models have the generality.
And I think they still don't have generality. At the same time that I can basically build anything I want with software today using AI coding tools, I can trip the models up with all of these goofy things that humans are able to really easily do. So I think the academically rigorous argument is that we don't have this general intelligence until the models don't get tripped up with these things. It's pretty easy to beat the models at poker or chess or any of these things that humans can be relatively good at.
My worldview is that for the average person, it doesn't matter. For the average person, these tools are super impactful. They're already creating tons of economic value.
From an AGI perspective, my worldview is that AGI is not going to be a model; it's going to be a product that somebody creates. I think folks would probably agree with this worldview, even those who have this very academically rigorous point of view, just given the way that the models are. Now, to do coding as an example, you need sort of an agent harness, and you need a product, whether it's a CLI or an IDE, to bring the model to life.
You need all these things. So I think this worldview is likely tracking the progress that we're making. My last quick comment is I think it's a realization that there's this capability overhang. So I think that's part of my worldview of why we will see it's likely that it won't be that this model lab just dropped the new model and then everyone thinks it's AGI.
I think it's going to be that there was a model that came out three months ago, and some really smart product and engineering team found a way to do something really interesting with it and put it into a system that I think people generally think is AGI. Because there's this huge capability overhang that exists today.
Simon Maple: And that resonates so strongly with me because if you think about what's actually unlocked development today with AI, it's not a model that has got significantly better than any other model.
It's the fact that agents have been layered on top of that. And it's that interaction between agents and, well, although it's an LLM itself, an agent and a backend LLM, that has made agentic development so much more powerful.
And I think that when you say AGI will probably feel like a product release, it's that environmental aspect, how you actually build your system that has various products in vs. a single LLM that says, "Okay, ask me anything you want." I'm AGI."
That doesn't make sense, but what you're saying there, as a product or a feeling with a set of products in an environment, makes a ton more sense. If AGI does arrive more as a product experience than a single model, who is down for building that? Is that one company, one individual, or is that more a combination from an ecosystem of products?
That's a good question. I think this is one of the interesting threads about being at Google. I think we have product distribution across all of these things, across so many different verticals and places that you might expect to need to be good if we really were to have this general level of intelligence.
So yeah, I don't know. It'll be interesting to see. I don't know if I have the answer, but I would expect the Gemini app, as a personal assistant, to help you in your everyday life should it, if it's artificially generally intelligent, be able to dispatch and work with all of these different tools and ecosystems to help me complete any task that I want.
And I think we're trending in that direction with tool use and all these other things that are coming to the app. And I think that paradigm will continue as you'll likely have this product experience where one system may rely on many other systems.
But I think it is underappreciated, probably, the level of breadth and depth in the complexity of solving this artificial general intelligence thing.
I would be very surprised if we end up with just this chat UI where you go in and you just ask your question and everyone's like, "This is AGI because it can do anything." I think it's going to probably be very orchestrated, with lots of different stuff, a ton of UI complexity, and all these different things. I think it is going to be much more verbose than this simple AGI idea that folks have. That's at least my personal feeling of how things will shake out.
Simon Maple: Hey everyone. Hope you're enjoying the episode so far. Our team is working really hard behind the scenes to bring you the best guests so we can have the most informative conversations about agentic development, whether that's talking about the latest tools, the most efficient workflows, or defining best practices.
But for whatever reason, many of you have yet to subscribe to the channel. If you are enjoying the podcast and want us to continue to bring you the very best content, please do us a favor and hit that subscribe button. It really does make a difference and lets us continue to improve the quality of our guests and build an even better product for you.
Alright, back to the episode. Cool. Logan, this has been awesome. I want to jump into some quick-fire questions to wrap up this episode. First of all, and honestly, Logan: OpenAI or Google? Where did you have more fun?
Logan Kilpatrick: I'm having more fun at Google these days. I think it's an incredible place to build products, so.
Simon Maple: Awesome. When you were at OpenAI, what was one thing Sam Altman got right that no one really gives him credit for?
Logan Kilpatrick: I think people give Sam a lot of credit for a lot of things. I think he was right about the level of compute. I think Sam was maniacally focused on making sure that they were resourced for the level of AI consumption.
And I think he was right to make that the focus. Even basically the first meeting I was ever in with Sam, he was talking about that. So, he was definitely right about it.
Simon Maple: Amazing. There are a lot of big labs out there now, all jostling for position. If you had to bet on one lab not existing in five years, which one would it be?
Logan Kilpatrick: That's a good question. I think maybe the bet would more so be that the way that we see them as labs today will be different. So I think one of these labs will likely evolve into something that probably doesn't look like the lab of today's era. But I think all of them have; my sense is there'll be many winners in the ecosystem, but I think people will pivot into different areas. There is, good, actually. I don't know if the social media analogy is perfect, but a lot of the early social media products looked very similar, and over time they ended up diverging.
It's very clear to most people that Snapchat's a completely different business and product than Instagram, than X and Twitter, et cetera, et cetera. So I think we end up in that kind of setup a little bit.
Simon Maple: Awesome. What would you say is most overrated today: agents, RAG, or prompt engineering?
Logan Kilpatrick: I think people have moved on from RAG and prompt engineering. So they perhaps were overrated before, but I feel like they're probably adequately rated now, which is that people don't put a lot of stock in them. Obviously still important, both from a conceptual point of view, but I think the frontier has moved on, and people are focused on other things.
Simon Maple: Sounds good. What is the most over-hyped AI benchmark today?
Logan Kilpatrick: I like SWE-bench a lot. My only comment on this is there are better versions now, and actually the folks who built SWE-bench have done a great job of continuing a bunch of the different versions of SWE-bench, but some of the original versions of SWE-bench are completely out of distribution of how I think most people do development work.
It's like I'm off on this; I don't remember the specific numbers, but it's like 40% of the original SWE-bench is the model setting up Django. Which is fine, and it's great that it can measure that, but probably out of distribution from how developers spend 40% of their time.
So they've done a great job of continuing to evolve. So I think maybe not that they're overhyped, but they've done a great job of meeting the moment with some of the new SWE-bench stuff that they've done.
Simon Maple: Yeah. In 10 years' time, will there be more or fewer software developers than today?
Logan Kilpatrick: I think the absolute number will probably be very similar, and I think there'll be new role profiles where there'll be maybe 10 to 100x more people who are touching code on a daily basis.
They're probably also just, this "member of the technical staff" profile is a good example of this. They're probably just doing other things. So the absolute number of developers is the same number of people, but the number of people touching code, I would expect, to be dramatically more than what it is today.
Simon Maple: What would you say? There have been a lot of amazing research papers over the last few years. What would you say is the most important AI paper of the last couple of years?
Logan Kilpatrick: It's probably scaling laws or the original transformer paper. Just sort of setting up this industry that we're in today. And those two things have really held true and had a huge impact.
Simon Maple: Which AI company would you say is most underestimated right now?
Logan Kilpatrick: I think people underestimate DeepMind, actually. I am there. Every six months I think to myself, "Is Google the best place in the world to be doing this work?" And every six months the answer is the same, which it is.
I think there are just so many great things about being at Google and the talent and the way Demis runs DeepMind; I think it all deeply resonates with me.
So, it's hard. Google's a big ship, but when you steer it in the right direction, things go really well. So I'm excited.
Simon Maple: And the last question, Logan: What's the one thing you wish developers would stop asking you?
Logan Kilpatrick: I wish they would stop asking rate-limiting stuff. And I think there's a thread of this; it's not because I don't have a lot of empathy for folks asking this question. We're doing a lot on the product side, both on AI Studio and across Google and other products, to solve this problem.
The future version of the world that I want is abundant compute. Developers can just do the things that they want; they're not worried about rate limits and they're not worried about quota stuff. I think we need to do a bunch of stuff from a product perspective to make that possible. But so, I selfishly hope they stop asking because we solve this problem, not because I'm annoyed by them asking.
Simon Maple: That sounds amazing. That's a future we can definitely live in. Logan, this has been absolutely great fun. Thank you so, so much for taking the time out to speak with us, and I appreciate it.
Logan Kilpatrick: Thank you for all the thoughtful questions. This was a ton of fun.
Simon Maple: Awesome. Thanks everyone for listening.
Tune into the next episode soon. Bye for now.
Chapters
In this episode
"If you could have a system that could build anything with code, humans can't compete on the same level. That's narrow superintelligence, and we're close."
In this episode of AI Native Dev, Simon Maple sits down with Logan Kilpatrick, who spent years at OpenAI working alongside Sam Altman before moving to Google DeepMind as Group Product Manager.
They get into:
- There will be 100x more developers in the world because of AI
- AGI will be a product, not a model
- The way you used AI tools three months ago is already wrong
- What's actually changing inside the Gemini team and why it matters for developers building with it today
The developers who win won't be the ones who mastered today's tools. They'll be the ones who never stopped learning the new ones.
Prompt Engineering Was a Bug: Context Engineering and the Future of AI Development
Twelve months ago, agent-based development was something of a joke in developer circles. The technology existed in demos but failed in practice. Today, agents work. Products ship. The shift happened faster than most predictions suggested, and understanding what changed matters for anyone building with AI.
In a recent episode of the AI Native Dev podcast, Simon Maple spoke with Logan Kilpatrick, a member of the technical staff at Google DeepMind and formerly a developer relations leader at OpenAI. The conversation ranged from practical tips for working with current tools to longer-term perspectives on AGI, but a consistent thread ran through it: the rules of AI development are being rewritten continuously, and the developers who adapt fastest capture the most value.
From Minimal Prompts to Maximum Requests
The conventional wisdom of early LLM usage involved asking for the bare minimum. Models would fumble with complex requests, so developers learned to break tasks into small pieces, guide each step, and carefully constrain scope. That guidance is now outdated.
"Twelve months ago I was like, let me ask for the bare minimum thing possible, because otherwise the model and the agent will fumble over itself," Logan explained. "And now I'm constantly kicking myself to be like, maybe I should ask for three extra things, or four extra things, or all 30 things that I want."
The bottleneck has shifted. With current models, the constraint is how quickly developers can articulate what they want rather than how much the model can handle. This represents a fundamental change in the interaction pattern, and developers still operating under the old mental model are leaving productivity gains on the table.
The adjustment is harder than it sounds. Habits built over months of careful prompting do not disappear overnight. Logan admitted falling into the same pattern himself, even knowing intellectually that models can now handle far more complexity. The practical advice is to push harder on scope and complexity with each request, treating it as deliberate practice rather than expecting the old habits to update automatically.
Why Prompt Engineering Was a Bug
The framing that stuck most from the conversation was Logan's characterization of prompt engineering as a bug rather than a feature. The entire practice emerged because models could not retrieve context on their own, forcing humans to do the work of assembling relevant information into a chat box.
"Your job as the human using AI systems was to do the context engineering," Logan observed. "Going and finding all these disparate sources, bringing it into a little chat box, and then sending it off to the model so it could do something useful."
The evolution toward better tools means that burden is shifting. Deep Research demonstrated the pattern: users submit loosely formed questions, and the system handles the context gathering, visiting hundreds or thousands of sites to assemble what it needs. The same dynamic now applies to coding tools that can search codebases, pull in relevant files, and construct their own context from a minimal starting prompt.
This trajectory suggests skills, which have become central to context engineering (/blog/context-engineering-guide) workflows, represent a transitional phase rather than an endpoint. They provide efficiency by front-loading context that models would otherwise need to discover through trial and error. But the longer-term direction points toward models that can generate and manage their own skills dynamically, pulling from repositories of domain knowledge or building what they need on the fly.
"Skills, I think in the short term, are a helpful way to just get around that," Logan noted. "But I would expect over time that the model just pre-writes a bunch of this stuff or pulls from some repository of domain authority skills."
What Development Tools Will Look Like
The conversation addressed the proliferation of development environments, from IDE-based tools like Cursor to CLI approaches like Gemini CLI to browser-based builders like AI Studio. The question of whether these will converge or diverge prompted a clear prediction: divergence will likely increase rather than decrease.
Developer preferences for interaction patterns existed before AI tools, with some preferring IDEs while others used Vim or Emacs. AI adds another dimension of customization. Logan suggested that developers might eventually have highly personalised environments, potentially even forks of tools like VS Code that include custom agentic capabilities.
The cost of creating software continues to drop, which enables more experimentation and specialization. Rather than converging on a single optimal workflow, the ecosystem may produce an expanding variety of approaches tuned to different mental models, languages, and problem domains.
For organizations, this implies flexibility in tooling policies. Forcing a single approach may sacrifice individual productivity gains that come from letting developers work in their preferred environments. The tradeoff between standardization benefits and individual optimization is shifting toward the latter as tools become more capable.
Software Volume and Developer Value
Logan made a striking claim: software volume could be a million times larger in ten years than it is today. The natural follow-up question concerns what that means for developer value.
The answer draws on a distinction between coding as typing and software engineering as problem-solving. The value of typing characters that compile into programs is likely to decline. The value of understanding problems deeply, making architectural decisions, and having strong opinions based on experience is likely to persist or increase.
"AI doesn't minimise the value of that way of thinking about the world, the way of problem-solving," Logan argued. "I think it actually accelerates the value of it."
There will always be a frontier of capability beyond what average users of AI tools can achieve. The gap between what tools can theoretically do and what most people can accomplish with them creates space for developers who understand the systems deeply enough to push toward that frontier. Edge cases, debugging complex failures, and making technical decisions where no single right answer exists all require human judgment even when AI handles implementation.
The pedagogical implication matters as well. Computer science education that emphasises problem-solving approaches and ways of thinking transfers to AI-augmented development. Education focused narrowly on syntax and manual coding translates less directly.
AGI as Product, Not Model
The conversation touched briefly on AGI, where Logan offered a framework worth considering. His view is that AGI will emerge as a product experience rather than a model release. The distinction matters because products involve orchestration, UI, integration with other systems, and all the complexity that goes into making capabilities accessible and useful.
"I would be very surprised if we end up with just this chat UI where you go in and you just ask your question and everyone's like, this is AGI because it can do anything," Logan predicted. "I think it's going to probably be very orchestrated, with lots of different stuff, a ton of UI complexity, and all these different things."
This perspective aligns with what happened with coding agents. The models capable of writing code existed before the agent harnesses that made them useful in practice. The product layer that orchestrates model capabilities, manages context, handles tool use, and presents appropriate interfaces is what transformed potential into productivity.
For developers and companies building on AI, the implication is that capability improvements in underlying models matter, but so does the entire stack of product work that makes those capabilities usable. The gap between what models can do in principle and what products deliver in practice represents both an opportunity and a necessary investment.
The full conversation covers additional ground on Google's AI Studio platform, the evolution of SWE-bench and other benchmarks, and predictions about the lab landscape over the next five years. Worth a listen for anyone tracking how the major platforms are thinking about developer tools and the trajectory of AI capabilities.