Logan Kilpatrick on Who Ships AGI, DeepMind

Careers Docs Registry Book a Demo

Logan Kilpatrick on Who Ships AGI, DeepMind and the Problem With More Software

Prompt engineering was a bug. Logan Kilpatrick of Google DeepMind on why models should do context engineering for you—and why AGI will ship as a product, not a paper. #AINativeDev

21 Apr 202638 min 17 secwith Logan Kilpatrick

context engineering Prompt Engineering Agentic Systems

Transcript

In this episode

"If you could have a system that could build anything with code, humans can't compete on the same level. That's narrow superintelligence, and we're close."

In this episode of AI Native Dev, Simon Maple sits down with Logan Kilpatrick, who spent years at OpenAI working alongside Sam Altman before moving to Google DeepMind as Group Product Manager.

They get into:

There will be 100x more developers in the world because of AI
AGI will be a product, not a model
The way you used AI tools three months ago is already wrong
What's actually changing inside the Gemini team and why it matters for developers building with it today

The developers who win won't be the ones who mastered today's tools. They'll be the ones who never stopped learning the new ones.

Prompt Engineering Was a Bug: Context Engineering and the Future of AI Development

Twelve months ago, agent-based development was something of a joke in developer circles. The technology existed in demos but failed in practice. Today, agents work. Products ship. The shift happened faster than most predictions suggested, and understanding what changed matters for anyone building with AI.

In a recent episode of the AI Native Dev podcast, Simon Maple spoke with Logan Kilpatrick, a member of the technical staff at Google DeepMind and formerly a developer relations leader at OpenAI. The conversation ranged from practical tips for working with current tools to longer-term perspectives on AGI, but a consistent thread ran through it: the rules of AI development are being rewritten continuously, and the developers who adapt fastest capture the most value.

From Minimal Prompts to Maximum Requests

The conventional wisdom of early LLM usage involved asking for the bare minimum. Models would fumble with complex requests, so developers learned to break tasks into small pieces, guide each step, and carefully constrain scope. That guidance is now outdated.

"Twelve months ago I was like, let me ask for the bare minimum thing possible, because otherwise the model and the agent will fumble over itself," Logan explained. "And now I'm constantly kicking myself to be like, maybe I should ask for three extra things, or four extra things, or all 30 things that I want."

The bottleneck has shifted. With current models, the constraint is how quickly developers can articulate what they want rather than how much the model can handle. This represents a fundamental change in the interaction pattern, and developers still operating under the old mental model are leaving productivity gains on the table.

The adjustment is harder than it sounds. Habits built over months of careful prompting do not disappear overnight. Logan admitted falling into the same pattern himself, even knowing intellectually that models can now handle far more complexity. The practical advice is to push harder on scope and complexity with each request, treating it as deliberate practice rather than expecting the old habits to update automatically.

Why Prompt Engineering Was a Bug

The framing that stuck most from the conversation was Logan's characterization of prompt engineering as a bug rather than a feature. The entire practice emerged because models could not retrieve context on their own, forcing humans to do the work of assembling relevant information into a chat box.

"Your job as the human using AI systems was to do the context engineering," Logan observed. "Going and finding all these disparate sources, bringing it into a little chat box, and then sending it off to the model so it could do something useful."

The evolution toward better tools means that burden is shifting. Deep Research demonstrated the pattern: users submit loosely formed questions, and the system handles the context gathering, visiting hundreds or thousands of sites to assemble what it needs. The same dynamic now applies to coding tools that can search codebases, pull in relevant files, and construct their own context from a minimal starting prompt.

This trajectory suggests skills, which have become central to context engineering (/blog/context-engineering-guide) workflows, represent a transitional phase rather than an endpoint. They provide efficiency by front-loading context that models would otherwise need to discover through trial and error. But the longer-term direction points toward models that can generate and manage their own skills dynamically, pulling from repositories of domain knowledge or building what they need on the fly.

"Skills, I think in the short term, are a helpful way to just get around that," Logan noted. "But I would expect over time that the model just pre-writes a bunch of this stuff or pulls from some repository of domain authority skills."

What Development Tools Will Look Like

The conversation addressed the proliferation of development environments, from IDE-based tools like Cursor to CLI approaches like Gemini CLI to browser-based builders like AI Studio. The question of whether these will converge or diverge prompted a clear prediction: divergence will likely increase rather than decrease.

Developer preferences for interaction patterns existed before AI tools, with some preferring IDEs while others used Vim or Emacs. AI adds another dimension of customization. Logan suggested that developers might eventually have highly personalised environments, potentially even forks of tools like VS Code that include custom agentic capabilities.

The cost of creating software continues to drop, which enables more experimentation and specialization. Rather than converging on a single optimal workflow, the ecosystem may produce an expanding variety of approaches tuned to different mental models, languages, and problem domains.

For organizations, this implies flexibility in tooling policies. Forcing a single approach may sacrifice individual productivity gains that come from letting developers work in their preferred environments. The tradeoff between standardization benefits and individual optimization is shifting toward the latter as tools become more capable.

Software Volume and Developer Value

Logan made a striking claim: software volume could be a million times larger in ten years than it is today. The natural follow-up question concerns what that means for developer value.

The answer draws on a distinction between coding as typing and software engineering as problem-solving. The value of typing characters that compile into programs is likely to decline. The value of understanding problems deeply, making architectural decisions, and having strong opinions based on experience is likely to persist or increase.

"AI doesn't minimise the value of that way of thinking about the world, the way of problem-solving," Logan argued. "I think it actually accelerates the value of it."

There will always be a frontier of capability beyond what average users of AI tools can achieve. The gap between what tools can theoretically do and what most people can accomplish with them creates space for developers who understand the systems deeply enough to push toward that frontier. Edge cases, debugging complex failures, and making technical decisions where no single right answer exists all require human judgment even when AI handles implementation.

The pedagogical implication matters as well. Computer science education that emphasises problem-solving approaches and ways of thinking transfers to AI-augmented development. Education focused narrowly on syntax and manual coding translates less directly.

AGI as Product, Not Model

The conversation touched briefly on AGI, where Logan offered a framework worth considering. His view is that AGI will emerge as a product experience rather than a model release. The distinction matters because products involve orchestration, UI, integration with other systems, and all the complexity that goes into making capabilities accessible and useful.

"I would be very surprised if we end up with just this chat UI where you go in and you just ask your question and everyone's like, this is AGI because it can do anything," Logan predicted. "I think it's going to probably be very orchestrated, with lots of different stuff, a ton of UI complexity, and all these different things."

This perspective aligns with what happened with coding agents. The models capable of writing code existed before the agent harnesses that made them useful in practice. The product layer that orchestrates model capabilities, manages context, handles tool use, and presents appropriate interfaces is what transformed potential into productivity.

For developers and companies building on AI, the implication is that capability improvements in underlying models matter, but so does the entire stack of product work that makes those capabilities usable. The gap between what models can do in principle and what products deliver in practice represents both an opportunity and a necessary investment.

The full conversation covers additional ground on Google's AI Studio platform, the evolution of SWE-bench and other benchmarks, and predictions about the lab landscape over the next five years. Worth a listen for anyone tracking how the major platforms are thinking about developer tools and the trajectory of AI capabilities.

context engineering Prompt Engineering Agentic Systems

CHAPTERS