
Why Your Agent Needs Memory, Not Just Context
Also available on
Transcript
[00:00:00] Simon Maple: Before we jump into this episode, I wanted to let you know that this podcast is for developers building with AI at the core. So whether that's exploring the latest tools, the workflows, or the best practices, this podcast is for you. A really quick ask: 90% of people who are listening to this haven't yet subscribed.
[00:00:24] Simon Maple: So if this content has helped you build smarter, hit that subscribe button and maybe a like. Alright, back to the episode. Hello, and welcome to another episode of the AI Native Dev. My name is Simon Maple. I'm your host for the show, and joining me today is Richmond Alake, who is the Director of AI Developer Experience at Oracle.
[00:00:43] Simon Maple: Welcome, Richmond. On this episode, we're going to be talking about context management and memory management, and where you draw the line between the two. We were also going to be talking about whether developers should experiment with their data storage and also a wonderful question about how we differentiate between onboarding humans and onboarding AI agents.
[00:01:01] Simon Maple: Richmond, welcome to the episode.
[00:01:03] Richmond Alake: Thank you for having me, Simon.
[00:01:04] Simon Maple: Oh, it's an absolute pleasure to have you, Richmond. And you were in London, so we thought we had to invite you around to the podcast studio. And look, we have got a beautiful, nice new background for our podcast studio.
[00:01:16] Richmond Alake: That's very good. I'm jealous.
[00:01:17] Simon Maple: I know, right? I think I need to take some of this home and put it in my office as well. It's making my home office look bad. So Richmond, let's give a little bit of background about yourself. Let's talk a little bit about your career and things like that.
[00:01:29] Simon Maple: You worked with many companies such as Nvidia and Neptune AI in the past, and you previously worked for MongoDB before joining Oracle. Now, I am a big Java person in my background, so I worked with Oracle a lot on the Java side and things like that. But today you are the Director of AI Developer Experience at Oracle.
[00:01:50] Simon Maple: Tell us a little bit about what is happening in Oracle in and around AI, but tell us a little bit about what an AI developer experience role looks like.
[00:02:03] Richmond Alake: Yeah, so again, thanks for having me on. We're going to have a very good conversation.
[00:02:08] Richmond Alake: So yeah, you're right. I've had a very interesting trajectory across my career, but one thing is very important across the different companies I've worked at: I've always kept the developer front and center of everything I do.
[00:02:26] Richmond Alake: When I worked over at Nvidia, it was as a writer when I was speaking to machine learning engineers and data scientists, and I was writing about algorithms and all the good stuff we had around computer vision and deep learning at the time. And in MongoDB, I was a developer advocate. So again, focused on developers. And what that led me to is to understand what developers really want from technology and how they want to use technology and experience technology, especially within the domain of AI.
[00:02:50] Richmond Alake: And that's led me to be one of the key folks within Oracle Database that helps shape our products and features around reaching developers.
[00:03:00] Simon Maple: That's pretty cool actually, that you went from Mongo to Oracle as well. Because I remember back in the day, of course, when document-style databases came out, you saw the big rush to folks like Mongo and stuff like that.
[00:03:11] Simon Maple: But obviously, Oracle is so present in so many different organisations. It's a big area. Now if we think about one area that you're very familiar with today, it's agent memory. In fact, why don't we start by defining a few things? When we talk about agent memory and memory management, what do we actually mean?
[00:03:31] Richmond Alake: Okay. So with agent memory, it's actually a term that describes a bunch of systems and techniques working together to allow agents to remember and adapt.
[00:03:44] Richmond Alake: As simply as I can keep it. But if we just double-click into the systems and the techniques, the systems within agent memory are your embedded model that does the encoding of different data objects into numerical representations that allow you to do semantic search.
[00:03:58] Richmond Alake: Then you have your database, which stores all the information and allows you to retrieve information as well. Then you have your LLM, which has a form of parametric memory, which is static. It doesn't change unless you fine-tune it.
[00:04:13] Richmond Alake: So these are the systems. Then techniques are what we talk about, such as context management, memory engineering, and all the context engineering techniques.
[00:04:21] Simon Maple: And so when we talk about the context management or context engineering, what's the difference between context management and memory management? And I guess we have prompt engineering as a role that people are like, "Oh, we need a prompt engineer for this."
[00:04:37] Simon Maple: Do you feel like, or have you seen, and do you feel like there's space for a context engineer role or a memory engineer role?
[00:04:44] Richmond Alake: So, one thing about AI in the field is it's moved so quickly. And people are trying to understand this space. And for developers, it's trying to understand the job to be done.
[00:04:54] Simon Maple: Yeah.
[00:04:55] Richmond Alake: And because we're moving so quickly, we tend to just latch onto something that we think is something and then start to define roles around it. Prompt engineering emerged probably like two years ago, which is nothing.
[00:05:08] Simon Maple: Yeah, right. It feels like about two decades.
[00:05:11] Richmond Alake: It feels like about two decades in AI terms.
[00:05:12] Richmond Alake: Yeah. And "prompt engineering" was meant to describe the key job to be done that developers were doing, which was actually using a bunch of linguistic patterns to actually steer the LLMs into a desired outcome. Right. We wanted the output to have a certain representation that would come out in a certain way, and we started to put in a bunch of prompts, and prompt engineering became the job to be done.
[00:05:36] Simon Maple: Mm.
[00:05:36] Richmond Alake: But the reasoning capabilities and the abilities of these large language models actually improved, and that led us to concentrate more on what we actually put in rather than the robustness of the prompt itself. So we went into context engineering, which is actually thinking very systematically about the information you're curating to pass into the context window. Because the context window is limited.
[00:06:03] Simon Maple: Mm-hmm.
[00:06:04] Richmond Alake: So to answer your question, are there going to be context engineers and memory engineers? I would say yes. But would they be called memory engineers or context engineers?
[00:06:17] Richmond Alake: It's something we're yet to see. But memory engineering I could speak to because it's nothing new.
[00:06:21] Simon Maple: Hmm.
[00:06:21] Richmond Alake: Memory engineering is just a cross-section between a couple of disciplines, which include database engineers, search optimisation engineers that work within the database domain, software engineers, and a bit of AI engineering.
[00:06:35] Richmond Alake: If you just cross-pollinate some of the techniques and disciplines within the roles I mentioned, you're going to go into memory engineering, which is a set of engineers that really focus on optimising the retrieval pipelines or the latency of the retrieval pipelines within generative systems. And that's what they solely focus on.
[00:06:54] Simon Maple: Yeah.
[00:06:55] Richmond Alake: But they need to understand software engineering and databases as well.
[00:06:58] Simon Maple: And would you say memory management is a superset of context management generally?
[00:07:03] Richmond Alake: Yeah.
[00:07:03] Simon Maple: Generally.
[00:07:04] Richmond Alake: It's an interesting debate, right? Which one encapsulates the other?
[00:07:08] Richmond Alake: Is it memory or context? So look, I'm very biased, right? I see everything as memory. And that's because I always look to how humans work and use human analogies to actually describe what I'm seeing in tech today because I communicate with both technical and non-technical people.
[00:07:29] Richmond Alake: So when I use human analogies, folks understand me. The general populace understands me. Yeah. So memory actually is a good way to bridge the gap between people's knowledge of non-technical and technical. So when I say "memory" to my mom or my grandma, they understand what I'm talking about. But if I say "context," they don't have context.
[00:07:49] Simon Maple: They don't have context.
[00:07:50] Richmond Alake: Exactly. There we go. But memory really brings people into that same understanding landscape where you can start to build upon it. So when I say, "Hey, we have agents in our phone, and they need to remember," every piece of information you're giving to an agent, that's actually memory.
[00:08:09] Simon Maple: Yeah.
[00:08:10] Richmond Alake: So it starts to allow you to think about ways you can improve the chances of that specific information being recalled. You start to think about ways you can optimise your retrieval pipelines. So this is where we're heading, and that's why I see everything as memory. But you can argue maybe everything is context.
[00:08:30] Simon Maple: It's an interesting way to frame it. I guess it massively depends on your perspective in terms of how you look at it.
[00:08:35] Richmond Alake: Exactly.
[00:08:36] Simon Maple: One thing I'd love to chat about: we've talked about skills now for the last three or four weeks on the podcast.
[00:08:44] Simon Maple: I'd love to hear your thoughts in terms of, okay, skills, where do they sit in relation to context? Even at Tessl, we have that question: when we put stuff on the website, should we say "skills" and then "context"? The parts overlap as well.
[00:09:00] Simon Maple: Where do skills sit in?
[00:09:02] Richmond Alake: It is very important to get the name in, right? Within this phase, naming is hard. It is very hard. And because it moves so quickly, you are like, "Oh, okay, we cannot use that anymore this month."
[00:09:11] Simon Maple: We will say skills.
[00:09:11] Richmond Alake: We will say skills; next month we will say something else.
[00:09:14] Richmond Alake: And no one was saying skills early last year, right? But where do we see skills in this whole thing? Again, I am a very simple man. Let me go back to humans. Skills are nothing new. Humans have the equivalent, and they are called SOPs: Standard Operating Procedures, which are documentation that you have within organisations that define how you do certain tasks.
[00:09:37] Richmond Alake: And the reason why we have SOPs is because you might have a certain way of doing things, and let us say you hire someone else and you want to be able to transfer that knowledge to someone. The best way we saw as humans to do that was to document it, and really document it in a structured way where you define the step-by-step process you have to take to achieve a certain outcome.
[00:10:00] Richmond Alake: And we called them SOPs. We have them in organisations today. Skills are SOPs for agents.
[00:10:09] Simon Maple: Yeah.
[00:10:09] Richmond Alake: Right. It is just a way of telling an agent, "There is a task, and I am going to give an arbitrary name to the task, and then I am going to describe the task in a certain length.
[00:10:22] Richmond Alake: Then I am going to give you step-by-step instructions and maybe the locations of tools and scripts or APIs that allow you to achieve the outcome that we want from this task execution. "And that is what skills are.
[00:10:44] Richmond Alake: So where do they fall into the domain? If you come into the world of agent memory, you are going to realise that skills are actually a form of procedural memory. Humans have procedural memory. In our brain, there is a part of our brain responsible for storing and understanding skills that we have. For example, if you can do a backflip, awesome if you can do a backflip. I do not know if you could do a backflip, Simon.
[00:11:00] Simon Maple: I wish. No, not at my age. Come on. I put my back out getting my dog on the lead these days. I do not know if I could do a backflip.
[00:11:08] Richmond Alake: I do not know; I have seen some old people do backflips. But if you can do a backflip, where is that knowledge actually stored? There is something called a cerebellum in your brain.
[00:11:18] Richmond Alake: And that stores a bunch of procedural knowledge that you have: the ways you do routines, tasks, and skills. If you look at things from the agent memory perspective, skills are procedural knowledge for your agents.
[00:11:30] Simon Maple: I love the parallels that you are drawing between humans and agents here.
[00:11:34] Simon Maple: In fact, I will call out our marketing team here, do not tell them I like marketing messages because I will lose all credibility, but one of the things that they mentioned was effectively saying if you had a new person join the team.
[00:11:51] Richmond Alake: Yeah,
[00:11:51] Simon Maple: A human, let us call it.
[00:11:52] Simon Maple: If you had a human join the team, you would have to onboard them. And one of the things that we are doing obviously with context is we are onboarding agents. And so it is about how humans need onboarding, and agents need that onboarding too. You have to provide them with that context. You have to provide them with those skills and things like that.
[00:12:08] Simon Maple: It is exactly the same thing. And one of the things that I love about what you said was how organisations already have those procedures, the standard operating procedures. They already have them written down.
[00:12:24] Richmond Alake: Yeah.
[00:12:25] Simon Maple: If they want consistency across their organisation, they need to explain to developers, "This is the way we do things.
[00:12:31] Simon Maple: And if you want to follow a procedure, this is how you do it. "All you really need to do is take that and 'SOP-ify' it, if that is a word. It is now; if we use it enough, it will be in the Oxford English Dictionary. So it is really about wrapping it up. To expect a human to be able to perform the way that we want them to, or to expect a new developer to do that without giving them that information.
[00:12:56] Simon Maple: We would think, are we being stupid trying to expect a human to do that? Why would we ever expect an agent to do exactly the same without providing it that context, that memory, or that skill?
[00:13:05] Richmond Alake: Yeah.
[00:13:05] Simon Maple: Let us talk a little bit about what leads onto what agents need to be successful. I think a lot of the time we blame agents unnecessarily, whereas actually we are not giving the agent the level of context or the level of information it needs to actually be successful.
[00:13:18] Simon Maple: In the same way, if I just said to a new developer who is joining the team, "I need to do these things," but I did not give them the information, and then they come back and say, "Well, I have done this thing, and it satisfied what you asked."
[00:13:34] Simon Maple: I might say, "No, we need to do it this way. Where are the tests? We need to follow this style guide," et cetera. We would not do that to a person. So what do we need to provide to an agent so that it can be successful?
[00:13:45] Richmond Alake: So, a very simple answer to that is information.
[00:13:51] Simon Maple: Mm-hmm.
[00:13:52] Richmond Alake: We need to provide agents as much information as we can.
[00:13:54] Simon Maple: Mm-hmm.
[00:13:55] Richmond Alake: Then the second answer to that question would be data. Information is data, and organisations are sitting on top of tons and tons of data.
[00:14:07] Simon Maple: Mm-hmm.
[00:14:08] Richmond Alake: Then you come into my world, and I just tell you, "You need to give them memory."
[00:14:13] Richmond Alake: You need to give them existing memory that contains information that you have previously, and you need to give them a robust system that enables them to actually adapt and continuously evolve and learn.
[00:14:30] Simon Maple: Mm-hmm.
[00:14:30] Richmond Alake: I happen to work at Oracle, and it will be no surprise that pretty much all the Fortune 500 companies, or most of them, are using an Oracle database.
[00:14:41] Simon Maple: I heard Oracle does some work in data.
[00:14:44] Richmond Alake: Yeah, I heard we did.
[00:14:45] Simon Maple: I thought it was a rumor, but yeah.
[00:14:47] Richmond Alake: Well, the thing about the unique position of where Oracle is, and this is very rare for any company in tech, it has existed for over four decades.
[00:14:56] Simon Maple: Mm-hmm.
[00:14:58] Richmond Alake: And across our four decades, we have seen different crucible moments and paradigm shifts. We have seen the internet era and the cloud transformation data. Now we are in AI, but because we are students of change itself, we understand what we need to do to evolve and meet the needs of our customers and developers.
[00:15:19] Richmond Alake: And that is why most of the data is sitting in most of the data you see in Fortune 500 companies sitting in Oracle. So Oracle understands data. That means we understand what agents need to actually be successful, which is memory. And we understand the robustness and the techniques we have to build around the agent itself to allow it to continuously learn.
[00:15:44] Simon Maple: Now, a couple of points here. Because I think one interesting thing, which I think humans are very good at, and which agents are perhaps not as good at, is as a human I can kind of understand when I am being overwhelmed with data.
[00:16:00] Richmond Alake: Mm-hmm.
[00:16:00] Simon Maple: I can also understand when I need to seek data and when I make a decision as to, "Do you know what?
[00:16:06] Simon Maple: I may have enough information to just about be able to complete this, but I know I am not going to be able to complete it well without more information. "Agents, on the other hand, when you overload them with data, you see their performance degrade when they sometimes get just enough data that they can possibly form an answer.
[00:16:26] Simon Maple: Sometimes they rush that and jump into the answer before actually getting all the data they need. And I guess one of the wonderful things about Oracle customers is that they will have so much data.
[00:16:45] Simon Maple: How does that... I guess my two questions are: how do you get an agent to the right data given that there is so much? And let us start with that one. My second one as a follow-on will be: how do you make sure that the agent is smarter about getting enough data so that it is not so much that it overwhelms it, but it is not so little that it does not have the right information and capability to actually perform the task well?
[00:17:10] Richmond Alake: So your first question was: how do we get an agent to the right data?
[00:17:14] Richmond Alake: And so I will answer your first question, which is: it is a very hard thing to solve. And as we were talking a few days ago, OpenAI put out a piece on how they built their data agent. And one of the techniques that they actually mention is the other single agent that actually scans it. It works at different levels.
[00:17:37] Richmond Alake: So one way, and this answers the question, and one way the agent actually works is, firstly, it scans through all the tables in the database to understand what they have in the schema definition and how they relate to each other. And that allows the agent to understand how data is stored and how it is used.
[00:17:58] Richmond Alake: But then they have the same agent scan through the code that is used, the Python code, and the Java code that is used to actually construct these tables and manage these tables. Now the agent understands the thinking that went behind the generation of these tables, or what sort of data is stored, or the transformations that happen.
[00:18:18] Simon Maple: Mm-hmm.
[00:18:18] Richmond Alake: So how do you get an agent to the right data? It will be the answer to the last question, which was giving the agent a lot of memory of what has been happening in your organisation. You have a bunch of institutional knowledge, and you have a bunch of information and tribal knowledge. The best thing you can do is have a very secure agent go through and traverse this information.
[00:18:44] Simon Maple: Mm-hmm.
[00:18:44] Richmond Alake: They have a capacity to understand, and when it is doing this traversal of information, it is able to store this in a system that allows it to retrieve information efficiently. This is what I would say. In production, the data you encounter has a heterogeneous nature.
[00:19:05] Simon Maple: Mm-hmm.
[00:19:05] Richmond Alake: So you have vectors, you have JSON data, and you have unstructured, structured, and semi-structured data.
[00:19:14] Richmond Alake: You have knowledge graphs, but you just need to really equip your agents with the capabilities to work with all of this different variety of data in terms of storage and retrieval, and allow your agent to be able to scan through different aspects of your enterprise data securely. When you actually do this properly with a very good, robust agent memory system.
[00:19:40] Richmond Alake: You start to understand, or you put yourself on a trajectory of really building a reliable, believable, and capable agent. It is not going to be one shot. We are not going to one-shot our way through this. It is going to be an iterative, experimental process of improving the agents we have.
[00:19:58] Simon Maple: And "securely" was a really key word there as well.
[00:20:01] Simon Maple: And "securely" was a really key word there as well. Because obviously there is so much data. How do you allow the agent to request the permissions that a user should...?
[00:20:08] Richmond Alake: It is such a huge room for us in Oracle because, again, we have existed for over four decades, and you can imagine the amount of times we have seen the nature of privacy change.
[00:20:22] Simon Maple: Mm-hmm.
[00:20:24] Richmond Alake: The requirements to actually secure data change as well. Different regulations from different governing bodies on how data should be stored and retrieved securely as well. And this is embedded in Oracle.
[00:20:37] Simon Maple: Yeah.
[00:20:39] Richmond Alake: I say Oracle Database.
[00:20:39] Simon Maple: You have all the usual role-based access built in. So it is about the agent space.
[00:20:46] Richmond Alake: So when I speak to developers and AI developers today, I'm gonna be very honest. The security on the list of things we care about comes in.
[00:20:56] Simon Maple: You surprise me. You surprise me.
[00:20:58] Richmond Alake: Second
[00:20:59] Simon Maple: To last. Second to last. I'm not going to say "second" for a minute?
[00:21:01] Richmond Alake: Say it comes second to last.
[00:21:02] Simon Maple: Oh, come on. We can't. What? What's last? Well,
[00:21:06] Richmond Alake: To be honest, I don't know.
[00:21:07] Simon Maple: That's what embarrassed the team, that that is lost.
[0021:10] Richmond Alake: Second thing, the thing is you have to think about, and when I say AI developers, I mean AI developers within the industry or across. When I speak to AI, and you can see, just look at today, right?
[00:21:21] Richmond Alake: The popular thing we have now is Open Claw or Claw Bot. Wherever the name is now, or Mal Bot.
[00:21:30] Simon Maple: Yeah.
[00:21:30] Richmond Alake: The security issues with that.
[00:21:35] Simon Maple: Yeah.
[00:21:35] Richmond Alake: Open-source framework, and that's all I'm seeing on my timeline.
[00:21:38] Simon Maple: Yeah.
[00:21:39] Richmond Alake: Because we are AI developers, we love to experiment. We love to get stuff working, but one thing I actually see with Oracle developers is this: because of Oracle, we take security and privacy.
[00:21:53] Richmond Alake: We keep it number one.
[00:21:54] Simon Maple: It's third from the bottom. Oh no, it's number one.
[00:21:56] Richmond Alake: Okay. It's number one for Oracle security. Yeah. And data privacy is number one for Oracle. Yeah. And when you use an Oracle database, it means that you can move as fast as you want and you get to worry less because it's built in.
[00:22:10] Richmond Alake: Security is built in; it's developed already. So you know that you have that infrastructure for security. In the Oracle database. In Oracle Cloud. Yeah. Which allows your developers to just do what they do best.
[00:22:24] Simon Maple: Nice.
[00:22:25] Richmond Alake: Just move fast and innovate and experiment because AI is very experimental in HR.
[00:22:31] Simon Maple: So let's talk about that user, a single user. We'll start with a single user, and then we'll jump into kind of multi-user and team. As a single user, what do you see as the typical context that will make a single user more effective?
[00:22:43] Richmond Alake: A single user. So when you say a single user, you mean a single user interact with an agent or an agent understanding exactly.
[00:22:49] Simon Maple: A single user interacting with an agent outside of a team. Okay.
[00:22:51] Richmond Alake: So let's say a single user interacting with an agent. And I will speak from the agent perspective.
[00:22:57] Simon Maple: Go as role play. You be the agent.
[00:23:00] Richmond Alake: I am the agent. I'll be the Dutch. Sorry. You be the C. Okay. This is good. So one thing that agent will actually need is a form of memory we refer to as entity memory.
[00:23:09] Richmond Alake: So the agent, we need to have a very good understanding of Simon, and it can get that in two ways. One, we can front-load this information of Simon because you have data in all the tools you're using, and we can just ingest that into the agent's memory, and it's able to then create a persona of you within his entity memory.
[00:23:33] Richmond Alake: That's where we're going to store it. We store it as an entity. Humans have this as well. We're going to store it as entity memory, or we can just have back-and-forth interaction, and we begin to formulate an understanding of Simon within the agent's entity memory. Or we have both, where we start with some information about Simon and we build on top of that. We evolve what we know about Simon.
[00:23:58] Simon Maple: And this would be things like my preferences, my way of working, my stack, my capability with certain languages. Yes. The libraries I like to use.
[00:24:07] Richmond Alake: Exactly.
[00:24:07] Richmond Alake: And it's very important we start to identify the distinction of these different types of memories because.
[00:24:13] Richmond Alake: We don't need to reinvent the wheel here.
[00:24:16] Richmond Alake: Neuroscience has been trying to do this for decades, right? Neuroscience has been trying to understand the human brain, and the best way that they've seen they can understand it is by really distinctively identifying what parts of the human brain are responsible for what.
[00:24:31] Richmond Alake: So that way you can actually have a table within the Oracle database just for entity memory.
[00:24:38] Simon Maple: Mm.
[00:24:38] Richmond Alake: And in there you have Simon; you have a role for Simon, and it has all its likes and preferences. And you have different forms of data. You can have JSON data, and you can have an embedding representation of your likes and preferences.
[00:24:53] Richmond Alake: And that allows us to do semantic search whenever you ask a query, and all of this information is how you can start to make a single agent be useful.
[00:25:01] Richmond Alake: For an end user. Because with entity memory, you solve one thing, which is the believability of the agent. How believable is it that I'm interacting with an intelligent entity?
[00:25:12] Richmond Alake: And you see this in a lot of commercial tools today. So if you use ChatGPT, I always get this box that says, Do you like this personality?
[00:25:22] Simon Maple: Yeah.
[00:25:22] Richmond Alake: I'm like, no.
[00:25:24] Simon Maple: Yeah,
[00:25:25] Richmond Alake: I'm joking.
[00:25:26] Simon Maple: Do better. Do better.
[00:25:26] Richmond Alake: Yeah.
[00:25:27] Simon Maple: Yeah.
[00:25:27] Richmond Alake: But you see that even people on the frontier of this space are trying to understand how we start to make these agents and computational entities believable.
[00:25:37] Simon Maple: Mm-hmm.
[00:25:37] Richmond Alake: We're trying to understand a persona, and we're trying to modify this, and this is entity memory.
[00:25:42] Simon Maple: You know, the agents are being asked the same questions, right? They're like, do you like the personality of this user? No, I would like a new user, please. Yeah. So I think from the point of view of, so what we've talked a lot about, there is kind of like a way of working, right?
[00:25:56] Simon Maple: Yeah. So it's almost like how effective a user can be with their agent is based on, because ultimately, the user is the reviewer there, right? The user is the person who, when the agent provides something, says it works, but I want it to work like this. And so knowing and understanding those ways of working, the personality, and how a person wants to see things done will allow the agent to provide the user with an outcome that the user will be happy with.
[00:26:23] Simon Maple: I guess there's also very objective answers where it's like this is true or this is false. And that's another style of context, which is potentially a truism. Like if you're going to use this library, this is the API. Right. We're talking maybe through hallucinations or version problems, but it's about giving context to the agent to make it actually do things accurately.
[00:26:46] Simon Maple: And so when we think about things that will, if we think about a lifecycle of this context, you probably have the first that we talked about there probably actually changes a lot less frequently because the way someone wants to do something, that's probably more about who they are and the way they like to work than an API, which may change two months later.
[00:27:08] Simon Maple: And then another week, and then another two weeks. And you get updates that the agent needs to be aware of. So when we think about context generally, what do we think about as the lifecycle or the longevity of a piece of context? And when we think about that, what are the things that we need in place to keep up with the changes that are required to keep up with the update, the refresh of that context?
[00:27:35] Richmond Alake: I'm going to answer your question, but one thing is I'll answer it by talking about the lifecycle of memory, or you can switch out memory for context. Right. So the lifecycle of context, this answer applies to the lifecycle of memory.
[00:27:48] Simon Maple: Mm-hmm.
[00:27:49] Richmond Alake: The first thing when we think about the lifecycle, and now I'll speak from a technical perspective, is we need to ingest data, right? So ingest context, ingest memory. Most data scientists and data analysts will be familiar with this. You ingest data, you do data cleaning, and then you start to do encoding. This is where we start to pass that data through embedding models and generate different representations.
[00:28:15] Richmond Alake: And then we go into the storage of this data, and then you store it in an Oracle database. And then we start to think about how we can retrieve the data. Do we want to retrieve it using vector search? Do we want to retrieve it using normal lexical search or maybe both, which we would call hybrid search?
[00:28:34] Richmond Alake: But then this is where you start to see a difference between traditional data pipelines. You have to have a way to forget.
[00:28:43] Simon Maple: Mm-hmm.
[00:28:44] Richmond Alake: And that's very important. When you work with context, forgetting or suppressing information doesn't come naturally. But when I speak to developers and I say, if you think about things from the perspective of memory, you immediately start to know that you need to implement a way of forgetting information.
[00:29:15] Simon Maple: Mm-hmm.
[00:29:16] Richmond Alake: And in that lifecycle, you need a way to suppress information or a way to make the way information is being recalled be affected by other information that is within the storage. There's a paper called Generative Agents by some folks at Stanford that came out in 2023 or something, where it talks about ways to forget information or forget memory, and they have this weighted score you can add to the attributes of each memory unit. But this is the lifecycle here.
[00:29:49] Simon Maple: Mm-hmm.
[00:29:49] Richmond Alake: So we ingest the data, encode the data, store the data, retrieve the data, and forget or remember or reinforce the remembrance of the data, and then we go through that cycle again. And there is an augmentation aspect of it.
[00:30:03] Richmond Alake: So not so long ago in computer vision or deep learning, we had a technique called data augmentation. So this is where we introduced diversity into our dataset by augmenting the image.
[00:30:27] Richmond Alake: So we rotated the image to let the convolutional neural network see a different perspective of ways that an image could be presented to it in production. This is called data augmentation. We can leverage those techniques in memory engineering or in agent engineering today because the information that you get out of an LLM can actually be augmented by an LLM looking at the domain and looking at the information.
[00:30:52] Richmond Alake: And you can prompt an LLM to put in more information that might be missing or might be important to include and then store it again, which will improve the way that information is recorded or forgotten. I do this with CPS or scripts that I have, where I have functions or CPS within my system resource. And I use an LLM to augment the description of the function.
[00:31:30] Richmond Alake: And understand the function itself. And then I use an embedding model to encode this information, to generate an embedding representation, a much richer embedding representation of the script so that I can then use semantic discovery for some of the tools I have within my system whenever I'm working with an LLM.
[00:31:55] Simon Maple: Yeah, really interesting. And I think when we have talked about single-user.
[00:32:01] Richmond Alake: Yeah.
[00:32:01] Simon Maple: When we move to multi-user, the thing that makes this problem a little bit smaller for the single individual is when there is context that goes out of date, it's the individual that can identify and fix it.
[00:32:15] Simon Maple: or prompt things to update and fix the data. And when we get into the multi-team perspective, and you have a team of 10, 20, or 30, how big is Oracle these days?
[00:32:31] Richmond Alake: It's bigger than you can imagine. But still operates at the speed of innovation that you would see at any startup today.
[00:32:40] Simon Maple: Good answer. Good answer. So when we talk about bigger teams, you then have the problem whereby different people could have different contexts if they're holding things locally. It can get stale. I guess there's the problem then of discovery and sharing across teams.
[00:32:59] Simon Maple: What would you say are some of the other use cases or things that we need to be mindful of when thinking about a multi-team AI developer organisation whereby context is important and context needs to be shared? How do we go about that?
[00:33:15] Richmond Alake: The difference between now and the days before, when you had large companies and people had tribal knowledge, and when they left, the knowledge left, is that agents can be omnipresent.
[00:33:32] Richmond Alake: When you start to see success, when the agent is going through processes or workflows or messages, and it's starting to see successful events or see other workers do successful events, what the agent can do is document that for you and store that within some form of skill.
[00:34:01] Richmond Alake: Maybe generate a new skill MD for this specific workflow that wasn't documented by the person that did the successful task, or you could just put it in workflow memory essentially.
[00:34:23] Richmond Alake: Agents today are omnipresent. They can go through different systems and understand different systems, and they can see interactions within a secure environment. Therefore, we don't have the problem of loss of knowledge.
[00:34:43] Richmond Alake: If you do this properly, you start to have a scribe that is following you around, always writing down things you're doing successfully, or even things you're not doing successfully, and then share ways to improve. That's the difference.
[00:34:54] Simon Maple: I love that loop of constantly trying to optimise.
[00:34:57] Richmond Alake: Exactly.
[00:34:58] Simon Maple: Touched by maybe even how we react or how the agent does, finding issues as well. Let's talk a little bit about the database.
[00:35:10] Richmond Alake: Yeah.
[00:35:10] Simon Maple: Let's talk a little bit about the file system as well.
[00:35:13] Richmond Alake: Yeah.
[00:35:13] Simon Maple: Because a lot of the time agents love to see what's on the file system.
[00:35:18] Richmond Alake: Yes.
[00:35:19] Simon Maple: And sometimes when we say, "Can you use this CP server over here?" if it has something locally, it can sometimes grab that.
[00:35:23] Simon Maple: And if it's still valid context, that might be okay. But what would you say are the differences or benefits of the approach of a file system versus context that is from a database?
[00:35:39] Richmond Alake: The key benefit of using a file system that I've seen with AI developers today is the speed at which they can build.
[00:35:49] Richmond Alake: When you don't have to worry about infrastructure components and tool selection, you don't have to worry about what database I should be using or what stack I should have in my stack, and you're just using files, you remove the thinking of tool selection. The speed is there.
[00:36:08] Simon Maple: Yeah.
[00:36:09] Richmond Alake: But that speed comes at a cost.
[00:36:11] Simon Maple: Mm-hmm.
[00:36:12] Richmond Alake: And the cost is security.
[00:36:15] Richmond Alake: Right. The cost is what we've been spending over four decades really solving effectively, and a file system is a good interface for LLMs and agents in general because they're trained on a bunch of shell commands and bash commands and all of this information, like how to read and write files, right?
[00:36:41] Richmond Alake: That's part of the training data. So they have this natural affinity for working with files that we're taking advantage of today. But it shouldn't stop there because, again, we've come a long way from the seventies. And I wasn't alive in the seventies.
[00:36:59] Simon Maple: I was just,
[00:36:59] Richmond Alake: Yeah, you were just alive.
[00:37:00] Richmond Alake: Yeah, you were just alive. Yeah. So maybe you remember, I don't know. But there was a time in the seventies when there was this phrase from the Unix philosophy, "Everything is a file," or "Everything is a file descriptor. " And whenever I look into the space of technology, it's almost like Groundhog Day because the same things keep repeating now and again.
[00:37:24] Richmond Alake: In 2026, I'm seeing articles where everyone is saying files are all you need. I'm like, "Yeah, we did that in the seventies, and I think we've come a long way from then."We realised that you actually need concurrency, you need asset transactions, you need security, and you need data privacy, and when you start to actually implement orders in file systems, you start to realise that you've just built your own database.
[00:37:48] Richmond Alake: Start to realise that you've just built your own database
[00:37:52] Simon Maple: Yeah. On the power system. Yeah,
[00:37:53] Richmond Alake: Absolutely. Exactly. So this is what I tell my developers.
[00:37:56] Simon Maple: Yeah. Awesome.
[00:37:58] Richmond Alake: Don't experiment with your tool selection.
[00:38:03] Simon Maple: Right.
[00:38:04] Richmond Alake: The same reason why file systems are very useful in terms of the natural affinity with the LLM or how easy it is to interface with it is the same reason why Oracle Database is very useful for most enterprise AI workloads.
[00:38:21] Richmond Alake: We have all that you need to really build this agency today. Any form of data can be stored in Oracle. Not just JS data, not just relational data, not just knowledge graphs. You can store all of that data in Oracle and have all the retrieval mechanisms for each of those data types. No database could fall into that category of a truly general-purpose database.
[00:38:48] Richmond Alake: At the same time, we give you that enterprise-grade security. So you can have the Oracle AI database on your local machine, work with it with your agent, and you start to get all of the richness that we've spent four decades just actually working on.
[00:39:06] Simon Maple: Yeah. What would you say, looking forward now?
[00:39:08] Simon Maple: Yeah. What would you say, looking forward now, to wrap up? I guess, where would you say context and memory management are going? As models improve, because I remember a couple of years ago when we were talking about prompt engineering all the time, and prompt engineering, I don't think I've said "prompt engineering" as much ever as I have done today in this last hour or so.
[00:39:28] Simon Maple: But when we talk about context engineering so much now, context engineering is the hot topic, or memory engineering is the hot topic for today to make agents more accurate, friendlier to work with, and less frustrating. What's the future path of this?
[00:39:51] Simon Maple: Do you feel as models get better, there is a space where actually we rely on context a lot less? Or do you feel like there's no way models can really, truly understand what we as individuals or what we as an organisation or team want to do without this data?
[00:40:07] Richmond Alake: I think the space of context and memory engineering, you are putting me into that prediction thing, right?
[00:40:13] Simon Maple: Yeah.
[00:40:14] Richmond Alake: Thing, right? Yeah.
[00:40:14] Simon Maple: Yeah. We're going to,
[00:40:15] Richmond Alake: Prediction in AI is so bad because they're usually wrong.
[00:40:18] Simon Maple: We'll have you back in a year.
[00:40:20] Richmond Alake: Yeah, exactly.
[00:40:20] Simon Maple: We'll grill you on whatever you say next.
[00:40:22] Richmond Alake: So this is what I would say. Over in Oracle, there is this principle that we try to operate by, which is that we always try to be six months ahead of our customers, especially within AI.
[00:40:33] Simon Maple: Mm-hmm.
[00:40:34] Richmond Alake: So everyone's talking about context engineering. We were talking about that before.
[00:40:39] Simon Maple: Mm-hmm.
[00:40:39] Richmond Alake: We've been talking about it for loads of years. You need data.
[00:40:46] Richmond Alake: Being six months ahead of our customers, we start to realise that you don't just need context. You need a way to ensure that context is retrieved efficiently.
[00:40:58] Richmond Alake: We want these systems to work in near real-time or real-time scenarios. Now, how do we start to make that happen? How do we let these agents process data in the background and also work in real time simultaneously?
[00:41:18] Simon Maple: Mm.
[00:41:19] Richmond Alake: It all comes to infrastructure. It all comes down to scalability, and that's what we pride ourselves on over at Oracle, both Oracle Database and Oracle Cloud.
[00:41:30] Simon Maple: Mm-hmm.
[00:41:30] Richmond Alake: I'll tell you where things are going or where I see things are going. We spoke about agent memory. We spoke about memory engineering. The way things are going is we're going to go into a world where continuous learning is the norm.
[00:41:45] Simon Maple: Mm-hmm.
[00:41:46] Richmond Alake: Where you're no longer building a system where you're thinking, do I get the right context?" You're more thinking, how do I get the right context or the right information out of this agent loop that I can then pass into my training loop of the actual model and I could then replace it maybe two weeks down the line because now I've retrained this model and the core latent memory aspect of it that allows it to work efficiently has been improved.
[00:42:17] Simon Maple: Mm-hmm.
[00:42:17] Richmond Alake: With the new data that I've just collected, and we're already seeing traces of this. For example, a few months ago, Cursor, they wrote about how they were taking all the agent traces, all the traces that they were collecting from IDEs, and they used that to fine-tune the embedded model to improve the performance.
[00:42:40] Richmond Alake: Right. And that is a form of continuous learning.
[00:42:44] Simon Maple: Mm-hmm.
[00:42:44] Richmond Alake: So where we're going in the next six months is looking for how we can merge this agent loop with the training loop efficiently.
[00:42:54] Simon Maple: Yeah.
[00:42:55] Richmond Alake: That's where things will be.
[00:42:56] Simon Maple: It's interesting because I guess there's context, which is very specific and will change depending on who's using it.
[00:43:00] Simon Maple: And there's context, which is almost like that learning, like you say, where it's like actually 99 percent of people are doing it like this. This is what they need. And I guess it's that piece that can always get pushed back into that cycle of learning.
[00:43:20] Simon Maple: And then it's almost like the customisation-style context, like the skills, right? Where it's like this is a specific skill for me, and there are other skills out there for other teams, organisations, or individuals. And if they want to do it other ways, they can do that. But I love the idea of that continuous learning.
[00:43:36] Richmond Alake: That's where it's going. We're just going to get the agent loop and the training loop together.
[00:43:40] Simon Maple: Yeah.
[00:43:41] Richmond Alake: If I were to put it, it is a lot harder than it seems. Right. But again, looking six months ahead and also Oracle being six months ahead of our customers so we can give them essentially tomorrow, today.
[00:43:55] Simon Maple: Yeah.
[00:43:56] Richmond Alake: Yeah, that's where we see things going.
[00:43:59] Simon Maple: Yeah. Richmond, it's been an absolute pleasure. I really appreciate you coming down, and also, we didn't mention it at the start, but you were one of our speakers at AI Native Dev in New York.
[00:44:14] Simon Maple: And actually for those who are listening, of course we have the London event coming up soon. I think it's going to be announced by the time we go live. So I'll say it's June 1st, second in London. And maybe it'll be a pleasure. Awesome. Thank you very much for this episode, and I really appreciate everyone tuning in. So thanks very much, and tune in next time.
Chapters
In this episode
Not onboarding your agent is on you.
Richmond Alake, Director of AI Developer Experience at Oracle, joins Simon Maple to make the case that most agent failures come down to one thing: memory. Not the model, not the infrastructure. Memory.
On the docket:
- why skills are just SOPs your organisation already has written down
- the job title that is replacing prompt engineers
- file systems vs databases for agent memory (and why one gets you hacked)
- the memory trick that makes agents feel actually intelligent
- why the agent loop and training loop are about to become one
The developers who figure this out first are going to be very hard to compete with.
Connect with us here:
- Richmond Alake: https://www.linkedin.com/in/richmondalake/
- Simon Maple: https://www.linkedin.com/in/simonmaple/
- Tessl: https://www.linkedin.com/company/tesslio/
Memory Engineering: The Infrastructure Layer of Context Engineering
Context engineering has become the defining practice of agentic development, but there is a layer beneath it that determines whether agents can actually retrieve and apply the right information at the right time. In a recent episode of the AI Native Dev podcast, Simon Maple sat down with Richmond Alake, Director of AI Developer Experience at Oracle, to explore how memory systems, retrieval pipelines, and data infrastructure shape agent capabilities.
The conversation offered a framework for thinking about context that goes beyond prompt construction, grounding the discussion in how enterprise organisations can build agents that remember, adapt, and learn continuously.
What Is Memory Engineering?
Memory engineering is the discipline of optimising how agents store, retrieve, and forget information. It sits at the intersection of database engineering, search optimisation, software engineering, and AI engineering. Unlike prompt engineering, which focuses on how to phrase requests, or context engineering, which addresses what information to provide, memory engineering concerns itself with the infrastructure that makes retrieval efficient and reliable.
Richmond offered a useful framing: "Memory engineering is just a cross-section between a couple of disciplines. If you just cross-pollinate some of the techniques and disciplines within database engineers, search optimisation engineers, software engineers, and a bit of AI engineering, you're going to go into memory engineering."
This matters for development teams because the systems that store and retrieve context determine whether agents can access the right information quickly enough to be useful in practice. A well-designed memory system handles the encoding, storage, retrieval, and crucially, the forgetting of information as circumstances change.
Skills as Procedural Memory for Agents
One of the more instructive parts of the conversation addressed where skills fit within the broader memory architecture. Richmond drew a direct parallel to human cognition: just as the cerebellum stores procedural knowledge like how to perform physical tasks, agents need dedicated systems for storing and retrieving procedural knowledge about how to accomplish specific workflows.
"Skills are SOPs for agents," Richmond explained, referencing Standard Operating Procedures. "It's just a way of telling an agent, 'There's a task, and I'm going to give an arbitrary name to the task, and then I'm going to describe the task. Then I'm going to give you step-by-step instructions and maybe the locations of tools and scripts or APIs that allow you to achieve the outcome.'"
This framing connects skill development to established organisational practices. Most enterprises already have documented procedures for how work gets done. The challenge is translating those procedures into formats that agents can consume and apply. This aligns with the broader shift toward spec-driven development (https://claude.ai/blog/spec-driven-development-guide), where explicit specifications guide agent behaviour rather than relying on implicit assumptions.
The distinction between different types of memory proves useful here. Entity memory stores information about users and their preferences. Procedural memory, where skills reside, stores information about how to accomplish tasks. Separating these concerns allows teams to optimise each independently and reason about what information needs to be retrieved for different types of agent interactions.
The Memory Lifecycle: Ingest, Encode, Store, Retrieve, Forget
The conversation surfaced a lifecycle model for thinking about how context flows through agent systems. It starts with ingestion, where raw data enters the system. Then encoding, where embedding models transform that data into representations suitable for semantic search. Storage follows, typically in a database optimised for the relevant data types. Retrieval mechanisms determine how that stored information gets surfaced when needed.
But the critical addition to traditional data pipelines is forgetting. "When you work with context, forgetting or suppressing information doesn't come naturally," Richmond noted. "But when I speak to developers and I say, if you think about things from the perspective of memory, you immediately start to know that you need to implement a way of forgetting information."
This has practical implications for teams building agent systems. Context that was accurate six months ago may now be outdated or actively misleading. Without mechanisms to deprecate stale information, agents continue to retrieve and apply knowledge that no longer reflects reality. The research on generative agents from Stanford suggests approaches like weighted scoring that can affect how likely information is to be recalled, allowing systems to naturally phase out outdated context.
File Systems Versus Databases for Agent Memory
The conversation addressed a tension familiar to anyone building agent systems: the ease of file-based context versus the robustness of database-backed storage. File systems offer speed and simplicity. Agents have natural affinity for working with files given their training data. Development moves faster when teams do not have to think about infrastructure.
But that speed comes with costs. "Don't experiment with your tool selection," Richmond advised. As teams add requirements for concurrency, transactions, security, and data privacy to file-based systems, they often find themselves rebuilding database functionality from scratch.
The recommendation was clear: for production systems, particularly in enterprise environments, purpose-built databases provide security, scalability, and retrieval mechanisms that file systems cannot match. The experimentation phase benefits from file-based approaches, but production deployments need infrastructure designed for the retrieval patterns agents require.
Agents as Omnipresent Knowledge Capturers
One forward-looking observation from the conversation concerned how agents might solve the tribal knowledge problem that plagues large organisations. When experienced developers leave, their undocumented knowledge often leaves with them. Agents, by contrast, can observe workflows across an organisation and document successful patterns automatically.
"Agents today are omnipresent," Richmond observed. "They can go through different systems and understand different systems, and they can see interactions within a secure environment. Therefore, we don't have the problem of loss of knowledge."
The implication is that memory systems need to support not just retrieval but continuous learning. When an agent observes a successful workflow that was not previously documented, it can generate new procedural knowledge, perhaps a new skill file, and store that for future use. This creates a feedback loop where agent usage improves the context available for future agent interactions.
Toward Continuous Learning
The conversation pointed toward a future where agent loops and training loops merge. Rather than treating context as static information that agents consume, the trajectory appears to be toward systems that continuously learn from agent interactions. Cursor's approach of using agent traces to fine-tune embedding models represents an early example of this pattern.
For development teams working with agents today, the practical takeaway is to build memory systems that can evolve. Context will change. Requirements will shift. The teams that succeed will be those whose infrastructure supports iteration rather than requiring wholesale replacement when circumstances change.
The full conversation covers additional ground on entity memory for personalisation, security considerations for enterprise deployments, and the parallels between human and agent cognition. Worth a listen for teams thinking about the infrastructure layer of their agentic systems.