How AI Agents Actually Work
AI agents are not magic. They're five concepts stacked on top of each other, and once you get them you can build any agent from scratch.
These are the five layers. Each one builds on the last.
A stateless text generator
Text in, text out. No memory between calls. Each request is completely independent. The model has no idea what you asked five seconds ago.
LLMs are stateless
Start here, because everything else depends on this.
An LLM takes text in and generates text out. That's it. It has no memory. It doesn't “know” what you asked three messages ago unless you paste the entire conversation history back into every single call.
This seems limiting. It is. But it also makes the architecture simple: every agent capability is something you build on top of this stateless core. No hidden state to fight with.
System prompts turn a general model into a specialist
A system prompt sits before every user message. It shapes how the model responds and what role it plays. Same model, completely different behavior.
You are a Python tutor. Never give direct answers. Ask guiding questions instead.
How do I reverse a list in Python?
Good question! Before I answer, what do you already know about list methods? Have you tried looking at what happens when you call .sort() on a list?
General model + system prompt = specialist agent. A translation agent, a code reviewer, a math tutor: same model, different text in front. Your system prompt is the single biggest lever you have. Vague instructions produce vague results.
Tool calling
Without tools, an LLM can only talk. It can describe how to multiply numbers. It can explain APIs. But it can't actually do anything.
Function calling fixes that.
You send the prompt + tools to the API
curl api.anthropic.com/v1/messages -d '{
"messages": [{ "role": "user", "content": "What is 15 × 8?" }],
"tools": [{ "name": "calculator", "input_schema": { ... } }]
}'The LLM never executes code. It says what it wants to call and with what arguments, your code does the actual work.
The agent loop
One tool call is useful but limited. Real problems need multiple steps. So you put it in a loop.
This is called the ReAct pattern (Reasoning + Acting). The LLM thinks about what it needs, calls a tool, looks at the result, then decides what to do next. It keeps going until it has enough to answer.
Step through a real example. Watch how the agent breaks a word problem into individual tool calls:
A coffee shop sells 3 drinks. A latte costs $5, a cappuccino costs $4, and a drip coffee costs $3. Yesterday they sold 40 lattes, 25 cappuccinos, and 60 drip coffees. What was the total revenue?
Most production agents work this way. A while loop with an LLM inside calling functions.
Where it gets complicated is in the details: how many iterations to allow, how to handle tool errors, when to bail out. But the core pattern fits in ten lines of code.
Memory
Everything so far happens inside a single conversation. The agent has no idea who you are between sessions. Memory fixes that, but it's less obvious than it sounds.
The simplest version is what you saw earlier: stuff facts into the system prompt. But that's a static hack. Real memory is a read-write system. The agent doesn't just read stored context, it decides what to write, what to update, and what to delete.
The hard part is not storing memories. It's deciding what deserves to be stored, when to update stale facts, and when to forget. Storing everything leads to noise, and noise degrades retrieval quality over time. Every token spent on memory is a token not available for reasoning.
There's a lot more to memory than fits here. Different memory types (episodic, semantic, procedural), storage architectures, retrieval strategies, the forgetting problem. It deserves its own deep dive.
Memory in AI Agents
How different memory types work, how they complement each other, and how to build your own.
That's the foundation. Production agents layer more on top, but underneath it's the same thing. Text in, text out, a loop with some function calls, and a place to write things down between sessions.
Sources
- Making Sense of Memory in AI Agents by Leonie Monigatti
- From RAG to Agent Memory by Leonie Monigatti
- Exploring Anthropic's Memory Tool by Leonie Monigatti
- Agent Memory: Filesystem vs Database by Leonie Monigatti
- 7 Steps to Mastering Memory in Agentic AI Systems by Bala Priya C
- Building Effective Agents by Anthropic
crafted by bart stefanski