Part 1 of 7

Agent SDK Course Intro: Why This Exists Right Now

The intro video to our Anthropic Agent SDK video series.

1 The Agent SDK gives you the exact same infrastructure that powers Claude Code - arguably the best agent on the market
2 You can build custom agents for any use case, not just coding - like a HIPAA compliance auditor
3 What used to take weeks to build can now be done in a few hours with the Agent SDK

1 An agent is just three things: an LLM (the brain), a set of tools (actions it can take), and a loop
2 At the code level, an agent is literally a while loop that calls the LLM and executes tool calls until done
3 Frameworks like Vercel's AI SDK abstract the boilerplate, but the Anthropic Agent SDK goes further with the exact same tools and architecture as Claude Code

1 The Agent SDK comes with pre-made tools (read, edit, bash, grep, glob, web search) - the same ones Claude Code uses
2 You can use your existing Anthropic subscription instead of per-token API billing for personal use
3 The $200/month Claude Code subscription gives you over $2,000 worth of API usage - massively underpriced
4 Built-in features like sub-agents and automatic context compaction are handled for you out of the box

1 The SDK requires Claude Code CLI under the hood - the npm package bundles it automatically
2 Use allowed_tools to control what the agent can do - start with read-only tools, then add edit and bash when ready
3 Set max_turns to limit agent loops (5-20 for typical tasks, up to 250 for complex agents) to prevent runaway costs
4 The system prompt shapes agent behavior - you can even use Claude Code's own system prompt for coding agents

1 Custom tools are added through MCP (Model Context Protocol) - a standardized way for agents to connect to external services
2 Tool descriptions are critical - they determine whether the agent picks the right tool for the task
3 Be strategic about what tools return - only send back what's immediately relevant to preserve the context window
4 Remote MCP servers from services like Notion and Granola can be connected directly without writing boilerplate

1 Skills are markdown files with instructions that the agent auto-discovers and loads only when relevant
2 Skills use progressive disclosure - only loaded when needed, so dozens of skills won't bloat the context window
3 You don't pass skills into the query - they're autodiscovered from the skills folder structure
4 The skill description in the YAML frontmatter is what the agent uses to decide whether to load it

1 Sessions let you resume conversations - save the session ID and pass it back to pick up exactly where you left off
2 You can fork a session to try a different approach without losing the original conversation
3 For long-term memory, build custom tools (memory_save_critical, memory_get_critical) backed by a database
4 The combination of sessions, Claude MD for project context, and custom memory tools gives you a solid memory architecture

Why the Agent SDK Matters

Anthropic’s Agent SDK isn’t just another framework for building AI agents. It’s the same infrastructure that powers Claude Code - which, if you’ve used it, you know is one of the best coding agents out there. The SDK opens that up so you can build custom agents for any use case, not just coding. We’re using it to build a HIPAA compliance auditor. Stuff that would’ve taken weeks is now a few hours of work.

What This Series Covers

This series goes from zero to building real agents. We start with fundamentals - what an agent actually is, how the loop works - then get hands-on with the SDK’s built-in tools, custom tool creation through MCP, skills for modular behavior, and memory systems for conversation persistence. It’s designed to be practical enough that you can start building your own agent alongside it.

The Agent Loop

An AI agent is simpler than it sounds. It’s three things: an LLM (the brain), a set of tools (actions it can take), and a loop. The agent decides which tool to use, executes it, checks the result, and repeats until the task is done. In code, it’s literally a while loop that calls the model and runs tool calls until there’s nothing left to do.

That simplicity gets messy fast once you add real functionality - conversation history management, tool execution, error handling, permissions. Frameworks like Vercel’s AI SDK clean up the boilerplate, but the Anthropic Agent SDK takes it further by giving you the exact same tools and architecture that Claude Code uses internally.

Built-In Tools You Don’t Have to Build

The biggest difference between the Agent SDK and other frameworks is what comes out of the box. With Vercel’s AI SDK, you define every tool yourself. The Agent SDK ships with pre-made tools - read, edit, bash, grep, glob, web search - the same ones Claude Code uses. Anthropic spent serious engineering time making these work well with their models, and it shows.

The Subscription Advantage

Here’s what gets people excited: you can use your existing Anthropic subscription with the Agent SDK instead of paying per token. Someone did the math - the $200/month Claude Code plan gets you over $2,000 worth of API usage. For personal projects and experimentation, that’s a no-brainer. If you’re deploying an agent for other people to use, you’ll need API billing, but for building and testing, the subscription is absurdly good value.

Getting Started

Setup is straightforward - install the npm or Python package, and the Claude Code CLI gets bundled automatically. If you’re using API billing, set your key. You can also run this on AWS Bedrock or Google Vertex if you need HIPAA-level security.

The simplest possible agent is just a few lines of code. Give it the glob tool, point it at a directory, and it’ll tell you what’s there. It doesn’t look like much, but a year ago that same functionality would’ve been hundreds of lines.

Configuration That Matters

Three settings shape everything your agent does. allowed_tools controls what it can access - start read-only with glob, grep, and read, then add edit and bash when you’re ready. The system_prompt defines behavior - you can even use Claude Code’s own system prompt for coding agents. And max_turns prevents runaway loops. We use 5-20 for typical tasks and up to 250 for complex agents like our HIPAA auditor.

Custom Tools Through MCP

The built-in tools get you surprisingly far, but the real power comes from custom tools via MCP (Model Context Protocol). MCP is a standardized way for agents to connect to external services - think Slack, Notion, databases, whatever your agent needs to interact with.

The tool description is the most important part to get right. It’s what the agent reads to decide whether to use that tool for a given task. It doesn’t need to be complex, just descriptive enough that the agent knows when to reach for it. And be strategic about return values - only send back what’s immediately relevant. Flooding the context window with unnecessary data is one of the easiest mistakes to make.

Services like Notion and Granola already have remote MCP servers you can point to directly, which means no boilerplate at all for those integrations.

Skills: Modular Instructions That Scale

Skills are one of the most underrated parts of the Agent SDK. Instead of cramming every possible instruction into one massive system prompt, you write separate markdown files - each one covering a specific capability - and drop them into a skills folder. The agent discovers them automatically and only loads the ones it needs.

This is called progressive disclosure. On startup, the SDK reads just the name and description from each skill file and adds those to the system prompt. When a user asks something like “check Slack for urgent messages,” the agent recognizes the Slack skill is relevant and pulls in the full instructions. You can have dozens of skills installed without bloating the context window, because only the relevant ones get loaded.

The gotcha that trips everyone up: skills aren’t passed as a parameter. They’re autodiscovered from the folder structure. You also need to add skill as an allowed tool and define your sources - without both of those, the system won’t find anything.

Sessions and Conversation Continuity

Every conversation in the Agent SDK creates a session with a unique ID. Save that ID and you can resume the exact conversation hours or even days later - the SDK loads the full history and picks up where you left off. You can even fork a session to try a different approach without losing the original thread. For building real applications, this is essential.

Building Long-Term Memory

Sessions handle continuity within a conversation, but what about remembering things across different sessions? The SDK has a /memory command in beta, but we’ve found it’s better to build your own system. We use a database (Convex, Postgres, or Supabase) and give the agent custom tools like memory_save_critical and memory_get_critical to store and retrieve important context - user preferences, ongoing projects, whatever matters.

Pair that with Claude MD files for project-level context that should be available every session, and you’ve got a solid memory architecture: session IDs for continuity, Claude MD for project context, and custom memory tools for long-term storage. There’s no single right way to do this - the best approach depends entirely on what your agent needs to do.

0:00 Why the Agent SDK exists

AI agents have been around for a while. People have been building them with Langchain, Vercel's AI SDK. There's no shortage of frameworks, but most of them required a lot of hacking things together, and the results were kind of hit or miss. But then, Anthropic released their agent SDK, which is the exact same infrastructure that powers Claude Code. If you've used Claude Code before, you know it's a really good agent, arguably one of the best agents on the market. The agent SDK gives you the exact same power, but now you can build your own custom agents on top of it, and it does not have to be for coding. You can now build your own assistant or in our case a HIPAA compliance auditor agent. And this is stuff that would have taken us weeks to do you can now do in a few hours.

0:39 What this course covers

So we wanted to put together this crash course on the agent SDK. We're going to start with the fundamentals and then we're going to get more hands-on with things like tool calling, memory deployment, then finish off with cost tracking and real world use cases we're building right now. We'll make this as practical as possible so you can jump in and start building your own agent on top of this. And I highly recommend checking out the agent SDK documentation as a supplement to these videos.

0:00 What is an agent?

You've probably been hearing the word agent everywhere. And let me explain because it's actually a lot simpler than it sounds. It's an LLM like Claude or GPT, which is basically the brain, a set of tools, and this is basically what the AI agent can do, actions that it can perform. This can be stuff like read a file, send a message, check Slack, and lastly, it's a loop. The agent goes through a loop where it decides what tool to use. It executes it. It checks, okay, how did I perform on that? And then it keeps going over and over again until it feels like it successfully executed the task. At the core, that's really it. That's what an agent is.

0:33 The agent loop in code

Agents have been around for a pretty long time. Basically, since LLMs became a thing in terms of code, this is the most basic form. This is what an agent is. It's literally a while loop. You call the LLM. It's going to return and decide which tool to use. We add the result of that tool call back to the conversation, and then we loop again. And when it has no more tools to call, that's when we're done. It can technically get a lot more complex, especially once you add custom tooling. And over time it can become a lot of boilerplate. You got to manage the conversation history yourself, tool execution, dealing with edge cases, errors, permission, all of that stuff. It gets messy really fast.

1:04 Why frameworks exist

And that's why people decided to build frameworks. Vercel's AI SDK is probably the most popular framework out there for building agents. And it wraps all of the functionality of manually building this loop and tool call and calling the LLM directly into a bunch of pre-made functions you can export and use. Here's the code where we build a very similar agent, but using the AI SDK from Vercel. If you look at it, it's way cleaner, but it's the same exact concept. It's an LLM. You're giving it tools and you're having it run in a loop. It's just abstracted away a little bit.

1:34 Enter the Anthropic Agent SDK

So, here's where things get really interesting because then Anthropic released their agent SDK, which is kind of like the AI SDK from Vercel, but takes it a lot further. It doesn't just give you the functions to call the loop and call these tools. It actually gives you the exact same tools and architecture as the agent inside of Claude Code. So now that you have a basic understanding of what an agent is under the hood, in the next video we'll cover what Anthropic's agent SDK is and why it's so special, especially when something like AI SDK from Vercel already exists.

0:00 Built-in tools from Claude Code

In the last video, we covered the basics of agents, what they are, which is basically an LLM, giving it a set of tools, and having it go through a loop. And we also mentioned that there are frameworks like the AI SDK from Vercel that make this stuff way easier to deal with. But the agent SDK that Anthropic released is categorically different. On the surface, it looks really similar to the AI SDK. They both can let you create agents, but let's explain why they're a bit different. The first major difference is with Vercel's AI SDK, you have to define the tools yourself. But with the agent SDK from Anthropic, sure you do have to define some tools and you have this ability, but it also gives you a bunch of pre-made tools out of the box. And these tools are special because they're the exact tools that Claude Code uses. And more importantly, Anthropic spent probably millions of dollars of engineering time to make sure that these tools work really well with their models.

0:50 The tools breakdown

Let me go through some of these tools. So we have the read tool, which allows the agent to read files. We have an edit tool, so it can edit files, which is really great if you wanted to take notes during a long running session, for example. It can run terminal commands with bash. It can search with grep and glob. And it even has a built-in web search tool, which is one of the best that I've seen in any agent. So, these are built-in. You don't have to write these from scratch like you would have to with the AI SDK.

1:13 Code example and built-in features

And here's what the code looks like to build a basic agent with the agent SDK and feeding in these tools. Technically with this, the agent can read whatever codebase is on your machine. It can do a lot of what Claude Code can do with just these minimal lines of code. You don't have to handle the context window. You don't have to handle sending messages in and managing the state. All of this is handled directly by the agent SDK. There are a ton of other things that the agent SDK is going to give you. For example, it already has built-in sub-agents. It can handle compaction automatically. So when the context window is being eaten up, when you have a long conversation, it will automatically compact and compress things for you.

2:02 Using your Anthropic subscription

Now the other big difference and a very big reason why people are fascinated by the agent SDK is you can actually use your existing Anthropic subscription with the agent SDK. When you're using something like the AI SDK from Vercel, you need to pick a provider like OpenAI or Anthropic and you need to pay per token used. So the more you use it, the more tokens you're going to use up, the more expensive it's going to get. If you're paying for Claude Code, you already have an Anthropic subscription and Anthropic allows you to use the same subscription with the agent SDK, which is huge. You can just pay a flat subscription and the tokens are going to come out of that. At the time of recording the video, the Anthropic subscription is massively underpriced. Someone did the math and said that if you're paying for the $200 a month Claude Code subscription, you basically get over $2,000 of API usage.

3:02 API vs subscription billing

Now, you can, and to be honest, you are supposed to actually use the API pricing with Anthropic's Agent SDK if you plan on releasing this as some sort of agent that other people can use. But for personal use, it's completely fine to use your Anthropic subscription. Now, it is worth noting that if you actually do plan on deploying your agent and letting other people use it, you cannot use your Anthropic subscription to power this. You have to get an API key and you have to use their API billing for that. But if you're just experimenting or building a personal agent for yourself, it is a no-brainer to just use this subscription.

0:00 Setting up the Agent SDK

Let's try to actually build something. Now, fair warning, the agent SDK is constantly being updated. So, some of this might be out of date, but I'm sure at a high level, most of this should still hold even by the time you're watching this video. So, the first step in using the agent SDK is you have to install it on your machine. If you're using TypeScript, here is the command for this. Or if you're using Python, here's the command for that. Now, one thing to note is that the SDK requires the Claude Code CLI under the hood. The npm package bundles it automatically. For Python, it's also bundled, but if you want a specific version, you can install it separately.

0:46 API keys and providers

Next, if you're using the API based billing and you're not going to use your Anthropic subscription, you need to set your API key. So, you can do this here. And running this command is going to set this API key on your machine. Quick note, you can also use this with AWS Bedrock and Google Vertex if you need a bit more enhanced security. So, if you're building something that needs HIPAA compliance, that's the way to do that.

0:57 Building your first agent

Here is the code for probably the simplest possible agent that you can build. When you run this, the agent is going to use the glob tool to scan your directory and tell you what's there. Nothing crazy, but technically you just built an agent. And I know it doesn't look like a lot, but this is actually a pretty powerful agent, especially compared to a year ago where this exact thing would have taken hundreds if not thousands of lines of code to replicate its functionality.

1:22 Code review agent example

Now, let's make this more interesting. Let's say you want to analyze a code base. And here's the code for that. So now the agent will actually read through your files, search for patterns, and give you a real code review. And this is all with the built-in tools that come with the agent SDK, which again, this is the same tools that come with Claude Code. There's no custom tool definitions needed.

1:42 Key configuration options

Now, a couple things I want to point out. So allowed tools, this controls what the agent can do. So you can start with read-only tools like read, grep, glob. And then you can add things like edit and bash which will then write files and delete files and stuff when you're ready to make further changes. Then there's a system prompt which really shapes the agent's behavior. And you can actually even use Claude Code's direct system prompt if you plan on building a coding agent. For max turns, this limits how many steps the agent can take, which is really important because you don't want this thing looping forever because it will. I think typically what we see for most tasks we allow it to do about five to 20 different turns and for some really complex things like we built our own HIPAA compliant agent I think for that we have it at like 250 max turns.

0:00 Custom tools through MCP

If you've been following the other videos, you should have a good base understanding of the Agent SDK, how to set it up, and how to use it with the built-in tools. But the real power comes once you give it custom tools. The agent SDK supports custom tools through MCP, which is model context protocol. If you're not familiar with MCP, it's a standardized way for agents to connect to external services. Now, here's the code to add a custom tool through the agent SDK. You use these two functions, which are the tool and the create SDK MCP server functions.

0:32 Slack API tool example

Here's an example for a custom MCP tool to hook into the Slack API. Now, we're going to assume that the Slack API is already defined here, but this is how you would define the tool so the agent SDK can use it. The big thing to notice here is that we have a tool description. This is really important to get right because this is going to determine whether or not the agent is going to call this tool for the specific use case. This doesn't need to be overly complex, but it needs to be descriptive enough where an agent can accurately just know when to pick it up.

0:58 Passing tools and remote MCP servers

And after you define the tool, this is how you're going to pass it in. You're going to define it as one of the allowed tools in this array. And notice the tool naming convention. They use MCP server_name tool_name. That's how the SDK knows which MCP server to route the tool call to. And you can also connect to remote MCP servers. So if you use a service like Notion or Granola, they already have MCP servers defined and you can just point to it like this. I absolutely love these external MCP servers because now the agent just has access to them and I don't have to write any boilerplate code to define these tools myself.

1:29 Tool return values matter

One tip is to be very careful and strategic about what the tool actually returns because the agent is going to process the results of that tool. So if you just send in a ton of data, it's going to have to crunch all of that and you're going to waste a lot of the precious context window. Really only return what is immediately relevant for that tool call.

0:00 What are skills?

Let's talk about skills in the agent SDK. Skills are probably one of the most interesting parts of the agent SDK, and I'm surprised not a lot of people take advantage of them. They usually just focus on trying to make the best tools and then modifying the system prompt. But at the core, a skill is a markdown file with specific instructions. You drop it in a specific folder and your agent automatically discovers it and pulls it in whenever it's relevant.

0:24 Skill file structure

And here's the folder structure. Each skill lives in its own folder with a skill MD file. And here's what one actually looks like. So, this is a Slack integration skill. The YAML frontmatter at the top is critical. That name field needs to be lowercase with hyphens, and the description is super important because it's what the agent uses to decide if it should load the skill or not.

0:43 Skills are autodiscovered

Here's another key thing which trips people up. You do not pass the skill into the query. There's no skill parameter option. Skills are completely autodiscovered as long as they are in that folder. So, it needs to actually follow this folder structure. This trips up so many people. This tripped me up the first time as well.

0:59 Progressive disclosure

And here's how it works. When the agent starts up, the SDK is going to scan this skills folder, read just the name and description of each of the skill MD files, and it adds those descriptions to the system prompt. So then when the user asks something like check Slack for urgent messages, the agent's going to look at those descriptions, recognize that the Slack integration skill is relevant here, and then it's going to load the full instructions. This is called progressive disclosure. It's only load what you need when you need it. So you can have dozens of skills installed, and it won't bloat the context window.

1:27 Enabling skills properly

To enable skills properly, you need two things. First, setting sources needs to be defined. So, the SDK will properly load skills from the skills folder. Without this, the skills tool is available, but it just won't have any skills. And second, and this one really trips people up, is skill must be added as an allowed tool. Skills are super underrated. There is a great talk from some Anthropic engineers on how skills work, why they think it's important. They are how you make sure that agents perform consistently across tasks without having to cram everything into one giant system prompt.

0:00 The memory problem

One of the biggest limitations with LLMs and the agent SDK is not a stranger to this is memory. They really struggle to remember things. And in the case of the agent SDK, every conversation basically starts from scratch. But there are some things in the agent SDK that'll allow you to build real memory and conversation persistence.

0:17 Sessions and conversation continuity

So let's start with sessions. Now when you start a query, the SDK is actually going to create a session. That session has an ID and you are able to capture it. The first message you get back is always a system init message with a session ID. So this is the point where you can start saving this and later which could be a few minutes later, hours or even days later, you can resume the exact conversation as long as you have that session ID. And it's literally just as simple as passing and resume with the session ID here. The SDK is going to load the full conversation history and context and pick up exactly where you left off, which is huge for building real applications.

0:53 Forking sessions

You can even fork a session and create a branch so you can try a whole different approach without losing the original. Here's the code to do that. Now, sessions will handle conversation continuity. But what about long-term memory, especially across different sessions?

1:02 Long-term memory with custom tools

Now, at the time of recording, there is a /memory command that is in beta. So, you can actually have Claude Code manage and maintain its own memory system, which will work across different conversations. But in our experience, we found it's pretty good to just build your own system for this. We typically use a database like Convex, Postgres, Supabase to store the important information and then we give the agent custom tools to save memories and retrieve memories. For example, we created tools like memory_save_critical and memory_get_critical that the agent can call to store and retrieve things that it should remember. This is user preferences, ongoing projects, important context.

1:39 Building a memory architecture

You can define when it should save and retrieve critical memories in the system prompt or as a skill. Just like Claude Code, this agent SDK also has a Claude MD file which is basically a persistent instruction file or the brain of a project. This is great for project level context that should be available on every single session. The combination of conversation continuity with the session IDs, Claude MD for project context and custom memory tools for long-term storage gives you a really solid basic architecture for memory. There's a thousand ways to do this. This is just to give you an idea of what is possible, but I highly recommend you create your own memory system depending on what you want the agent to do.

Up Next

What Are AI Agents? (The Fundamentals) Agent SDK Video 1: The fundamentals of AI agents and why the Anthropic Agent SDK matters.