AI Agents
ReAct agent loops, tool calling architecture, multi-step reasoning, planning patterns, multi-agent systems, human-in-the-loop, memory management, and cost control — building autonomous AI systems that act, not just answer.
AI Agents
ReAct agent loops, tool calling architecture, multi-step reasoning, planning patterns, multi-agent systems, human-in-the-loop, memory management, and cost control — building autonomous AI systems that act, not just answer.
Principles
1. What Makes an Agent an Agent
An agent is an LLM that can take actions, observe results, and decide what to do next. A chatbot answers questions. An agent completes tasks.
The difference:
- Chat — User asks → LLM responds → done
- Agent — User defines goal → LLM reasons → calls tool → observes result → reasons again → calls another tool → ... → delivers result
Agents use tool calling to interact with the world (databases, APIs, file systems) and loops to chain multiple steps together. The LLM decides when to use tools, which tools to use, and when the task is complete.
When to use an agent:
- Tasks requiring multiple steps that depend on each other
- Tasks where the path is not known in advance (research, investigation)
- Workflows that combine information from multiple sources
- Tasks requiring judgment calls between steps
When NOT to use an agent:
- Single-step operations (extraction, classification, translation)
- Tasks with a fixed, known pipeline (use regular code)
- Simple Q&A (use RAG or direct generation)
- Tasks where determinism is required (agents are inherently non-deterministic)
2. The ReAct Loop
ReAct (Reason + Act) is the foundational agent pattern. The LLM alternates between reasoning about what to do and taking action via tools.
Loop:
1. THINK — "I need to find the customer's order status"
2. ACT — Call lookupOrder(orderId: "ORD-123")
3. OBSERVE — Order is "shipped", tracking: "1Z999..."
4. THINK — "Now I need to get the delivery estimate"
5. ACT — Call getTracking(trackingNumber: "1Z999...")
6. OBSERVE — Estimated delivery: March 5
7. THINK — "I have all the info. I can answer now."
8. RESPOND — "Your order ORD-123 has shipped and is estimated to arrive March 5."With the Vercel AI SDK, the ReAct loop is built into streamText and generateText via maxSteps:
import { streamText, tool } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { z } from 'zod';
const result = streamText({
model: anthropic('claude-sonnet-4-20250514'),
system: `You are a customer support agent.
Gather all necessary information before responding to the customer.
Think step by step about what you need to look up.`,
messages,
tools: {
lookupOrder: tool({
description: 'Look up order details by order ID',
parameters: z.object({ orderId: z.string() }),
execute: async ({ orderId }) => await getOrder(orderId),
}),
getTracking: tool({
description: 'Get shipping tracking details',
parameters: z.object({ trackingNumber: z.string() }),
execute: async ({ trackingNumber }) => await getTracking(trackingNumber),
}),
},
maxSteps: 5, // Allow up to 5 reason-act cycles
});Each "step" is one round of: model generates tool calls → you execute them → results go back to the model. Set maxSteps to control the maximum depth. 3-5 is typical for most use cases. 10+ is for complex research tasks.
3. Tool Calling Architecture
Tools are the agent's hands. Well-designed tools make agents reliable. Poorly designed tools make agents unpredictable.
Design principles for agent tools:
- One responsibility — each tool does one thing. "searchAndUpdate" should be two tools.
- Descriptive names —
lookupCustomerByEmailbeatsfindUser. The model reads tool names. - Detailed descriptions — the description is the model's documentation. Include what the tool does, when to use it, and what it returns.
- Narrow parameters — accept the minimum input needed. Don't pass entire objects when an ID suffices.
- Structured returns — return consistent shapes. Include error information in the return, don't throw.
- Safe by default — read-only tools should be the majority. Write tools should require explicit confirmation.
// GOOD tool design
const tools = {
searchKnowledgeBase: tool({
description: `Search the help center knowledge base by semantic similarity.
Returns the top 5 most relevant articles with titles, URLs, and content previews.
Use this when the customer asks about product features, troubleshooting, or how-to questions.`,
parameters: z.object({
query: z.string().describe('The search query — use the customer\'s exact words or a refined version'),
category: z.enum(['getting-started', 'billing', 'api', 'integrations', 'troubleshooting'])
.optional()
.describe('Filter by article category if the topic is clear'),
}),
execute: async ({ query, category }) => {
const results = await searchDocs(query, { category, limit: 5 });
return {
found: results.length,
articles: results.map((r) => ({
title: r.title,
url: r.url,
preview: r.content.slice(0, 200),
relevance: r.score,
})),
};
},
}),
// BAD tool design
doStuff: tool({
description: 'Does various things', // Too vague
parameters: z.object({
data: z.any(), // Too broad
}),
execute: async ({ data }) => {
// Multiple responsibilities
const user = await findUser(data.email);
await updateUser(user.id, data.updates);
await sendEmail(user.email, data.message);
return 'done';
},
}),
};Tool categories:
| Category | Examples | Safety Level |
|---|---|---|
| Read-only | Search, lookup, calculate, format | Safe — always allow |
| Write (reversible) | Create draft, add to cart, bookmark | Medium — log actions |
| Write (irreversible) | Send email, place order, delete | High — require confirmation |
| External | Call third-party API, post to social | High — require confirmation |
4. Planning Patterns
For complex tasks, have the agent plan before executing. Plan-then-execute produces more reliable results than diving straight into tools.
Simple planning with a structured output step:
import { generateObject, streamText, tool } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { z } from 'zod';
const PlanSchema = z.object({
goal: z.string().describe('The overall goal to accomplish'),
steps: z.array(
z.object({
step: z.number(),
action: z.string().describe('What to do in this step'),
tool: z.string().describe('Which tool to use'),
reasoning: z.string().describe('Why this step is needed'),
})
),
estimatedSteps: z.number().describe('Expected total number of tool calls'),
});
async function planAndExecute(
task: string,
tools: Record<string, any>,
messages: any[]
) {
// Phase 1: Plan
const { object: plan } = await generateObject({
model: anthropic('claude-sonnet-4-20250514'),
schema: PlanSchema,
prompt: `Create a step-by-step plan for this task:
TASK: ${task}
AVAILABLE TOOLS:
${Object.entries(tools)
.map(([name, t]) => `- ${name}: ${t.description}`)
.join('\n')}
Create a concrete plan. Each step should use one specific tool.`,
});
console.log('Plan:', plan);
// Phase 2: Execute with the plan as context
const result = streamText({
model: anthropic('claude-sonnet-4-20250514'),
system: `You are executing a plan step by step.
PLAN:
${plan.steps.map((s) => `${s.step}. ${s.action} (using ${s.tool})`).join('\n')}
Execute each step in order. After each tool result, check if the plan needs adjustment.
If a step fails, explain why and suggest an alternative.`,
messages,
tools,
maxSteps: plan.estimatedSteps + 2, // Buffer for retries
});
return result;
}5. Multi-Agent Systems
For complex workflows, decompose into multiple specialized agents coordinated by a supervisor. Each agent has its own tools, system prompt, and expertise.
Supervisor pattern:
// lib/agents/supervisor.ts
import { generateObject, generateText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { z } from 'zod';
interface Agent {
name: string;
description: string;
execute: (task: string) => Promise<string>;
}
const RoutingDecision = z.object({
reasoning: z.string(),
agent: z.string().describe('Name of the agent to handle this task'),
task: z.string().describe('Specific task description for the selected agent'),
});
export async function supervisorRoute(
userMessage: string,
agents: Agent[]
): Promise<{ agent: string; result: string }> {
// Step 1: Route to the right agent
const { object: routing } = await generateObject({
model: anthropic('claude-sonnet-4-20250514'),
temperature: 0,
schema: RoutingDecision,
prompt: `Route this user request to the most appropriate agent.
USER REQUEST: "${userMessage}"
AVAILABLE AGENTS:
${agents.map((a) => `- ${a.name}: ${a.description}`).join('\n')}
Choose the best agent and formulate a specific task for them.`,
});
// Step 2: Execute with the selected agent
const agent = agents.find((a) => a.name === routing.agent);
if (!agent) throw new Error(`Agent not found: ${routing.agent}`);
const result = await agent.execute(routing.task);
return { agent: routing.agent, result };
}// Example: Multi-agent content pipeline
const researchAgent: Agent = {
name: 'researcher',
description: 'Searches the web and knowledge base for information on a topic',
execute: async (task) => {
const result = await generateText({
model: anthropic('claude-sonnet-4-20250514'),
system: 'You are a research assistant. Find comprehensive information on the given topic.',
prompt: task,
tools: { webSearch, knowledgeBaseSearch },
maxSteps: 5,
});
return result.text;
},
};
const writerAgent: Agent = {
name: 'writer',
description: 'Writes polished content based on research and guidelines',
execute: async (task) => {
const result = await generateText({
model: anthropic('claude-sonnet-4-20250514'),
system: 'You are a professional content writer. Write clear, engaging content following the brand style guide.',
prompt: task,
});
return result.text;
},
};
const editorAgent: Agent = {
name: 'editor',
description: 'Reviews and improves content for clarity, accuracy, and style',
execute: async (task) => {
const result = await generateText({
model: anthropic('claude-sonnet-4-20250514'),
system: 'You are a senior editor. Review content for accuracy, clarity, grammar, and style. Return the improved version.',
prompt: task,
});
return result.text;
},
};Use multi-agent when: agents need different tools, expertise, or system prompts. Do not use multi-agent when a single agent with multiple tools would work — the overhead of routing adds latency and cost.
6. Memory and State
Agents need memory to handle multi-turn interactions and learn from past actions.
Short-term memory — the conversation history. Managed by useChat on the client and passed as messages to each call:
// This is handled automatically by useChat + streamText
const result = streamText({
model,
messages, // Full conversation history including previous tool calls and results
tools,
maxSteps: 5,
});Working memory — intermediate state within a task. Stored in tool results that flow back to the model:
// Each tool result becomes part of the conversation context
// The model "remembers" previous tool results within a task
tools: {
addToCart: tool({
description: 'Add a product to the shopping cart',
parameters: z.object({ productId: z.string(), quantity: z.number() }),
execute: async ({ productId, quantity }) => {
const cart = await addToCart(userId, productId, quantity);
// Return full cart state so the model knows what's in the cart
return {
added: { productId, quantity },
cartTotal: cart.total,
itemCount: cart.items.length,
items: cart.items,
};
},
}),
}Long-term memory — facts that persist across conversations. Stored in a database and retrieved at the start of each conversation:
// lib/agents/memory.ts
import { db } from '@/lib/db';
import { embed } from 'ai';
import { openai } from '@ai-sdk/openai';
// Store a memory
export async function saveMemory(
userId: string,
content: string,
category: 'preference' | 'fact' | 'instruction'
) {
const { embedding } = await embed({
model: openai.embedding('text-embedding-3-small'),
value: content,
});
await db.agentMemory.create({
data: {
userId,
content,
category,
embedding: JSON.stringify(embedding),
},
});
}
// Retrieve relevant memories for a conversation
export async function recallMemories(
userId: string,
currentMessage: string,
limit: number = 5
): Promise<string[]> {
const { embedding } = await embed({
model: openai.embedding('text-embedding-3-small'),
value: currentMessage,
});
const memories = await db.$queryRaw<Array<{ content: string }>>`
SELECT content
FROM "AgentMemory"
WHERE "userId" = ${userId}
ORDER BY embedding <=> ${JSON.stringify(embedding)}::vector
LIMIT ${limit}
`;
return memories.map((m) => m.content);
}
// Use in the agent's system prompt
export async function buildAgentSystemPrompt(
userId: string,
currentMessage: string
): Promise<string> {
const memories = await recallMemories(userId, currentMessage);
const memoryBlock = memories.length > 0
? `\nUSER CONTEXT (from previous conversations):\n${memories.map((m) => `- ${m}`).join('\n')}\n`
: '';
return `You are a personal assistant.
${memoryBlock}
Use the above context to personalize your responses.
If the user shares a preference or important fact, note it for future reference.`;
}7. Human-in-the-Loop
Not every action should be autonomous. High-stakes actions need human approval before execution.
Pattern: confirmation before write actions:
import { streamText, tool } from 'ai';
import { z } from 'zod';
const result = streamText({
model: anthropic('claude-sonnet-4-20250514'),
system: `You are a customer support agent.
CONFIRMATION REQUIRED for these actions:
- Issuing refunds
- Cancelling subscriptions
- Deleting accounts
- Sending emails to customers
For these actions, describe what you plan to do and ask the user to confirm.
Only proceed when the user explicitly says "yes" or "confirm".`,
messages,
tools: {
// Read tools — no confirmation needed
lookupCustomer: tool({
description: 'Look up customer details',
parameters: z.object({ email: z.string() }),
execute: async ({ email }) => await findCustomer(email),
}),
// Write tool — the model should ask for confirmation
issueRefund: tool({
description: 'Issue a refund to a customer. REQUIRES user confirmation before executing.',
parameters: z.object({
orderId: z.string(),
amount: z.number(),
reason: z.string(),
}),
execute: async ({ orderId, amount, reason }) => {
// Double-check: the frontend should also validate confirmation
const result = await processRefund(orderId, amount, reason);
return { success: true, refundId: result.id, amount };
},
}),
},
maxSteps: 5,
});Server-side confirmation gate — for critical actions, add a server-side check that requires an explicit confirmation token:
// lib/agents/confirmation.ts
import { db } from '@/lib/db';
import { randomUUID } from 'crypto';
export async function createConfirmation(action: {
type: string;
params: Record<string, any>;
userId: string;
}): Promise<string> {
const token = randomUUID();
await db.pendingAction.create({
data: {
token,
actionType: action.type,
params: action.params,
userId: action.userId,
expiresAt: new Date(Date.now() + 5 * 60 * 1000), // 5 min expiry
},
});
return token;
}
export async function executeConfirmed(token: string) {
const action = await db.pendingAction.findUnique({ where: { token } });
if (!action) throw new Error('Confirmation not found');
if (action.expiresAt < new Date()) throw new Error('Confirmation expired');
// Execute the action
switch (action.actionType) {
case 'refund':
return await processRefund(action.params.orderId, action.params.amount, action.params.reason);
case 'cancel':
return await cancelSubscription(action.params.subscriptionId);
default:
throw new Error(`Unknown action type: ${action.actionType}`);
}
}8. Error Recovery and Cost Control
Agents can loop, fail, and burn through API credits. Build guardrails from day one.
Cost control:
// lib/agents/cost.ts
interface AgentBudget {
maxSteps: number;
maxTokens: number;
maxCostUsd: number;
maxDurationMs: number;
}
const DEFAULT_BUDGET: AgentBudget = {
maxSteps: 10,
maxTokens: 50000,
maxCostUsd: 0.50,
maxDurationMs: 60000, // 1 minute
};
export function createCostTracker(budget: Partial<AgentBudget> = {}) {
const limits = { ...DEFAULT_BUDGET, ...budget };
let totalTokens = 0;
let totalCost = 0;
let steps = 0;
const startTime = Date.now();
return {
recordStep(usage: { promptTokens: number; completionTokens: number }) {
steps++;
totalTokens += usage.promptTokens + usage.completionTokens;
// Approximate cost (adjust for your model pricing)
totalCost +=
(usage.promptTokens / 1_000_000) * 3 +
(usage.completionTokens / 1_000_000) * 15;
},
checkBudget(): { withinBudget: boolean; reason?: string } {
if (steps >= limits.maxSteps) {
return { withinBudget: false, reason: `Step limit reached (${steps}/${limits.maxSteps})` };
}
if (totalTokens >= limits.maxTokens) {
return { withinBudget: false, reason: `Token limit reached (${totalTokens}/${limits.maxTokens})` };
}
if (totalCost >= limits.maxCostUsd) {
return { withinBudget: false, reason: `Cost limit reached ($${totalCost.toFixed(4)}/$${limits.maxCostUsd})` };
}
if (Date.now() - startTime >= limits.maxDurationMs) {
return { withinBudget: false, reason: `Duration limit reached` };
}
return { withinBudget: true };
},
getSummary() {
return {
steps,
totalTokens,
totalCost: Math.round(totalCost * 10000) / 10000,
durationMs: Date.now() - startTime,
};
},
};
}Error handling in tools:
// Tools should never throw — return error objects
const safeTool = tool({
description: 'Fetch data from external API',
parameters: z.object({ endpoint: z.string() }),
execute: async ({ endpoint }) => {
try {
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 10000); // 10s timeout
const response = await fetch(endpoint, { signal: controller.signal });
clearTimeout(timeout);
if (!response.ok) {
return {
error: `API returned ${response.status}: ${response.statusText}`,
suggestion: 'Try a different endpoint or check if the service is available',
};
}
return { data: await response.json() };
} catch (error) {
return {
error: error instanceof Error ? error.message : 'Unknown error',
suggestion: 'The request failed. You may want to try an alternative approach.',
};
}
},
});LLM Instructions
AI AGENT DEVELOPMENT INSTRUCTIONS
1. BUILD A REACT AGENT:
- Use streamText with maxSteps (3-5 for simple agents, 5-10 for complex ones)
- Define tools with clear descriptions, Zod parameters, and execute functions
- Set a system prompt that instructs the model to think step-by-step
- Return result.toDataStreamResponse() for streaming to the client
- Use onFinish callback to log usage, tool calls, and cost
2. DESIGN AGENT TOOLS:
- One tool = one responsibility (no multi-action tools)
- Write detailed descriptions: what it does, when to use it, what it returns
- Use specific Zod schemas with .describe() on every field
- Return structured objects (never raw strings)
- Return error objects instead of throwing exceptions
- Add timeouts to all external calls (10 seconds default)
- Categorize tools as read-only, write-reversible, or write-irreversible
3. ADD HUMAN-IN-THE-LOOP:
- Identify which actions require human approval (writes, deletes, external sends)
- Add confirmation instructions to the system prompt
- For critical actions, implement server-side confirmation tokens
- Set expiry on pending confirmations (5 minutes default)
- Log all confirmed and rejected actions for audit
4. IMPLEMENT AGENT MEMORY:
- Short-term: pass full messages array (handled by useChat)
- Working: return rich context from tool results
- Long-term: store key facts in pgvector, retrieve relevant ones per conversation
- Use embed() to store and retrieve memories by semantic similarity
- Clean up outdated memories periodically
5. CONTROL AGENT COST:
- Set maxSteps on every agent (never unlimited)
- Track tokens per step and total cost per agent run
- Set budget limits: max steps, max tokens, max cost, max duration
- Abort the agent if any limit is reached
- Log cost per agent run for monitoring and billing
- Use cheaper models (gpt-4o-mini) for planning and routing, expensive models for executionExamples
Example 1: ReAct Agent with Vercel AI SDK
A customer support agent that looks up orders, checks policies, and resolves issues.
// app/api/agent/support/route.ts
import { streamText, tool } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { z } from 'zod';
import { db } from '@/lib/db';
import { auth } from '@/lib/auth';
export const maxDuration = 60;
export async function POST(req: Request) {
const session = await auth();
if (!session?.user) return new Response('Unauthorized', { status: 401 });
const { messages } = await req.json();
const result = streamText({
model: anthropic('claude-sonnet-4-20250514'),
system: `You are a customer support agent for Acme SaaS.
PROCESS:
1. Understand the customer's issue
2. Look up relevant information (orders, account, knowledge base)
3. Determine the resolution
4. If the resolution involves a refund or account change, explain what you'll do and ask for confirmation
5. Execute the resolution
RULES:
- Always look up the customer's account before making assumptions
- Search the knowledge base for how-to questions before answering from memory
- Never promise features or timelines not in the documentation
- For refunds over $50, recommend the customer contact billing directly
- Be empathetic but solution-oriented`,
messages,
tools: {
lookupAccount: tool({
description:
'Look up the current user\'s account details including subscription tier, usage, and billing status',
parameters: z.object({}),
execute: async () => {
const account = await db.user.findUnique({
where: { id: session.user.id },
include: {
subscription: true,
_count: { select: { orders: true } },
},
});
return {
name: account?.name,
email: account?.email,
tier: account?.subscription?.tier,
status: account?.subscription?.status,
totalOrders: account?._count.orders,
};
},
}),
searchOrders: tool({
description:
'Search the customer\'s orders by status, date range, or order ID',
parameters: z.object({
orderId: z.string().optional().describe('Specific order ID to look up'),
status: z.enum(['all', 'pending', 'shipped', 'delivered', 'cancelled', 'refunded']).default('all'),
limit: z.number().default(5),
}),
execute: async ({ orderId, status, limit }) => {
const orders = await db.order.findMany({
where: {
userId: session.user.id,
...(orderId && { id: orderId }),
...(status !== 'all' && { status }),
},
orderBy: { createdAt: 'desc' },
take: limit,
});
return {
count: orders.length,
orders: orders.map((o) => ({
id: o.id,
total: o.total,
status: o.status,
date: o.createdAt.toISOString().split('T')[0],
items: o.items,
})),
};
},
}),
searchKnowledgeBase: tool({
description:
'Search help center articles for product documentation, troubleshooting guides, and how-to instructions',
parameters: z.object({
query: z.string().describe('Search query'),
}),
execute: async ({ query }) => {
const articles = await semanticSearch(query, { limit: 3 });
return {
articles: articles.map((a) => ({
title: a.title,
content: a.content.slice(0, 500),
url: a.sourceUrl,
})),
};
},
}),
requestRefund: tool({
description:
'Initiate a refund for an order. ONLY call this after the customer has confirmed they want a refund.',
parameters: z.object({
orderId: z.string(),
amount: z.number().describe('Refund amount in dollars'),
reason: z.string().describe('Reason for the refund'),
}),
execute: async ({ orderId, amount, reason }) => {
if (amount > 50) {
return {
error: 'Refunds over $50 must be processed by the billing team',
suggestion: 'Please direct the customer to billing@acme.com',
};
}
const refund = await processRefund(orderId, amount, reason);
return {
success: true,
refundId: refund.id,
amount,
estimatedDays: 5,
};
},
}),
},
maxSteps: 6,
onFinish: async ({ usage, steps }) => {
await db.agentLog.create({
data: {
userId: session.user.id,
agentType: 'support',
steps: steps.length,
toolCalls: steps.flatMap((s) => s.toolCalls.map((tc) => tc.toolName)),
promptTokens: usage.promptTokens,
completionTokens: usage.completionTokens,
},
});
},
});
return result.toDataStreamResponse();
}Example 2: Research Agent with Plan-and-Execute
An agent that plans its research, executes the plan, and synthesizes findings.
// lib/agents/research.ts
import { generateObject, generateText, streamText, tool } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';
const ResearchPlan = z.object({
topic: z.string(),
questions: z.array(z.string()).min(2).max(5).describe('Key questions to answer'),
searchQueries: z.array(z.string()).min(2).max(8).describe('Search queries to run'),
estimatedSteps: z.number(),
});
export async function researchTopic(topic: string): Promise<string> {
// Phase 1: Plan
const { object: plan } = await generateObject({
model: openai('gpt-4o-mini'), // Cheap model for planning
schema: ResearchPlan,
prompt: `Create a research plan for: "${topic}"
Generate specific questions to answer and search queries to find the information.
Focus on factual, verifiable information.`,
});
console.log('Research plan:', plan);
// Phase 2: Execute research
const { text: researchNotes } = await generateText({
model: anthropic('claude-sonnet-4-20250514'),
system: `You are a research agent. Execute the research plan by searching for information.
PLAN:
Questions to answer:
${plan.questions.map((q, i) => `${i + 1}. ${q}`).join('\n')}
Search queries to run:
${plan.searchQueries.map((q, i) => `${i + 1}. ${q}`).join('\n')}
For each search, note the key findings. Cite sources.
After completing research, summarize your findings organized by question.`,
prompt: `Research topic: ${topic}`,
tools: {
searchWeb: tool({
description: 'Search the web for current information',
parameters: z.object({
query: z.string(),
}),
execute: async ({ query }) => {
// Implement with your preferred search API
const results = await webSearch(query);
return results.slice(0, 5).map((r) => ({
title: r.title,
snippet: r.snippet,
url: r.url,
}));
},
}),
readPage: tool({
description: 'Read the content of a web page',
parameters: z.object({
url: z.string().url(),
}),
execute: async ({ url }) => {
const content = await fetchAndExtract(url);
return { content: content.slice(0, 3000), url };
},
}),
},
maxSteps: plan.estimatedSteps + 2,
});
// Phase 3: Synthesize
const { text: synthesis } = await generateText({
model: anthropic('claude-sonnet-4-20250514'),
system: 'You are a research synthesizer. Create a clear, well-organized summary from research notes.',
prompt: `Synthesize the following research notes into a comprehensive summary.
Organize by key themes. Include citations.
RESEARCH NOTES:
${researchNotes}`,
});
return synthesis;
}Example 3: Customer Support Agent with Escalation
An agent that handles common support tasks and escalates when needed.
// lib/agents/support.ts
import { streamText, tool, generateObject } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { z } from 'zod';
const EscalationDecision = z.object({
shouldEscalate: z.boolean(),
reason: z.string(),
department: z.enum(['billing', 'engineering', 'management', 'legal']).optional(),
priority: z.enum(['normal', 'urgent', 'critical']).optional(),
});
export function createSupportAgent(
userId: string,
conversationId: string
) {
return streamText({
model: anthropic('claude-sonnet-4-20250514'),
system: `You are a Level 1 support agent for Acme SaaS.
YOUR CAPABILITIES:
- Answer product questions using the knowledge base
- Look up order status and account details
- Process simple refunds (under $50)
- Update account preferences
ESCALATION TRIGGERS (hand off to a human):
- Customer is angry and asks to speak to a manager
- Technical issues you cannot diagnose
- Billing disputes over $50
- Account security concerns
- Legal threats or compliance questions
- Customer has contacted support 3+ times about the same issue
When escalating, summarize the conversation and hand off gracefully.`,
messages: [], // Will be populated
tools: {
lookupAccount: tool({
description: 'Look up customer account details',
parameters: z.object({}),
execute: async () => {
return await getAccountDetails(userId);
},
}),
searchKnowledge: tool({
description: 'Search help center for answers',
parameters: z.object({ query: z.string() }),
execute: async ({ query }) => {
return await searchHelpCenter(query);
},
}),
checkEscalation: tool({
description: 'Evaluate whether this conversation should be escalated to a human agent. Call this when the customer seems frustrated, has a complex issue, or triggers an escalation rule.',
parameters: z.object({
conversationSummary: z.string(),
customerSentiment: z.enum(['positive', 'neutral', 'frustrated', 'angry']),
issueType: z.string(),
}),
execute: async ({ conversationSummary, customerSentiment, issueType }) => {
// Log escalation check
await db.escalationLog.create({
data: {
conversationId,
userId,
summary: conversationSummary,
sentiment: customerSentiment,
issueType,
},
});
// Auto-escalate for angry customers or repeated issues
const recentTickets = await db.ticket.count({
where: {
userId,
createdAt: { gte: new Date(Date.now() - 7 * 24 * 60 * 60 * 1000) },
},
});
if (customerSentiment === 'angry' || recentTickets >= 3) {
return {
shouldEscalate: true,
reason: customerSentiment === 'angry'
? 'Customer is frustrated — needs human empathy'
: `Customer has ${recentTickets} tickets in the past week`,
department: 'management',
priority: 'urgent',
};
}
return { shouldEscalate: false, reason: 'Issue can be handled by AI agent' };
},
}),
escalateToHuman: tool({
description: 'Transfer the conversation to a human support agent. Only call after checkEscalation returns shouldEscalate: true.',
parameters: z.object({
department: z.enum(['billing', 'engineering', 'management', 'legal']),
priority: z.enum(['normal', 'urgent', 'critical']),
summary: z.string().describe('Summary of the issue and what has been tried'),
}),
execute: async ({ department, priority, summary }) => {
await createEscalation({
conversationId,
userId,
department,
priority,
summary,
});
return {
escalated: true,
message: `Transferred to ${department} team with ${priority} priority`,
estimatedWait: priority === 'critical' ? '5 minutes' : '15 minutes',
};
},
}),
},
maxSteps: 6,
});
}Example 4: Multi-Agent Content Pipeline
Three specialized agents that collaborate to produce a blog post: research, write, edit.
// lib/agents/content-pipeline.ts
import { generateText, generateObject, tool } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';
const ContentBrief = z.object({
title: z.string(),
outline: z.array(z.object({
heading: z.string(),
keyPoints: z.array(z.string()),
})),
targetLength: z.number(),
tone: z.string(),
keywords: z.array(z.string()),
});
const EditFeedback = z.object({
overallQuality: z.number().min(1).max(10),
issues: z.array(z.object({
type: z.enum(['grammar', 'clarity', 'accuracy', 'flow', 'tone']),
location: z.string(),
issue: z.string(),
suggestion: z.string(),
})),
improvedContent: z.string(),
});
export async function generateBlogPost(
topic: string,
style: string = 'informative'
) {
// Agent 1: Researcher — gathers information
console.log('Phase 1: Research');
const { text: research } = await generateText({
model: anthropic('claude-sonnet-4-20250514'),
system: 'You are a thorough researcher. Gather key facts, statistics, and expert opinions on the given topic.',
prompt: `Research this topic thoroughly for a blog post: "${topic}"
Include:
- Key facts and statistics (with sources)
- Common questions people have about this topic
- Expert perspectives
- Recent developments
- Practical tips and actionable advice`,
tools: {
webSearch: tool({
description: 'Search the web for information',
parameters: z.object({ query: z.string() }),
execute: async ({ query }) => await webSearch(query),
}),
},
maxSteps: 5,
});
// Agent 2: Writer — creates the first draft
console.log('Phase 2: Writing');
const { object: brief } = await generateObject({
model: openai('gpt-4o-mini'),
schema: ContentBrief,
prompt: `Create a content brief for a blog post about "${topic}".
Style: ${style}
Research notes: ${research.slice(0, 3000)}`,
});
const { text: draft } = await generateText({
model: anthropic('claude-sonnet-4-20250514'),
system: `You are a professional blog writer. Write engaging, well-structured content.
Follow the outline provided. Target ${brief.targetLength} words.
Tone: ${brief.tone}
Include the keywords naturally: ${brief.keywords.join(', ')}`,
prompt: `Write a blog post based on this brief:
TITLE: ${brief.title}
OUTLINE:
${brief.outline.map((s) => `## ${s.heading}\n${s.keyPoints.map((p) => `- ${p}`).join('\n')}`).join('\n\n')}
RESEARCH:
${research}`,
});
// Agent 3: Editor — reviews and improves
console.log('Phase 3: Editing');
const { object: editResult } = await generateObject({
model: anthropic('claude-sonnet-4-20250514'),
schema: EditFeedback,
prompt: `Review and improve this blog post draft.
Check for:
- Grammar and spelling errors
- Clarity and readability
- Logical flow between sections
- Factual accuracy
- Engaging tone and voice
- SEO keyword usage (keywords: ${brief.keywords.join(', ')})
DRAFT:
${draft}
Provide specific feedback and return an improved version.`,
});
return {
title: brief.title,
content: editResult.improvedContent,
qualityScore: editResult.overallQuality,
issues: editResult.issues,
brief,
research: research.slice(0, 500),
};
}Example 5: Agent with Long-Term Memory
An assistant that remembers user preferences and past interactions across conversations.
// app/api/agent/personal/route.ts
import { streamText, tool } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
import { z } from 'zod';
import { recallMemories, saveMemory } from '@/lib/agents/memory';
import { auth } from '@/lib/auth';
export const maxDuration = 30;
export async function POST(req: Request) {
const session = await auth();
if (!session?.user) return new Response('Unauthorized', { status: 401 });
const { messages } = await req.json();
const latestMessage = messages[messages.length - 1].content;
// Recall relevant memories
const memories = await recallMemories(session.user.id, latestMessage, 5);
const memoryContext = memories.length > 0
? `\nTHINGS I REMEMBER ABOUT YOU:\n${memories.map((m) => `- ${m}`).join('\n')}\n`
: '';
const result = streamText({
model: anthropic('claude-sonnet-4-20250514'),
system: `You are a personal AI assistant with memory.
${memoryContext}
MEMORY INSTRUCTIONS:
- Use remembered facts to personalize responses
- When the user shares preferences, habits, or important facts, use the rememberFact tool
- When information you remember seems outdated, ask the user to confirm
- Be natural about using memories — don't list everything you remember`,
messages,
tools: {
rememberFact: tool({
description:
'Save an important fact about the user for future conversations. Use for preferences, goals, frequently referenced info.',
parameters: z.object({
fact: z.string().describe('The fact to remember, written in third person'),
category: z.enum(['preference', 'fact', 'instruction']),
}),
execute: async ({ fact, category }) => {
await saveMemory(session.user.id, fact, category);
return { saved: true, fact };
},
}),
recallContext: tool({
description:
'Search memories for information about the user that might be relevant to the current topic',
parameters: z.object({
topic: z.string().describe('What topic to search memories for'),
}),
execute: async ({ topic }) => {
const relevant = await recallMemories(session.user.id, topic, 3);
return { memories: relevant };
},
}),
},
maxSteps: 4,
});
return result.toDataStreamResponse();
}Common Mistakes
1. No Step Limit
Wrong:
const result = streamText({
model,
messages,
tools,
maxSteps: 100, // Or no limit at all
});Fix: Set maxSteps to the minimum needed for the task. 3-5 for simple agents, 5-10 for complex ones. Without a limit, an agent can loop indefinitely — calling tools, getting results it doesn't understand, calling more tools. This burns tokens and money.
2. Vague Tool Descriptions
Wrong:
tool({
description: 'Gets data',
parameters: z.object({ id: z.string() }),
execute: async ({ id }) => await getData(id),
});Fix: Tool descriptions are the agent's documentation. Write them like API docs: what the tool does, when to use it, what it returns, what inputs it needs. The model literally reads these to decide which tool to call.
3. Tools That Do Too Much
Wrong: A single tool that searches, filters, sorts, creates a draft, sends an email, and logs the result.
Fix: One tool, one responsibility. A tool that does too much is unpredictable — the model might call it when it only needs one of its six functions. Split into focused tools: search, createDraft, sendEmail, logAction.
4. No Error Handling in Tools
Wrong:
execute: async ({ url }) => {
const res = await fetch(url); // No timeout, no error handling
return await res.json(); // Crashes on non-JSON response
}Fix: Every tool should handle errors gracefully. Use try/catch, set timeouts (10 seconds), and return error objects instead of throwing. The agent can reason about errors if you return { error: "API unavailable", suggestion: "Try another approach" }.
5. Using an Agent for Simple Tasks
Wrong: Building an agent with a ReAct loop to answer "What are your business hours?" — a question that needs zero tool calls.
Fix: Only use agents when the task requires multiple steps, dynamic decision-making, or tool interaction. For simple Q&A, use generateText or RAG. For extraction, use generateObject. Agents add latency (each step = another LLM call) and cost.
6. No Human Oversight
Wrong: An agent that can send emails, process refunds, and delete accounts with zero approval gates.
Fix: Categorize tools by risk level. Read-only tools can run freely. Write actions that affect users (emails, charges, deletions) should require confirmation — either from the user in chat or from a human reviewer. Log every action for audit.
7. Unbounded Memory
Wrong: Storing every fact from every conversation without cleanup, growing the memory database indefinitely.
Fix: Limit memory size per user (50-100 facts). Add a relevance decay — memories not accessed in 30 days get archived. Deduplicate — if the user says "I prefer dark mode" twice, store it once. Periodically consolidate related memories.
8. No Cost Tracking
Wrong: Deploying an agent with no visibility into how many tokens it uses, how many steps it takes, or how much it costs per run.
Fix: Log every agent run: total tokens, step count, tool calls, duration, and estimated cost. Set budget alerts. Show cost data in your admin dashboard. Agents are significantly more expensive than single LLM calls — a 5-step agent costs 5x a single call at minimum.
9. Over-Engineering with Multi-Agent
Wrong: Building a supervisor, router, three specialist agents, and a synthesis agent for a task that one agent with three tools could handle.
Fix: Start with a single agent. Only split into multiple agents when you have concrete evidence that: (1) a single agent's tools are too many (10+), (2) different steps need fundamentally different system prompts, or (3) you need parallel execution. Multi-agent adds latency and complexity.
10. No Logging or Observability
Wrong: Agents running in production with no visibility into what tools they call, what reasoning they follow, or why they produce certain outputs.
Fix: Log every step: the model's reasoning, tool calls with parameters, tool results, and the final response. Store logs by conversation ID for replay and debugging. Alert on anomalies: high step counts, repeated tool calls, error-heavy sessions. Use structured logging (JSON) for queryability.
See also: LLM-Patterns | Prompt-Engineering | RAG | Backend/Background-Jobs | Backend/Error-Handling-Logging
Last reviewed: 2026-03
By Ryan Lind, Assisted by Claude Code and Google Gemini.
Embeddings
Embedding models, vector databases, similarity search, indexing strategies, hybrid search, semantic caching, batch processing, and metadata filtering — the foundation for semantic search and RAG.
AI Workflows & Pipelines
Multi-step AI processing chains, batch operations, map-reduce patterns, conditional routing, content moderation pipelines, document processing, scheduled AI jobs, webhook-triggered AI, error handling in AI pipelines, and cost management at scale — the deterministic orchestration layer for production AI.