Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.
The word 'agent' is overloaded. Three distinct shapes get called agents: (1) a fixed chain of LLM calls (workflow), (2) a loop where the LLM decides what to do next from a tool list (true agent), (3) a multi-agent system (orchestrator + subagents). Each has different costs, debugging stories, and failure modes. Naming them precisely lets you pick the right shape; conflating them ships the wrong architecture for your use case.
Workflow: a deterministic sequence — classify → retrieve → answer → cite. Predictable cost and latency, easy to debug. Agent: a loop — the model has tools and decides at each step whether to call one. Variable cost, variable latency, opaque reasoning. Multi-agent: an orchestrator that dispatches subagents. Most powerful, most expensive, most failure modes. Most production 'agents' are actually workflows in disguise. Don't ship a true agent when a workflow does.
Use these three in order. Each builds on the one before.
Define workflow vs agent vs multi-agent. Give a use case for each.
Walk me through the LLM-decides-next-step loop. What signals does the model use, and what makes the loop terminate?
I have a customer support 'agent' that's actually 4 sequential LLM calls. Help me decide: keep it as a workflow, or convert to a true agent for flexibility?
# WORKFLOW (deterministic) — predictable cost + latency
def workflow_answer(question):
intent = classify(question) # call 1
docs = retrieve(question, intent) # not a model call
response = generate(question, docs) # call 2
return response
# Total: 2 LLM calls, deterministic shape, easy to trace.
# AGENT (loop) — model chooses tools
def agentic_answer(question):
messages = [{"role": "user", "content": question}]
for _ in range(6): # cap iterations
resp = client.messages.create(model="claude-sonnet-4-6", tools=TOOLS, messages=messages)
messages.append({"role": "assistant", "content": resp.content})
if resp.stop_reason == "end_turn":
return resp.content[0].text
# otherwise execute tools the model requested, append results, continue
for block in resp.content:
if block.type == "tool_use":
result = dispatch(block.name, block.input)
messages.append({"role": "user", "content": [{"type": "tool_result", "tool_use_id": block.id, "content": result}]})
# Total: 1-6 LLM calls, variable shape, harder to trace.
# MULTI-AGENT (orchestrator + subagents)
def multi_agent_answer(question):
plan = orchestrator_plan(question) # call 1
subresults = parallel([subagent(t) for t in plan.tasks]) # N×M calls in subagents
return orchestrator_synthesize(question, subresults) # call N+2python3 main.py