Picking application shapes — assistant, copilot, agent, autonomous

hard

Learn with your AI

Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.

Open in Claude Open in ChatGPT

Why this matters

Not every LLM-powered product is an 'agent'. The four common shapes: assistant (chat with retrieval), copilot (IDE-integrated suggestions in a workflow), agent (multi-step with tools), autonomous (runs without user in the loop). Each scales differently — autonomous costs more per task but serves fewer interactive users; assistant scales with users; copilot scales with editor sessions. The shape determines the cost curve and the failure modes.

Demo

Examples: ChatGPT = assistant. GitHub Copilot = copilot. Devin / Manus = autonomous agent. Customer-support bot = could be assistant or agent depending on tool access. Each shape has different bottlenecks: assistant scales with chat history (memory), copilot scales with latency (must respond in <1s), agent scales with iteration count (multi-step is expensive), autonomous scales with episode duration (long-running tasks need orchestration + checkpointing).

# Assistant — chat + retrieval, low iteration count
def assistant_handler(query, history, user_id):
    chunks = retrieve(query, k=5)
    return llm_call(system=ASSISTANT_PROMPT + chunks, history=history + [query])

# Copilot — embedded in a workflow, latency-critical
def copilot_handler(context):
    # < 500ms p99 target
    return llm_call_streaming(model="claude-haiku-4-5-20251001", messages=[..., context])

# Agent — multi-step with tools
def agent_handler(query, user_id):
    return run_agent_loop(query, user_id, max_steps=6)   # 5-15 seconds typical

# Autonomous — runs without user in the loop
def autonomous_handler(task_id):
    # may run for minutes to hours
    # needs: checkpointing, recovery on crash, progress reporting
    while not task.complete:
        next_action = plan_next(task)
        result = execute(next_action)
        checkpoint(task, result)
        if needs_human_review(result):
            pause_for_review(task)
            return

# Decision tree
# - sync user-facing chat?              -> assistant or agent
# - embedded in another tool?            -> copilot
# - long-running background work?        -> autonomous
# - high-volume cheap queries?           -> probably assistant
# - few queries, deeply researched?      -> agent or autonomous

Run: python3 main.py

Try it yourself

Identify which shape your product is. Honestly — many self-described 'agents' are assistants.

For each shape, identify the bottleneck early: chat history for assistants, latency for copilots, iterations for agents, episode duration for autonomous.

Mix shapes only when needed. A SaaS might have an assistant frontend + autonomous background agents for long-running tasks.

Cost-per-task varies wildly. Autonomous can be

1+/task; assistant is

0.01/query. Plan accordingly.

Prompt your AI

Use these three in order. Each builds on the one before.

1. Basics & terminology

Compare assistant, copilot, agent, autonomous. Where does each one bottleneck at scale?

2. Why it works (the mechanism)

Walk me through latency vs iteration trade-offs across the four shapes.

3. Advanced — application & what's next

I'm building a 'project planning' AI. Should it be assistant, agent, or autonomous? Walk through the design.