Context engineering vs prompt engineering

easy

Learn with your AI

Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.

Open in Claude Open in ChatGPT

Why this matters

Prompt engineering is about the words you write. Context engineering is about everything the model can see when it generates the next token — the system prompt, the conversation so far, retrieved documents, tool results, and the formatting that holds them together. As soon as you move past one-shot chat into RAG, agents, or long sessions, the prompt is a tiny fraction of the context, and the hard problems all live in what else you put in the window and in what order. Treating the context as a budgeted, engineered artifact — not a string you concatenate — is the difference between a demo and a system that holds up.

Demo

Here's the whole discipline in one frame: a model call is f(context) -> tokens, where context is a list of messages plus a system prompt, all measured in tokens. Prompt engineering optimizes one message. Context engineering optimizes the whole list: what's included, what's dropped, what order it's in, and how it's formatted. The snippet below shows the same question answered with a bare prompt vs. an engineered context — same model, very different reliability.

Try it yourself

Run both calls against a fact only your domain knows. The bare call will hedge or invent; the engineered one answers from the SPECS block.
Delete the 'Answer ONLY from the SPECS' rule and re-run. Watch the model start blending training-data guesses with the provided facts.
Move the SPECS block from the system role into the user message. Note whether grounding gets weaker — role placement is a context decision.
Count the tokens in each request (the bare one is tiny). The engineered call costs more per request — that trade-off is what you're learning to manage.

Prompt your AI

Use these three in order. Each builds on the one before.

1. Basics & terminology

In one paragraph, explain the difference between prompt engineering and context engineering, and why the distinction starts to matter once I build RAG or agents.

2. Why it works (the mechanism)

A model call is f(context) -> tokens. Walk me through everything that counts as 'context' in a chat API request, and how each part influences the next-token distribution.

3. Advanced — application & what's next

I'm getting inconsistent answers from an LLM feature even though my prompt is good. Give me a checklist of context-level causes (ordering, role placement, stale history, missing grounding) to diagnose before I touch the prompt.