Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.
It's tempting to dump everything into the window 'just in case' — the model is smart, let it sort it out. But irrelevant context actively hurts: it dilutes attention, raises the chance the model latches onto the wrong passage, increases latency and cost, and pushes important material toward positions the model attends to less. The skill is subtractive. A tight context of five relevant chunks beats a sprawling one of fifty, almost every time. This is the single most counterintuitive lesson for people coming from prompt engineering, where 'add more instructions' usually helps.
This is measurable, not philosophical. Take a question with one correct supporting passage, then answer it twice: once with just that passage, once with the passage buried among 40 distractors. Accuracy and latency both move in the wrong direction as you add noise.
Use these three in order. Each builds on the one before.
Explain why adding more context to a prompt can make answers worse, not better. What does 'attention dilution' mean in plain terms?
Walk me through what happens inside the model's attention when relevant evidence is surrounded by irrelevant passages. Why does position within the context matter?
Design an experiment to measure how my RAG system's accuracy degrades as I increase k (chunks retrieved). What metrics would I track and how would I find the 'knee' where more context stops helping?
from anthropic import Anthropic
import time
client = Anthropic()
GOLD = "The warranty on the X200 is 36 months from date of purchase."
DISTRACTORS = [f"Unrelated spec note #{i}: tolerance is 0.5mm." for i in range(40)]
def ask(context_blocks):
ctx = "\n".join(context_blocks)
t = time.time()
r = client.messages.create(
model="claude-sonnet-4-6", max_tokens=100,
system=f"Answer only from CONTEXT.\nCONTEXT:\n{ctx}",
messages=[{"role": "user", "content": "How long is the X200 warranty?"}],
)
return r.content[0].text.strip(), round(time.time() - t, 2)
print("tight: ", ask([GOLD]))
print("noisy: ", ask(DISTRACTORS[:20] + [GOLD] + DISTRACTORS[20:]))python3 main.py