Context Windows and Tokens

easy

Learn with your AI

Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.

Open in Claude Open in ChatGPT

Why this matters

Every model has a context window, measured in tokens, not words. When your prompt exceeds it, the oldest content is silently dropped (or the request fails). This single mechanical fact explains half the weird behavior you'll see with long conversations.

Demo

A token is roughly ¾ of a word in English. "Hello, world!" is 4 tokens. A 2,000-word document is ~2,600 tokens. Claude's default window is 200K tokens; GPT-4o's is 128K; open-source models often range 8K–32K.

If you paste a 500-page PDF into a 128K-token window, something has to give. Either the API rejects you, or the framework silently trims. Knowing your model's window is the first step to not being surprised.

Try it yourself

Go to OpenAI's tokenizer and paste something you commonly ask about. Count the tokens. Now count your typical chat history. Are you anywhere near the limit?

Prompt your AI

Rough-count the tokens in the text below without running any tool (use the heuristic: ~4 characters per token in English). Then tell me:
1. Estimated token count.
2. Whether it fits in a 32K-token context window with 4K reserved for output.
3. What I'd cut if it doesn't fit.

Text:
"""
[PASTE HERE]
"""

Context Windows and Tokens

Why this matters

Demo

Try it yourself

Prompt your AI

References