Embeddings — what they are, what to use

easy

Learn with your AI

Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.

Open in Claude Open in ChatGPT

Why this matters

An embedding is a vector representation of text that places semantically-similar text close together in N-dimensional space. The choice of embedding model determines your retrieval ceiling — a bad embedding is a corpus where 'how do I get a refund?' lands miles from 'cancel my order'. For 2026, sensible defaults: OpenAI text-embedding-3-small ($0.02/1M tokens, 1536 dims) for general English. Voyage-3 or BGE-large for stronger English. Cohere embed-multilingual-v3 for non-English / cross-lingual. Run a couple of comparison queries against your own data before committing — a 20-minute experiment saves a months-long migration later.

Demo

Trade-offs: larger embeddings (1536-3072 dims) capture more nuance at higher storage + retrieval cost. Smaller embeddings (256-768) are cheaper and faster, and often 'good enough' for narrow domains. New: Matryoshka embeddings (Voyage, OpenAI v3) let you truncate vectors to your size budget without re-training — store 256 dims, get 90% of 1536-dim quality. The unsexy truth: embedding model differences matter less than chunking and reranking. Pick a reasonable default and move on.

Try it yourself

Pick 20 realistic (question, correct-doc) pairs from your domain. Add 20 hard negatives (question paired with semantically nearby but wrong doc). This is your eval set.
Run two or three embedding models against the eval set. Compare margin (positive cosine − negative cosine). A 0.05+ margin difference is meaningful; less is noise.
Try Matryoshka truncation: take 1536-dim vectors, truncate to 512, re-score. Often you keep 90%+ of margin at 1/3 the storage.
Compute the monthly bill: tokens × your traffic × model price. text-embedding-3-small is ~ $0.02/1M tokens; Voyage-3 is ~$ 0.12/1M; the differences add up at scale.

Embeddings — what they are, what to use

Why this matters

Demo

Try it yourself

Prompt your AI

1. Basics & terminology

2. Why it works (the mechanism)

3. Advanced — application & what's next

References

Chat about this lesson