Reranking with Cohere or Voyage

medium

Learn with your AI

Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.

Open in Claude Open in ChatGPT

Why this matters

Hybrid retrieval gives you top-50 candidates. The LLM only sees top-5. The ordering matters enormously. A reranker (Cohere Rerank-3, Voyage Rerank-2, BGE Reranker self-hosted) is a cross-encoder model that scores (query, doc) pairs precisely. Costs ~$1/1K queries managed; usually lifts answer accuracy 10-25%.

Demo

Workflow: hybrid → top-50 → fetch chunk text → rerank → top-5 → feed to LLM. Latency: ~80-150ms for reranking 50 docs. Cost: ~ $0.001/query for Cohere. Self-hosting BGE Reranker on a small GPU box ($ 50-100/mo) is free at scale. For the project, Cohere/Voyage is the fastest path.

Reranking with Cohere or Voyage

Why this matters

Demo

Try it yourself

Prompt your AI

1. Basics & terminology

2. Why it works (the mechanism)

3. Advanced — application & what's next

References

Chat about this lesson