Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.
Test doubles are the tools that make unit testing possible when your code has dependencies (databases, HTTP APIs, queues, file systems). But 'mock everything' is a trap: tests full of mock setup become brittle contracts that break when implementation details change, not when behavior changes. The three main tools have different jobs: a fake is a working implementation (in-memory database instead of real DB), a stub returns canned responses (always returns user 42), and a mock records interactions and asserts on them (verify send_email was called exactly once with the right arguments). Using the wrong tool causes tests that pass when they should fail, or tests so tightly coupled to implementation that any refactor breaks them.
Fakes, stubs, and mocks serve different purposes: a fake is a working in-memory implementation (such as an in-memory database) used when you need realistic behavior across multiple tests; a stub returns a hardcoded value to control the code path under test; and a mock records calls so you can assert on the exact interactions that occurred. Reaching for a mock when a stub would do — or a stub when a fake would give more confidence — is the most common test double mistake, and it leads either to brittle tests that break on every refactor or to tests that never verify the behavior they claim to cover.
import pytest
from unittest.mock import Mock, patch, call
from typing import Protocol
class EmailSender(Protocol):
def send(self, to: str, subject: str, body: str) -> bool: ...
class UserRepository(Protocol):
def find(self, user_id: int) -> dict | None: ...
class NotificationService:
def __init__(self, user_repo: UserRepository, email_sender: EmailSender):
self._repo = user_repo
self._email = email_sender
def notify_welcome(self, user_id: int) -> bool:
user = self._repo.find(user_id)
if not user:
return False
return self._email.send(
to=user["email"],
subject=f"Welcome, {user['name']}!",
body=f"Hi {user['name']}, your account is ready.",
)
# ── Type 1: FAKE — a working in-memory implementation ────────────────────────
class FakeUserRepo:
def __init__(self): self._db = {}
def add(self, user_id, user): self._db[user_id] = user
def find(self, user_id): return self._db.get(user_id)
class FakeEmailSender:
def __init__(self): self.sent = []
def send(self, to, subject, body):
self.sent.append({"to": to, "subject": subject})
return True
def test_welcome_email_sent_to_correct_address():
# FAKE: in-memory repo and email catcher
repo = FakeUserRepo()
email = FakeEmailSender()
repo.add(1, {"email": "alice@test.com", "name": "Alice"})
svc = NotificationService(repo, email)
svc.notify_welcome(1)
assert len(email.sent) == 1
assert email.sent[0]["to"] == "alice@test.com"
# ── Type 2: STUB — canned return value, no verification ──────────────────────
def test_returns_false_when_user_not_found():
stub_repo = Mock(spec=UserRepository)
stub_repo.find.return_value = None # always returns None (stub)
stub_email = Mock(spec=EmailSender)
svc = NotificationService(stub_repo, stub_email)
result = svc.notify_welcome(999)
assert result is False
stub_email.send.assert_not_called() # email not sent for missing user
# ── Type 3: MOCK — verify interactions (HOW it was called) ───────────────────
def test_email_called_with_correct_subject():
mock_repo = Mock(spec=UserRepository)
mock_email = Mock(spec=EmailSender)
mock_repo.find.return_value = {"email": "bob@test.com", "name": "Bob"}
mock_email.send.return_value = True
svc = NotificationService(mock_repo, mock_email)
svc.notify_welcome(1)
# Assert on HOW the mock was called (not just that it was called)
mock_email.send.assert_called_once_with(
to="bob@test.com",
subject="Welcome, Bob!",
body="Hi Bob, your account is ready.",
)python3 main.pymock_email.send.assert_called_once_with(...) to mock_email.send.assert_called_once() (no argument check). What does this test now verify? Is it weaker or stronger? This shows the difference between 'it was called' (interaction) vs 'it was called correctly' (contract).notify_welcome is called twice for the same user. Assert that send is called exactly twice. Then change the implementation to deduplicate (skip the second send). Which version of the assertion catches this behavioral change?Use these three in order. Each builds on the one before.
In one paragraph, explain the difference between a stub and a mock. What does a stub control (return value) and what does a mock verify (interactions)? Give a concrete example using an email sender.
Walk me through when you'd use each test double type: fake (complex, realistic behavior needed for multiple tests), stub (you only care about the return value), mock (you need to verify HOW a dependency was called). Give one real-world scenario for each.
I'm reviewing a PR where 80% of the test code is setting up mock expectations: every collaborator mocked, every method return value specified, assertions on the exact order of calls. The developer says 'this is very thoroughly tested.' Walk me through why this test suite is fragile, what it will cost in maintenance, and how to restructure it using fakes and a smaller number of targeted mocks.