Mocking, stubbing, and faking — when to use which

medium

Learn with your AI

Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.

Open in Claude Open in ChatGPT

Why this matters

Test doubles are the tools that make unit testing possible when your code has dependencies (databases, HTTP APIs, queues, file systems). But 'mock everything' is a trap: tests full of mock setup become brittle contracts that break when implementation details change, not when behavior changes. The three main tools have different jobs: a fake is a working implementation (in-memory database instead of real DB), a stub returns canned responses (always returns user 42), and a mock records interactions and asserts on them (verify send_email was called exactly once with the right arguments). Using the wrong tool causes tests that pass when they should fail, or tests so tightly coupled to implementation that any refactor breaks them.

Demo

Fakes, stubs, and mocks serve different purposes: a fake is a working in-memory implementation (such as an in-memory database) used when you need realistic behavior across multiple tests; a stub returns a hardcoded value to control the code path under test; and a mock records calls so you can assert on the exact interactions that occurred. Reaching for a mock when a stub would do — or a stub when a fake would give more confidence — is the most common test double mistake, and it leads either to brittle tests that break on every refactor or to tests that never verify the behavior they claim to cover.

import pytest
from unittest.mock import Mock, patch, call
from typing import Protocol

class EmailSender(Protocol):
    def send(self, to: str, subject: str, body: str) -> bool: ...

class UserRepository(Protocol):
    def find(self, user_id: int) -> dict | None: ...

class NotificationService:
    def __init__(self, user_repo: UserRepository, email_sender: EmailSender):
        self._repo = user_repo
        self._email = email_sender

    def notify_welcome(self, user_id: int) -> bool:
        user = self._repo.find(user_id)
        if not user:
            return False
        return self._email.send(
            to=user["email"],
            subject=f"Welcome, {user['name']}!",
            body=f"Hi {user['name']}, your account is ready.",
        )

# ── Type 1: FAKE — a working in-memory implementation ────────────────────────
class FakeUserRepo:
    def __init__(self): self._db = {}
    def add(self, user_id, user): self._db[user_id] = user
    def find(self, user_id): return self._db.get(user_id)

class FakeEmailSender:
    def __init__(self): self.sent = []
    def send(self, to, subject, body):
        self.sent.append({"to": to, "subject": subject})
        return True

def test_welcome_email_sent_to_correct_address():
    # FAKE: in-memory repo and email catcher
    repo  = FakeUserRepo()
    email = FakeEmailSender()
    repo.add(1, {"email": "alice@test.com", "name": "Alice"})
    svc = NotificationService(repo, email)

    svc.notify_welcome(1)

    assert len(email.sent) == 1
    assert email.sent[0]["to"] == "alice@test.com"

# ── Type 2: STUB — canned return value, no verification ──────────────────────
def test_returns_false_when_user_not_found():
    stub_repo   = Mock(spec=UserRepository)
    stub_repo.find.return_value = None     # always returns None (stub)
    stub_email  = Mock(spec=EmailSender)
    svc = NotificationService(stub_repo, stub_email)

    result = svc.notify_welcome(999)

    assert result is False
    stub_email.send.assert_not_called()   # email not sent for missing user

# ── Type 3: MOCK — verify interactions (HOW it was called) ───────────────────
def test_email_called_with_correct_subject():
    mock_repo  = Mock(spec=UserRepository)
    mock_email = Mock(spec=EmailSender)
    mock_repo.find.return_value = {"email": "bob@test.com", "name": "Bob"}
    mock_email.send.return_value = True
    svc = NotificationService(mock_repo, mock_email)

    svc.notify_welcome(1)

    # Assert on HOW the mock was called (not just that it was called)
    mock_email.send.assert_called_once_with(
        to="bob@test.com",
        subject="Welcome, Bob!",
        body="Hi Bob, your account is ready.",
    )

Run: python3 main.py

Try it yourself

Change mock_email.send.assert_called_once_with(...) to mock_email.send.assert_called_once() (no argument check). What does this test now verify? Is it weaker or stronger? This shows the difference between 'it was called' (interaction) vs 'it was called correctly' (contract).

Add a test: notify_welcome is called twice for the same user. Assert that send is called exactly twice. Then change the implementation to deduplicate (skip the second send). Which version of the assertion catches this behavioral change?

Find a test in your codebase that mocks a function whose return value isn't used by the test (a 'mock that does nothing'). Is this a mock, stub, or fake? Could it be replaced with a simpler in-memory fake?

Research 'mock objects that shouldn't' by Martin Fowler — the argument against mocking internal collaborators. What is the line between 'mock the external dependency' (good) vs 'mock everything to isolate every class' (harmful)?

Prompt your AI

Use these three in order. Each builds on the one before.

1. Basics & terminology

In one paragraph, explain the difference between a stub and a mock. What does a stub control (return value) and what does a mock verify (interactions)? Give a concrete example using an email sender.

2. Why it works (the mechanism)

Walk me through when you'd use each test double type: fake (complex, realistic behavior needed for multiple tests), stub (you only care about the return value), mock (you need to verify HOW a dependency was called). Give one real-world scenario for each.

3. Advanced — application & what's next

I'm reviewing a PR where 80% of the test code is setting up mock expectations: every collaborator mocked, every method return value specified, assertions on the exact order of calls. The developer says 'this is very thoroughly tested.' Walk me through why this test suite is fragile, what it will cost in maintenance, and how to restructure it using fakes and a smaller number of targeted mocks.