Capstok — learn by doing

Why this matters

Every action the bot takes on your behalf either earns trust or spends it. A well-drafted reply that lands cleanly earns a small amount. A weird auto-send that confuses a friend spends a large amount. A wrongly deleted email spends a catastrophic amount. Trust is not a linear meter — it's asymmetric, path-dependent, and easily zeroed out. This means the design goal is not 'maximize helpful actions' but 'never take an action whose worst-case trust cost exceeds its expected value'. In practice: aggressive automations on low-stakes, high-frequency tasks (delete newsletter spam, categorize routine mail) earn small amounts of trust steadily; automations on high-stakes tasks (auto-reply to your investor) must be conservative or gated because a single misfire zeroes months of earned trust. The rest of the course is about building the second kind while never letting it destroy the first.

Demo

A trust-ledger primitive: every bot action gets a signed trust-delta based on outcome. Watch how a single high-stakes misfire wipes out weeks of steady wins.

Try it yourself

Run the simulation and read the trust delta from the one bad event. That gap defines your risk tolerance.
Change the matrix so 'catastrophic' outcomes cost -1000 instead of -200. Does it change your design? (It shouldn't; a bot that can burn -1000 trust should never be allowed to act autonomously.)
For each action type in your delegation audit, place it on the stakes axis (low / medium / high / catastrophic).
Identify the ONE action in your list where a single misfire would wipe out months of earned trust. That's the action that gets the tightest gates.

Prompt your AI

Use these three in order. Each builds on the one before.

1. Basics & terminology

In one paragraph, explain why trust with an automated agent is asymmetric — small wins earn slowly, one bad event costs a lot.

2. Why it works (the mechanism)

Walk me through the mechanism: why can 120 clean automations be undone by one bad send, in terms of memory salience, storytelling, and stakeholder response?

3. Advanced — application & what's next

Design a policy: below what trust threshold should the bot enter 'quiet mode' (draft-only, no auto-sends) until the human resets it?

References

undefined
undefined

Chat about this lesson

from datetime import datetime, timedelta
from dataclasses import dataclass, field

@dataclass
class TrustEvent:
    when: datetime
    action: str
    stakes: str        # "low" | "medium" | "high" | "catastrophic"
    outcome: str       # "clean" | "minor_fix" | "user_override" | "burned"
    delta: float = 0

def score(event: TrustEvent) -> float:
    matrix = {
        ("low", "clean"): +0.5,
        ("low", "minor_fix"): +0.1,
        ("low", "user_override"): -0.5,
        ("medium", "clean"): +2.0,
        ("medium", "minor_fix"): +0.5,
        ("medium", "user_override"): -3.0,
        ("high", "clean"): +5.0,
        ("high", "minor_fix"): -1.0,
        ("high", "user_override"): -10.0,
        ("high", "burned"): -50.0,
        ("catastrophic", "burned"): -200.0,
    }
    return matrix.get((event.stakes, event.outcome), 0)

# Simulate a month of a good bot
events = []
for _ in range(120):  # 120 low-stakes wins
    events.append(TrustEvent(datetime.now(), "delete_newsletter", "low", "clean"))
for _ in range(40):  # 40 medium wins
    events.append(TrustEvent(datetime.now(), "draft_reply", "medium", "clean"))
for _ in range(5):   # a couple minor edits
    events.append(TrustEvent(datetime.now(), "draft_reply", "medium", "minor_fix"))

trust = sum(score(e) for e in events)
print(f"After 165 clean/minor-fix events: trust = {trust:+.1f}")

# One high-stakes misfire
events.append(TrustEvent(datetime.now(), "auto_reply_investor", "high", "burned"))
trust = sum(score(e) for e in events)
print(f"After one high-stakes misfire:    trust = {trust:+.1f}")
print("\nDesign implication: high-stakes actions MUST be gated by grace window + escalation.")

Run: python3 main.py

Trust budget: what your bot earns vs. spends