Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.
Without a cost cap, your first viral tweet costs you X-per-user × N — capped at your budget. The cap goes on before the launch announcement. Cheap to add early; expensive to retrofit after a runaway bill.
Layered: (1) Per-user request rate limit (60/min). (2) Per-user daily request cap (varies by tier). (3) Per-user daily cap (panic stop). (5) Per-tool call rate (expensive tools tighter). Implementation: Redis SETEX counters. Action on cap: 429 with a clear 'you've hit your daily limit'.
Use these three in order. Each builds on the one before.
Name 4 cost-control layers for an agentic API at Day 1.
Walk me through 'reserve estimate, then reconcile actual'. Why pre-estimate vs reconcile-only?
Design cost caps for a free-tier-heavy SaaS: 50 free queries/day, 90% free users, want abuse-resistant. What's the policy?
import redis, time
r = redis.Redis()
def rate_limit(key, limit, window_s):
now = time.time()
pipe = r.pipeline()
pipe.zremrangebyscore(key, 0, now - window_s)
pipe.zcard(key)
pipe.zadd(key, {str(now): now})
pipe.expire(key, window_s)
_, count, _, _ = pipe.execute()
return count < limit
def cost_cap(user_id, additional_cents):
key = f"cost:{user_id}:{today()}"
spent = int(r.get(key) or 0)
if spent + additional_cents > daily_cap_cents(user_id):
return False
r.incrby(key, additional_cents)
r.expire(key, 86400 * 2)
return True
def daily_cap_cents(user_id):
tier = get_user_tier(user_id)
return {"free": 10, "pro": 200, "team": 2000, "ent": 20000}[tier]
# Global panic stop
def global_budget_ok():
today_spent = int(r.get(f"global_cost:{today()}") or 0)
return today_spent < int(os.environ.get("GLOBAL_DAILY_CAP_CENTS", "100000"))
# In the request handler
@app.post("/api/chat")
async def chat(req, user = Depends(get_current_user)):
if not global_budget_ok():
raise HTTPException(503, "global cost cap reached; service paused")
if not rate_limit(f"u:{user['id']}", 60, 60):
raise HTTPException(429)
estimated_cents = 1 # rough pre-estimate
if not cost_cap(user["id"], estimated_cents):
raise HTTPException(429, "daily cost cap reached")
# ... run agent
actual_cents = compute_actual_cost(usage)
r.incrby(f"cost:{user['id']}:{today()}", actual_cents - estimated_cents)
r.incrby(f"global_cost:{today()}", actual_cents)python3 main.py