Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.
Every request your agent serves gets logged with full trace. By request #1 not request #10,000. You'll need it: for debugging, abuse review, compliance, billing reconciliation, and the inevitable 'why did the agent say X?' customer question. Cheap to add early; impossible to add retroactively.
Schema: trace_id, user_id, tenant_id, request_at, model, input (or hash), tools called, output (or hash), token usage, cost. Storage: hot in DB (30 days), cold in S3 (90+ days), then delete. PII: redact at write OR hash with a salt so it's queryable but not readable. Use it: pull the trace for any complaint in <60 seconds.
Use these three in order. Each builds on the one before.
Why is audit logging non-negotiable at Day 1? Name 4 use cases.
Walk me through 'log preview + hash full text' for PII-safe logging. Trade-offs?
Design audit log for a multi-tenant SaaS with compliance requirements (SOC2, GDPR). What's stored, what's not, what's the retention?
import uuid, json, hashlib
def hash_pii(text):
return hashlib.sha256((SALT + text).encode()).hexdigest()[:16]
async def chat_with_audit(req, user):
trace_id = str(uuid.uuid4())
started_at = time.time()
audit = {
"trace_id": trace_id,
"user_id": user["id"],
"tier": user["tier"],
"started_at": started_at,
"input_hash": hash_pii(req["message"]), # or full text if your policy allows
"input_preview": req["message"][:120], # safe preview, no PII?
}
answer, usage, tool_calls = await run_agent(req["message"], user)
audit.update({
"ended_at": time.time(),
"duration_ms": int((time.time() - started_at) * 1000),
"tool_calls": tool_calls,
"usage": usage,
"cost_cents": compute_cost(usage),
"answer_preview": answer[:200],
})
# write to DB (or to a queue + worker, for high volume)
with db.cursor() as cur:
cur.execute("INSERT INTO audit_logs (data) VALUES (%s)", (json.dumps(audit),))
db.commit()
return {"trace_id": trace_id, "answer": answer}
# Retention policy
def cleanup_audit():
db.execute("DELETE FROM audit_logs WHERE (data->>'started_at')::float < extract(epoch from now() - interval '30 days')")
# archive older to S3 before deleting if needed
# Pull trace by id (for support / debugging)
def trace_by_id(trace_id):
return db.query_one("SELECT data FROM audit_logs WHERE data->>'trace_id' = $1", trace_id)python3 main.py