Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.
Adding AI to a product is never free — every AI feature levies three taxes that a plain feature doesn't. There's a money tax (you pay per token, every call, forever), a latency tax (model calls are slow, often seconds, breaking the snappy interactions users expect), and a trust tax (the feature can be confidently wrong, and one bad answer poisons the whole feature's credibility). You must price all three into the decision up front, because a feature that's 'cool' but adds a 4-second wait, costs a dollar per use, and is wrong 10% of the time is a net negative. Naming these taxes turns vague unease into a checklist.
Tally the three taxes for a candidate feature and compare against the value it delivers. The exercise is forcing yourself to put a number on latency and an error rate on trust, not just cost.
Use these three in order. Each builds on the one before.
In one paragraph, explain the three hidden costs (cost, latency, trust) of adding AI to a product.
Walk me through how each of cost, latency, and trust actually accrues in a production AI feature.
Given a feature with great value but high latency and a 10% error rate, how would you decide whether to ship, redesign, or kill it?
Working within free-tier limits. Free / low-tier provider keys rate-limit aggressively, and eval or agent loops that fan out calls will hit
429 Too Many Requestsfast. Survive it: readRetry-Afterand thex-ratelimit-*headers and back off (exponential backoff with jitter + a max-retry cap) instead of hammering; cap in-flight requests with a small concurrency limiter so you stay under the RPM/TPM ceiling; cache identical requests so retries don't re-spend quota; downshift to a smaller/cheaper model for practice runs; use the provider Batch API for non-interactive jobs; or sidestep hosted limits entirely by running a small model locally (Ollama / llama.cpp) or on a free Colab/Kaggle GPU while you learn.
def feature_taxes(calls_per_user_month, cost_per_call, p95_latency_s, error_rate, value_per_use):
money = calls_per_user_month * cost_per_call
latency_penalty = "breaks flow" if p95_latency_s > 2 else "acceptable"
trust_penalty = "high — needs UX safeguards" if error_rate > 0.05 else "manageable"
net_value = (value_per_use * calls_per_user_month) - money
return {"monthly_cost_per_user": round(money, 2), "latency": latency_penalty,
"trust": trust_penalty, "net_value_per_user": round(net_value, 2)}
print(feature_taxes(calls_per_user_month=50, cost_per_call=0.02,
p95_latency_s=3.5, error_rate=0.08, value_per_use=0.10))python3 main.py