Capstok — learn by doing

Why this matters

Once you understand the serving problem, the real decision is what to build yourself versus adopt. Rolling your own gives maximum control and zero per-seat cost but means you reimplement batching, metrics, versioning, and GPU sharing — the exact wheels Triton already provides. Triton is the open, framework-agnostic platform you assemble and tune. NIM is the opinionated, prepackaged, optimized microservice you deploy in minutes but with less flexibility and a licensing/registry dependency. Choosing well requires honestly weighing engineering time, the heterogeneity of your model fleet, your latency targets, and your tolerance for vendor coupling. This task frames the trade-off the rest of the course equips you to make.

Demo

The demo scores the three options against the dimensions that actually decide it — engineering effort, flexibility, time-to-production, and lock-in — so the choice becomes explicit rather than vibes.

Try it yourself

Re-weight the dimensions for a tiny team that needs to ship this week, and confirm the ranking shifts toward NIM.
Re-weight for a team with one exotic model and strict control needs, and see roll-your-own or Triton rise.
Add a 'fleet_heterogeneity' dimension and reason about how a many-framework fleet favors Triton.
Write the one sentence you'd tell leadership justifying the top-ranked option for your real situation.

Prompt your AI

Use these three in order. Each builds on the one before.

1. Basics & terminology

Explain the build-vs-buy choice for model serving: rolling your own, using Triton, or using NIM.

2. Why it works (the mechanism)

Walk me through the dimensions (engineering cost, flexibility, time-to-production, lock-in) that distinguish roll-your-own vs. Triton vs. NIM and how they trade off.

3. Advanced — application & what's next

Given my team size, model-fleet heterogeneity, latency targets, and lock-in tolerance, how should I reason about choosing among roll-your-own, Triton, and NIM?

References

Chat about this lesson

# A decision matrix for the build-vs-buy question. Higher = better for that option.
options = {
    "roll-your-own": {"control": 5, "time_to_prod": 1, "eng_cost": 1, "lock_in_freedom": 5},
    "triton":        {"control": 4, "time_to_prod": 3, "eng_cost": 3, "lock_in_freedom": 4},
    "nim":           {"control": 2, "time_to_prod": 5, "eng_cost": 5, "lock_in_freedom": 2},
}
# Weight the dimensions by what YOUR team values right now:
weights = {"control": 1, "time_to_prod": 3, "eng_cost": 2, "lock_in_freedom": 1}

def score(opt):
    return sum(opt[k] * weights[k] for k in weights)

ranked = sorted(options, key=lambda o: score(options[o]), reverse=True)
for o in ranked:
    print(f"{o:>14}: {score(options[o])}")
print("-> with these weights, prefer:", ranked[0])

Run: python3 main.py