Open this lesson in your favourite AI. It'll walk you through the why, explain the demo, and quiz you on the try-it list.
Every honest conversation about HE deployment has to start with the cost numbers, because they shape every other design decision. Plaintext arithmetic in 2024 is roughly basic operations per second per CPU core. HE multiplication on a fresh ciphertext, even in a state-of-the-art library, is roughly – basic plaintext-equivalent operations per second — that is, somewhere between five and six orders of magnitude slower. Bootstrapping costs another order or two on top of that. Compared to garbled-circuit MPC, HE is more compact (no communication) but slower per gate; compared to secret-sharing MPC, HE shifts cost from network rounds to local CPU. The right tool depends on whether your bottleneck is bandwidth, latency, or compute. Without these numbers in your head, you'll either dismiss HE as 'too slow' for problems where it's actually fine, or oversell it for problems where MPC would have shipped years ago.
We compare three numbers across three primitives: throughput per core (ops/sec), ciphertext expansion (bytes ciphertext / bytes plaintext), and round complexity (network round trips per multiplication). Plaintext is the baseline. HE has zero round complexity (single-shot computation) but huge ciphertext expansion and slow ops. MPC has small ciphertext but many rounds. The 2024-era ballpark is below.
Use these three in order. Each builds on the one before.
In one paragraph, explain why HE is roughly 'a million times slower than plaintext' but can still be the right answer for some applications. What property does it offer that compensates?
Walk me through where the cost actually goes in a single CKKS multiplication: NTTs, modulus switching, key-switching, relinearization. Which of these dominate the runtime at production parameters?
Compare HE and secret-sharing-MPC for a depth-100 circuit between two parties on a 100ms-RTT link. Build the rough cost model and find the crossover circuit size where HE beats MPC and vice versa.
// main.go — run: go run main.go
package main
import (
"fmt"
"math/rand"
"time"
)
const (
nOps = 100_000
nHeOps = 1_000
mask60 = (1 << 60) - 1
mask30 = (1 << 30) - 1
)
func main() {
r := rand.New(rand.NewSource(time.Now().UnixNano()))
// Plaintext baseline: integer mult
start := time.Now()
var c int64
for i := 0; i < nOps; i++ {
a := int64(r.Int63() & mask30)
b := int64(r.Int63() & mask30)
c = a * b
}
_ = c
plainDt := time.Since(start).Seconds()
fmt.Printf("plaintext mult x%d: %.1f ms -> %.2e ops/sec\n",
nOps, plainDt*1000, float64(nOps)/plainDt)
// Toy HE-style mult: ciphertext as length-2 polynomial, schoolbook polymult,
// then reduce mod degree-2 ring polynomial. Mimics one CKKS-style mult
// at ring dim 2 (real schemes use n=2^14).
start = time.Now()
var r0, r1 int64
for i := 0; i < nHeOps; i++ {
a0 := int64(r.Int63() & mask60)
a1 := int64(r.Int63() & mask60)
b0 := int64(r.Int63() & mask60)
b1 := int64(r.Int63() & mask60)
// poly mult
c0 := a0 * b0
c1 := a0*b1 + a1*b0
c2 := a1 * b1
// reduce mod x^2 + 1
r0 = (c0 - c2) & mask60
r1 = c1 & mask60
}
_ = r0
_ = r1
heDt := time.Since(start).Seconds()
fmt.Printf("toy HE mult x%d: %.1f ms -> %.2e ops/sec\n",
nHeOps, heDt*1000, float64(nHeOps)/heDt)
slowdown := (heDt / nHeOps) / (plainDt / nOps)
fmt.Printf("slowdown (toy ring-dim 2 vs plaintext): ~%.1fx\n", slowdown)
fmt.Println("real HE at ring-dim 16384 is ~10000x slower again on top of this.")
}go run main.go