bonsai
← Curriculum
Intermediate
~12 min
cost
latency
caching

Cost and latency as quality signals

A perfect answer the user never waited for is a failed answer.

Step 1 of 14

Cost and latency are not 'ops concerns' to be optimized after launch — they are quality signals. A 95th-percentile response that takes 14 seconds is broken for the user even if the content is perfect. A judge that costs $80 per release is an eval pipeline that won't run on every PR. Treat both as first-class metrics in your eval suite.