← Quizzes
Advanced
3 questions · ~6 minProduction, drift, and frontier topics
Online quality, drift detection, and what's hard in 2026.
Progress
0 of 3 answered · 0 correct
Q1
You ship a new prompt. Offline evals look good. 4 hours later, online judge quality has dropped 6%. Errors are flat. What's most likely?
Q2
Which is the BEST input to a self-improving eval pipeline?
Q3