← Curriculum
Intermediate
~12 minstructured-output
tool-use
json-schema
Evaluating structured outputs
Parse rate is not correctness — they're two different evals.
Step 1 of 13
Structured outputs — JSON responses, tool calls, function arguments — are table stakes for any AI system that does more than chat. They look easy to evaluate (just JSON.parse) and that is the trap. Parse rate measures syntax; semantic accuracy measures intent. Conflating them is how teams ship 99% 'success' with 30% real correctness.