bonsai
← Curriculum
Frontier
~20 min
multimodal
long-horizon
self-improving

Frontier topics: multi-modal, long-horizon, self-improving evals

Where the field is heading in 2026 and beyond.

Step 1 of 14

The QA-for-AI playbook is being rewritten as systems become multi-modal, take longer-horizon actions, and increasingly evaluate themselves. This is the survey of where the field actually is in 2026 — what's working, what's broken, and what the field is still figuring out.