GPT-5 finds 16 gaps, Opus 11, Sonnet 9. GPT-5 excels at exhaustive
state space enumeration; Opus finds convention-vs-enforcement gaps;
Sonnet adequate but less thorough.
Key insight: state machine completeness is a GPT-5 sweet spot due to
reasoning tokens enabling systematic combinatorial coverage.