New task type: analyzing prescriptive/specification documents for completeness.
- GPT-5 dominates with exhaustive enumeration (34 findings)
- Opus traces gaps to consequences (routing failures, compiler issues)
- Sonnet surface-level (not recommended for thorough analysis)
Key insight: GPT-5 found internal contradiction (telemetry verb rule vs example)
that neither Claude model caught. Opus unique in tracing PubSub collision
to actual routing failure scenario.
Task taxonomy: convention gap analysis follows same pattern as architecture
docs - GPT-5 for coverage, Opus for consequences.