Tested GPT-5, Opus, Sonnet on specid-lot-selection.md (125 lines)
for implementation specification gaps.
Key findings:
- Opus most cost-effective (4.6 gaps/1K tokens vs 1.8 for GPT-5)
- GPT-5 catches operational/financial edge cases (fees, multi-execution)
- Opus catches design-level binding ambiguities
- Sonnet too shallow for serious spec review
New lens distinct from hidden assumptions and race conditions:
focuses on ambiguity of intent, not risks.