New analytical lens: observability gap analysis — asking 'when something
goes wrong, can you SEE it?' rather than 'what can go wrong?'
Results on aggregation.md (239 lines):
- GPT-5: 23 findings (12 unique), exhaustive telemetry architecture
- Opus: 14 findings (6 unique), operator-behavioral insights
- Sonnet: 11 findings (0 unique), no added value
Key insight: GPT-5 designs the instrumentation; Opus identifies where
available signals mislead operators toward wrong remediations.
Two-model (GPT-5 + Opus) optimal for this task type.