Tested on signal-lifecycle.md (111 lines). Results: - GPT-5: 17 gaps (7,744 reasoning tokens) - Opus: 11 gaps (design-level focus) - Sonnet: 8 gaps (fastest, protocol-level) Key insight: Union of all models (~26 gaps) far exceeds any single model (max 17). Only 5 gaps found by all three — highly differentiated outputs make multi-model runs valuable for interface documents.
3.1 KiB
Boundary Contract Analysis — Finding #62
Date: 2026-05-10
Lens: Boundary Contract Analysis (NEW)
Document: gargoyle's signal-lifecycle.md (111 lines)
Models: GPT-5, Claude Opus 4, Claude Sonnet 4
Summary
New analytical lens that examines implicit contracts at component interfaces — what one component promises/expects that another must deliver/understand. Unlike assumption-finding (what must be true) or race condition analysis (temporal interleavings), this focuses specifically on INTERFACE ASSUMPTIONS.
Results
| Model | Time | Output tokens | Reasoning tokens | Gaps found | Critical | High |
|---|---|---|---|---|---|---|
| GPT-5 | 125s | 2,062 | 7,744 | 17 | 5 | 4 |
| Claude Opus 4 | ~74s | 2,243 | (internal) | 11 | 3 | 4 |
| Claude Sonnet 4 | ~40s | 947 | (internal) | 8 | 2 | 3 |
Key Findings
Common Ground (all 3 found)
- Action normalization responsibility and position state dependency (CRITICAL)
- Instrument ID resolution timing across corporate actions
- stop_loss semantic transfer from signal to PortfolioMonitor
- Quantity/units interpretation for options vs stocks (100x sizing error)
- Audit log write failure handling
GPT-5 Unique (5 most significant)
- Signal fan-out double-execution (CRITICAL) — "one signal can appear under many decisions" creates execution-level hazard with no dedupe contract
- Signal replay/dedup gap — pipeline processes duplicates normally, only audit-level symptoms
- Instrument resolution trust boundary — wrong-but-known instrument_id passes through
- Late-arriving signals silently re-grouped — no notification or audit
- Ticker vs instrument_id mismatch — misleading observability
Opus Unique
- Entry price reconciliation — multiple signals with different entry_prices aggregate; which wins?
- Aggregator group identification key — not specified in signal fields
- Backpressure expiration criteria — FIFO without priority could drop risk-critical signals
Sonnet Unique
- Signal ordering contract — close signal could arrive before buy signal
- Signal ID generation entropy — poor entropy could cause collisions
Model Strengths for This Lens
| Model | Strength | Best For |
|---|---|---|
| GPT-5 | Exhaustive validation gap enumeration | Comprehensive boundary audits |
| Opus | Design-level incompleteness | "Model is fundamentally underspecified" |
| Sonnet | Protocol/temporal assumptions | Quick first-pass screening |
Key Insight
The union of all findings (~26 distinct gaps) significantly exceeds any single model's output (17, 11, 8). Only 5 gaps were found by all three models. This lens produces highly differentiated outputs across models — run all three for architecture documents describing component interfaces.
Practical Application
For documents that describe component interfaces, boundary contract analysis is high-value:
- Run Sonnet first for quick temporal/protocol screening (40s, cheap)
- Run GPT-5 for exhaustive validation/semantic gaps (125s, thorough)
- Run Opus for design-level coherence gaps (74s, insightful)
The combination catches significantly more issues than any single pass.