296bb21eb7
New analytical lens: failure propagation chains. Opus matched GPT-5's count (10 findings each) while using 2.2x fewer tokens. Overview docs are ideal for this lens. Sonnet produced zero unique insights.
78 lines
5.2 KiB
Markdown
78 lines
5.2 KiB
Markdown
# Finding #42: Failure Propagation Chain Analysis on system-overview.md
|
|
|
|
**Date:** 2026-05-07
|
|
**Analytical lens:** Failure propagation chain analysis (NEW)
|
|
**Document:** gargoyle's `system-overview.md` (323 lines) — high-level architecture overview
|
|
**Models:** GPT-5, Claude Opus 4.6, Claude Sonnet 4.6
|
|
|
|
## Summary
|
|
|
|
New analytical lens: identify failure propagation chains — sequences where a failure in one
|
|
component silently corrupts, degrades, or destabilizes another component's behavior WITHOUT
|
|
triggering explicit error handling or alarms.
|
|
|
|
## Results
|
|
|
|
| Model | Time | Output tokens | Reasoning tokens | Findings |
|
|
|---|---|---|---|---|
|
|
| GPT-5 | 88s | 9,027 | 6,720 | 10 |
|
|
| Claude Opus 4.6 | 97s | 4,044 | (internal) | 10 |
|
|
| Claude Sonnet 4.6 | 35s | 1,605 | (internal) | 8 |
|
|
|
|
## Key Findings
|
|
|
|
### Common Ground (all 3 identified)
|
|
|
|
- Shared tick event bus (EVT) as cross-user failure propagation, violating claimed user isolation (Invariant 12)
|
|
- BrokerAdapter fill misattribution/cross-user contamination through the shared port
|
|
- Stale/incorrect instrument_id resolution propagating silently through the pipeline
|
|
- Exact arithmetic boundary violation at float-to-decimal conversion at ingestion
|
|
- Recovery ordering hazards where reconciliation completes but derived state is inconsistent
|
|
|
|
### GPT-5 Unique Findings
|
|
|
|
- **Duplicate fills after reconnect:** BrokerAdapter replays fills on reconnect with no idempotency key → duplicate lots, inflated positions. Reconciliation only helps at startup, not steady-state reconnection.
|
|
- **Dual feed ingestion:** Live + replay adapters simultaneously connected (port substitutability permits this) → duplicate ticks → double decisions → double exposure. No "single active feed" mutual exclusion.
|
|
- **Missing live fills during steady state:** Dropped fills undetected until next restart. No continuous reconciliation specified. Positions silently drift.
|
|
- **PortfolioMonitor close-only outliving its trigger:** No documented lifecycle for clearing → OrderManager blocks new orders indefinitely after trigger resolves.
|
|
- **Instrument identity drift between market data and broker:** Corporate action causes disagreement between ingestion and adapter → fills recorded against wrong instrument lineage.
|
|
|
|
### Claude Opus Unique Findings
|
|
|
|
- **PortfolioMonitor/Ledger divergence:** PM runs as background process with own fill feed, NO reconciliation against authoritative Ledger lot state. PM's position view can drift → spurious close-only or missed close-only. Most architecturally significant: identifies PM has a PARALLEL position model with no convergence mechanism.
|
|
- **Signal rejection asymmetry:** SignalRisk rejections invisible to Aggregator (only approvals flow downstream). Aggregator forms decisions on systematically biased subset. Identifies this as design-level information asymmetry.
|
|
- **Kill switch + fill precedence invariant deadlock:** Kill switch engages while order partially filled → remaining fills forced by Invariant 6 → position grows during kill switch → PortfolioMonitor's close-only blocked by Invariant 8 → UNMANAGEABLE POSITION during crisis. Genuine deadlock between two stated invariants.
|
|
- **Corporate action lot adjustment bypasses risk pipeline:** Split doubles quantity → exceeds limits → no re-evaluation because risk pipeline only validates decisions, not external state changes.
|
|
|
|
### Claude Sonnet Findings
|
|
|
|
- 8 findings, all also identified by GPT-5 or Opus with more depth. Zero unique insights.
|
|
- One finding (audit log corruption) based on architectural misunderstanding.
|
|
|
|
## Analysis
|
|
|
|
### Opus's Token Efficiency
|
|
|
|
Opus produced 10 findings in 4,044 tokens — roughly **2.2x more token-efficient** than GPT-5 (10 findings in 9,027 tokens). This is the first experiment where Opus MATCHED GPT-5's finding count while using significantly fewer tokens. Previous experiments showed Opus finding fewer issues with higher insight density. Here: equal count AND higher density.
|
|
|
|
### Document Level Matters
|
|
|
|
Overview/architecture documents are IDEAL for failure propagation analysis because they show boundaries and shared resources that component-level docs hide. Suggested document-level → lens matching:
|
|
- **Overview docs** → failure propagation, blast radius, isolation verification
|
|
- **Component specs** → race conditions, invariant violations, hidden assumptions
|
|
- **Cross-cutting docs** → temporal ordering, recovery hazards
|
|
|
|
### Dominant Failure Vector
|
|
|
|
The shared infrastructure contradiction (EVT/BA as single shared nodes with claimed per-user isolation) is the single most important finding. All models caught it, each exploring different consequences:
|
|
- GPT-5: backpressure propagation, duplicate feed ingestion
|
|
- Opus: fill misattribution, PortfolioMonitor parallel state
|
|
- Sonnet: tick corruption (most obvious variant)
|
|
|
|
## Practical Implications
|
|
|
|
- Run **Opus** for highest insight density and design tension identification (10 findings, 97s, 4K tokens)
|
|
- Run **GPT-5** for operational/runtime hazards the architecture doesn't consider (10 findings, 88s, 9K tokens)
|
|
- **Sonnet is redundant** for this task — provides no unique value over the other two
|
|
- Total unique findings after deduplication: ~14 distinct propagation chains from a 323-line document
|