Files
model-research/findings/2026-05-06-34-information-flow-hazard-analysis.md
T
Rodin 4a69a99d05 finding #34: information flow hazard analysis on lot-accounting.md
New analytical lens: where data propagation creates stale, contradictory,
or misleading views for different consumers.

Key result: highest model convergence (45% common ground) due to document's
explicit failure mode table. GPT-5 finds event-level provenance gaps; Opus
identifies strategy attribution dimension. Sonnet adds zero unique value.
Two-model stack (GPT-5 + Opus) optimal.
2026-05-06 18:29:06 -07:00

70 lines
3.9 KiB
Markdown

# Finding #34: Information Flow Hazard Analysis on lot-accounting.md
**Date:** 2026-05-06
**Document:** `docs/domain/contexts/ledger/lot-accounting.md` (181 lines)
**Lens:** Information flow hazards (NEW) — where data propagation creates stale,
contradictory, incomplete, or misleading views for different consumers.
## Setup
Same document + same focused analytical prompt to all 3 models via HAI proxy.
Specified 5 categories: staleness propagation, fan-out inconsistency, causal
ordering violations, write amplification gaps, provenance opacity. Required
Flow/Category/Scenario/Impact/Severity format per finding. No tools, no project
context beyond the document.
## Results
| Model | Time | Output tokens | Reasoning tokens | Findings |
|---|---|---|---|---|
| GPT-5 | 94s | 8,246 | 6,016 | 11 |
| Claude Opus 4.6 | 66s | 3,318 | (internal) | 7 |
| Claude Sonnet 4.6 | 77s | 4,163 | (internal) | 6 |
## Common Ground (all 3 identified)
1. Position aggregate staleness after crash (write amplification gap) — Critical
2. Position.realized_pnl accumulator drift from LotClosed sum — Critical
3. Multi-lot close walk creating fan-out inconsistency (intermediate states visible) — High
4. Corporate action arrival ordering race creating permanently incorrect immutable LotClosed — Critical
5. Provenance opacity of Position.average_cost (no freshness metadata) — High
## GPT-5 Unique Findings
- **Lot ledger unavailability as information flow hazard**: Position freezes at pre-sell state indefinitely during documented outage. Risk/UI operate on stale exposure for entire outage+recovery window.
- **Wash sale timing at closing**: Document only checks at Opening (buy), not Closing (sell). Disqualifying buy that already exists when loss sale executes creates immutable LotClosed with disallowed loss amount.
- **LotClosed per-fill completeness gap**: No end-of-fill marker or expected event count. Consumers cannot distinguish "all events written" from "more coming."
- **Reconciliation non-atomicity**: Re-derivation writes may not be atomic across Position fields, creating transient internal inconsistency.
- **Opening write amplification**: Lot exists before Position update, creating underreported-exposure window.
## Opus Unique Findings
- **Strategy P&L attribution fan-out**: During multi-LotClosed write, per-strategy P&L consumers see inconsistent attribution. Strategy-level stop-losses or rebalancing may fire incorrectly on partial data.
## Sonnet Unique Findings
- None truly unique — all findings were variations of what the other models found.
## Key Insight: High Model Convergence
This lens produced the **highest convergence rate** across all experiments:
- 45% of GPT-5's findings were common ground (vs typical 25-35%)
- Only 6 unique findings across GPT-5's 11 (54% unique vs typical 60-75%)
**Why:** The document includes an explicit "Failure Modes" table that all models
effectively re-derive as information flow hazards. The unique findings come from
models going BEYOND the document's own failure analysis.
## Practical Implications
1. **Information flow analysis is most valuable on documents WITHOUT explicit failure mode tables** — documents that describe data architecture without self-analyzing their failure properties.
2. **Two-model stack (GPT-5 + Opus) is optimal** for this lens. Sonnet adds zero unique value.
3. **GPT-5 finds hazards outside the document's own frame** (event-level completeness, wash sale timing).
4. **Opus excels at dimensional analysis** (strategy attribution dimension) and produces the most concrete, test-case-ready scenarios.
## Model Characterization for This Lens
- **GPT-5**: Broadest coverage; finds event-level and lifecycle-timing hazards the document doesn't address
- **Opus**: Most precise scenario construction with concrete dollar amounts; identifies dimensions the document doesn't analyze
- **Sonnet**: Adequate but redundant; produces elaborated versions of what the other two already find