4a69a99d05
New analytical lens: where data propagation creates stale, contradictory, or misleading views for different consumers. Key result: highest model convergence (45% common ground) due to document's explicit failure mode table. GPT-5 finds event-level provenance gaps; Opus identifies strategy attribution dimension. Sonnet adds zero unique value. Two-model stack (GPT-5 + Opus) optimal.
70 lines
3.9 KiB
Markdown
70 lines
3.9 KiB
Markdown
# Finding #34: Information Flow Hazard Analysis on lot-accounting.md
|
|
|
|
**Date:** 2026-05-06
|
|
**Document:** `docs/domain/contexts/ledger/lot-accounting.md` (181 lines)
|
|
**Lens:** Information flow hazards (NEW) — where data propagation creates stale,
|
|
contradictory, incomplete, or misleading views for different consumers.
|
|
|
|
## Setup
|
|
|
|
Same document + same focused analytical prompt to all 3 models via HAI proxy.
|
|
Specified 5 categories: staleness propagation, fan-out inconsistency, causal
|
|
ordering violations, write amplification gaps, provenance opacity. Required
|
|
Flow/Category/Scenario/Impact/Severity format per finding. No tools, no project
|
|
context beyond the document.
|
|
|
|
## Results
|
|
|
|
| Model | Time | Output tokens | Reasoning tokens | Findings |
|
|
|---|---|---|---|---|
|
|
| GPT-5 | 94s | 8,246 | 6,016 | 11 |
|
|
| Claude Opus 4.6 | 66s | 3,318 | (internal) | 7 |
|
|
| Claude Sonnet 4.6 | 77s | 4,163 | (internal) | 6 |
|
|
|
|
## Common Ground (all 3 identified)
|
|
|
|
1. Position aggregate staleness after crash (write amplification gap) — Critical
|
|
2. Position.realized_pnl accumulator drift from LotClosed sum — Critical
|
|
3. Multi-lot close walk creating fan-out inconsistency (intermediate states visible) — High
|
|
4. Corporate action arrival ordering race creating permanently incorrect immutable LotClosed — Critical
|
|
5. Provenance opacity of Position.average_cost (no freshness metadata) — High
|
|
|
|
## GPT-5 Unique Findings
|
|
|
|
- **Lot ledger unavailability as information flow hazard**: Position freezes at pre-sell state indefinitely during documented outage. Risk/UI operate on stale exposure for entire outage+recovery window.
|
|
- **Wash sale timing at closing**: Document only checks at Opening (buy), not Closing (sell). Disqualifying buy that already exists when loss sale executes creates immutable LotClosed with disallowed loss amount.
|
|
- **LotClosed per-fill completeness gap**: No end-of-fill marker or expected event count. Consumers cannot distinguish "all events written" from "more coming."
|
|
- **Reconciliation non-atomicity**: Re-derivation writes may not be atomic across Position fields, creating transient internal inconsistency.
|
|
- **Opening write amplification**: Lot exists before Position update, creating underreported-exposure window.
|
|
|
|
## Opus Unique Findings
|
|
|
|
- **Strategy P&L attribution fan-out**: During multi-LotClosed write, per-strategy P&L consumers see inconsistent attribution. Strategy-level stop-losses or rebalancing may fire incorrectly on partial data.
|
|
|
|
## Sonnet Unique Findings
|
|
|
|
- None truly unique — all findings were variations of what the other models found.
|
|
|
|
## Key Insight: High Model Convergence
|
|
|
|
This lens produced the **highest convergence rate** across all experiments:
|
|
- 45% of GPT-5's findings were common ground (vs typical 25-35%)
|
|
- Only 6 unique findings across GPT-5's 11 (54% unique vs typical 60-75%)
|
|
|
|
**Why:** The document includes an explicit "Failure Modes" table that all models
|
|
effectively re-derive as information flow hazards. The unique findings come from
|
|
models going BEYOND the document's own failure analysis.
|
|
|
|
## Practical Implications
|
|
|
|
1. **Information flow analysis is most valuable on documents WITHOUT explicit failure mode tables** — documents that describe data architecture without self-analyzing their failure properties.
|
|
2. **Two-model stack (GPT-5 + Opus) is optimal** for this lens. Sonnet adds zero unique value.
|
|
3. **GPT-5 finds hazards outside the document's own frame** (event-level completeness, wash sale timing).
|
|
4. **Opus excels at dimensional analysis** (strategy attribution dimension) and produces the most concrete, test-case-ready scenarios.
|
|
|
|
## Model Characterization for This Lens
|
|
|
|
- **GPT-5**: Broadest coverage; finds event-level and lifecycle-timing hazards the document doesn't address
|
|
- **Opus**: Most precise scenario construction with concrete dollar amounts; identifies dimensions the document doesn't analyze
|
|
- **Sonnet**: Adequate but redundant; produces elaborated versions of what the other two already find
|