From 4a69a99d051be9e11722c530cc1b6aea4e65818f Mon Sep 17 00:00:00 2001 From: Rodin Date: Wed, 6 May 2026 18:29:06 -0700 Subject: [PATCH] finding #34: information flow hazard analysis on lot-accounting.md New analytical lens: where data propagation creates stale, contradictory, or misleading views for different consumers. Key result: highest model convergence (45% common ground) due to document's explicit failure mode table. GPT-5 finds event-level provenance gaps; Opus identifies strategy attribution dimension. Sonnet adds zero unique value. Two-model stack (GPT-5 + Opus) optimal. --- ...-06-34-information-flow-hazard-analysis.md | 69 +++++++++++++++++++ 1 file changed, 69 insertions(+) create mode 100644 findings/2026-05-06-34-information-flow-hazard-analysis.md diff --git a/findings/2026-05-06-34-information-flow-hazard-analysis.md b/findings/2026-05-06-34-information-flow-hazard-analysis.md new file mode 100644 index 0000000..51660c2 --- /dev/null +++ b/findings/2026-05-06-34-information-flow-hazard-analysis.md @@ -0,0 +1,69 @@ +# Finding #34: Information Flow Hazard Analysis on lot-accounting.md + +**Date:** 2026-05-06 +**Document:** `docs/domain/contexts/ledger/lot-accounting.md` (181 lines) +**Lens:** Information flow hazards (NEW) — where data propagation creates stale, +contradictory, incomplete, or misleading views for different consumers. + +## Setup + +Same document + same focused analytical prompt to all 3 models via HAI proxy. +Specified 5 categories: staleness propagation, fan-out inconsistency, causal +ordering violations, write amplification gaps, provenance opacity. Required +Flow/Category/Scenario/Impact/Severity format per finding. No tools, no project +context beyond the document. + +## Results + +| Model | Time | Output tokens | Reasoning tokens | Findings | +|---|---|---|---|---| +| GPT-5 | 94s | 8,246 | 6,016 | 11 | +| Claude Opus 4.6 | 66s | 3,318 | (internal) | 7 | +| Claude Sonnet 4.6 | 77s | 4,163 | (internal) | 6 | + +## Common Ground (all 3 identified) + +1. Position aggregate staleness after crash (write amplification gap) — Critical +2. Position.realized_pnl accumulator drift from LotClosed sum — Critical +3. Multi-lot close walk creating fan-out inconsistency (intermediate states visible) — High +4. Corporate action arrival ordering race creating permanently incorrect immutable LotClosed — Critical +5. Provenance opacity of Position.average_cost (no freshness metadata) — High + +## GPT-5 Unique Findings + +- **Lot ledger unavailability as information flow hazard**: Position freezes at pre-sell state indefinitely during documented outage. Risk/UI operate on stale exposure for entire outage+recovery window. +- **Wash sale timing at closing**: Document only checks at Opening (buy), not Closing (sell). Disqualifying buy that already exists when loss sale executes creates immutable LotClosed with disallowed loss amount. +- **LotClosed per-fill completeness gap**: No end-of-fill marker or expected event count. Consumers cannot distinguish "all events written" from "more coming." +- **Reconciliation non-atomicity**: Re-derivation writes may not be atomic across Position fields, creating transient internal inconsistency. +- **Opening write amplification**: Lot exists before Position update, creating underreported-exposure window. + +## Opus Unique Findings + +- **Strategy P&L attribution fan-out**: During multi-LotClosed write, per-strategy P&L consumers see inconsistent attribution. Strategy-level stop-losses or rebalancing may fire incorrectly on partial data. + +## Sonnet Unique Findings + +- None truly unique — all findings were variations of what the other models found. + +## Key Insight: High Model Convergence + +This lens produced the **highest convergence rate** across all experiments: +- 45% of GPT-5's findings were common ground (vs typical 25-35%) +- Only 6 unique findings across GPT-5's 11 (54% unique vs typical 60-75%) + +**Why:** The document includes an explicit "Failure Modes" table that all models +effectively re-derive as information flow hazards. The unique findings come from +models going BEYOND the document's own failure analysis. + +## Practical Implications + +1. **Information flow analysis is most valuable on documents WITHOUT explicit failure mode tables** — documents that describe data architecture without self-analyzing their failure properties. +2. **Two-model stack (GPT-5 + Opus) is optimal** for this lens. Sonnet adds zero unique value. +3. **GPT-5 finds hazards outside the document's own frame** (event-level completeness, wash sale timing). +4. **Opus excels at dimensional analysis** (strategy attribution dimension) and produces the most concrete, test-case-ready scenarios. + +## Model Characterization for This Lens + +- **GPT-5**: Broadest coverage; finds event-level and lifecycle-timing hazards the document doesn't address +- **Opus**: Most precise scenario construction with concrete dollar amounts; identifies dimensions the document doesn't analyze +- **Sonnet**: Adequate but redundant; produces elaborated versions of what the other two already find