From 7c64712c2fdc1804b7d7aef04d99f95005d6c5ae Mon Sep 17 00:00:00 2001 From: Rodin Date: Sun, 10 May 2026 11:48:41 -0700 Subject: [PATCH] Add finding #65: concurrent write hazards in event sourcing New analytical lens testing concurrent write hazards against event-catalog.md. GPT-5 found 19 hazards, Opus 11, Sonnet 12. Union ~27 distinct findings. Key insight: this lens is high-value for event sourcing docs because replay correctness depends on ordering invariants that are often implicit. --- ...concurrent-write-hazards-event-sourcing.md | 82 +++++++++++++++++++ 1 file changed, 82 insertions(+) create mode 100644 findings/2026-05-10-65-concurrent-write-hazards-event-sourcing.md diff --git a/findings/2026-05-10-65-concurrent-write-hazards-event-sourcing.md b/findings/2026-05-10-65-concurrent-write-hazards-event-sourcing.md new file mode 100644 index 0000000..9fe0cff --- /dev/null +++ b/findings/2026-05-10-65-concurrent-write-hazards-event-sourcing.md @@ -0,0 +1,82 @@ +# Finding #65: Concurrent Write Hazards in Event Sourcing + +**Date:** 2026-05-10 +**Document:** `gargoyle/docs/impl/event-catalog.md` (108 lines) +**Analytical Lens:** Concurrent write hazards in aggregate reconstruction + +## Summary + +All three models found genuine concurrency hazards with moderate overlap. GPT-5 was most +exhaustive (19 hazards), Opus identified design-level flaws, and Sonnet covered core issues +fastest. The union (~27 distinct hazards) far exceeds any single model's output. + +## Metrics + +| Model | Time | Output tokens | Reasoning tokens | Hazards | Critical | High | Medium | +|---|---|---|---|---|---|---|---| +| GPT-5 | 93s | 2,569 | 4,480 | 19 | 6 | 7 | 5 | +| Claude Opus 4 | 64s | 3,250 | (internal) | 11 | 4 | 5 | 2 | +| Claude Sonnet 4 | 33s | 1,631 | (internal) | 12 | 4 | 5 | 3 | + +## Common Ground (All 3 Identified) + +1. **Order fill vs cancellation race** (CRITICAL) — fill precedence rule doesn't specify + timestamp authority +2. **Position update concurrency** (CRITICAL) — no optimistic concurrency control +3. **Cross-stream atomicity** (CRITICAL) — OrderFilled/LotOpened not atomic +4. **Kill switch toggle race** (CRITICAL) — global singleton without concurrency control +5. **Lot closure idempotency** (HIGH) — LotPartiallyClosed can double-apply +6. **Partial fill accumulation** (HIGH) — duplicates can double-count fills + +## GPT-5 Unique Findings + +- **pending_cancel vs pending_replace race** — no precedence between competing non-terminal + transitions (HIGH) +- **Terminal-to-nonterminal regression** — fill precedence omits rejected state (HIGH) +- **Fill events lack unique fill_id** — no idempotency key on fill events (CRITICAL) +- **OrderPartiallyFilled + OrderFilled collision** — race between partial and terminal (HIGH) +- **PositionUpdated after PositionClosed** — no precedence blocking (HIGH) +- **LotPartiallyClosed vs LotFullyClosed race** — competing closures (HIGH) +- Plus 5 more MEDIUM findings on state transitions and event delivery + +## Opus Unique Findings + +- **Cost basis non-determinism** — concurrent partial fills produce different lot cost bases + depending on application order (HIGH) — qualitatively different from quantity accumulation +- **Order state machine transition matrix undefined** — "handles out-of-order" insufficient (HIGH) +- **User ID collision in stream ID** — multi-tenant collision risk (HIGH/MEDIUM) +- **DecisionFormed references unpersisted signals** (MEDIUM) +- **Fill ID uniqueness unspecified** (HIGH) + +## Sonnet Unique Findings + +- **Broker vs system ordering mismatch** — broker may process in reverse order from system (HIGH) +- **Stream ID generation race** — algorithms could generate same order_id (HIGH) +- **Position resurrection** — delayed fill updates position after PositionClosed (MEDIUM) + +## Key Insight + +Concurrent write hazard analysis is a high-value lens for event sourcing documents because: + +1. Event sourcing inherently involves concurrent writes (producers, brokers, timers) +2. Replay correctness depends on ordering invariants that are often implicit +3. Cross-stream dependencies are common but atomicity is hard to achieve +4. Idempotency requirements are frequently under-specified + +## Model Strengths + +- **GPT-5:** Exhaustive enumeration across all hazard categories. Best for comprehensive audits. +- **Opus:** Design-level hazards where the model is underspecified (cost basis determinism). +- **Sonnet:** External consistency (broker ordering) and upstream hazards (ID generation). + +## Practical Implication + +For event sourcing architecture documents, run all three models. GPT-5 for exhaustive coverage, +Opus for design gaps, Sonnet for fast screening. The union provides comprehensive coverage that +no single model achieves. + +## Efficiency + +- GPT-5: 135 tokens/hazard +- Opus: 295 tokens/hazard (more detailed scenarios) +- Sonnet: 136 tokens/hazard (similar to GPT-5, but faster)