Add finding #65: concurrent write hazards in event sourcing
New analytical lens testing concurrent write hazards against event-catalog.md. GPT-5 found 19 hazards, Opus 11, Sonnet 12. Union ~27 distinct findings. Key insight: this lens is high-value for event sourcing docs because replay correctness depends on ordering invariants that are often implicit.
This commit is contained in:
@@ -0,0 +1,82 @@
|
||||
# Finding #65: Concurrent Write Hazards in Event Sourcing
|
||||
|
||||
**Date:** 2026-05-10
|
||||
**Document:** `gargoyle/docs/impl/event-catalog.md` (108 lines)
|
||||
**Analytical Lens:** Concurrent write hazards in aggregate reconstruction
|
||||
|
||||
## Summary
|
||||
|
||||
All three models found genuine concurrency hazards with moderate overlap. GPT-5 was most
|
||||
exhaustive (19 hazards), Opus identified design-level flaws, and Sonnet covered core issues
|
||||
fastest. The union (~27 distinct hazards) far exceeds any single model's output.
|
||||
|
||||
## Metrics
|
||||
|
||||
| Model | Time | Output tokens | Reasoning tokens | Hazards | Critical | High | Medium |
|
||||
|---|---|---|---|---|---|---|---|
|
||||
| GPT-5 | 93s | 2,569 | 4,480 | 19 | 6 | 7 | 5 |
|
||||
| Claude Opus 4 | 64s | 3,250 | (internal) | 11 | 4 | 5 | 2 |
|
||||
| Claude Sonnet 4 | 33s | 1,631 | (internal) | 12 | 4 | 5 | 3 |
|
||||
|
||||
## Common Ground (All 3 Identified)
|
||||
|
||||
1. **Order fill vs cancellation race** (CRITICAL) — fill precedence rule doesn't specify
|
||||
timestamp authority
|
||||
2. **Position update concurrency** (CRITICAL) — no optimistic concurrency control
|
||||
3. **Cross-stream atomicity** (CRITICAL) — OrderFilled/LotOpened not atomic
|
||||
4. **Kill switch toggle race** (CRITICAL) — global singleton without concurrency control
|
||||
5. **Lot closure idempotency** (HIGH) — LotPartiallyClosed can double-apply
|
||||
6. **Partial fill accumulation** (HIGH) — duplicates can double-count fills
|
||||
|
||||
## GPT-5 Unique Findings
|
||||
|
||||
- **pending_cancel vs pending_replace race** — no precedence between competing non-terminal
|
||||
transitions (HIGH)
|
||||
- **Terminal-to-nonterminal regression** — fill precedence omits rejected state (HIGH)
|
||||
- **Fill events lack unique fill_id** — no idempotency key on fill events (CRITICAL)
|
||||
- **OrderPartiallyFilled + OrderFilled collision** — race between partial and terminal (HIGH)
|
||||
- **PositionUpdated after PositionClosed** — no precedence blocking (HIGH)
|
||||
- **LotPartiallyClosed vs LotFullyClosed race** — competing closures (HIGH)
|
||||
- Plus 5 more MEDIUM findings on state transitions and event delivery
|
||||
|
||||
## Opus Unique Findings
|
||||
|
||||
- **Cost basis non-determinism** — concurrent partial fills produce different lot cost bases
|
||||
depending on application order (HIGH) — qualitatively different from quantity accumulation
|
||||
- **Order state machine transition matrix undefined** — "handles out-of-order" insufficient (HIGH)
|
||||
- **User ID collision in stream ID** — multi-tenant collision risk (HIGH/MEDIUM)
|
||||
- **DecisionFormed references unpersisted signals** (MEDIUM)
|
||||
- **Fill ID uniqueness unspecified** (HIGH)
|
||||
|
||||
## Sonnet Unique Findings
|
||||
|
||||
- **Broker vs system ordering mismatch** — broker may process in reverse order from system (HIGH)
|
||||
- **Stream ID generation race** — algorithms could generate same order_id (HIGH)
|
||||
- **Position resurrection** — delayed fill updates position after PositionClosed (MEDIUM)
|
||||
|
||||
## Key Insight
|
||||
|
||||
Concurrent write hazard analysis is a high-value lens for event sourcing documents because:
|
||||
|
||||
1. Event sourcing inherently involves concurrent writes (producers, brokers, timers)
|
||||
2. Replay correctness depends on ordering invariants that are often implicit
|
||||
3. Cross-stream dependencies are common but atomicity is hard to achieve
|
||||
4. Idempotency requirements are frequently under-specified
|
||||
|
||||
## Model Strengths
|
||||
|
||||
- **GPT-5:** Exhaustive enumeration across all hazard categories. Best for comprehensive audits.
|
||||
- **Opus:** Design-level hazards where the model is underspecified (cost basis determinism).
|
||||
- **Sonnet:** External consistency (broker ordering) and upstream hazards (ID generation).
|
||||
|
||||
## Practical Implication
|
||||
|
||||
For event sourcing architecture documents, run all three models. GPT-5 for exhaustive coverage,
|
||||
Opus for design gaps, Sonnet for fast screening. The union provides comprehensive coverage that
|
||||
no single model achieves.
|
||||
|
||||
## Efficiency
|
||||
|
||||
- GPT-5: 135 tokens/hazard
|
||||
- Opus: 295 tokens/hazard (more detailed scenarios)
|
||||
- Sonnet: 136 tokens/hazard (similar to GPT-5, but faster)
|
||||
Reference in New Issue
Block a user