# Boundary Violation Risk Analysis on Context-Level README **Date:** 2026-05-10 **Document:** gargoyle Risk context README (149 lines) **Task Type:** Boundary violation risk identification **Status:** Partial experiment (heavy models timed out) ## Experiment Design First experiment targeting a bounded context README rather than a mechanism specification. The analytical lens (boundary violations) matches the document type (architecture/boundaries). ### Prompt Categories 1. Anti-corruption layer gaps 2. Event contract underspecification 3. Service responsibility creep 4. Invariant enforcement gaps 5. Temporal coupling risks ### Models Tested | Model | Time | Output tokens | Status | |---|---|---|---| | Claude Sonnet 4.6 | 19s | 1,302 | Complete | | Claude Opus | - | - | TIMEOUT (300s) | | GPT-5 | - | - | TIMEOUT (300s) | ## Results (Sonnet) 12 findings across 5 categories: ### Critical Severity (2) - **LotClosed for PDT counting:** No definition of "same day" across time zones, no partial close handling - **Kill Switch monotonicity:** No mechanism to prevent concurrent escalation level updates ### High Severity (8) - Tick event translation rules missing (symbol mapping, price normalization) - FillReceived currency/lot identification mapping undefined - KillSwitchEngaged/Disengaged idempotency semantics missing - Liquidation Sizing requires wash sale/liquidity data not provided via events - Portfolio Evaluation requires correlation data not provided via events - PDT Counting concurrent update atomicity undefined - Continuous Monitoring/MarketOpened sequencing unspecified - Portfolio Evaluation price staleness tolerance unspecified ### Medium Severity (2) - MarketDataStale/Fresh staleness threshold undefined - Kill switch disengage race between automatic and manual ## Key Insights ### Context-level analysis is different from spec-level Findings are about RELATIONSHIPS (how contexts communicate) rather than MECHANISMS (how components work internally). Several findings identify where the README CLAIMS a service has responsibilities but doesn't document the DATA FLOW enabling them. ### Document type shapes finding character Spec-level experiments find implementation gaps (ETS ownership, race conditions). Context-level experiments find BOUNDARY gaps (event contracts, cross-context timing). ### Sonnet handles context-level analysis well With structured prompts (5 categories), Sonnet produces well-organized boundary violation analysis in 19s. The feedback loop finding (#12 - pro-forma staleness causing immediate escalation) shows genuine architectural reasoning. ## Pending Work - Retry Opus and GPT-5 during lower-contention period - Compare whether heavy models find different boundary risks - Hypothesis: Opus may excel here (boundary/tension reasoning is its strength) ## Practical Implication For context-level architecture review, Sonnet is sufficient for first-pass boundary violation scanning. Structured prompts keep analysis focused. Worth running Opus as second pass for design-tension identification.