diff --git a/findings/2026-05-07-43-opus-narrow-contradiction-detection.md b/findings/2026-05-07-43-opus-narrow-contradiction-detection.md new file mode 100644 index 0000000..7d324f8 --- /dev/null +++ b/findings/2026-05-07-43-opus-narrow-contradiction-detection.md @@ -0,0 +1,134 @@ +# Finding #43: Opus + narrow framing produces qualitatively different contradiction type than GPT-5; neither dominates + +**Date:** 2026-05-07 +**Document:** `docs/domain/contexts/risk/escalation-policy.md` (228 lines) +**Task type:** Internal logical consistency / self-contradiction detection +**Models:** Claude Opus 4.6, GPT-5, Claude Sonnet 4.6 (all narrow framing) +**Open question tested:** "Would Opus + narrow framing match GPT-5 for self-contradiction detection?" (from Finding #39) + +## Experiment Design + +Finding #39 showed that Sonnet + narrow framing does NOT close the gap with GPT-5 for +contradiction detection — Sonnet found 3 contradictions but only 1 was genuine (2 misreadings). +The open question: does Opus's deeper internal reasoning suffice for the verification step +that Sonnet lacks? + +Three conditions, same document, same narrow prompt: + +| Condition | Model | Time | Output tokens | Reasoning tokens | Contradictions | +|---|---|---|---|---|---| +| A | GPT-5 | 52s | 6,415 | 6,208 | 1 | +| B | Claude Opus 4.6 | 12s | 468 | (internal) | 1 | +| C | Claude Sonnet 4.6 | 26s | 1,451 | (internal) | 3 | + +## What They Found + +### GPT-5 (1 genuine contradiction): + +**Broker-unavailable timing conflict:** The prose says broker unreachability leads to kill +switch only after "continued consecutive breaches" (N more evaluations). The table says +broker unavailable → "Immediate kill switch escalation." Both describe the same scenario +(broker unavailable during liquidation) but prescribe different timing: debounce-gated vs +immediate. Severity: High. + +### Claude Opus 4.6 (1 genuine contradiction): + +**Debounce reset paradox:** The document states "A single clear evaluation resets the breach +counter." But the Liquidation Sizing section says if liquidation is insufficient, "the next +evaluation cycle can trigger additional liquidation — but only after the debounce count resets +and fires again." If the metric NEVER clears (liquidation was insufficient, metric still +breaches), the counter can never reset per the stated rule. Yet the document says additional +liquidation requires the counter to reset. These cannot both be true for a continuously- +breaching metric. Severity: High. + +### Claude Sonnet 4.6 (3 claimed, assessment below): + +1. **Failure modes "Automatic" vs manual de-escalation** — Claims "Automatic" recovery in + the failure modes table contradicts "manual only" de-escalation from liquidate. + **Assessment: MISREAD.** The "Automatic" column describes how the system HANDLES the + failure scenario (auto-retries, escalates to kill switch), not downward de-escalation. + The system's autonomous recovery is escalation UPWARD (to kill switch), which is + consistent with manual-only downward de-escalation. + +2. **Debounce defaults vs calibration guidance** — Restrict→Liquidate defaults to 3 but + calibration says volatile metrics need 5-8. + **Assessment: TENSION, not contradiction.** The document explicitly says "These are + configurable per metric" — the defaults don't need to match the guidance for specific + metric types. The calibration section explains HOW to override defaults, not what the + defaults must be. This is advice vs defaults, not statement vs statement. + +3. **Kill switch immediate trigger vs "post-liquidation" event description** — Same finding + as GPT-5's: broker-unavailable immediate escalation conflicts with the event described + as "post-liquidation." + **Assessment: GENUINE.** This is the same contradiction GPT-5 found but arrived at via + a different evidence path (event description rather than prose/table conflict). + +**Sonnet accuracy: 1 genuine + 1 tension + 1 misread out of 3 claimed = 33-67% precision.** + +## Analysis + +### GPT-5's finding vs Opus's finding — different types of contradiction: + +GPT-5 found a **surface-level specification conflict**: two statements about the same +scenario (broker unavailable) prescribe different behaviors (wait N breaches vs immediate). +This is the type of contradiction you'd find during careful proofreading — it's where the +document says "X" in one place and "not-X" in another about the same thing. + +Opus found a **logical impossibility**: the interaction between two stated rules creates a +situation that can never resolve. The debounce reset rule (requires a clear evaluation) and +the re-triggering mechanism (needs the counter to reset) cannot both work as described when +the metric continuously breaches. This is NOT a statement-vs-statement conflict — it's a +logical consequence that the author likely didn't reason through. + +These are qualitatively different: +- GPT-5's type: "you said conflicting things about the same scenario" (specification bug) +- Opus's type: "your rules, when combined, produce an impossible requirement" (logic bug) + +### Does Opus match GPT-5? + +**No — but not because it's worse.** They find different things. GPT-5's 6,208 reasoning +tokens went toward exhaustively checking statement pairs for direct conflicts. Opus's +internal reasoning went toward understanding the LOGICAL INTERACTION between rules. + +GPT-5 missed the debounce reset paradox (likely because it requires multi-step logical +reasoning about rule interactions rather than statement comparison). Opus missed the +broker-unavailable timing conflict (likely because it's a more surface-level inconsistency +between prose and table that doesn't involve logical deduction). + +### Sonnet's continued weakness: + +Consistent with Finding #39: Sonnet found 3 contradictions but only 1 was genuine (the +broker-unavailable one, same as GPT-5). The failure-modes misread shows Sonnet doesn't +reliably verify whether two statements ACTUALLY conflict — it pattern-matches on surface +similarity ("Automatic" and "manual only" appear to conflict) without reasoning about +whether they refer to the same thing. The debounce/calibration "contradiction" confuses +advisory guidance with specification (a type confusion that reasoning models avoid). + +## Key Insight — Two distinct contradiction-finding modes: + +| Mode | Best model | What it catches | Cognitive demand | +|---|---|---|---| +| Specification conflicts | GPT-5 | Same scenario, different prescriptions | Statement comparison + verification | +| Logical impossibilities | Opus | Rules that can't coexist under all conditions | Multi-step logical deduction | + +This explains why the open question ("does Opus match GPT-5?") has no clean yes/no answer. +They're not attempting the same thing. GPT-5 exhaustively compares statement pairs. Opus +reasons about what the stated rules IMPLY when combined. Both modes catch real bugs that +the other misses. + +## Practical Implication + +For self-contradiction detection in architecture documents: +- Run BOTH GPT-5 and Opus — they catch fundamentally different types of contradictions +- GPT-5 catches specification bugs (conflicting statements about the same thing) +- Opus catches logic bugs (rules whose interactions produce impossible conditions) +- Sonnet remains unreliable — too many false positives from surface-pattern matching +- The cost is minimal (12s + 468 tokens for Opus vs 52s + 6,415 for GPT-5) + +## Updated Answer to Open Question + +> "Would Opus + narrow framing match GPT-5 for self-contradiction detection?" + +**Wrong question.** Opus doesn't try to match GPT-5 — it finds a different class of +contradiction. The right framing: Opus + GPT-5 together catch more than either alone, +and the contradictions they find don't overlap. Run both. diff --git a/open-questions.md b/open-questions.md index 4e89821..e268e10 100644 --- a/open-questions.md +++ b/open-questions.md @@ -22,11 +22,18 @@ cross-doc contradictions are easy to verify once spotted (reducing GPT-5's verification advantage)? Or because boundary reasoning (Opus's strength) is the primary skill needed? -### Opus + narrow framing for contradiction detection (from Finding #39) -Would Opus + narrow framing match GPT-5 for self-contradiction detection? +### ~~Opus + narrow framing for contradiction detection (from Finding #39)~~ → ANSWERED (Finding #43) +~~Would Opus + narrow framing match GPT-5 for self-contradiction detection? Finding #39 showed Sonnet can't do it even with narrow framing (reasoning depth issue). Opus has strong cross-boundary reasoning — does its internal -reasoning depth suffice for the verification step that Sonnet lacks? +reasoning depth suffice for the verification step that Sonnet lacks?~~ + +**WRONG QUESTION.** Opus doesn't try to match GPT-5 — it finds a different CLASS +of contradiction. GPT-5 finds specification conflicts (same scenario, conflicting +prescriptions via statement comparison). Opus finds logical impossibilities (rules +whose interaction produces impossible conditions via deductive reasoning). Neither +dominates — they don't overlap. Run both for complete coverage. Sonnet remains +unreliable (~33% precision on contradiction detection). ### ~~Sonnet + narrow framing = GPT-5 level? (from Finding #5)~~ → ANSWERED (Finding #39) ~~Would Sonnet catch semantic issues if given a narrower "check for logical