finding #43: opus + narrow framing for contradiction detection

Tests the open question from Finding #39: does Opus's internal reasoning
depth suffice for self-contradiction verification?

Key result: wrong question. Opus finds a different CLASS of contradiction
than GPT-5. GPT-5 finds specification conflicts (statement comparison).
Opus finds logical impossibilities (deductive rule interaction). Neither
dominates — they don't overlap. Sonnet remains unreliable (~33% precision).

Document tested: escalation-policy.md (228 lines)
Models: GPT-5, Claude Opus 4.6, Claude Sonnet 4.6
This commit is contained in:
claw
2026-05-07 16:05:14 -07:00
parent 296bb21eb7
commit d8a030d9e9
2 changed files with 144 additions and 3 deletions
@@ -0,0 +1,134 @@
# Finding #43: Opus + narrow framing produces qualitatively different contradiction type than GPT-5; neither dominates
**Date:** 2026-05-07
**Document:** `docs/domain/contexts/risk/escalation-policy.md` (228 lines)
**Task type:** Internal logical consistency / self-contradiction detection
**Models:** Claude Opus 4.6, GPT-5, Claude Sonnet 4.6 (all narrow framing)
**Open question tested:** "Would Opus + narrow framing match GPT-5 for self-contradiction detection?" (from Finding #39)
## Experiment Design
Finding #39 showed that Sonnet + narrow framing does NOT close the gap with GPT-5 for
contradiction detection — Sonnet found 3 contradictions but only 1 was genuine (2 misreadings).
The open question: does Opus's deeper internal reasoning suffice for the verification step
that Sonnet lacks?
Three conditions, same document, same narrow prompt:
| Condition | Model | Time | Output tokens | Reasoning tokens | Contradictions |
|---|---|---|---|---|---|
| A | GPT-5 | 52s | 6,415 | 6,208 | 1 |
| B | Claude Opus 4.6 | 12s | 468 | (internal) | 1 |
| C | Claude Sonnet 4.6 | 26s | 1,451 | (internal) | 3 |
## What They Found
### GPT-5 (1 genuine contradiction):
**Broker-unavailable timing conflict:** The prose says broker unreachability leads to kill
switch only after "continued consecutive breaches" (N more evaluations). The table says
broker unavailable → "Immediate kill switch escalation." Both describe the same scenario
(broker unavailable during liquidation) but prescribe different timing: debounce-gated vs
immediate. Severity: High.
### Claude Opus 4.6 (1 genuine contradiction):
**Debounce reset paradox:** The document states "A single clear evaluation resets the breach
counter." But the Liquidation Sizing section says if liquidation is insufficient, "the next
evaluation cycle can trigger additional liquidation — but only after the debounce count resets
and fires again." If the metric NEVER clears (liquidation was insufficient, metric still
breaches), the counter can never reset per the stated rule. Yet the document says additional
liquidation requires the counter to reset. These cannot both be true for a continuously-
breaching metric. Severity: High.
### Claude Sonnet 4.6 (3 claimed, assessment below):
1. **Failure modes "Automatic" vs manual de-escalation** — Claims "Automatic" recovery in
the failure modes table contradicts "manual only" de-escalation from liquidate.
**Assessment: MISREAD.** The "Automatic" column describes how the system HANDLES the
failure scenario (auto-retries, escalates to kill switch), not downward de-escalation.
The system's autonomous recovery is escalation UPWARD (to kill switch), which is
consistent with manual-only downward de-escalation.
2. **Debounce defaults vs calibration guidance** — Restrict→Liquidate defaults to 3 but
calibration says volatile metrics need 5-8.
**Assessment: TENSION, not contradiction.** The document explicitly says "These are
configurable per metric" — the defaults don't need to match the guidance for specific
metric types. The calibration section explains HOW to override defaults, not what the
defaults must be. This is advice vs defaults, not statement vs statement.
3. **Kill switch immediate trigger vs "post-liquidation" event description** — Same finding
as GPT-5's: broker-unavailable immediate escalation conflicts with the event described
as "post-liquidation."
**Assessment: GENUINE.** This is the same contradiction GPT-5 found but arrived at via
a different evidence path (event description rather than prose/table conflict).
**Sonnet accuracy: 1 genuine + 1 tension + 1 misread out of 3 claimed = 33-67% precision.**
## Analysis
### GPT-5's finding vs Opus's finding — different types of contradiction:
GPT-5 found a **surface-level specification conflict**: two statements about the same
scenario (broker unavailable) prescribe different behaviors (wait N breaches vs immediate).
This is the type of contradiction you'd find during careful proofreading — it's where the
document says "X" in one place and "not-X" in another about the same thing.
Opus found a **logical impossibility**: the interaction between two stated rules creates a
situation that can never resolve. The debounce reset rule (requires a clear evaluation) and
the re-triggering mechanism (needs the counter to reset) cannot both work as described when
the metric continuously breaches. This is NOT a statement-vs-statement conflict — it's a
logical consequence that the author likely didn't reason through.
These are qualitatively different:
- GPT-5's type: "you said conflicting things about the same scenario" (specification bug)
- Opus's type: "your rules, when combined, produce an impossible requirement" (logic bug)
### Does Opus match GPT-5?
**No — but not because it's worse.** They find different things. GPT-5's 6,208 reasoning
tokens went toward exhaustively checking statement pairs for direct conflicts. Opus's
internal reasoning went toward understanding the LOGICAL INTERACTION between rules.
GPT-5 missed the debounce reset paradox (likely because it requires multi-step logical
reasoning about rule interactions rather than statement comparison). Opus missed the
broker-unavailable timing conflict (likely because it's a more surface-level inconsistency
between prose and table that doesn't involve logical deduction).
### Sonnet's continued weakness:
Consistent with Finding #39: Sonnet found 3 contradictions but only 1 was genuine (the
broker-unavailable one, same as GPT-5). The failure-modes misread shows Sonnet doesn't
reliably verify whether two statements ACTUALLY conflict — it pattern-matches on surface
similarity ("Automatic" and "manual only" appear to conflict) without reasoning about
whether they refer to the same thing. The debounce/calibration "contradiction" confuses
advisory guidance with specification (a type confusion that reasoning models avoid).
## Key Insight — Two distinct contradiction-finding modes:
| Mode | Best model | What it catches | Cognitive demand |
|---|---|---|---|
| Specification conflicts | GPT-5 | Same scenario, different prescriptions | Statement comparison + verification |
| Logical impossibilities | Opus | Rules that can't coexist under all conditions | Multi-step logical deduction |
This explains why the open question ("does Opus match GPT-5?") has no clean yes/no answer.
They're not attempting the same thing. GPT-5 exhaustively compares statement pairs. Opus
reasons about what the stated rules IMPLY when combined. Both modes catch real bugs that
the other misses.
## Practical Implication
For self-contradiction detection in architecture documents:
- Run BOTH GPT-5 and Opus — they catch fundamentally different types of contradictions
- GPT-5 catches specification bugs (conflicting statements about the same thing)
- Opus catches logic bugs (rules whose interactions produce impossible conditions)
- Sonnet remains unreliable — too many false positives from surface-pattern matching
- The cost is minimal (12s + 468 tokens for Opus vs 52s + 6,415 for GPT-5)
## Updated Answer to Open Question
> "Would Opus + narrow framing match GPT-5 for self-contradiction detection?"
**Wrong question.** Opus doesn't try to match GPT-5 — it finds a different class of
contradiction. The right framing: Opus + GPT-5 together catch more than either alone,
and the contradictions they find don't overlap. Run both.
+10 -3
View File
@@ -22,11 +22,18 @@ cross-doc contradictions are easy to verify once spotted (reducing GPT-5's
verification advantage)? Or because boundary reasoning (Opus's strength)
is the primary skill needed?
### Opus + narrow framing for contradiction detection (from Finding #39)
Would Opus + narrow framing match GPT-5 for self-contradiction detection?
### ~~Opus + narrow framing for contradiction detection (from Finding #39)~~ → ANSWERED (Finding #43)
~~Would Opus + narrow framing match GPT-5 for self-contradiction detection?
Finding #39 showed Sonnet can't do it even with narrow framing (reasoning
depth issue). Opus has strong cross-boundary reasoning — does its internal
reasoning depth suffice for the verification step that Sonnet lacks?
reasoning depth suffice for the verification step that Sonnet lacks?~~
**WRONG QUESTION.** Opus doesn't try to match GPT-5 — it finds a different CLASS
of contradiction. GPT-5 finds specification conflicts (same scenario, conflicting
prescriptions via statement comparison). Opus finds logical impossibilities (rules
whose interaction produces impossible conditions via deductive reasoning). Neither
dominates — they don't overlap. Run both for complete coverage. Sonnet remains
unreliable (~33% precision on contradiction detection).
### ~~Sonnet + narrow framing = GPT-5 level? (from Finding #5)~~ → ANSWERED (Finding #39)
~~Would Sonnet catch semantic issues if given a narrower "check for logical