finding #43: opus + narrow framing for contradiction detection
Tests the open question from Finding #39: does Opus's internal reasoning depth suffice for self-contradiction verification? Key result: wrong question. Opus finds a different CLASS of contradiction than GPT-5. GPT-5 finds specification conflicts (statement comparison). Opus finds logical impossibilities (deductive rule interaction). Neither dominates — they don't overlap. Sonnet remains unreliable (~33% precision). Document tested: escalation-policy.md (228 lines) Models: GPT-5, Claude Opus 4.6, Claude Sonnet 4.6
This commit is contained in:
@@ -0,0 +1,134 @@
|
|||||||
|
# Finding #43: Opus + narrow framing produces qualitatively different contradiction type than GPT-5; neither dominates
|
||||||
|
|
||||||
|
**Date:** 2026-05-07
|
||||||
|
**Document:** `docs/domain/contexts/risk/escalation-policy.md` (228 lines)
|
||||||
|
**Task type:** Internal logical consistency / self-contradiction detection
|
||||||
|
**Models:** Claude Opus 4.6, GPT-5, Claude Sonnet 4.6 (all narrow framing)
|
||||||
|
**Open question tested:** "Would Opus + narrow framing match GPT-5 for self-contradiction detection?" (from Finding #39)
|
||||||
|
|
||||||
|
## Experiment Design
|
||||||
|
|
||||||
|
Finding #39 showed that Sonnet + narrow framing does NOT close the gap with GPT-5 for
|
||||||
|
contradiction detection — Sonnet found 3 contradictions but only 1 was genuine (2 misreadings).
|
||||||
|
The open question: does Opus's deeper internal reasoning suffice for the verification step
|
||||||
|
that Sonnet lacks?
|
||||||
|
|
||||||
|
Three conditions, same document, same narrow prompt:
|
||||||
|
|
||||||
|
| Condition | Model | Time | Output tokens | Reasoning tokens | Contradictions |
|
||||||
|
|---|---|---|---|---|---|
|
||||||
|
| A | GPT-5 | 52s | 6,415 | 6,208 | 1 |
|
||||||
|
| B | Claude Opus 4.6 | 12s | 468 | (internal) | 1 |
|
||||||
|
| C | Claude Sonnet 4.6 | 26s | 1,451 | (internal) | 3 |
|
||||||
|
|
||||||
|
## What They Found
|
||||||
|
|
||||||
|
### GPT-5 (1 genuine contradiction):
|
||||||
|
|
||||||
|
**Broker-unavailable timing conflict:** The prose says broker unreachability leads to kill
|
||||||
|
switch only after "continued consecutive breaches" (N more evaluations). The table says
|
||||||
|
broker unavailable → "Immediate kill switch escalation." Both describe the same scenario
|
||||||
|
(broker unavailable during liquidation) but prescribe different timing: debounce-gated vs
|
||||||
|
immediate. Severity: High.
|
||||||
|
|
||||||
|
### Claude Opus 4.6 (1 genuine contradiction):
|
||||||
|
|
||||||
|
**Debounce reset paradox:** The document states "A single clear evaluation resets the breach
|
||||||
|
counter." But the Liquidation Sizing section says if liquidation is insufficient, "the next
|
||||||
|
evaluation cycle can trigger additional liquidation — but only after the debounce count resets
|
||||||
|
and fires again." If the metric NEVER clears (liquidation was insufficient, metric still
|
||||||
|
breaches), the counter can never reset per the stated rule. Yet the document says additional
|
||||||
|
liquidation requires the counter to reset. These cannot both be true for a continuously-
|
||||||
|
breaching metric. Severity: High.
|
||||||
|
|
||||||
|
### Claude Sonnet 4.6 (3 claimed, assessment below):
|
||||||
|
|
||||||
|
1. **Failure modes "Automatic" vs manual de-escalation** — Claims "Automatic" recovery in
|
||||||
|
the failure modes table contradicts "manual only" de-escalation from liquidate.
|
||||||
|
**Assessment: MISREAD.** The "Automatic" column describes how the system HANDLES the
|
||||||
|
failure scenario (auto-retries, escalates to kill switch), not downward de-escalation.
|
||||||
|
The system's autonomous recovery is escalation UPWARD (to kill switch), which is
|
||||||
|
consistent with manual-only downward de-escalation.
|
||||||
|
|
||||||
|
2. **Debounce defaults vs calibration guidance** — Restrict→Liquidate defaults to 3 but
|
||||||
|
calibration says volatile metrics need 5-8.
|
||||||
|
**Assessment: TENSION, not contradiction.** The document explicitly says "These are
|
||||||
|
configurable per metric" — the defaults don't need to match the guidance for specific
|
||||||
|
metric types. The calibration section explains HOW to override defaults, not what the
|
||||||
|
defaults must be. This is advice vs defaults, not statement vs statement.
|
||||||
|
|
||||||
|
3. **Kill switch immediate trigger vs "post-liquidation" event description** — Same finding
|
||||||
|
as GPT-5's: broker-unavailable immediate escalation conflicts with the event described
|
||||||
|
as "post-liquidation."
|
||||||
|
**Assessment: GENUINE.** This is the same contradiction GPT-5 found but arrived at via
|
||||||
|
a different evidence path (event description rather than prose/table conflict).
|
||||||
|
|
||||||
|
**Sonnet accuracy: 1 genuine + 1 tension + 1 misread out of 3 claimed = 33-67% precision.**
|
||||||
|
|
||||||
|
## Analysis
|
||||||
|
|
||||||
|
### GPT-5's finding vs Opus's finding — different types of contradiction:
|
||||||
|
|
||||||
|
GPT-5 found a **surface-level specification conflict**: two statements about the same
|
||||||
|
scenario (broker unavailable) prescribe different behaviors (wait N breaches vs immediate).
|
||||||
|
This is the type of contradiction you'd find during careful proofreading — it's where the
|
||||||
|
document says "X" in one place and "not-X" in another about the same thing.
|
||||||
|
|
||||||
|
Opus found a **logical impossibility**: the interaction between two stated rules creates a
|
||||||
|
situation that can never resolve. The debounce reset rule (requires a clear evaluation) and
|
||||||
|
the re-triggering mechanism (needs the counter to reset) cannot both work as described when
|
||||||
|
the metric continuously breaches. This is NOT a statement-vs-statement conflict — it's a
|
||||||
|
logical consequence that the author likely didn't reason through.
|
||||||
|
|
||||||
|
These are qualitatively different:
|
||||||
|
- GPT-5's type: "you said conflicting things about the same scenario" (specification bug)
|
||||||
|
- Opus's type: "your rules, when combined, produce an impossible requirement" (logic bug)
|
||||||
|
|
||||||
|
### Does Opus match GPT-5?
|
||||||
|
|
||||||
|
**No — but not because it's worse.** They find different things. GPT-5's 6,208 reasoning
|
||||||
|
tokens went toward exhaustively checking statement pairs for direct conflicts. Opus's
|
||||||
|
internal reasoning went toward understanding the LOGICAL INTERACTION between rules.
|
||||||
|
|
||||||
|
GPT-5 missed the debounce reset paradox (likely because it requires multi-step logical
|
||||||
|
reasoning about rule interactions rather than statement comparison). Opus missed the
|
||||||
|
broker-unavailable timing conflict (likely because it's a more surface-level inconsistency
|
||||||
|
between prose and table that doesn't involve logical deduction).
|
||||||
|
|
||||||
|
### Sonnet's continued weakness:
|
||||||
|
|
||||||
|
Consistent with Finding #39: Sonnet found 3 contradictions but only 1 was genuine (the
|
||||||
|
broker-unavailable one, same as GPT-5). The failure-modes misread shows Sonnet doesn't
|
||||||
|
reliably verify whether two statements ACTUALLY conflict — it pattern-matches on surface
|
||||||
|
similarity ("Automatic" and "manual only" appear to conflict) without reasoning about
|
||||||
|
whether they refer to the same thing. The debounce/calibration "contradiction" confuses
|
||||||
|
advisory guidance with specification (a type confusion that reasoning models avoid).
|
||||||
|
|
||||||
|
## Key Insight — Two distinct contradiction-finding modes:
|
||||||
|
|
||||||
|
| Mode | Best model | What it catches | Cognitive demand |
|
||||||
|
|---|---|---|---|
|
||||||
|
| Specification conflicts | GPT-5 | Same scenario, different prescriptions | Statement comparison + verification |
|
||||||
|
| Logical impossibilities | Opus | Rules that can't coexist under all conditions | Multi-step logical deduction |
|
||||||
|
|
||||||
|
This explains why the open question ("does Opus match GPT-5?") has no clean yes/no answer.
|
||||||
|
They're not attempting the same thing. GPT-5 exhaustively compares statement pairs. Opus
|
||||||
|
reasons about what the stated rules IMPLY when combined. Both modes catch real bugs that
|
||||||
|
the other misses.
|
||||||
|
|
||||||
|
## Practical Implication
|
||||||
|
|
||||||
|
For self-contradiction detection in architecture documents:
|
||||||
|
- Run BOTH GPT-5 and Opus — they catch fundamentally different types of contradictions
|
||||||
|
- GPT-5 catches specification bugs (conflicting statements about the same thing)
|
||||||
|
- Opus catches logic bugs (rules whose interactions produce impossible conditions)
|
||||||
|
- Sonnet remains unreliable — too many false positives from surface-pattern matching
|
||||||
|
- The cost is minimal (12s + 468 tokens for Opus vs 52s + 6,415 for GPT-5)
|
||||||
|
|
||||||
|
## Updated Answer to Open Question
|
||||||
|
|
||||||
|
> "Would Opus + narrow framing match GPT-5 for self-contradiction detection?"
|
||||||
|
|
||||||
|
**Wrong question.** Opus doesn't try to match GPT-5 — it finds a different class of
|
||||||
|
contradiction. The right framing: Opus + GPT-5 together catch more than either alone,
|
||||||
|
and the contradictions they find don't overlap. Run both.
|
||||||
+10
-3
@@ -22,11 +22,18 @@ cross-doc contradictions are easy to verify once spotted (reducing GPT-5's
|
|||||||
verification advantage)? Or because boundary reasoning (Opus's strength)
|
verification advantage)? Or because boundary reasoning (Opus's strength)
|
||||||
is the primary skill needed?
|
is the primary skill needed?
|
||||||
|
|
||||||
### Opus + narrow framing for contradiction detection (from Finding #39)
|
### ~~Opus + narrow framing for contradiction detection (from Finding #39)~~ → ANSWERED (Finding #43)
|
||||||
Would Opus + narrow framing match GPT-5 for self-contradiction detection?
|
~~Would Opus + narrow framing match GPT-5 for self-contradiction detection?
|
||||||
Finding #39 showed Sonnet can't do it even with narrow framing (reasoning
|
Finding #39 showed Sonnet can't do it even with narrow framing (reasoning
|
||||||
depth issue). Opus has strong cross-boundary reasoning — does its internal
|
depth issue). Opus has strong cross-boundary reasoning — does its internal
|
||||||
reasoning depth suffice for the verification step that Sonnet lacks?
|
reasoning depth suffice for the verification step that Sonnet lacks?~~
|
||||||
|
|
||||||
|
**WRONG QUESTION.** Opus doesn't try to match GPT-5 — it finds a different CLASS
|
||||||
|
of contradiction. GPT-5 finds specification conflicts (same scenario, conflicting
|
||||||
|
prescriptions via statement comparison). Opus finds logical impossibilities (rules
|
||||||
|
whose interaction produces impossible conditions via deductive reasoning). Neither
|
||||||
|
dominates — they don't overlap. Run both for complete coverage. Sonnet remains
|
||||||
|
unreliable (~33% precision on contradiction detection).
|
||||||
|
|
||||||
### ~~Sonnet + narrow framing = GPT-5 level? (from Finding #5)~~ → ANSWERED (Finding #39)
|
### ~~Sonnet + narrow framing = GPT-5 level? (from Finding #5)~~ → ANSWERED (Finding #39)
|
||||||
~~Would Sonnet catch semantic issues if given a narrower "check for logical
|
~~Would Sonnet catch semantic issues if given a narrower "check for logical
|
||||||
|
|||||||
Reference in New Issue
Block a user