Files
model-research/findings
Rodin 2b10595bff Finding #68: Cross-context contract coherence analysis
GPT-5 outperforms Sonnet on cross-context integration analysis:
- GPT-5: 10 findings (4 Critical) in 191s with 7,744 reasoning tokens
- Sonnet: 7 findings (1 Critical) in 23s

Key insight: Cross-context contract verification benefits from extended
reasoning (contrast to Finding #67 where Sonnet was better at inter-doc
contradictions). Flow tracing and subscription gap detection require
systematic verification that GPT-5's exhaustive style excels at.

Discovered actual spec gaps in gargoyle domain model (FillReceived
missing fields, no liquidation instruction event, Risk not subscribing
to LotOpened for PDT, etc.).
2026-05-10 21:47:27 -07:00
..

Model Findings — Analytical & Research Work

Tracking what actually works (and doesn't) when using AI models for research, analysis, bias detection, and document review — not coding.

Started: 2026-04-26

Context

We use multiple models in different roles: Claude Code (Opus/Sonnet) for generation, Sonnet + GPT-5 for independent dual review, smaller models for focused analytical tasks. Most public discussion is about coding. We found almost no published methodology for using models in analytical research tasks (searched 2026-04-26). That gap is why we're tracking this.

Each experiment lives in its own file. See individual finding files below.