Files
model-research/findings
Rodin 98304604ac Finding 58: State machine completeness analysis on kill-switch.md
GPT-5 finds 16 gaps, Opus 11, Sonnet 9. GPT-5 excels at exhaustive
state space enumeration; Opus finds convention-vs-enforcement gaps;
Sonnet adequate but less thorough.

Key insight: state machine completeness is a GPT-5 sweet spot due to
reasoning tokens enabling systematic combinatorial coverage.
2026-05-09 15:06:32 -07:00
..

Model Findings — Analytical & Research Work

Tracking what actually works (and doesn't) when using AI models for research, analysis, bias detection, and document review — not coding.

Started: 2026-04-26

Context

We use multiple models in different roles: Claude Code (Opus/Sonnet) for generation, Sonnet + GPT-5 for independent dual review, smaller models for focused analytical tasks. Most public discussion is about coding. We found almost no published methodology for using models in analytical research tasks (searched 2026-04-26). That gap is why we're tracking this.

Each experiment lives in its own file. See individual finding files below.