Files
model-research/findings
Rodin ce4801e8a3 Add Finding #62: Boundary contract analysis (new analytical lens)
Tested on signal-lifecycle.md (111 lines). Results:
- GPT-5: 17 gaps (7,744 reasoning tokens)
- Opus: 11 gaps (design-level focus)
- Sonnet: 8 gaps (fastest, protocol-level)

Key insight: Union of all models (~26 gaps) far exceeds any single
model (max 17). Only 5 gaps found by all three — highly differentiated
outputs make multi-model runs valuable for interface documents.
2026-05-09 23:35:36 -07:00
..

Model Findings — Analytical & Research Work

Tracking what actually works (and doesn't) when using AI models for research, analysis, bias detection, and document review — not coding.

Started: 2026-04-26

Context

We use multiple models in different roles: Claude Code (Opus/Sonnet) for generation, Sonnet + GPT-5 for independent dual review, smaller models for focused analytical tasks. Most public discussion is about coding. We found almost no published methodology for using models in analytical research tasks (searched 2026-04-26). That gap is why we're tracking this.

Each experiment lives in its own file. See individual finding files below.