Files
model-research/findings
Rodin ac55ecdb98 Finding 28: Regulatory compliance analysis on wash sale tracking
- GPT-5 most comprehensive on IRS-specific rules (18 findings, 9600 reasoning tokens)
- Sonnet fast first-pass (14 findings in 25s)
- Opus high-density actionable (11 findings with clear remediation)
- New insight: domain expertise tasks favor GPT-5 reasoning depth
- Updated model assignment for compliance review workflow
2026-05-11 00:29:12 -07:00
..

Model Findings — Analytical & Research Work

Tracking what actually works (and doesn't) when using AI models for research, analysis, bias detection, and document review — not coding.

Started: 2026-04-26

Context

We use multiple models in different roles: Claude Code (Opus/Sonnet) for generation, Sonnet + GPT-5 for independent dual review, smaller models for focused analytical tasks. Most public discussion is about coding. We found almost no published methodology for using models in analytical research tasks (searched 2026-04-26). That gap is why we're tracking this.

Each experiment lives in its own file. See individual finding files below.