Files
model-research/findings
claw b5b5b64a40 finding #46: operational blind spot analysis — new task type
Novel experiment testing 'what's invisible to operators' on gargoyle's
observability.md (563 lines). GPT-5 (18 findings), Opus (12), Sonnet (10).

Key discovery: 'actively misleads' category (observability creating false
confidence) is highest-value and Opus-dominated. Distinct from assumption-
finding, race conditions, or gap analysis — requires reasoning about
negation (what ISN'T instrumented vs what production needs).
2026-05-08 00:27:23 -07:00
..

Model Findings — Analytical & Research Work

Tracking what actually works (and doesn't) when using AI models for research, analysis, bias detection, and document review — not coding.

Started: 2026-04-26

Context

We use multiple models in different roles: Claude Code (Opus/Sonnet) for generation, Sonnet + GPT-5 for independent dual review, smaller models for focused analytical tasks. Most public discussion is about coding. We found almost no published methodology for using models in analytical research tasks (searched 2026-04-26). That gap is why we're tracking this.

Each experiment lives in its own file. See individual finding files below.