model-research

Author	SHA1	Message	Date
Rodin	a3aebc7cc1	docs(readme): add Reports section with links to REPORT.md and LESSONS.md Explains what each file contains, that they're auto-regenerated weekly, and includes generation timestamps.	2026-05-06 07:29:03 -07:00
Rodin	6af8a6ee10	refactor(findings): split ALL-FINDINGS.md into per-experiment files Break the monolithic 3249-line findings file into 29 individual files, one per experiment. Each file is named YYYY-MM-DD-NN-slug.md for easy chronological sorting and discovery. No content changes — purely structural reorganization.	2026-05-06 07:15:50 -07:00
Rodin	1b108ff66e	Initial publish: 29 findings, 6 prompts, methodology, open questions Full comparative analysis of GPT-5, Claude Opus 4.6, Claude Sonnet 4.6, GPT-4.1, and GPT-4.1 Mini on analytical tasks (not coding). Contents: - findings/ALL-FINDINGS.md — complete 3,249-line research log with all 29 findings, methodology notes, and open questions - prompts/ — 6 exact prompts used across experiments - methodology.md — experimental setup and evaluation criteria - open-questions.md — unanswered questions for future work - README.md — overview and summary table Key findings: - Cross-document consistency: Opus is 2.4x faster with more findings - Gap-finding: GPT-5 reasoning tokens find domain-specific gaps - Race conditions: Opus excels at temporal interaction reasoning - Bias detection: Signal-to-noise ratio > model capability - Adversarial analysis: GPT-5 exhaustive, Opus qualitatively different Signed-off-by: Rodin	2026-05-05 19:13:03 -07:00
rodin	4aea0d004b	Initial commit	2026-05-06 02:10:14 +00:00

4 Commits