refactor(findings): split ALL-FINDINGS.md into per-experiment files

Break the monolithic 3249-line findings file into 29 individual files, one per experiment. Each file is named YYYY-MM-DD-NN-slug.md for easy chronological sorting and discovery. No content changes — purely structural reorganization.
2026-05-06 07:15:50 -07:00
parent 1b108ff66e
commit 6af8a6ee10
32 changed files with 3232 additions and 3254 deletions
@@ -0,0 +1,16 @@
+# Finding 1: Different models catch different things (confirmed)
+
+**Date:** 2026-04-26
+**Task:** PR reviews on DDD reference docs (~6,600 lines across 18 files)
+**How we used them:** Both models got the same task via pr-review skill —
+fetch diff, fetch full file content for changed files, review against PR
+description and linked issue acceptance criteria. Rich context: full diff,
+project CLAUDE.md conventions, issue body. Each reviewer ran independently
+in its own sub-agent with its own Gitea token. No cross-pollination.
+
+- GPT-5 caught SUMMARY.md verdict mismatches (Commanded classification,
+  small teams classification) that Sonnet missed entirely (PR #375)
+- Sonnet caught a broken cross-reference link first that GPT-5 missed (PR #378)
+- **Takeaway:** Different blind spots are real. Neither model is strictly better
+  for analytical review — they complement each other. This is why we run two
+  independent reviewers from different model families.