refactor(findings): split ALL-FINDINGS.md into per-experiment files
Break the monolithic 3249-line findings file into 29 individual files, one per experiment. Each file is named YYYY-MM-DD-NN-slug.md for easy chronological sorting and discovery. No content changes — purely structural reorganization.
This commit is contained in:
@@ -0,0 +1,18 @@
|
||||
# Finding 4: GPT-5 defaults to delegation; Claude defaults to doing the work
|
||||
|
||||
**Date:** 2026-04-26
|
||||
**Task:** PR review delegation to sub-agents
|
||||
**How we used them:** Both spawned as sub-agents from main session with
|
||||
same task description, same pr-review skill file, same Gitea credentials.
|
||||
Difference: GPT-5 got model override to gpt5, Sonnet used default model.
|
||||
Both got full skill instructions.
|
||||
|
||||
- GPT-5 first attempt: spawned sub-sub-agents and timed out
|
||||
- GPT-5 with "do it yourself, no sub-agents" + step-by-step: worked
|
||||
- Even with constraints, GPT-5 sometimes dumps raw tool output instead of
|
||||
synthesizing — needs explicit output format instructions
|
||||
- Claude (Sonnet/Opus) given the same kind of task does the work directly
|
||||
- **Takeaway:** GPT interprets complex task descriptions as delegation
|
||||
opportunities. Claude interprets them as work to do. For GPT: explicit
|
||||
single-actor instructions + output format. For Claude: can give broader
|
||||
mandate. Same skill file, very different behavior.
|
||||
Reference in New Issue
Block a user