Files

T

Rodin 6af8a6ee10 refactor(findings): split ALL-FINDINGS.md into per-experiment files

Break the monolithic 3249-line findings file into 29 individual files,
one per experiment. Each file is named YYYY-MM-DD-NN-slug.md for easy
chronological sorting and discovery.

No content changes — purely structural reorganization.

2026-05-06 07:15:50 -07:00

894 B

Raw Blame History

Finding 3: GPT-5 times out on complex multi-step analytical tasks (confirmed pattern)

Date: 2026-04-26 Task: Full PR review of #382 (research document rewrite) How we used it: pr-review skill — multi-phase (fetch diff, fetch files, check CI, analyze against AC, post inline comments, post summary). 7 phases, many curl calls to Gitea API, large diff context. Heavy tool-use workflow through SAP proxy (adds latency vs direct API). 300s timeout.

Timed out 3 times at 300s (17, 6, 6 tool calls respectively)
Bottleneck was model processing time, not network (~0.3s Gitea API latency)
Takeaway: Break analytical tasks into focused bounded pieces. Twelve small deep reviews > one rushed big one. The issue isn't GPT-5's analysis quality — it's that multi-phase tool-heavy workflows burn too much time on mechanics. Separate the data gathering from the analysis.

894 B Raw Blame History

Finding 3: GPT-5 times out on complex multi-step analytical tasks (confirmed pattern)

894 B

Raw Blame History