refactor: skill v2 — thinking frameworks, NEVER list, fallbacks

Evaluated via skill-judge: 106/120 (B grade). - Thinking framework before analysis - Prioritization criteria (what's worth digging into) - 10-item NEVER list from 5-repo experience - Commands externalized to references/commands.md - Fallbacks for edge cases (no PRs, private repos, young repos) - Each phase teaches what matters, not how to grep
2026-04-30 11:56:54 -07:00
parent 38a9d66072
commit 946119a3cc
2 changed files with 300 additions and 121 deletions
@@ -16,178 +16,235 @@ description: >-

 # Codebase Analysis

-Extract architectural conventions from open source repositories through
-a structured multi-phase process. Output goes to Gitea convention repos.
+Extract architectural conventions from open source repositories.
+Output: Gitea repos under `rodin/<project>-conventions`.

-## Naming Convention
+## Naming

- **Language patterns** (`go-patterns`, `elixir-patterns`): stdlib/language
-  idioms verified from source
- **Project conventions** (`<project>-conventions`): how a specific codebase
-  does things — e.g., `temporal-conventions`, `cockroachdb-conventions`
+- `*-patterns` = language-level (go-patterns, elixir-patterns)
+- `*-conventions` = project-specific (temporal-conventions, etc.)
+
+## Thinking Framework
+
+Before starting any analysis, ask:
+
+1. **What is this project's essence?** A trading system is a state
+   machine where the state is money. A workflow engine is a tree of
+   state machines. Name the essence — the patterns follow from it.
+2. **What forces shaped it?** Team size, age, performance constraints,
+   backward compatibility obligations. These predict WHERE conventions
+   will be strict vs relaxed.
+3. **What would surprise me?** The interesting findings are never "they
+   use interfaces" — it's "they have 566 dynamic config settings" or
+   "zero TODOs in 3.8M of code." Surprise = insight.
+
+## Prioritization: What to Dig Into
+
+Not everything is interesting. Focus on patterns that:
+
+- **Appear >50 times** — this is a conscious convention, not a one-off
+- **Have a dedicated package** — someone thought it was important enough
+  to abstract
+- **Other projects solve differently** — reveals a real design tradeoff
+- **Have a surprising name** — indicates the team had to invent
+  vocabulary for a novel concept
+- **Were introduced recently with many PR comments** — active design
+  decisions with recorded rationale
+
+Skip patterns that are:
+- Standard library usage (unless the project wraps/extends it)
+- Single-use internal helpers
+- Generated code
+- Exact copies of well-known open-source patterns without modification

 ## Phases

-Execute in order. Each phase builds on the previous.
+### Phase 1: Shape (5 min)

-### Phase 1: Clone & Shape
+Clone on forge (`~/src/analysis/<name>`). Full clone — never shallow.

-Clone the repo on forge (`~/src/analysis/<name>`). Measure:
+Measure: size, files, commits, contributors, top-level dirs.

-```
- Total size, file count, commit count, contributor count
- Top-level directory structure
- Language breakdown (if polyglot)
-```
+**What matters here:** The ratio of test files to production files.
+The presence/absence of `internal/` vs flat structure. Whether there's
+a single `pkg/` or many top-level packages. These reveal organizational
+philosophy before you read a single line.

-### Phase 2: Import Hierarchy
+### Phase 2: What the Codebase Values (10 min)

-Find the most-depended-upon packages. This reveals what the codebase
-considers foundational:
+Find the most-imported internal packages. The top 5 are the
+project's definition of "foundational."

-```bash
-# Go: grep imports, extract internal packages, count
-grep -rh '"<module>/' --include="*.go" | sed ... | sort | uniq -c | sort -rn
+**Ask:** Why these? What do they share? Usually: logging, errors,
+config, and one domain-specific abstraction that IS the project.
+That domain-specific one is where the real conventions live.

-# Elixir: grep aliases/imports
-grep -rh "alias\|import\|use" --include="*.ex" | ...
-```
+See `references/commands.md` for grep patterns by language.

-The top 5-10 packages are where architectural decisions live.
+### Phase 3: Interface Contracts (10 min)

-### Phase 3: Interface Contracts
+Find interfaces/behaviours/protocols — but don't list them all.

-Find the key abstractions:
+**Focus on:** Interfaces with >3 implementations (these are real
+extension points). Interfaces in constructor signatures (these are
+dependency injection boundaries). Interfaces that appear in BOTH
+production and test code (these are the testability seams).

-```bash
-# Go
-grep -rn "type.*interface {" --include="*.go" | grep -v test | grep -v mock
+**Skip:** One-method interfaces (usually just for mocking). Interfaces
+only used in one place (not yet conventions).

-# Elixir
-grep -rn "@callback\|@behaviour\|defprotocol" --include="*.ex"
-```
+### Phase 4: Quality Fingerprint (5 min)

-### Phase 4: Error Handling & Quality Markers
+Measure: TODO count, FIXME count, HACK count, test count, mock count.

-```bash
-# Error style
-grep -rh "fmt.Errorf.*%w" | wc -l        # wrapping
-grep -rh "errors.New\|errors.Wrap" | wc -l # sentinel/pkg
+**What to notice:**
+- TODO format reveals discipline: `TODO(owner):` = accountability,
+  `TODO:` = aspirational, version-gated = systematic cleanup
+- Zero TODOs in a large codebase means active cleanup culture
+- High mock count relative to test count suggests heavy DI
+- HACK count > 0 is honest; HACK count = 0 in a large project is
+  suspicious (they probably use different words)

-# Quality markers
-grep -rn "TODO\|FIXME\|HACK" --include="*.go" | grep -v test | wc -l
-```
+### Phase 5: Unique Patterns (15 min)

-Note TODO format — does the project use `TODO(owner):`? Plain `TODO:`?
-Version-gated TODOs?
+Look for infrastructure NOT in stdlib. Categories:

-### Phase 5: Unique Patterns
+- **Concurrency:** goroutine handles, schedulers, shutdown primitives
+- **Testing:** custom assertions, fake registries, golden file systems
+- **Configuration:** dynamic config, feature flags, runtime toggles
+- **Error handling:** custom error types, assertion systems, panic
+  recovery patterns
+- **Extension:** plugin registration, hook systems, middleware chains

-Look for project-specific infrastructure that's NOT in stdlib:
- Custom concurrency primitives (handles, schedulers, pools)
- Testing infrastructure (custom assertions, harnesses)
- Configuration systems (dynamic config, feature flags)
- Plugin/extension systems
+**The test for uniqueness:** Would you be surprised to find this in
+another project of similar size? If yes → convention worth documenting.
+If no → standard practice, skip.

-### Phase 6: Git Archaeology (requires full clone)
+### Phase 6: Git Archaeology (20 min)

-This is where the real insight lives. For each interesting pattern:
+For each unique pattern found in Phase 5:

-```bash
-# When was it introduced?
-git log --all --oneline --diff-filter=A -- path/to/pattern
+1. Find the commit that introduced it (`git log --diff-filter=A`)
+2. Read the commit message — the "why" is usually there
+3. Check if it replaced something (`git log -S "old_name"`)
+4. Note the date and author — context for why shortcuts were taken

-# Who wrote it and why?
-git log --format="%an%n%ad%n%s%n%b" <commit> -1
+**The insight is always WHY, not WHAT.** A bare goroutine with a
+TODO is uninteresting as a listing. A bare goroutine introduced during
+a complex 20-file admission control feature, tagged by the author in
+the same commit, that survived 3 years because nobody touched the
+function — that's a lesson about how real codebases evolve.

-# What did it replace?
-git log -p -S "old_pattern_name" -- relevant/path
-```
+See `references/commands.md` for git archaeology patterns.

-### Phase 7: PR Discussions
+**If the repo is on a forge without PR history** (self-hosted, mailing
+list-based): Fall back to commit messages and CHANGELOG. The commit
+body IS the PR description for these projects. Look for "Reviewed-by"
+trailers and linked issues.

-Find the PR where a key pattern was introduced:
+### Phase 7: PR Discussions (20 min)

-```bash
-gh api "search/issues?q=repo:<org>/<repo>+<search>+type:pr" \
-  --jq '.items[] | {number, title}'
-```
-
-Then read:
- PR body (the "why" from the author)
+Find PRs where key patterns were introduced. Read:
+- The PR body (author's motivation)
 - Review comments (the debate)
- Issue comments (the resolution)
+- The resolution

-Key questions to answer:
-1. What problem motivated this pattern?
-2. What alternatives were considered?
-3. What did reviewers push back on?
-4. How was it migrated in (big-bang vs gradual)?
+**What to extract from discussions:**
+- What the author was defending (= where the real insight is)
+- What reviewers pushed back on (= non-obvious tradeoffs)
+- Whether it was "merge and iterate" vs "perfect before merge"
+- Whether external validation was cited (benchmarks, user feedback)
+- The migration strategy (big-bang vs gradual coexistence)

-### Phase 8: Synthesis & Output
+**The highest-value finding:** When a reviewer says "I wish we'd done
+X instead" and the author explains why X doesn't work. That tradeoff
+reasoning is pure expert knowledge.

-Produce two artifacts:
+### Phase 8: Synthesis

-**1. `analysis.md`** — Full architectural analysis:
- Repository shape and import hierarchy
- Key patterns with code examples
- PR discussion summaries
- Cross-ecosystem comparisons
- Code quality metrics
+Produce two files. Push to Gitea.

-**2. `conventions.md`** — Extracted patterns with:
- Pattern name
- Location in source
- Code example
- When to use / when NOT to use
- Origin story (from PR discussion if available)
+**`analysis.md`** — the full story:
+1. Repo shape and organizational philosophy
+2. Import hierarchy (what it values)
+3. Key patterns with code examples + origin stories
+4. PR discussion excerpts (attributed quotes)
+5. Cross-ecosystem comparisons (prior art, independent invention)
+6. Quality metrics in context (not bare numbers)

-## Output Repos
+**`conventions.md`** — the reference:
+For each pattern:
+- Name and location in source
+- Code example (real, not simplified)
+- When to use / When NOT to use
+- Origin (commit date, author, PR# if available)

-Push to Gitea under `rodin/<project>-conventions`:
-
-```bash
-GITEA_TOKEN=$(cat ~/.openclaw/credentials/gitea-rodin)
-# Create repo if needed:
-curl -s -X POST "https://gitea.weiker.me/api/v1/user/repos" \
-  -H "Authorization: token $GITEA_TOKEN" \
-  -H "Content-Type: application/json" \
-  -d '{"name": "<project>-conventions", "auto_init": true, "default_branch": "master", "license": "MIT"}'
-```
-
-## Quality Criteria
-
-Before pushing, verify:
- Each pattern has a real code example from the repo
- "When to use / when NOT to use" sections exist
- PR discussion quotes are attributed
- Cross-ecosystem comparisons note prior art
- Files end with `<!-- PATTERN_COMPLETE -->` sentinel
+End both files with `<!-- PATTERN_COMPLETE -->` sentinel.

 ## Cross-Ecosystem Observations

-Always note when a pattern exists in multiple repos. Example: Temporal's
-`goro.Handle` (2021) predates CockroachDB's `stop.Handle` (2025) —
-same solution, invented independently. These connections are the most
-valuable findings.
+Always note when a pattern exists in multiple repos. These
+independent inventions reveal forces that transcend project context:
+- Temporal goro.Handle (2021) ↔ CockroachDB stop.Handle (2025)
+- Ecto zero TODOs (version-gated) ↔ Oban zero TODOs (2-week cleanup)
+- Prometheus init() plugins ↔ Temporal init() plugins

 ## The 4 Categories of Pattern Breaks

-When you find impurity (pattern violations), classify them:
+When you find convention violations, classify:

-1. **Ship behavior, fix plumbing later** — time pressure, author knows it's
-   wrong (tagged with TODO)
-2. **Better tooling exposed the limitation** — old pattern worked but was
-   invisible to profiling/tracing
-3. **Removal cost > carrying cost** — technically debt but zero interest rate
+1. **Ship behavior, fix plumbing later** — tagged with TODO same commit
+2. **Better tooling exposed limitation** — observability, not correctness
+3. **Removal cost > carrying cost** — zero-interest debt
 4. **Context needs different pattern** — not actually a break

-See `references/pattern-breaks.md` for detailed examples.
+See `references/pattern-breaks.md` for real examples with git history.
+
+## NEVER
+
+- **NEVER analyze with a shallow clone** and assume full picture —
+  archaeology requires full history
+- **NEVER present patterns from one file as repo-wide conventions** —
+  verify frequency across the codebase first
+- **NEVER skip PR discussions** — code without context is just syntax;
+  the discussion IS the insight
+- **NEVER report bare numbers** ("738 TODOs") — always contextualize
+  (per 1000 files, vs comparable projects, trending up/down)
+- **NEVER confuse "the maintainer likes X" with "X is the right
+  pattern"** — solo-maintained projects reflect one person's taste;
+  team projects reflect negotiated conventions
+- **NEVER present a pattern as "unique" without checking** if stdlib
+  has it or if it's a well-known library pattern
+- **NEVER list patterns without when-NOT-to-use** — that's where the
+  expertise actually lives
+- **NEVER quote PR discussions without attribution** — who said it
+  matters (maintainer vs drive-by contributor)
+- **NEVER analyze repos <1000 commits** — not enough history for
+  meaningful archaeology
+- **NEVER conflate language patterns with project conventions** — `go-
+  patterns` is stdlib idiom; `temporal-conventions` is project choice
+
+## Output Repos
+
+Push to Gitea under `rodin/<project>-conventions`. See
+`references/commands.md` for repo creation and push commands.
+
+## Fallbacks
+
+- **No PR discussions?** Use commit messages as primary source.
+  Many projects (Linux, PostgreSQL) do all review in commit messages
+  and mailing lists.
+- **Repo too large to clone fully?** Clone shallow first, do Phase
+  1-5, then `git fetch --unshallow` only if Phase 6-7 are needed.
+- **Private repo / no GitHub API?** Skip Phase 7. Phase 6 (local git
+  history) still works.
+- **<3000 commits?** Reduce Phase 6-7 expectations. Younger projects
+  have less archaeology to mine — focus on Phase 5 (unique patterns)
+  and the project's README/docs for rationale.

 ## Execution Notes

- Clone on **forge** (`host="node", node="forge"`) — has disk space
- Use full clones (not `--depth 1`) for archaeology
+- Clone on **forge** (`host="node", node="forge"`)
 - `gh api` for GitHub PR lookups (authenticated on all nodes)
 - One repo at a time for focused analysis
 - Markdownlint all output before pushing