Files
Rodin bd9790caa1 feat: repeatable mechanical method for patterns mode
5 steps: Quantify → Extract one → Decision tree → Cross-refs → Hyperlinks.
Delegation strategy (per-entry, not per-file).
Discovery greps for Go, Elixir, Rust, Python.
Hyperlink scripts per language.
2026-04-30 14:46:41 -07:00

528 lines
20 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
name: codebase-analysis
description: >-
Analyze open source repositories to extract conventions or patterns.
Two modes: "conventions" (how a project works architecturally) and
"patterns" (how to write idiomatic code in that language/ecosystem).
Use when asked to "analyze a repo", "extract patterns from", "what
conventions does X use", "how should I write X", "what's idiomatic",
"add X to the analysis repos", or "how does X do Y architecturally".
Do NOT use for: code review of specific PRs (use pr-review), security
audits (use vuln-scout), or reading a single file for a quick answer.
---
# Codebase Analysis
Extract conventions or idiomatic patterns from open source repos.
## Mode
Set `MODE` when invoking (or infer from request):
| Mode | Question | Output | Repo suffix |
|------|----------|--------|-------------|
| `conventions` | "How does this project work?" | Architecture, governance, unique infra | `*-conventions` |
| `patterns` | "How should I write code like this?" | Prescriptive rules for users | `*-patterns` |
**Default:** `conventions` unless the request says "idiomatic",
"how to write", "style guide", or "patterns for users".
**Both modes share Phases 1-7.** They diverge at Phase 8 (synthesis).
## Configuration
Set these in your workspace context (TOOLS.md, AGENTS.md, or pass
explicitly when invoking the skill):
| Parameter | Description | Example |
|-----------|-------------|----------|
| `CLONE_DIR` | Directory to clone repos into | `~/src/analysis/` |
| `CLONE_HOST` | Machine with disk + git for cloning | `forge`, `localhost` |
| `GIT_REMOTE` | Where convention repos are pushed | `https://git.example.com` |
| `GIT_ORG` | Org/user for convention repos | `myorg`, `username` |
| `GIT_TOKEN_PATH` | Path to auth token for pushing | `~/.credentials/git-token` |
**Minimum required:** `CLONE_DIR` and `GIT_REMOTE`. If others are
omitted:
- `CLONE_HOST` defaults to localhost (current machine)
- `GIT_ORG` defaults to the authenticated user
- `GIT_TOKEN_PATH` uses default git credential helper
**Example in TOOLS.md:**
```markdown
## Codebase Analysis
- CLONE_DIR: ~/src/analysis/
- CLONE_HOST: my-dev-server (ssh user@host)
- GIT_REMOTE: https://git.example.com
- GIT_ORG: my-patterns
- GIT_TOKEN_PATH: ~/.credentials/git-token
```
If not explicitly provided, infer from workspace context (TOOLS.md,
shell environment, or git remote configuration).
## Naming
- `*-patterns` = prescriptive (how users should write code)
- `*-conventions` = descriptive (how a specific codebase works)
A language can have both: `go-patterns` (write Go like this) AND
`golang-conventions` (how the Go team builds Go itself).
## Thinking Framework
Before starting any analysis, ask:
1. **What is this project's essence?** A trading system is a state
machine where the state is money. A workflow engine is a tree of
state machines. Name the essence — the patterns follow from it.
2. **What forces shaped it?** Team size, age, performance constraints,
backward compatibility obligations. These predict WHERE conventions
will be strict vs relaxed.
3. **What would surprise me?** The interesting findings are never "they
use interfaces" — it's "they have 566 dynamic config settings" or
"zero TODOs in 3.8M of code." Surprise = insight.
## Prioritization: What to Dig Into
Not everything is interesting. Focus on patterns that:
- **Appear >50 times** — this is a conscious convention, not a one-off
- **Have a dedicated package** — someone thought it was important enough
to abstract
- **Other projects solve differently** — reveals a real design tradeoff
- **Have a surprising name** — indicates the team had to invent
vocabulary for a novel concept
- **Were introduced recently with many PR comments** — active design
decisions with recorded rationale
Skip patterns that are:
- Standard library usage (unless the project wraps/extends it)
- Single-use internal helpers
- Generated code
- Exact copies of well-known open-source patterns without modification
## Phases
### Phase 1: Shape (5 min)
Clone to `CLONE_DIR/<name>` on `CLONE_HOST`. Full clone — never shallow.
Measure: size, files, commits, contributors, top-level dirs.
**What matters here:** The ratio of test files to production files.
The presence/absence of `internal/` vs flat structure. Whether there's
a single `pkg/` or many top-level packages. These reveal organizational
philosophy before you read a single line.
### Phase 2: What the Codebase Values (10 min)
Find the most-imported internal packages. The top 5 are the
project's definition of "foundational."
**Ask:** Why these? What do they share? Usually: logging, errors,
config, and one domain-specific abstraction that IS the project.
That domain-specific one is where the real conventions live.
See `references/commands.md` for grep patterns by language.
### Phase 3: Interface Contracts (10 min)
Find interfaces/behaviours/protocols — but don't list them all.
**Focus on:** Interfaces with >3 implementations (these are real
extension points). Interfaces in constructor signatures (these are
dependency injection boundaries). Interfaces that appear in BOTH
production and test code (these are the testability seams).
**Skip:** One-method interfaces (usually just for mocking). Interfaces
only used in one place (not yet conventions).
### Phase 4: Quality Fingerprint (5 min)
Measure: TODO count, FIXME count, HACK count, test count, mock count.
**What to notice:**
- TODO format reveals discipline: `TODO(owner):` = accountability,
`TODO:` = aspirational, version-gated = systematic cleanup
- Zero TODOs in a large codebase means active cleanup culture
- High mock count relative to test count suggests heavy DI
- HACK count > 0 is honest; HACK count = 0 in a large project is
suspicious (they probably use different words)
### Phase 5: Unique Patterns (15 min)
Look for infrastructure NOT in stdlib. Categories:
- **Concurrency:** goroutine handles, schedulers, shutdown primitives
- **Testing:** custom assertions, fake registries, golden file systems
- **Configuration:** dynamic config, feature flags, runtime toggles
- **Error handling:** custom error types, assertion systems, panic
recovery patterns
- **Extension:** plugin registration, hook systems, middleware chains
**The test for uniqueness:** Would you be surprised to find this in
another project of similar size? If yes → convention worth documenting.
If no → standard practice, skip.
### Phase 6: Git Archaeology (20 min)
For each unique pattern found in Phase 5:
1. Find the commit that introduced it (`git log --diff-filter=A`)
2. Read the commit message — the "why" is usually there
3. Check if it replaced something (`git log -S "old_name"`)
4. Note the date and author — context for why shortcuts were taken
**The insight is always WHY, not WHAT.** A bare goroutine with a
TODO is uninteresting as a listing. A bare goroutine introduced during
a complex 20-file admission control feature, tagged by the author in
the same commit, that survived 3 years because nobody touched the
function — that's a lesson about how real codebases evolve.
See `references/commands.md` for git archaeology patterns.
**If the repo is on a forge without PR history** (self-hosted, mailing
list-based): Fall back to commit messages and CHANGELOG. The commit
body IS the PR description for these projects. Look for "Reviewed-by"
trailers and linked issues.
### Phase 7: PR Discussions (20 min)
Find PRs where key patterns were introduced. Read:
- The PR body (author's motivation)
- Review comments (the debate)
- The resolution
**What to extract from discussions:**
- What the author was defending (= where the real insight is)
- What reviewers pushed back on (= non-obvious tradeoffs)
- Whether it was "merge and iterate" vs "perfect before merge"
- Whether external validation was cited (benchmarks, user feedback)
- The migration strategy (big-bang vs gradual coexistence)
**The highest-value finding:** When a reviewer says "I wish we'd done
X instead" and the author explains why X doesn't work. That tradeoff
reasoning is pure expert knowledge.
### Phase 8: Synthesis
Produce output based on MODE. Push to `GIT_REMOTE`.
---
#### MODE: conventions
Output: `<project>-conventions` repo.
**`analysis.md`** — the full story:
1. Repo shape and organizational philosophy
2. Import hierarchy (what it values)
3. Key patterns with code examples + origin stories
4. PR discussion excerpts (attributed quotes)
5. Cross-ecosystem comparisons (prior art, independent invention)
6. Quality metrics in context (not bare numbers)
**`conventions.md`** — the reference:
For each unique pattern:
- Name and location in source
- Code example (real, not simplified)
- When to use / When NOT to use
- Origin (commit date, author, PR# if available)
**Tone:** Descriptive. "This project does X because Y."
---
#### MODE: patterns
Output: `<language>-patterns` or `<ecosystem>-patterns` repo.
**Synthesis question:** "What should a developer copy from this
codebase?" Filter everything through: "If I were writing new code
in this language/ecosystem, what rules does this source teach me?"
**This is iterative, not one-shot.** The method produces quality
through decomposition, not through asking one agent to "write a
good file." Each step is bounded, mechanical, and verifiable.
### The Repeatable Method
**Step 1: Quantify** (5 min per topic)
For each topic area, run frequency grep commands to find patterns.
The goal is COUNTS — how often does this pattern appear?
```
# Example: error handling in Go
grep -rn "^var Err" --include="*.go" | grep -v test | wc -l → 55
grep -rn "fmt.Errorf.*%w" --include="*.go" | grep -v test | wc -l → 115
grep -rn "errors\.Is\|errors\.As" --include="*.go" | wc -l → 212
```
Output: a numbered list of pattern names + counts. This IS the
table of contents for that topic file.
**Step 2: Extract one** (5-10 min per pattern)
For EACH pattern from the list, in order:
1. Find the best example (grep → pick the clearest one)
2. Read 10 lines of surrounding context (understand WHY)
3. Write one pattern entry (40-80 lines, all required sections)
4. Move to the next pattern
The key constraint: **write one pattern entry completely before
starting the next.** Never read all patterns then write all entries.
This prevents context exhaustion and ensures each entry is complete.
**Step 3: Decision tree** (5 min per topic)
After all patterns are written, add a decision tree at the end.
Format: "If X, use pattern A. If Y, use pattern B."
**Step 4: Cross-references** (2 min per topic)
Add `See also:` links to related topic files.
**Step 5: Hyperlinks** (mechanical, scriptable)
Convert all source references to clickable permalinks:
```bash
HEAD=$(git rev-parse HEAD)
BASE="https://github.com/OWNER/REPO/blob/${HEAD}"
sed -i -E "s|\`(path/file\.ext):([0-9]+)\`|[\1#L\2](${BASE}/\1#L\2)|g" file.md
```
### Delegation Strategy
When using sub-agents:
- **DO:** One agent per pattern entry (bounded: read one, write one)
- **DO:** Give the agent the grep output as input (they don't discover,
they deepen a known pattern)
- **DO:** Include one complete example entry in the prompt as the
quality reference
- **DON'T:** Ask one agent to write an entire topic file
- **DON'T:** Ask agents to "discover patterns" (they'll find 5 obvious
ones and miss 10 important ones)
- **DON'T:** Let agents choose their own structure (give them the
template)
**Template for sub-agent task:**
```
Write pattern entry for: [PATTERN NAME]
Source repo: [REPO] at commit [SHA]
Access: [SSH command to get to the source]
Permalink base: [URL]
Grep that found this: [the grep command + sample output]
Reference quality: [paste ONE complete pattern entry as example]
Write to: [output path]
```
### Parallelism
- Step 1 (quantify): run for ALL topics in parallel (just grep)
- Step 2 (extract): run per-pattern entries in parallel (max 5)
- Steps 3-5: sequential (need all entries to exist first)
### Done Criteria
A topic file is done when:
- [ ] Every pattern from Step 1's list has an entry
- [ ] Each entry has ALL required sections (source, why, when to use
with before/after, when NOT to use with over-application)
- [ ] Decision tree exists at the end
- [ ] All source refs are hyperlinked
- [ ] PATTERN_COMPLETE sentinel at EOF
- [ ] File is 500-1000 lines (if shorter, entries are too shallow)
A language is done when:
- [ ] 8-12 topic files exist
- [ ] Each topic has 10-15+ patterns
- [ ] Total is 5,000-10,000+ lines
- [ ] No grep scan reveals patterns not yet documented
- [ ] smells.md covers anti-patterns found in the source
**Output structure — one file per topic:**
`patterns/<topic>.md` — topics include (but aren't limited to):
- Error handling (sentinel errors, error types, wrapping, multi-error)
- Naming conventions (packages, types, functions, receivers)
- Concurrency patterns (goroutines, channels, mutexes, sync primitives)
- Testing patterns (table-driven, helpers, fixtures, benchmarks, examples)
- Interface/protocol design (size, composition, assertion, extension)
- Module/package organization (layout, internal/, visibility)
- Documentation conventions (godoc, deprecation, package-level)
- Performance idioms (pooling, preallocate, append, zero-alloc)
- Configuration patterns (functional options, config structs, defaults)
- Extension/plugin patterns (registration, middleware, hooks)
- Struct patterns (constructors, zero values, embedding, tags)
- API design (backwards compat, versioning, deprecation strategy)
**Start with 810 topics for a language stdlib; add more if the
source shows distinct patterns in additional areas.** Each topic
should map to a real problem domain that developers face.
**File naming:** Use lowercase, hyphenated names that describe the
topic clearly: `error-handling.md`, `testing-advanced.md`,
`api-conventions.md`, `concurrency.md`.
**Each pattern entry requires ALL of these sections:**
### `## N. Pattern Name`
Short, linkable heading (no generic names like "Pattern 1").
### `### Source:`
Hyperlinked to the exact file and line on the forge.
Format: `[src/io/io.go#L86](https://github.com/golang/go/blob/COMMIT_SHA/src/io/io.go#L86)`
Use permalink format (commit SHA) for stability.
### Real source example
The actual code from the source, with file:line comments. Not
simplified, not invented. This IS the evidence.
### `### Why`
The force that makes this the right choice. Not "because the
stdlib does it" — explain the FORCE (testability, allocation
cost, readability under diff, composability).
### `### When to Use`
**Triggers:** — bullet list of specific situations that call for this.
**Example — before:** — code showing the problem WITHOUT the pattern.
This is critical. Readers must recognize their own bad code here.
**Example — after:** — code showing the same problem WITH the pattern.
The before/after pair is what makes patterns teachable.
### `### When NOT to Use`
**Don't use this when:** — bullet list of boundary conditions.
**Over-application example:** — code showing what happens when you
use this pattern where it doesn't belong. This prevents cargo-culting.
**Better alternative:** — what to do instead in those cases.
### `### Anti-pattern` (when relevant)
Explicit `DON'T:` block showing the wrong approach with a comment
explaining why it's wrong, followed by `DO:` showing the fix.
---
**Each topic file ALSO needs:**
- **Summary/Decision Tree at the end** — "If X, use pattern A. If Y,
use pattern B." Readers should be able to skip to the decision
tree and find their situation.
- **Cross-references** — link to related patterns in other topic files.
e.g., error-handling links to interfaces when discussing error types.
---
**Quality bar:** Each pattern entry should be 4080 lines including
code examples. A topic file with 10 patterns should be 500900 lines.
If entries are shorter than 40 lines, they're missing before/after
examples or anti-patterns.
---
**`smells.md`** — anti-patterns found in the source:
- What it looks like (with real code)
- Why it exists (technical debt? deliberate tradeoff? historical?)
- What to do instead (with code showing the fix)
- How to detect it (grep pattern or linter rule)
**Tone:** Prescriptive. "Write it this way because X."
**Key difference from conventions mode:** Skip governance, team
structure, TODO culture, and project history unless they directly
inform HOW to write code. Focus on patterns a user should copy.
**Done criteria:** You've scanned every major directory in the source.
No new patterns emerge from further grep/read. Each topic file has
1015+ patterns, each with before/after examples, anti-patterns,
and decision guidance. Total output for a language stdlib should be
5,00010,000+ lines across all topic files.
---
End all output files with `<!-- PATTERN_COMPLETE -->` sentinel.
## Cross-Ecosystem Observations
Always note when a pattern exists in multiple repos. These
independent inventions reveal forces that transcend project context:
- Temporal goro.Handle (2021) ↔ CockroachDB stop.Handle (2025)
- Ecto zero TODOs (version-gated) ↔ Oban zero TODOs (2-week cleanup)
- Prometheus init() plugins ↔ Temporal init() plugins
## The 4 Categories of Pattern Breaks
When you find convention violations, classify:
1. **Ship behavior, fix plumbing later** — tagged with TODO same commit
2. **Better tooling exposed limitation** — observability, not correctness
3. **Removal cost > carrying cost** — zero-interest debt
4. **Context needs different pattern** — not actually a break
See `references/pattern-breaks.md` for real examples with git history.
## NEVER
- **NEVER analyze with a shallow clone** and assume full picture —
archaeology requires full history
- **NEVER present patterns from one file as repo-wide conventions** —
verify frequency across the codebase first
- **NEVER skip PR discussions** — code without context is just syntax;
the discussion IS the insight
- **NEVER report bare numbers** ("738 TODOs") — always contextualize
(per 1000 files, vs comparable projects, trending up/down)
- **NEVER confuse "the maintainer likes X" with "X is the right
pattern"** — solo-maintained projects reflect one person's taste;
team projects reflect negotiated conventions
- **NEVER present a pattern as "unique" without checking** if stdlib
has it or if it's a well-known library pattern
- **NEVER list patterns without when-NOT-to-use** — that's where the
expertise actually lives
- **NEVER quote PR discussions without attribution** — who said it
matters (maintainer vs drive-by contributor)
- **NEVER analyze repos <1000 commits** — not enough history for
meaningful archaeology
- **NEVER conflate language patterns with project conventions** — `go-
patterns` is stdlib idiom; `temporal-conventions` is project choice
## Output Repos
Push to `GIT_REMOTE` under:
- **conventions mode:** `GIT_ORG/<project>-conventions`
- **patterns mode:** `GIT_ORG/<language>-patterns`
See `references/commands.md` for repo creation and push commands.
## Fallbacks
- **No PR discussions?** Use commit messages as primary source.
Many projects (Linux, PostgreSQL) do all review in commit messages
and mailing lists.
- **Repo too large to clone fully?** Clone shallow first, do Phase
1-5, then `git fetch --unshallow` only if Phase 6-7 are needed.
- **Private repo / no forge API?** Skip Phase 7. Phase 6 (local git
history) still works.
- **<3000 commits?** Reduce Phase 6-7 expectations. Younger projects
have less archaeology to mine — focus on Phase 5 (unique patterns)
and the project's README/docs for rationale.
## Execution Notes
- Clone on `CLONE_HOST` — needs disk space for full git history
- `gh api` or equivalent for forge PR lookups (requires authentication)
- One repo at a time for focused analysis
- Markdownlint all output before pushing