docs: richer README with methodology table, principles, and config

This commit is contained in:
Rodin
2026-04-30 12:14:42 -07:00
parent 886a4d8b46
commit 68e35b0cea
+91 -1
View File
@@ -1,3 +1,93 @@
# codebase-analysis
Skill for analyzing open source repositories to extract architectural conventions and patterns. 8-phase methodology: clone, imports, interfaces, quality markers, unique patterns, git archaeology, PR discussions, synthesis.
An [OpenClaw](https://github.com/openclaw/openclaw) skill for analyzing
open source repositories to extract architectural conventions, patterns,
and the design decisions behind them.
## What It Does
Given a repository URL, this skill produces two artifacts:
- **`analysis.md`** — the full story: repo shape, import hierarchy, key
patterns with code examples, PR discussion excerpts, cross-ecosystem
comparisons
- **`conventions.md`** — a reference of extracted patterns with when to
use / when NOT to use each one
Output is pushed to a convention repo (`<project>-conventions`).
## The Methodology
8 phases, each building on the previous:
| Phase | Focus | Time |
|-------|-------|------|
| 1. Shape | Clone, measure dimensions | 5 min |
| 2. Values | Most-imported packages reveal priorities | 10 min |
| 3. Interfaces | Key abstractions and extension points | 10 min |
| 4. Quality | TODOs, test ratios, discipline markers | 5 min |
| 5. Unique Patterns | Infrastructure not in stdlib | 15 min |
| 6. Git Archaeology | Trace WHY decisions were made | 20 min |
| 7. PR Discussions | Read the actual debates | 20 min |
| 8. Synthesis | Produce analysis + conventions docs | — |
## Key Principles
- **WHY > WHAT** — a pattern listing is boring; the commit history and
PR debate that explains *why* it exists is the real insight
- **Cross-ecosystem comparison** — note when multiple projects
independently invent the same solution (e.g., Temporal's
`goro.Handle` in 2021 ↔ CockroachDB's `stop.Handle` in 2025)
- **Contextualize everything** — "738 TODOs" means nothing without
knowing the repo has 20K files and 700 contributors
- **When NOT to use is where expertise lives** — every pattern has a
shadow; documenting it is the real value
## Configuration
Set these in your workspace (TOOLS.md, AGENTS.md, or invocation
context):
| Parameter | Required | Description |
|-----------|----------|-------------|
| `CLONE_DIR` | Yes | Where to clone repos |
| `GIT_REMOTE` | Yes | Where convention repos are pushed |
| `CLONE_HOST` | No | Machine to clone on (default: localhost) |
| `GIT_ORG` | No | Org/user for repos (default: authenticated user) |
| `GIT_TOKEN_PATH` | No | Auth token path (default: git credential helper) |
## Naming Convention
- `*-patterns` — language-level idioms (how Go/Elixir wants you to write)
- `*-conventions` — project-specific (how a codebase chose to do it)
## Proven On
This methodology was developed and validated across 5 repos:
- **CockroachDB** (845M, 117K commits) — stopper handles, error
wrapping purity, stale TODOs
- **Prometheus** (39M, 15K commits) — slog migration, AppenderV2,
global vars in hot paths
- **Temporal** (181M, 9K commits) — HSM framework, CHASM, soft
assertions, effect buffers
- **Ecto** (3.8M, 12K commits) — zero TODOs, protocol extensibility,
version-gated cleanup
- **Oban** (2.1M, 3K commits) — engine behaviour, inline testing,
2-week TODO cleanup
## Installation
Drop the skill directory into your OpenClaw workspace:
```
skills/codebase-analysis/
├── SKILL.md
└── references/
├── commands.md
└── pattern-breaks.md
```
## License
MIT