Rodin 0c51a9334f fix: patterns mode spec now matches original quality bar
Each entry: source hyperlink, real code, why (force), triggers,
before/after examples, over-application warning, decision tree.
Quality bar: 40-80 lines/pattern, 500-900 lines/topic, 5-10K total.
2026-04-30 14:21:56 -07:00
2026-04-30 18:49:30 +00:00

codebase-analysis

An OpenClaw skill for analyzing open source repositories to extract architectural conventions, patterns, and the design decisions behind them.

What It Does

Given a repository URL, this skill produces two artifacts:

  • analysis.md — the full story: repo shape, import hierarchy, key patterns with code examples, PR discussion excerpts, cross-ecosystem comparisons
  • conventions.md — a reference of extracted patterns with when to use / when NOT to use each one

Output is pushed to a convention repo (<project>-conventions).

The Methodology

8 phases, each building on the previous:

Phase Focus Time
1. Shape Clone, measure dimensions 5 min
2. Values Most-imported packages reveal priorities 10 min
3. Interfaces Key abstractions and extension points 10 min
4. Quality TODOs, test ratios, discipline markers 5 min
5. Unique Patterns Infrastructure not in stdlib 15 min
6. Git Archaeology Trace WHY decisions were made 20 min
7. PR Discussions Read the actual debates 20 min
8. Synthesis Produce analysis + conventions docs

Key Principles

  • WHY > WHAT — a pattern listing is boring; the commit history and PR debate that explains why it exists is the real insight
  • Cross-ecosystem comparison — note when multiple projects independently invent the same solution (e.g., Temporal's goro.Handle in 2021 ↔ CockroachDB's stop.Handle in 2025)
  • Contextualize everything — "738 TODOs" means nothing without knowing the repo has 20K files and 700 contributors
  • When NOT to use is where expertise lives — every pattern has a shadow; documenting it is the real value

Configuration

Set these in your workspace (TOOLS.md, AGENTS.md, or invocation context):

Parameter Required Description
CLONE_DIR Yes Where to clone repos
GIT_REMOTE Yes Where convention repos are pushed
CLONE_HOST No Machine to clone on (default: localhost)
GIT_ORG No Org/user for repos (default: authenticated user)
GIT_TOKEN_PATH No Auth token path (default: git credential helper)

Naming Convention

  • *-patterns — language-level idioms (how Go/Elixir wants you to write)
  • *-conventions — project-specific (how a codebase chose to do it)

Proven On

This methodology was developed and validated across 5 repos:

  • CockroachDB (845M, 117K commits) — stopper handles, error wrapping purity, stale TODOs
  • Prometheus (39M, 15K commits) — slog migration, AppenderV2, global vars in hot paths
  • Temporal (181M, 9K commits) — HSM framework, CHASM, soft assertions, effect buffers
  • Ecto (3.8M, 12K commits) — zero TODOs, protocol extensibility, version-gated cleanup
  • Oban (2.1M, 3K commits) — engine behaviour, inline testing, 2-week TODO cleanup

Installation

Drop the skill directory into your OpenClaw workspace:

skills/codebase-analysis/
├── SKILL.md
└── references/
    ├── commands.md
    └── pattern-breaks.md

License

MIT

S
Description
Skill for analyzing open source repositories to extract architectural conventions and patterns. 8-phase methodology: clone, imports, interfaces, quality markers, unique patterns, git archaeology, PR discussions, synthesis.
Readme MIT 63 KiB
Languages
Markdown 100%