feat: repeatable mechanical method for patterns mode

5 steps: Quantify → Extract one → Decision tree → Cross-refs → Hyperlinks. Delegation strategy (per-entry, not per-file). Discovery greps for Go, Elixir, Rust, Python. Hyperlink scripts per language.
2026-04-30 14:46:41 -07:00
parent 0c51a9334f
commit bd9790caa1
2 changed files with 261 additions and 17 deletions
@@ -242,25 +242,106 @@ Output: `<language>-patterns` or `<ecosystem>-patterns` repo.
 codebase?" Filter everything through: "If I were writing new code
 in this language/ecosystem, what rules does this source teach me?"

-**This is iterative, not one-shot.** Keep extracting until you've
-identified ALL patterns the source demonstrates. A first pass finds
-the obvious ones. Second pass greps for variations and edge cases.
-Third pass finds the patterns that break. You're done when scanning
-the source no longer reveals new rules.
+**This is iterative, not one-shot.** The method produces quality
+through decomposition, not through asking one agent to "write a
+good file." Each step is bounded, mechanical, and verifiable.

-**Process:**
-1. **Discovery pass** — scan the source by topic area, identify every
-   distinct pattern (aim for 15-30+ per topic in a large codebase)
-2. **Deepening pass** — for each pattern, grep for 5-10 real usages
-   across the codebase. Note variations. Find the best example.
-3. **Edge case pass** — find where each pattern DOESN'T apply.
-   Grep for violations — are they bugs, or legitimate exceptions?
-4. **Cross-reference pass** — which patterns interact? Which ones
-   conflict? Document the decision framework for choosing between
-   competing patterns.
+### The Repeatable Method

-Repeat until scanning the source yields no new patterns. A language
-stdlib should produce 50-200+ patterns across all topics.
+**Step 1: Quantify** (5 min per topic)
+
+For each topic area, run frequency grep commands to find patterns.
+The goal is COUNTS — how often does this pattern appear?
+
+```
+# Example: error handling in Go
+grep -rn "^var Err" --include="*.go" | grep -v test | wc -l  → 55
+grep -rn "fmt.Errorf.*%w" --include="*.go" | grep -v test | wc -l  → 115
+grep -rn "errors\.Is\|errors\.As" --include="*.go" | wc -l  → 212
+```
+
+Output: a numbered list of pattern names + counts. This IS the
+table of contents for that topic file.
+
+**Step 2: Extract one** (5-10 min per pattern)
+
+For EACH pattern from the list, in order:
+1. Find the best example (grep → pick the clearest one)
+2. Read 10 lines of surrounding context (understand WHY)
+3. Write one pattern entry (40-80 lines, all required sections)
+4. Move to the next pattern
+
+The key constraint: **write one pattern entry completely before
+starting the next.** Never read all patterns then write all entries.
+This prevents context exhaustion and ensures each entry is complete.
+
+**Step 3: Decision tree** (5 min per topic)
+
+After all patterns are written, add a decision tree at the end.
+Format: "If X, use pattern A. If Y, use pattern B."
+
+**Step 4: Cross-references** (2 min per topic)
+
+Add `See also:` links to related topic files.
+
+**Step 5: Hyperlinks** (mechanical, scriptable)
+
+Convert all source references to clickable permalinks:
+```bash
+HEAD=$(git rev-parse HEAD)
+BASE="https://github.com/OWNER/REPO/blob/${HEAD}"
+sed -i -E "s|\`(path/file\.ext):([0-9]+)\`|[\1#L\2](${BASE}/\1#L\2)|g" file.md
+```
+
+### Delegation Strategy
+
+When using sub-agents:
+
+- **DO:** One agent per pattern entry (bounded: read one, write one)
+- **DO:** Give the agent the grep output as input (they don't discover,
+  they deepen a known pattern)
+- **DO:** Include one complete example entry in the prompt as the
+  quality reference
+- **DON'T:** Ask one agent to write an entire topic file
+- **DON'T:** Ask agents to "discover patterns" (they'll find 5 obvious
+  ones and miss 10 important ones)
+- **DON'T:** Let agents choose their own structure (give them the
+  template)
+
+**Template for sub-agent task:**
+```
+Write pattern entry for: [PATTERN NAME]
+Source repo: [REPO] at commit [SHA]
+Access: [SSH command to get to the source]
+Permalink base: [URL]
+Grep that found this: [the grep command + sample output]
+Reference quality: [paste ONE complete pattern entry as example]
+Write to: [output path]
+```
+
+### Parallelism
+
+- Step 1 (quantify): run for ALL topics in parallel (just grep)
+- Step 2 (extract): run per-pattern entries in parallel (max 5)
+- Steps 3-5: sequential (need all entries to exist first)
+
+### Done Criteria
+
+A topic file is done when:
+- [ ] Every pattern from Step 1's list has an entry
+- [ ] Each entry has ALL required sections (source, why, when to use
+  with before/after, when NOT to use with over-application)
+- [ ] Decision tree exists at the end
+- [ ] All source refs are hyperlinked
+- [ ] PATTERN_COMPLETE sentinel at EOF
+- [ ] File is 500-1000 lines (if shorter, entries are too shallow)
+
+A language is done when:
+- [ ] 8-12 topic files exist
+- [ ] Each topic has 10-15+ patterns
+- [ ] Total is 5,000-10,000+ lines
+- [ ] No grep scan reveals patterns not yet documented
+- [ ] smells.md covers anti-patterns found in the source

 **Output structure — one file per topic:**

@@ -124,3 +124,166 @@ git config user.email "<your-email>"
 # ... add files ...
 git add -A && git commit -m "docs: initial conventions from <org>/<project>" && git push
 ```
+
+## Patterns Mode: Step 1 Discovery Greps
+
+Run these per topic to get pattern names + counts. Adapt to your
+language. The goal is a numbered list of patterns to write entries for.
+
+### Go
+
+```bash
+# Error handling
+grep -rn "^var Err" --include="*.go" | grep -v test | grep -v vendor | grep -v internal | grep -v cmd | wc -l
+grep -rn "fmt.Errorf.*%w" --include="*.go" | grep -v test | grep -v vendor | wc -l
+grep -rn "errors\.Is\|errors\.As" --include="*.go" | grep -v test | grep -v vendor | wc -l
+grep -rn "errors\.Join" --include="*.go" | grep -v test | grep -v vendor | wc -l
+grep -rn "func.*Error() string" --include="*.go" | grep -v test | grep -v vendor | grep -v internal | wc -l
+
+# Interfaces
+grep -rn "type.*interface {" --include="*.go" | grep -v test | grep -v vendor | grep -v internal | grep -v cmd | wc -l
+grep -rn "var _.*=" --include="*.go" | grep -v test | grep -v vendor | wc -l
+
+# Testing
+grep -rn "t\.Run(" --include="*.go" | grep -v vendor | wc -l
+grep -rn "t\.Helper()" --include="*.go" | grep -v vendor | wc -l
+grep -rn "func Example" --include="*.go" | grep -v vendor | wc -l
+grep -rn "func TestMain" --include="*.go" | grep -v vendor | wc -l
+grep -rn "func Benchmark" --include="*.go" | grep -v vendor | wc -l
+find . -name "testdata" -type d | grep -v vendor | wc -l
+grep -rn "func Fuzz" --include="*.go" | grep -v vendor | wc -l
+
+# Concurrency
+grep -rn "sync\.Mutex\|sync\.RWMutex" --include="*.go" | grep -v test | grep -v vendor | grep -v internal | wc -l
+grep -rn "sync\.Once" --include="*.go" | grep -v test | grep -v vendor | grep -v internal | wc -l
+grep -rn "sync\.Pool" --include="*.go" | grep -v test | grep -v vendor | grep -v internal | wc -l
+grep -rn "context\.Context" --include="*.go" | grep -v test | grep -v vendor | grep -v internal | wc -l
+grep -rn "sync\.WaitGroup" --include="*.go" | grep -v test | grep -v vendor | grep -v internal | wc -l
+
+# Naming/Style
+grep -rn "^func New" --include="*.go" | grep -v test | grep -v vendor | grep -v internal | grep -v cmd | wc -l
+grep -rn "func.*) Get[A-Z]" --include="*.go" | grep -v test | grep -v vendor | grep -v internal | wc -l
+find . -name "doc.go" | grep -v vendor | grep -v internal | wc -l
+grep -rn "Deprecated:" --include="*.go" | grep -v test | grep -v vendor | wc -l
+
+# Configuration
+grep -rn "^func With" --include="*.go" | grep -v test | grep -v vendor | grep -v internal | grep -v cmd | wc -l
+grep -rn "type.*Config struct\|type.*Options struct" --include="*.go" | grep -v test | grep -v vendor | wc -l
+grep -rn "^var Default\|^func Default" --include="*.go" | grep -v test | grep -v vendor | grep -v internal | wc -l
+grep -rn "^func Register" --include="*.go" | grep -v test | grep -v vendor | grep -v internal | wc -l
+
+# Performance
+grep -rn "sync\.Pool" --include="*.go" | grep -v test | grep -v vendor | grep -v internal | wc -l
+grep -rn "func.*Append[A-Z]" --include="*.go" | grep -v test | grep -v vendor | grep -v internal | wc -l
+grep -rn "make(\[\].*,.*," --include="*.go" | grep -v test | grep -v vendor | grep -v internal | wc -l
+```
+
+### Elixir
+
+```bash
+# Error handling
+grep -rn "defexception" --include="*.ex" | wc -l
+grep -rn "{:ok,\|{:error," --include="*.ex" | grep -v test | wc -l
+grep -rn "raise\|reraise" --include="*.ex" | grep -v test | wc -l
+grep -rn "rescue" --include="*.ex" | grep -v test | wc -l
+
+# Protocols/Behaviours
+grep -rn "defprotocol" --include="*.ex" | wc -l
+grep -rn "@callback" --include="*.ex" | wc -l
+grep -rn "@behaviour" --include="*.ex" | wc -l
+grep -rn "defimpl" --include="*.ex" | wc -l
+
+# Testing
+grep -rn "describe\|test " --include="*.exs" | wc -l
+grep -rn "assert\|refute" --include="*.exs" | wc -l
+grep -rn "setup\|setup_all" --include="*.exs" | wc -l
+grep -rn "ExUnit.CaseTemplate" --include="*.ex" | wc -l
+grep -rn "doctest" --include="*.exs" | wc -l
+
+# Process/OTP
+grep -rn "use GenServer\|use Agent\|use Supervisor" --include="*.ex" | wc -l
+grep -rn "GenServer.call\|GenServer.cast" --include="*.ex" | grep -v test | wc -l
+grep -rn "DynamicSupervisor\|Registry" --include="*.ex" | grep -v test | wc -l
+
+# Documentation/Types
+grep -rn "@moduledoc" --include="*.ex" | wc -l
+grep -rn "@doc" --include="*.ex" | wc -l
+grep -rn "@spec" --include="*.ex" | wc -l
+grep -rn "@type\|@typep\|@opaque" --include="*.ex" | wc -l
+grep -rn "defguard" --include="*.ex" | wc -l
+
+# Macros/Metaprogramming
+grep -rn "defmacro" --include="*.ex" | wc -l
+grep -rn "quote do" --include="*.ex" | wc -l
+grep -rn "__using__\|__before_compile__" --include="*.ex" | wc -l
+```
+
+### Rust
+
+```bash
+# Error handling
+grep -rn "Result<" --include="*.rs" | grep -v test | wc -l
+grep -rn "impl.*Error" --include="*.rs" | grep -v test | wc -l
+grep -rn "#\[derive.*thiserror" --include="*.rs" | wc -l
+grep -rn "anyhow\|eyre" --include="*.rs" | wc -l
+
+# Traits
+grep -rn "pub trait" --include="*.rs" | grep -v test | wc -l
+grep -rn "impl.*for" --include="*.rs" | grep -v test | wc -l
+
+# Testing
+grep -rn "#\[test\]" --include="*.rs" | wc -l
+grep -rn "#\[cfg(test)\]" --include="*.rs" | wc -l
+grep -rn "proptest\|quickcheck" --include="*.rs" | wc -l
+
+# Async
+grep -rn "async fn" --include="*.rs" | grep -v test | wc -l
+grep -rn "\.await" --include="*.rs" | grep -v test | wc -l
+grep -rn "tokio\|async-std" --include="*.rs" | wc -l
+```
+
+### Python
+
+```bash
+# Error handling
+grep -rn "raise " --include="*.py" | grep -v test | wc -l
+grep -rn "class.*Exception\|class.*Error" --include="*.py" | wc -l
+grep -rn "except " --include="*.py" | grep -v test | wc -l
+
+# Types
+grep -rn "def.*->" --include="*.py" | grep -v test | wc -l
+grep -rn "@dataclass\|@attrs" --include="*.py" | wc -l
+grep -rn "Protocol\|ABC" --include="*.py" | wc -l
+
+# Testing
+grep -rn "def test_" --include="*.py" | wc -l
+grep -rn "@pytest.fixture" --include="*.py" | wc -l
+grep -rn "@pytest.mark.parametrize" --include="*.py" | wc -l
+```
+
+## Patterns Mode: Step 5 Hyperlink Script
+
+```bash
+# Generic hyperlink conversion script
+# Adapt OWNER, REPO, HEAD, and source path format to your language
+
+HEAD=$(git rev-parse HEAD)
+BASE="https://github.com/OWNER/REPO/blob/${HEAD}"
+
+# Go: `src/path/file.go:NN` or `src/path/file.go` lines NN
+for f in patterns/*.md; do
+  sed -i -E "s|\`src/([^:\`]+):([0-9]+)(-[0-9]+)?\`|[src/\\1#L\\2](${BASE}/src/\\1#L\\2)|g" "$f"
+  sed -i -E "s|\`src/([^\`]+)\` lines? ([0-9]+)[\x{2013}-]([0-9]+)|[src/\\1#L\\2](${BASE}/src/\\1#L\\2)|g" "$f"
+done
+
+# Elixir: `lib/elixir/lib/something.ex` lines NN
+for f in patterns/*.md; do
+  sed -i -E "s|\`(lib/[^:\`]+\\.ex[s]?):([0-9]+)(-[0-9]+)?\`|[\\1#L\\2](${BASE}/\\1#L\\2)|g" "$f"
+  sed -i -E "s|\`(lib/[^\`]+\\.ex[s]?)\` lines? ([0-9]+)|[\\1#L\\2](${BASE}/\\1#L\\2)|g" "$f"
+done
+
+# Rust: `src/path/file.rs:NN`
+for f in patterns/*.md; do
+  sed -i -E "s|\`(src/[^:\`]+\\.rs):([0-9]+)(-[0-9]+)?\`|[\\1#L\\2](${BASE}/\\1#L\\2)|g" "$f"
+done
+```