python-patterns/PROCESS.md

# Python Patterns Process

This file documents the workflow used to build and refine this repo so the extraction can be repeated without guesswork.

## Goal

Turn repeated patterns from mature Python codebases into concise, prescriptive docs with verifiable citations.

## Scope split

Keep this repo at the **language/library design** level.

Good fits:
- module/public-surface design
- exception design
- sync vs async API boundaries
- typing and protocol design
- data-model design
- Python testing patterns

Push framework/service-boundary concerns into a separate repo instead of muddying this one.

## Upstream selection rule

Choose mature repos that expose different parts of the problem space.

Current first-wave set:
- `python/cpython`
- `encode/httpx`
- `pytest-dev/pytest`
- `pydantic/pydantic`

Current refinement set:
- `python-attrs/attrs`
- `pallets/click`
- `sqlalchemy/sqlalchemy`

Selection criteria:
- respected and maintained
- stable public APIs
- enough repetition to support non-hand-wavy conclusions
- examples/tests/internal structure that reveal tradeoffs, not just happy paths

## Directory contract

- `sources/` = raw evidence notes, one file per upstream repo
- `patterns/` = synthesized guidance from repeated evidence
- `smells/` = anti-patterns derived from the same evidence base

## Step-by-step workflow

### 1) Split the problem cleanly
Do not mix Python-wide guidance with FastAPI/service conventions in one repo.

### 2) Clone or otherwise make upstream sources available locally
Work from local checkouts so citations can be verified quickly.

### 3) Write source notes first
For each upstream repo, create `sources/<repo>.md` with:
- why this repo is useful
- repeated patterns
- caveats/counterexamples
- exact `file:line` citations
- pattern candidates supported by the evidence

Important: source notes are not polished guidance. They are reusable evidence.

### 4) Synthesize only after evidence exists
Turn strong repeated signals into `patterns/*.md` docs.

Each pattern doc should usually include:
- what the pattern is
- why it exists
- when to use it
- when not to use it
- preferred shapes
- counterexamples
- source signals/citations

### 5) Mine anti-patterns explicitly
After the positive patterns are stable enough, write smell docs by inverting the evidence:
- broad catch-all exceptions with no recovery meaning
- hidden resource lifetime
- accidental public APIs
- fake sync wrappers over async code
- overuse of `Any`
- mixed transport/business/persistence responsibilities in one model

### 6) Run a refinement wave before broadening source coverage
Once the main docs exist, do **not** immediately add more repos.

Instead:
- improve synthesized docs in fresh contexts
- tighten weak citations
- rewrite source-note files to be denser and more reusable
- reduce duplicated guidance across docs

That was the right move for this repo after the first bootstrap wave.

## Fresh-context refinement pattern

A good refinement split is:
- one fresh pass over `patterns/*.md`
- one fresh pass over `sources/*.md`
- one fresh pass doing citation audit and anti-vagueness cleanup

The point is to avoid simply echoing earlier wording.

## Review checklist

### For source notes
- Does the file separate repeated patterns from one-off examples?
- Are caveats preserved?
- Are citations exact and easy to re-check?
- Does it avoid vague claims like “mature code usually...” unless evidence is shown?

### For pattern docs
- Does each rule follow from repeated evidence?
- Is the guidance prescriptive without pretending there were no tradeoffs?
- Are counterexamples concrete?
- Is the doc still Python-level rather than framework-specific?

## Local git workflow used here

When the repo is ready for human review:
1. initialize a local git repo
2. stage the current documentation set
3. create a single initial commit so review has a stable baseline

This repo intentionally avoids pushing or creating remotes unless explicitly requested.

## How to repeat this process next time

1. Define the scope split first.
2. Pick a small, high-signal upstream set.
3. Build `sources/` before `patterns/`.
4. Synthesize only the strongest topics first.
5. Add smell docs after positive patterns exist.
6. Run a fresh-context refinement wave.
7. Initialize git only once the repo is reviewable.

## What to avoid

- writing docs from memory
- mixing Python and framework guidance together
- broadening source coverage before tightening weak docs
- flattening caveats away during synthesis
- leaving citations too vague to verify quickly
- treating source notes as prose polish instead of evidence storage