# Python Patterns Process This file documents the workflow used to build and refine this repo so the extraction can be repeated without guesswork. ## Goal Turn repeated patterns from mature Python codebases into concise, prescriptive docs with verifiable citations. ## Scope split Keep this repo at the **language/library design** level. Good fits: - module/public-surface design - exception design - sync vs async API boundaries - typing and protocol design - data-model design - Python testing patterns Push framework/service-boundary concerns into a separate repo instead of muddying this one. ## Upstream selection rule Choose mature repos that expose different parts of the problem space. Current first-wave set: - `python/cpython` - `encode/httpx` - `pytest-dev/pytest` - `pydantic/pydantic` Current refinement set: - `python-attrs/attrs` - `pallets/click` - `sqlalchemy/sqlalchemy` Selection criteria: - respected and maintained - stable public APIs - enough repetition to support non-hand-wavy conclusions - examples/tests/internal structure that reveal tradeoffs, not just happy paths ## Directory contract - `sources/` = raw evidence notes, one file per upstream repo - `patterns/` = synthesized guidance from repeated evidence - `smells/` = anti-patterns derived from the same evidence base ## Step-by-step workflow ### 1) Split the problem cleanly Do not mix Python-wide guidance with FastAPI/service conventions in one repo. ### 2) Clone or otherwise make upstream sources available locally Work from local checkouts so citations can be verified quickly. ### 3) Write source notes first For each upstream repo, create `sources/.md` with: - why this repo is useful - repeated patterns - caveats/counterexamples - exact `file:line` citations - pattern candidates supported by the evidence Important: source notes are not polished guidance. They are reusable evidence. ### 4) Synthesize only after evidence exists Turn strong repeated signals into `patterns/*.md` docs. Each pattern doc should usually include: - what the pattern is - why it exists - when to use it - when not to use it - preferred shapes - counterexamples - source signals/citations ### 5) Mine anti-patterns explicitly After the positive patterns are stable enough, write smell docs by inverting the evidence: - broad catch-all exceptions with no recovery meaning - hidden resource lifetime - accidental public APIs - fake sync wrappers over async code - overuse of `Any` - mixed transport/business/persistence responsibilities in one model ### 6) Run a refinement wave before broadening source coverage Once the main docs exist, do **not** immediately add more repos. Instead: - improve synthesized docs in fresh contexts - tighten weak citations - rewrite source-note files to be denser and more reusable - reduce duplicated guidance across docs That was the right move for this repo after the first bootstrap wave. ## Fresh-context refinement pattern A good refinement split is: - one fresh pass over `patterns/*.md` - one fresh pass over `sources/*.md` - one fresh pass doing citation audit and anti-vagueness cleanup The point is to avoid simply echoing earlier wording. ## Review checklist ### For source notes - Does the file separate repeated patterns from one-off examples? - Are caveats preserved? - Are citations exact and easy to re-check? - Does it avoid vague claims like “mature code usually...” unless evidence is shown? ### For pattern docs - Does each rule follow from repeated evidence? - Is the guidance prescriptive without pretending there were no tradeoffs? - Are counterexamples concrete? - Is the doc still Python-level rather than framework-specific? ## Local git workflow used here When the repo is ready for human review: 1. initialize a local git repo 2. stage the current documentation set 3. create a single initial commit so review has a stable baseline This repo intentionally avoids pushing or creating remotes unless explicitly requested. ## How to repeat this process next time 1. Define the scope split first. 2. Pick a small, high-signal upstream set. 3. Build `sources/` before `patterns/`. 4. Synthesize only the strongest topics first. 5. Add smell docs after positive patterns exist. 6. Run a fresh-context refinement wave. 7. Initialize git only once the repo is reviewable. ## What to avoid - writing docs from memory - mixing Python and framework guidance together - broadening source coverage before tightening weak docs - flattening caveats away during synthesis - leaving citations too vague to verify quickly - treating source notes as prose polish instead of evidence storage