# Python Patterns Process This file documents the workflow used to build and refine this repo so the extraction can be repeated without guesswork. ## Goal Turn repeated patterns from mature Python codebases into concise, prescriptive docs with verifiable citations. ## Scope split Keep this repo at the **language/library design** level. Good fits: - module/public-surface design - exception design - sync vs async API boundaries - typing and protocol design - data-model design - Python testing patterns Push framework/service-boundary concerns into a separate repo instead of muddying this one. ## Upstream selection rule Choose mature repos that expose different parts of the problem space. Current first-wave set: - `python/cpython` - `encode/httpx` - `pytest-dev/pytest` - `pydantic/pydantic` Current refinement set: - `python-attrs/attrs` - `pallets/click` - `sqlalchemy/sqlalchemy` Selection criteria: - respected and maintained - stable public APIs - enough repetition to support non-hand-wavy conclusions - examples/tests/internal structure that reveal tradeoffs, not just happy paths ## Directory contract - `sources/` = raw evidence notes, one file per upstream repo - `patterns/` = synthesized guidance from repeated evidence - `comparison/` = short notes for places where a library or framework bends the generic Python rule - `smells/` = anti-patterns derived from the same evidence base ## Step-by-step workflow ### 1) Split the problem cleanly Do not mix Python-wide guidance with FastAPI/service conventions in one repo. ### 2) Clone or otherwise make upstream sources available locally Work from local checkouts so citations can be verified quickly. ### 3) Write source notes first For each upstream repo, create `sources/.md` with: - why this repo is useful - repeated patterns - caveats/counterexamples - exact `file:line` citations - pattern candidates supported by the evidence Important: source notes are not polished guidance. They are reusable evidence. ### 4) Synthesize only after evidence exists Turn strong repeated signals into `patterns/*.md` docs. Use `comparison/*.md` only when the generic Python rule still matters, but a specific library or framework changes how it should be applied. Each pattern doc should usually include: - what the pattern is - why it exists - when to use it - when not to use it - preferred shapes - counterexamples - source signals/citations ### 5) Mine anti-patterns explicitly After the positive patterns are stable enough, write smell docs by inverting the evidence: - broad catch-all exceptions with no recovery meaning - hidden resource lifetime - accidental public APIs - fake sync wrappers over async code - overuse of `Any` - mixed transport/business/persistence responsibilities in one model ### 6) Run a refinement wave before broadening source coverage Once the main docs exist, do **not** immediately add more repos. Instead: - improve synthesized docs in fresh contexts - tighten weak citations - rewrite source-note files to be denser and more reusable - reduce duplicated guidance across docs ## Handling mixed-concern source code Upstream code will often mix generic Python design with framework-specific behavior in the same function, block, or line. When that happens: - do not classify the whole snippet by repo name or file path alone - split the evidence into **Python-level signal** and **framework-owned signal** - keep only the reusable language/library-design lesson in `python-patterns` - move the framework-owned lesson into the framework repo or comparison note If the claim depends on request parsing, dependency injection, response modeling, or HTTP error semantics, it no longer belongs here as a base Python rule. ## Review checklist ### For source notes - Does the file separate repeated patterns from one-off examples? - Are caveats preserved? - Are citations exact and easy to re-check? - Does it avoid vague claims like “mature code usually...” unless evidence is shown? ### For pattern docs - Does each rule follow from repeated evidence? - Is the guidance prescriptive without pretending there were no tradeoffs? - Are counterexamples concrete? - Is the doc still Python-level rather than framework-specific? ## Local git workflow used here When the repo is ready for human review: 1. initialize a local git repo 2. stage the current documentation set 3. create a single initial commit so review has a stable baseline This repo intentionally avoids pushing or creating remotes unless explicitly requested. ## What to avoid - writing docs from memory - mixing Python and framework guidance together - broadening source coverage before tightening weak docs - flattening caveats away during synthesis - leaving citations too vague to verify quickly - treating source notes as prose polish instead of evidence storage