Initial extracted documentation set

This commit is contained in:
Rodin
2026-06-01 21:42:05 +00:00
commit 60ffec18e4
17 changed files with 1590 additions and 0 deletions
+149
View File
@@ -0,0 +1,149 @@
# Python Patterns Process
This file documents the workflow used to build and refine this repo so the extraction can be repeated without guesswork.
## Goal
Turn repeated patterns from mature Python codebases into concise, prescriptive docs with verifiable citations.
## Scope split
Keep this repo at the **language/library design** level.
Good fits:
- module/public-surface design
- exception design
- sync vs async API boundaries
- typing and protocol design
- data-model design
- Python testing patterns
Push framework/service-boundary concerns into a separate repo instead of muddying this one.
## Upstream selection rule
Choose mature repos that expose different parts of the problem space.
Current first-wave set:
- `python/cpython`
- `encode/httpx`
- `pytest-dev/pytest`
- `pydantic/pydantic`
Current refinement set:
- `python-attrs/attrs`
- `pallets/click`
- `sqlalchemy/sqlalchemy`
Selection criteria:
- respected and maintained
- stable public APIs
- enough repetition to support non-hand-wavy conclusions
- examples/tests/internal structure that reveal tradeoffs, not just happy paths
## Directory contract
- `sources/` = raw evidence notes, one file per upstream repo
- `patterns/` = synthesized guidance from repeated evidence
- `smells/` = anti-patterns derived from the same evidence base
## Step-by-step workflow
### 1) Split the problem cleanly
Do not mix Python-wide guidance with FastAPI/service conventions in one repo.
### 2) Clone or otherwise make upstream sources available locally
Work from local checkouts so citations can be verified quickly.
### 3) Write source notes first
For each upstream repo, create `sources/<repo>.md` with:
- why this repo is useful
- repeated patterns
- caveats/counterexamples
- exact `file:line` citations
- pattern candidates supported by the evidence
Important: source notes are not polished guidance. They are reusable evidence.
### 4) Synthesize only after evidence exists
Turn strong repeated signals into `patterns/*.md` docs.
Each pattern doc should usually include:
- what the pattern is
- why it exists
- when to use it
- when not to use it
- preferred shapes
- counterexamples
- source signals/citations
### 5) Mine anti-patterns explicitly
After the positive patterns are stable enough, write smell docs by inverting the evidence:
- broad catch-all exceptions with no recovery meaning
- hidden resource lifetime
- accidental public APIs
- fake sync wrappers over async code
- overuse of `Any`
- mixed transport/business/persistence responsibilities in one model
### 6) Run a refinement wave before broadening source coverage
Once the main docs exist, do **not** immediately add more repos.
Instead:
- improve synthesized docs in fresh contexts
- tighten weak citations
- rewrite source-note files to be denser and more reusable
- reduce duplicated guidance across docs
That was the right move for this repo after the first bootstrap wave.
## Fresh-context refinement pattern
A good refinement split is:
- one fresh pass over `patterns/*.md`
- one fresh pass over `sources/*.md`
- one fresh pass doing citation audit and anti-vagueness cleanup
The point is to avoid simply echoing earlier wording.
## Review checklist
### For source notes
- Does the file separate repeated patterns from one-off examples?
- Are caveats preserved?
- Are citations exact and easy to re-check?
- Does it avoid vague claims like “mature code usually...” unless evidence is shown?
### For pattern docs
- Does each rule follow from repeated evidence?
- Is the guidance prescriptive without pretending there were no tradeoffs?
- Are counterexamples concrete?
- Is the doc still Python-level rather than framework-specific?
## Local git workflow used here
When the repo is ready for human review:
1. initialize a local git repo
2. stage the current documentation set
3. create a single initial commit so review has a stable baseline
This repo intentionally avoids pushing or creating remotes unless explicitly requested.
## How to repeat this process next time
1. Define the scope split first.
2. Pick a small, high-signal upstream set.
3. Build `sources/` before `patterns/`.
4. Synthesize only the strongest topics first.
5. Add smell docs after positive patterns exist.
6. Run a fresh-context refinement wave.
7. Initialize git only once the repo is reviewable.
## What to avoid
- writing docs from memory
- mixing Python and framework guidance together
- broadening source coverage before tightening weak docs
- flattening caveats away during synthesis
- leaving citations too vague to verify quickly
- treating source notes as prose polish instead of evidence storage
+71
View File
@@ -0,0 +1,71 @@
# Python Patterns
**Prescriptive.** Follow these when writing Python code.
This repo captures reusable Python patterns extracted from mature upstream codebases, then turns them into concise guidance with citations.
A good pattern doc here includes:
- **Why** — the reasoning, not just the rule
- **When to use** — the trigger conditions
- **When NOT to use** — where the pattern causes harm
- **Preferred shapes** — examples of the intended form
- **Counterexamples** — what to avoid and why
- **Source citations** — verified `file:line` anchors from real codebases
These docs should be derived from what strong Python codebases actually do, not from generic style-blog advice.
## Structure
- `patterns/` — what to do
- `smells/` — what to avoid
- `sources/` — extracted source-study notes and upstream references
- `PROCESS.md` — the repeatable extraction/refinement workflow used to build this repo
## Current source base
Primary upstreams mined so far:
- `python/cpython`
- `encode/httpx`
- `pytest-dev/pytest`
- `pydantic/pydantic`
- `python-attrs/attrs`
- `pallets/click`
- `sqlalchemy/sqlalchemy`
Why this mix works:
- CPython: API boundaries, exceptions, context managers, typing surface design
- HTTPX: package facade design, sync/async split, transport seams, error taxonomy
- pytest: fixture lifetime, parametrization, test ergonomics
- Pydantic: validation and serialization boundaries
- attrs: lightweight value-object/data-model patterns
- Click: CLI-facing exception and public-surface patterns
- SQLAlchemy: explicit persistence/session lifetime and sync-vs-async caveats
## Current pattern set
- `patterns/module-design.md`
- `patterns/typing.md`
- `patterns/error-handling.md`
- `patterns/async-boundaries.md`
- `patterns/testing.md`
- `patterns/data-models.md`
- `smells/common-mistakes.md`
## Reviewing this repo
Recommended review order:
1. `README.md`
2. `PROCESS.md`
3. `sources/*.md` for evidence quality
4. `patterns/*.md` for synthesis quality
5. `smells/*.md` for anti-pattern coverage
Questions to ask during review:
- Is each claim grounded in repeated upstream evidence?
- Are caveats preserved instead of flattened away?
- Does the pattern stay at the Python level rather than drifting into framework guidance?
- Are citations specific enough to be re-checked quickly?
## Extraction rule
Do not write pattern docs from memory. First collect repeated source examples with `file:line` citations, then synthesize the rule.
+117
View File
@@ -0,0 +1,117 @@
# Async Boundaries
Keep sync and async APIs as separate, explicit surfaces when their semantics differ.
## Why
Async stops being an implementation detail as soon as it changes:
- how resources are acquired and released
- whether methods must be awaited
- which transport or session types are valid
- how cancellation behaves
- whether the caller needs an event loop
Trying to hide that behind one magical API usually makes things worse. The common failure mode is a fake sync wrapper over async internals: it breaks inside existing event loops, hides resource lifetime, and muddies cancellation.
Mature libraries usually accept the split and make it visible.
## The pattern
1. Expose separate sync and async entrypoints when semantics differ.
2. Keep their shapes parallel where that helps learnability.
3. Keep resource and transport types distinct.
4. Make lifecycle visible with normal sync and async context management.
5. Do not smuggle event-loop control into a sync-looking API.
## When to use
Use this when:
- your library owns network, database, filesystem, or other long-lived resources
- sync and async variants need different transport or session implementations
- callers need predictable lifetime control
- the library must work in many runtime environments
## When not to use
Do not split APIs just for symmetry.
Skip the split when:
- async adds no meaningful semantic difference
- the operation is trivial and one-shot
- a separate async API would mostly duplicate noise
But if the alternative requires hidden `asyncio.run(...)`, loop-detection tricks, or silent runtime switching, the split is probably the cleaner design.
## Preferred shape
```python
class Client:
def get(self, url: str) -> Response:
...
class AsyncClient:
async def get(self, url: str) -> Response:
...
```
Make the surfaces parallel enough to learn once, but distinct enough that their lifecycle stays honest.
## Why this works
- callers immediately know whether `await` is involved
- transport types stay correct
- connection pooling and cleanup remain visible
- cancellation behavior is not hidden behind sync-looking calls
## Counterexamples
### Fake sync wrapper over async internals
```python
def get(url: str) -> Response:
return asyncio.run(_async_get(url))
```
This breaks in environments that already have an event loop and hides lifecycle costs.
### One class with mode flags
```python
client = Client(async_mode=True)
```
Now methods, cleanup, and caller expectations depend on ambient configuration instead of the type.
### Shared transport types that are not really shared
If one path needs `BaseTransport` and the other needs `AsyncBaseTransport`, pretending they are interchangeable is lying to the caller.
## Source signals
### HTTPX
- `httpx/_client.py:594-661` defines `Client` as the sync entrypoint and documents that it can be shared between threads.
- `httpx/_client.py:639-660` types the sync constructor in sync-native terms, including `BaseTransport`.
- `httpx/_client.py:1275-1304` uses normal sync context management for lifecycle.
- `httpx/_client.py:1307-1375` defines `AsyncClient` separately and documents task-sharing semantics.
- `httpx/_client.py:1316-1318` shows async usage with `async with` and `await`.
- `httpx/_client.py:1353-1374` keeps the constructor parallel but switches to async-native types like `AsyncBaseTransport`.
- `httpx/_client.py:1445-1452` initializes `AsyncHTTPTransport` on the async path rather than pretending the sync transport is reusable.
### SQLAlchemy
- `examples/asyncio/async_orm.py:15-18` imports async-specific engine and session primitives.
- `examples/asyncio/async_orm.py:61-67` creates an `async_sessionmaker(...)` and enters explicit async session and transaction scopes.
- `examples/asyncio/async_orm.py:78-104` uses async-native query and commit methods.
- `examples/inheritance/joined.py:16` imports the sync `Session` separately.
- `examples/inheritance/joined.py:93-120` shows the corresponding sync session lifetime and explicit commit boundary.
## Bottom line
If sync and async usage have different semantics, give them different types.
Parallel APIs are good.
Pretending the difference is not there is not.
+140
View File
@@ -0,0 +1,140 @@
# Data Models
Treat data models as explicit boundary objects, and keep simple internal state in small value-like types.
## Why
A lot of messy Python comes from making one class do four jobs at once:
- validate external input
- represent internal state
- serialize output
- own persistence or business behavior
Mature codebases usually split those concerns.
The recurring pattern is:
- use explicit boundary models for validated input and output
- keep serialization explicit
- use lightweight value-like classes for simple internal state
- avoid turning every data container into a god object
## The pattern
1. Use dedicated boundary models where data enters or leaves the system.
2. Validate and serialize explicitly.
3. Use lightweight declarative classes for data-heavy, behavior-light state.
4. Keep persistence, transport, and business logic from collapsing into one model.
## When to use
Use validated boundary models when:
- data enters from HTTP, files, queues, or user input
- you need a predictable serialization contract
- callers need one clear place for validation and defaults
Use small value-like objects when:
- the object mainly carries state
- immutability helps reasoning
- behavior is narrow and local
## When not to use
Do not make every internal object a heavyweight validation model.
Do not scatter custom `to_dict()` logic across the codebase.
Do not let one model become schema, ORM wrapper, validator, service object, and side-effect manager at the same time.
## Preferred shapes
### Explicit boundary model
```python
class UserIn(BaseModel):
email: str
name: str
user = UserIn.model_validate(payload)
serialized = user.model_dump()
```
Why this works:
- validation is explicit
- serialization is explicit
- the models job is clear
### Small value-like internal class
```python
@dataclass(frozen=True)
class Duration:
start: Instant
stop: Instant
```
Why this works:
- state is simple
- mutation is constrained
- the class stays easy to trust
## Counterexamples
### Hand-rolled serialization everywhere
```python
class User:
def to_dict(self):
return {
"id": str(self.id),
"name": self.name,
"created_at": self.created_at.isoformat(),
}
```
Every model now invents its own boundary behavior.
### One class owns every concern
```python
class OrderModel:
def validate(self): ...
def save(self): ...
def send_webhook(self): ...
def render_html(self): ...
```
That is not a model. That is a junk drawer.
## Source signals
### Pydantic
- `pydantic/main.py:253-264` says `BaseModel.__init__` parses and validates input data and raises `ValidationError` on bad input.
- `pydantic/main.py:455-519` defines `model_dump(...)` as an explicit serialization step with caller-controlled include/exclude behavior.
- `pydantic/main.py:721-768` defines `model_validate(...) -> Self` as a named boundary-crossing API.
- `docs/index.md:82-107` pairs model creation with `model_dump()` instead of treating instances as already wire-ready.
- `docs/index.md:109-124` shows invalid external input failing loudly with `ValidationError`.
### Attrs
- `docs/examples.md:24-44` shows `@define` creating lightweight typed classes with generated constructor, repr, and equality behavior.
- `docs/examples.md:143-205` uses keyword-only fields to keep construction explicit at the call site.
- `docs/examples.md:209-220` uses `asdict(...)` as an intentional conversion step.
### Pytest
- `src/_pytest/timing.py:24-64` models `Instant` and `Duration` as frozen dataclasses for simple internal timing state.
## Bottom line
Boundary models should validate and serialize cleanly.
Internal models should stay small and honest.
If one model starts owning every concern in the system, split it before it turns to mud.
+140
View File
@@ -0,0 +1,140 @@
# Error Handling
Use exception types to encode what callers can do next.
## Why
Good Python libraries do not collapse every failure into `RuntimeError` or `Exception`. They shape errors around recovery boundaries:
- one base type for “something in this subsystem failed”
- narrower subtypes when callers need different recovery
- structured fields when the branch depends on data, not wording
That gives callers a clean ladder:
- catch broadly at subsystem boundaries
- catch narrowly when retry/report/ignore differs
- surface better API or CLI errors without parsing strings
## The pattern
1. Define a subsystem-level base exception.
2. Add subtypes only when callers need different handling.
3. Put structured context on the exception when branching depends on it.
4. Translate internal failures at API, CLI, or transport boundaries.
## When to use
Use this when:
- a module or library exposes a public API
- failure modes need different handling
- an outer boundary must turn internal failures into user-facing errors
- retry, ignore, and abort decisions differ by failure kind
## When not to use
Do not build a hierarchy when:
- the code is tiny and has one obvious failure mode
- every failure is handled the same way
- the only distinction is wording, not behavior
- a normal return value like `None` is already the contract
Do not make callers parse exception text. If the distinction matters, make it a type or a field.
## Good shape
```python
class MailError(Exception):
pass
class TemporaryMailError(MailError):
pass
class PermanentMailError(MailError):
pass
class MailRejected(PermanentMailError):
def __init__(self, code: int, reason: str) -> None:
super().__init__(reason)
self.code = code
self.reason = reason
```
Caller:
```python
try:
send_mail(message)
except TemporaryMailError:
retry_later(message)
except PermanentMailError as exc:
mark_failed(message, reason=str(exc))
```
## Counterexamples
### Stringly-typed branching
```python
try:
do_work()
except Exception as exc:
if "timeout" in str(exc).lower():
retry()
```
The recovery rule is hiding in fragile text matching.
### One catch-all with no domain meaning
```python
class AppError(Exception):
pass
raise AppError("not found")
raise AppError("permission denied")
raise AppError("timeout")
```
Callers cannot branch meaningfully.
### Boundary types leaking into the core
```python
from fastapi import HTTPException
def charge_card(card: Card) -> Receipt:
if card.expired:
raise HTTPException(status_code=400, detail="expired card")
```
This couples domain logic to one transport. Raise a domain error here; translate to HTTP at the boundary.
## Source signals
### Stdlib / CPython
- `Lib/smtplib.py:69-71` defines `SMTPException` as the module-wide base type.
- `Lib/smtplib.py:88-100` defines `SMTPResponseException` and stores structured fields on the exception itself: `smtp_code` and `smtp_error`.
- `Lib/smtplib.py:102-125` adds subtype-specific payload like `sender` on `SMTPSenderRefused` and `recipients` on `SMTPRecipientsRefused`.
### HTTPX
- `httpx/_exceptions.py:74-90` defines `HTTPError` as a broad catch point and explicitly documents it as useful around request + `raise_for_status()` flows.
- `httpx/_exceptions.py:107-125` narrows that into `RequestError` and `TransportError` for request-time failures.
- `httpx/_exceptions.py:132-160` further splits timeout handling into `ConnectTimeout`, `ReadTimeout`, `WriteTimeout`, and `PoolTimeout`.
### Click
- `src/click/exceptions.py:35-65` defines `ClickException` with behavior, not just categorization: an exit code and a `show()` renderer.
- `src/click/exceptions.py:68-111` makes `UsageError` a narrower subtype with a different exit code and help-aware output.
## Bottom line
If callers need different behavior, give them different exception types.
If callers need details, attach fields.
If an outer layer needs user-facing output, translate there instead of pushing boundary concerns through the whole codebase.
+127
View File
@@ -0,0 +1,127 @@
# Module Design
Design a small, stable public surface. Keep implementation modules movable behind it.
## Why
Python makes it easy to publish internals by accident:
- every file is importable
- helpers become de facto API once users depend on them
- refactors turn into breaking changes when file layout becomes the contract
Mature libraries push back on that. They usually:
- publish one obvious import surface
- re-export supported names deliberately
- keep internal modules non-authoritative
- use `__all__` when the boundary needs to be explicit
That buys two things:
- simpler imports for callers
- freedom to reorganize internals later
## The pattern
1. Decide what callers should import.
2. Re-export those names from the package boundary.
3. Keep implementation details in internal modules.
4. Use `__all__` when you want an explicit contract.
5. Treat internal file layout as private unless you intentionally publish it.
## When to use
Use this when:
- a package spans multiple modules
- internals will evolve faster than the public API
- you want callers thinking in domain concepts, not filenames
- compatibility matters across releases
## When not to use
Do not build a facade when:
- the package is tiny and direct imports are already clear
- the abstraction boundary is still moving fast
- re-exporting would turn `__init__.py` into a junk drawer
A curated surface is not the same as a flat surface. Keep structure where the concepts are meaningfully different.
## Preferred shapes
### Package facade over internal modules
```python
# mypkg/__init__.py
from ._client import Client
from ._errors import AppError
from ._models import Item
__all__ = ["Client", "AppError", "Item"]
```
Why this works:
- callers learn one stable import path
- internals can move without import churn
- the package advertises its real contract
### Explicit module contract
```python
__all__ = ["Number", "Complex", "Real", "Rational", "Integral"]
```
This says: these names are supported; everything else is implementation detail.
## Counterexamples
### File layout becomes the API by accident
```python
from mypkg.utils import helper_a
from mypkg.impl_v2 import thing
from mypkg.more_helpers import other_thing
```
Refactoring internal modules now breaks users.
### Everything dumped into `__init__.py`
If `__init__.py` exports fifty unrelated names, you did not create a clean facade. You created autocomplete noise.
### Public API mirrors the folder tree too literally
If callers need to know todays internal layout to use the library, the boundary is still underdesigned.
## Source signals
### CPython
- `Lib/numbers.py:8-23` warns that published ABC APIs are hard to change and should be designed carefully.
- `Lib/numbers.py:35` publishes a narrow `__all__` rather than treating every helper as public.
- `Lib/operator.py:13-15` and `Lib/smtplib.py:55-58` do the same in stdlib modules with mixed public/internal names.
### Pytest
- `src/pytest/__init__.py:6-80` builds the top-level `pytest` API by importing from many `_pytest.*` internals.
- `src/pytest/__init__.py:98-186` then pins that facade with an explicit `__all__`.
### HTTPX
- `httpx/__init__.py:1-12` re-exports the package surface from internal modules such as `._client`, `._exceptions`, and `._models`.
- `httpx/__init__.py:29-100` defines the supported top-level export list explicitly.
- `httpx/__init__.py:103-106` rewrites exported objects `__module__` to `httpx`, reinforcing the facade instead of leaking internal filenames.
### Click
- `src/click/__init__.py:10-75` exposes the package through re-exports.
- `src/click/__init__.py:77-126` keeps compatibility shims and deprecations at the boundary instead of freezing old internal layout forever.
## Bottom line
Make the public API intentional.
Callers should depend on your concepts, not your current file tree.
+137
View File
@@ -0,0 +1,137 @@
# Testing
Use fixtures for reusable resource setup, parametrization for behavior matrices, and explicit boundary seams instead of ad hoc mocking.
## Why
Good Python tests optimize for three things at once:
- local readability
- cheap variation across inputs and modes
- reusable setup and cleanup without hiding intent
The mature pattern is not just “use pytest.” It is:
- model resources with fixtures
- make fixture lifetime visible with `yield` when cleanup matters
- use parametrization when one behavior should hold across several inputs
- test through boundary seams like transports instead of patching internals blindly
## The pattern
1. Use fixtures for shared setup and resources.
2. Use `yield` fixtures when setup and teardown both matter.
3. Use parametrization when the assertion shape is the same but inputs vary.
4. Prefer explicit seams over invasive mocking.
5. Keep the test body focused on behavior, not scaffolding.
## When to use
Use fixtures when:
- multiple tests need the same resource wiring
- setup or cleanup would otherwise dominate the body
- the setup is a dependency, not the behavior under test
Use parametrization when:
- one behavior should hold across several inputs or modes
- the data varies but the test story stays the same
Use transport or injected seams when:
- the behavior crosses I/O boundaries
- you want realistic flow without spinning up the whole world
## When not to use
Do not hide essential behavior behind a giant fixture tower.
Do not parametrize cases that deserve different narratives or different assertions.
Do not call fixtures directly like helper functions; if you want a helper, write a helper.
Do not mock deep internals when a cleaner external seam exists.
## Preferred shapes
### Yield fixture for lifecycle
```python
@pytest.fixture
def resource():
obj = make_resource()
yield obj
obj.close()
```
This keeps setup and teardown obvious.
### Parametrization for behavior matrices
```python
@pytest.mark.parametrize("mode", ["prepend", "append", "importlib"])
def test_import_behavior(mode: str) -> None:
...
```
One behavior, several inputs, no duplicated body.
### Boundary seam instead of monkeypatch soup
```python
transport = httpx.MockTransport(handler)
client = httpx.Client(transport=transport)
```
This is usually cleaner than patching internals in three places.
## Counterexamples
### Repeated setup in every test
```python
def test_a():
client = make_client()
tmpdir = make_tmpdir()
seed_db()
def test_b():
client = make_client()
tmpdir = make_tmpdir()
seed_db()
```
The scaffolding overwhelms the behavior.
### Fixture tower opacity
If understanding the test requires opening six fixtures before reading one assertion, the abstraction has gone too far.
### Calling fixtures directly
Pytest explicitly rejects this because fixtures are injected dependencies, not disguised utility functions.
## Source signals
### Pytest
- `src/_pytest/fixtures.py:1378-1440` makes the contract explicit twice: calling a fixture directly is an error, and `yield` fixtures run teardown code after the test outcome.
- `testing/test_threadexception.py:84-91` shows a real `yield` fixture with post-test cleanup work after the `yield`.
- `testing/acceptance_test.py:158-169` uses `@pytest.mark.parametrize(...)` to check one behavior across multiple import modes without cloning the test body.
- `testing/acceptance_test.py:561-574` shows another compact parametrized case where only the example data changes.
### HTTPX
- `httpx/_transports/asgi.py:63-83` exposes `ASGITransport` as an in-process integration seam and even documents `raise_app_exceptions=False` for testing 500 responses.
- `httpx/_transports/mock.py:15-43` exposes `MockTransport` as a first-class request/response seam for tests.
- `httpx/_client.py:639-660` accepts `transport=` directly on `Client`, which is what makes transport substitution a normal testing path instead of a hack.
## Bottom line
A good Python test makes the behavior easy to see and the environment cheap to vary.
Use fixtures for lifetime.
Use parametrization for variation.
Use explicit seams instead of brittle patching.
+119
View File
@@ -0,0 +1,119 @@
# Typing
Use types to describe accepted shapes and behavioral contracts, not to pretend Python is a different language.
## Why
Good Python typing improves APIs in two ways:
- callers can see which shapes are accepted
- maintainers can preserve real boundaries without smearing `Any` everywhere
The mature pattern is not “make everything maximally abstract.” It is:
- use structural typing when capability matters more than inheritance
- use explicit aliases and unions for ergonomic public inputs
- keep public APIs typed even when internals stay dynamic
That gives users a real contract without freezing implementation choices too early.
## The pattern
1. Type public APIs precisely.
2. Prefer `Protocol` when callers care about behavior, not ancestry.
3. Use explicit unions and aliases for user-facing flexibility.
4. Keep dynamic internals from leaking into the public contract.
5. Avoid `Any` unless you truly mean “anything goes.”
## When to use
Use this when:
- multiple implementations can satisfy one behavioral need
- callers naturally have more than one valid input shape
- you want strong editor and type-checker help at public boundaries
- internals are dynamic but the public contract is still stable
## When not to use
Do not use a protocol when a concrete type is the real contract.
Do not use broad unions just to avoid choosing a better API.
Do not over-trust `@runtime_checkable`: CPython is explicit that runtime protocol checks verify only attribute presence, not signature correctness.
## Preferred shapes
### Structural typing for capability-based contracts
```python
class Writer(Protocol):
def write(self, data: bytes) -> int: ...
```
If the caller only needs `write()`, do not require a specific base class.
### Explicit flexible public inputs
```python
URLInput = URL | str
```
This is better than either extreme:
- forcing callers to pre-wrap everything
- accepting `Any` and hoping for the best
## Counterexamples
### Inheritance-only abstraction
```python
class BaseStore:
...
def persist(store: BaseStore) -> None:
...
```
This is too rigid when the function only needs a small capability surface.
### Type surrender
```python
def send(data: Any, options: Any) -> Any:
...
```
The API contract disappeared.
### Runtime protocol overconfidence
If runtime safety matters, attribute-presence checks are not enough. Protocols do most of their work at static-analysis time.
## Source signals
### CPython / typing
- `Lib/typing.py:2132-2157` defines `Protocol` around structural subtyping and explicitly frames it as static duck typing.
- `Lib/typing.py:2155-2157` states that `@runtime_checkable` protocols check only attribute presence, ignoring type signatures.
- `Lib/typing.py:2190-2250` shows `Annotated` as “type plus metadata,” not a new underlying runtime type.
### HTTPX
- `httpx/_client.py:639-660` gives `Client` precise constructor types for auth, params, headers, cookies, timeouts, transports, and `base_url`.
- `httpx/_client.py:1353-1374` mirrors that precision on `AsyncClient` instead of collapsing to untyped arguments.
### Pydantic
- `pydantic/main.py:156-205` exposes typed `ClassVar[...]` metadata for config, fields, serializer, and validator state even though framework internals are dynamic.
- `pydantic/main.py:253-264` makes model construction validate `**data: Any` immediately instead of pretending arbitrary inputs are already safe.
- `pydantic/main.py:721-768` gives `model_validate(...) -> Self` an explicit typed boundary contract.
## Bottom line
Use typing to make public boundaries clearer.
Be flexible where callers need flexibility.
Be precise where contracts matter.
Do not hide uncertainty behind `Any`.
+198
View File
@@ -0,0 +1,198 @@
# Common Mistakes in Python
This file captures recurring non-idiomatic patterns, especially code that reads like Java or TypeScript in a Python costume.
## 1. Boolean-flag methods that hide two behaviors
### Smell
```python
def fetch_user(user_id: str, include_deleted: bool = False) -> User | None:
...
```
### Why it is a smell
Boolean mode flags often mean one function is doing two jobs. They create call sites like `fetch_user(id, True)` that say almost nothing.
### Better shape
- split into separate functions when the behaviors are meaningfully different
- or promote the difference to a real enum or config object when there are several modes
## 2. Broad `except Exception:` without a boundary reason
### Smell
```python
try:
do_work()
except Exception:
return None
```
### Why it is a smell
This erases recovery meaning. Mature libraries like HTTPX and Click use exception hierarchies so callers can catch broadly or narrowly depending on what they can recover from.
### Better shape
- catch the specific failures you can handle
- use broad catch points only at real process, API, or CLI boundaries
- translate internal failures into a clear outer error contract
### Source signals
- `httpx/_exceptions.py:74-90` defines `HTTPError` as a broad catch point.
- `httpx/_exceptions.py:107-160` then narrows request, transport, and timeout failures for recovery.
- `src/click/exceptions.py:35-111` ties exception subtypes to different exit codes and help output.
## 3. Hidden I/O in constructors or properties
### Smell
```python
class User:
def __init__(self, user_id: str):
self.profile = requests.get(...).json()
```
### Why it is a smell
Object creation now performs surprising network I/O, which makes testing, lifetime, and failure handling muddy.
### Better shape
- keep I/O in explicit factory or boundary methods
- make resource acquisition visible
- separate data containers from loading logic
## 4. Import-time side effects
### Smell
```python
client = connect_to_db()
load_config_from_network()
```
### Why it is a smell
Importing a module should usually define names, not secretly talk to the outside world. Import-time side effects make startup order brittle and tests unpredictable.
### Better shape
- move initialization into explicit startup paths
- create resources in app setup, dependency wiring, or main entrypoints
## 5. Mixing validation, transport, and business logic in one class
### Smell
```python
class OrderModel(BaseModel):
def save(self): ...
def send_webhook(self): ...
def render_html(self): ...
```
### Why it is a smell
Boundary schema, persistence logic, and business behavior cannot evolve independently anymore.
### Better shape
- let Pydantic, attrs, or dataclasses own data-shape concerns
- keep boundary translation explicit
- split long-lived behavior into services or domain objects when needed
### Source signals
- `pydantic/main.py:253-264` treats model construction as input validation.
- `pydantic/main.py:455-519` makes dumping an explicit serialization step.
- `src/_pytest/timing.py:24-64` uses frozen dataclasses for small internal value objects.
## 6. Fake sync wrappers around async code
### Smell
```python
def get(url: str) -> Response:
return asyncio.run(async_get(url))
```
### Why it is a smell
This breaks in existing event loops and hides real async lifetime and cancellation semantics.
### Better shape
- provide separate sync and async entrypoints when semantics differ
- keep them parallel, not secretly interchangeable
### Source signals
- `httpx/_client.py:594-661` and `httpx/_client.py:1307-1375` define separate `Client` and `AsyncClient` types.
- `httpx/_client.py:639-660` vs. `httpx/_client.py:1353-1374` keep the constructor shapes parallel while changing transport types.
- `examples/asyncio/async_orm.py:61-67` vs. `examples/inheritance/joined.py:93-120` show the same split in SQLAlchemy session usage.
## 7. Overuse of `Any`
### Smell
```python
def send(data: Any, options: Any) -> Any:
...
```
### Why it is a smell
The API contract disappeared.
### Better shape
- use concrete types when the contract is specific
- use `Protocol` when callers care about capability
- use explicit unions or aliases for ergonomic flexibility
### Source signals
- `Lib/typing.py:2132-2157` frames `Protocol` around structural subtyping.
- `Lib/typing.py:2155-2157` warns that runtime protocol checks ignore signatures.
- `httpx/_client.py:639-660` shows rich public typing instead of `Any`-shaped parameters.
## 8. Accidental public APIs through file layout
### Smell
```python
from mypkg.utils import helper_a
from mypkg.impl_v2 import thing
```
### Why it is a smell
Internal module layout becomes the public contract by accident, which makes refactors painful.
### Better shape
- publish a deliberate import surface
- re-export supported names
- use `__all__` when the boundary should be explicit
### Source signals
- `src/pytest/__init__.py:6-80` builds a top-level facade over `_pytest.*` internals.
- `src/pytest/__init__.py:98-186` pins that facade with `__all__`.
- `httpx/__init__.py:1-12` and `httpx/__init__.py:29-106` do the same while also rewriting exported objects to appear under `httpx`.
## Heuristic
If the code:
- hides resource lifetime
- hides write boundaries
- hides which failures matter
- hides what the public API really is
…it is usually fighting Python instead of using it.
+32
View File
@@ -0,0 +1,32 @@
# Source Notes
This directory stores the reusable evidence behind the pattern docs.
## What belongs here
One note per upstream repo, with:
- why the repo is useful
- repeated patterns, not a reading diary
- caveats and counterexamples
- exact `file:line` anchors
- pattern candidates supported by the evidence
## Current notes
- `cpython.md`
- `httpx.md`
- `pytest.md`
- `pydantic.md`
- `attrs.md`
- `click.md`
- `sqlalchemy.md`
## Quality bar
A source note is good when it makes later synthesis cheap:
- repeated evidence is separated from one-off examples
- vague claims are avoided
- citations are fast to verify
- caveats survive instead of being flattened away
Read `../PROCESS.md` for the full repeatable workflow.
+48
View File
@@ -0,0 +1,48 @@
# Attrs source notes
Repo: `python-attrs/attrs`
Local checkout: `/home/ubuntu/repos/rodin-sources/attrs`
## Why this repo is useful
- `attrs` is a strong source for explicit data-carrier design: generated methods, constructor-shape control, and deliberate conversion at boundaries.
- Its docs are especially valuable because they include caveats and failure cases, not just happy-path examples.
## Declarative fields are the default for data-heavy classes
### Repeated evidence
- `docs/examples.md:24-44` shows `@define` with typed fields immediately generating constructor, repr, and equality behavior.
- `docs/examples.md:31-44` makes the generated behavior visible at the REPL rather than implicit.
- `docs/examples.md:51-58` shows the same declarative shape without relying on type annotations, via `field()`.
### Why it matters
Repeated signal: when a class mostly carries data, `attrs` prefers declaring fields and letting the library generate the boilerplate. The code emphasizes structure and invariants over handwritten dunder noise.
### Caveat / counterexample
`docs/examples.md:60-62` warns that mixing `field()` declarations without annotations flips `attrs` into a no-typing mode and can ignore annotation-only attributes. That is a sharp edge worth preserving in synthesis: declarative does not mean "mix styles freely."
## Keyword-only fields are used to protect call-site clarity and inheritance
### Repeated evidence
- `docs/examples.md:147-157` shows `field(kw_only=True)` forcing explicit construction at the call site.
- `docs/examples.md:159-172` shows decorator-level `@define(kw_only=True)` applying the same rule to the whole class.
- `docs/examples.md:176-191` shows the practical inheritance payoff: subclasses can add required fields even when the base class already has defaults.
- `docs/examples.md:193-205` shows the counterexample when `kw_only=True` is omitted: invalid attribute ordering raises a `ValueError`.
### Why it matters
Repeated signal: keyword-only fields are not just cosmetic. They are a tool for making constructor calls self-describing and for avoiding inheritance-order traps.
## Serialization is explicit and filterable
### Repeated evidence
- `docs/examples.md:211-217` shows `asdict(...)` as a deliberate conversion step from object to plain data.
- `docs/examples.md:219-235` shows `asdict(..., filter=...)` excluding sensitive fields like passwords.
- `docs/examples.md:238-253` shows built-in include/exclude helpers for more reusable serialization control.
### Why it matters
Repeated signal: even value-like objects are not assumed to be wire-ready. `attrs` makes the serialization boundary explicit and gives callers hooks to shape what crosses it.
## Pattern candidates supported by this repo
- use declarative field definitions for data-carrier classes
- prefer keyword-only construction when call-site clarity or inheritance safety matters
- keep serialization as an explicit step
- preserve caveats about mixed declaration styles and constructor ordering
+50
View File
@@ -0,0 +1,50 @@
# Click source notes
Repo: `pallets/click`
Local checkout: `/home/ubuntu/repos/rodin-sources/click`
## Why this repo is useful
- Click is a strong source for CLI API design: stable top-level exports, context-passing conventions, and user-facing exception behavior.
- It is especially useful because the implementation ties API design directly to operator experience at the terminal.
## The package root is a curated facade with compatibility shims
### Repeated evidence
- `src/click/__init__.py:10-75` re-exports commands, decorators, exceptions, types, and terminal helpers from internal modules.
- `src/click/__init__.py:77-124` uses `__getattr__` to keep deprecated compatibility names (`BaseCommand`, `MultiCommand`, `OptionParser`, `__version__`) working while emitting warnings.
### Why it matters
Repeated signal: mature libraries often keep the package root stable even while internal layout evolves. Click treats the package root as the user-facing API and places compatibility logic there deliberately.
### Caveat / counterexample
Compatibility shims are useful, but they are debt. Click's use of deprecation warnings is the important pattern: keep compatibility explicit and time-bounded rather than silently permanent.
## Command state is passed through context objects, not globals
### Repeated evidence
- `docs/complex.md:53-61` explains that callbacks do not receive context unless they opt in, and that `Context.invoke` mediates invocation.
- `docs/complex.md:92-99` shows a root command storing application state on `ctx.obj`.
- `docs/complex.md:107-113` states directly that `Context.obj` is the place commands are supposed to remember what they need to pass to children.
- `src/click/decorators.py:51-93` implements `make_pass_decorator(...)` by searching the linked context chain for the nearest object of the desired type and invoking the callback with it.
### Why it matters
Repeated signal: Click favors explicit, nestable context propagation over module globals or hidden singletons. That matters for complex CLIs with subcommands and plugins.
### Caveat / counterexample
`docs/complex.md:143-163` points out the interleaved-command problem: plugin layers can replace `ctx.obj`. That is why `make_pass_decorator(...)` exists; plain `pass_obj` is not always enough once commands are nested by third parties.
## Exceptions encode user-visible behavior, not just categorization
### Repeated evidence
- `src/click/exceptions.py:35-65` defines `ClickException` with an exit code, cached color behavior, and a `show()` method for terminal rendering.
- `src/click/exceptions.py:68-111` defines `UsageError` with a different exit code and help-aware rendering that prints usage plus a "Try '--help'" hint when context is available.
- `src/click/exceptions.py:114-118` documents `BadParameter` as a subtype that gains parameter context automatically.
### Why it matters
Repeated signal: in CLI libraries, exceptions often need to carry exit semantics and presentation rules, not just messages. Click's hierarchy is built around what the operator should see next.
## Pattern candidates supported by this repo
- expose a stable package-level facade over internal modules
- use explicit compatibility shims with deprecation warnings
- pass CLI state through typed/named context objects rather than globals
- design exception types around exit behavior and user guidance
+53
View File
@@ -0,0 +1,53 @@
# CPython source notes
Repo: `python/cpython`
Local checkout: `/home/ubuntu/repos/rodin-sources/cpython`
## Why this repo is useful
- CPython is a strong source for patterns that survive long-term compatibility pressure.
- It is especially useful for API-boundary choices and error-shape choices because stdlib modules must stay understandable to a huge caller base.
- Caveat: stdlib code is not stylistically uniform. Treat repeated shapes across modules as signal; do not treat a single module's convention as "the Python way."
## Public API boundaries are often declared explicitly
### Repeated evidence
- `Lib/operator.py:13-20` declares a tight `__all__` list instead of exporting every helper or alias in the module namespace.
- `Lib/smtplib.py:55-58` does the same for the SMTP module, listing public exceptions and the `SMTP` client.
- `Lib/warnings.py:3-18` likewise enumerates the supported warning helpers instead of relying on incidental module globals.
### Why it matters
Repeated signal: when a module has helper names, aliases, or internal scaffolding, CPython often freezes the supported surface explicitly. That makes the public API a maintained decision rather than an accident of file layout.
### Caveat / counterexample
This is common, not universal. Some stdlib modules still expose names without a curated `__all__`, so the useful pattern is not "always add `__all__`" but "use it when the file contains more than the intended public surface."
## Exception trees are shaped around recovery needs, not just taxonomy
### Repeated evidence
- `Lib/smtplib.py:69-71` defines `SMTPException` as the broad catch point for the module.
- `Lib/smtplib.py:73-86` adds semantic subtypes for unsupported commands and disconnected-session failures.
- `Lib/smtplib.py:88-100` defines `SMTPResponseException` and stores structured payload on the exception itself via `smtp_code` and `smtp_error`.
- `Lib/smtplib.py:102-125` and `Lib/smtplib.py:128-142` narrow further into sender, recipient, data, connect, HELO, and auth failures.
### Why it matters
Repeated signal: CPython exception trees usually give callers two useful options at once:
- catch broadly at the subsystem boundary, or
- branch narrowly on structured failure data when recovery differs.
This is stronger than a flat set of unrelated exception classes and stronger than raising plain `OSError`/`ValueError` with message parsing.
## Resource-owning helpers usually make lifetime visible
### Repeated evidence
- `Lib/contextlib.py:31-43` defines the abstract context-manager protocol directly in the stdlib.
- `Lib/tempfile.py:487-539` wraps temporary files so `__enter__` returns the wrapper and `__exit__` guarantees cleanup.
- `Lib/tempfile.py:758-761` also guards context entry on closed temporary files, showing that lifetime rules are enforced, not just documented.
### Why it matters
Repeated signal: when a stdlib helper owns cleanup-sensitive state, CPython prefers explicit context-manager boundaries over ambient cleanup assumptions.
## Pattern candidates supported by this repo
- declare public module surfaces explicitly when helper names would otherwise leak
- build exception hierarchies around caller recovery paths
- attach structured data to exceptions when callers need to branch without string parsing
- make resource lifetime obvious with context-manager boundaries
+54
View File
@@ -0,0 +1,54 @@
# HTTPX source notes
Repo: `encode/httpx`
Local checkout: `/home/ubuntu/repos/rodin-sources/httpx`
## Why this repo is useful
- HTTPX is a strong source for modern boundary-design patterns in Python libraries: sync/async separation, transport seams, and caller-oriented exception design.
- It is especially useful because the same conceptual API is implemented twice (sync and async), making repeated design choices easy to spot.
## Sync and async APIs are parallel types, not a mode flag
### Repeated evidence
- `httpx/_client.py:594-660` defines `Client` as the synchronous entrypoint with `BaseTransport` and thread-sharing semantics.
- `httpx/_client.py:1307-1374` defines `AsyncClient` separately with the same broad constructor shape but `AsyncBaseTransport` and task-sharing semantics.
- `httpx/_client.py:688-696` and `httpx/_client.py:1402-1410` show the same transport-initialization flow in each class, reinforcing that the APIs are intentionally parallel rather than conditionally branching inside one type.
### Why it matters
Repeated signal: mature Python networking libraries keep sync and async usage obviously separate in the type system while preserving familiar parameter shapes. That lowers cognitive overhead without hiding execution-model differences.
### Caveat / counterexample
The pattern is not "duplicate everything." HTTPX keeps shared behavior in `BaseClient`; the duplication is at the public entrypoint where transport types and calling style genuinely differ.
## Exceptions are layered for catch-broadly / recover-narrowly behavior
### Repeated evidence
- `httpx/_exceptions.py:74-90` makes `HTTPError` the top-level catch point and explicitly documents `try/except httpx.HTTPError` as a supported usage pattern.
- `httpx/_exceptions.py:107-120` defines `RequestError` for failures that occur while issuing a request and explains why request context may be attached later.
- `httpx/_exceptions.py:123-160` narrows transport failures into `TransportError` and timeout-specific subclasses.
- `httpx/_exceptions.py:167-178` continues the layering into network-specific failures.
### Why it matters
Repeated signal: the exception tree is organized around what callers do next:
- catch one broad library exception for "request failed somehow"
- catch a narrower transport or timeout subtype for retry/backoff behavior
- still access `exc.request` when request context has been attached
## Testing and embedding happen at the transport boundary
### Repeated evidence
- `httpx/_transports/asgi.py:63-97` exposes `ASGITransport` as a first-class transport for routing requests directly into an ASGI app.
- `httpx/_transports/asgi.py:78-83` explicitly calls out `raise_app_exceptions=False` for testing 500 responses instead of surfacing app exceptions immediately.
- `httpx/_transports/mock.py:15-43` defines `MockTransport` as a shared sync/async seam that accepts a handler and adapts it through the transport interface.
### Why it matters
Repeated signal: HTTPX prefers substitutable transports over monkeypatching internal request code. That is a strong pattern for any client library that needs both real I/O and test-time embedding.
### Caveat / counterexample
Transport seams are great for boundary tests, but they are not a full replacement for end-to-end network tests. The strong pattern is to make the boundary swappable, not to pretend boundary tests cover every production behavior.
## Pattern candidates supported by this repo
- split sync and async public APIs into separate types
- keep constructor shapes parallel across sync/async variants
- design exception trees around recovery decisions
- expose transport seams for testing, embedding, and alternate runtimes
+56
View File
@@ -0,0 +1,56 @@
# Pydantic source notes
Repo: `pydantic/pydantic`
Local checkout: `/home/ubuntu/repos/rodin-sources/pydantic`
## Why this repo is useful
- Pydantic is a strong source for boundary-object patterns: validating incoming data, preserving typed state, and serializing back out explicitly.
- It is also useful for validation-hook design because the docs distinguish several validator phases and call out their tradeoffs clearly.
## Models are explicit validation + serialization boundaries
### Repeated evidence
- `pydantic/main.py:119-145` defines `BaseModel` as the central abstraction and documents that models carry schema, field metadata, and decorator metadata.
- `pydantic/main.py:201-205` explicitly exposes model-level serializer and validator machinery as core parts of the abstraction.
- `docs/index.md:68-82` shows external data entering through model construction.
- `docs/index.md:82-89` immediately turns the model back into a plain data structure with `model_dump()`.
- `docs/index.md:109-152` shows invalid boundary data raising `ValidationError` with structured per-field errors instead of silently degrading.
### Why it matters
Repeated signal: Pydantic models are meant to sit at I/O boundaries. Input is validated/coerced at construction time; output is serialized through an explicit dump step.
### Caveat / counterexample
The strong pattern is not "models are your whole domain model." The evidence here is boundary-oriented: construct from external data, then call `model_dump()` when leaving the boundary again.
## Validators are narrow and phase-aware
### Repeated evidence
- `docs/concepts/validators.md:91-114` shows an `after` field validator that checks one parsed field and must return the validated value.
- `docs/concepts/validators.md:160-167` explains that `before` validators run prior to internal parsing and therefore receive raw input.
- `docs/concepts/validators.md:220-252` demonstrates a `before` validator that reshapes raw input and then lets normal item validation continue.
### Why it matters
Repeated signal: the best validator hooks are small in scope and explicit about phase:
- `before` for raw-input normalization
- `after` for post-parse invariants
This prevents validation logic from becoming an opaque second parser.
## Validator mode choice has real behavioral consequences
### Repeated evidence
- `docs/concepts/validators.md:160-164` warns that `before` validators should avoid careless mutation when raising later, especially with unions.
- `docs/concepts/validators.md:254-255` states that `plain` validators terminate validation immediately.
- `docs/concepts/validators.md:273-283` shows the consequence directly: a `PlainValidator` can return `'invalid'` for a field annotated as `int`, and Pydantic will accept it.
### Why it matters
Repeated signal: validator mode is not just an implementation detail. It changes whether core type validation still runs.
### Caveat / counterexample
This is the sharpest anti-pattern in the repo: `plain` validators are powerful, but they can bypass the type guarantee a reader expects from the annotation. Use them only when terminating validation is the actual goal.
## Pattern candidates supported by this repo
- use typed models at I/O boundaries
- serialize explicitly with `model_dump()`
- keep validators field-scoped and phase-aware
- treat `plain` validators as an escape hatch, not the default
+50
View File
@@ -0,0 +1,50 @@
# Pytest source notes
Repo: `pytest-dev/pytest`
Local checkout: `/home/ubuntu/repos/rodin-sources/pytest`
## Why this repo is useful
- Pytest is a strong source for package-facade patterns and lifecycle-heavy test helper patterns.
- It is especially useful because the public package is deliberately small compared to the internal `_pytest.*` implementation tree.
## The public package is a curated facade over internal modules
### Repeated evidence
- `src/pytest/__init__.py:6-92` re-exports the supported testing API from many `_pytest.*` implementation modules.
- `src/pytest/__init__.py:98-186` defines `__all__` explicitly, turning the package root into a maintained public surface.
- `src/pytest/__init__.py:23-30` includes compatibility-minded exports like `yield_fixture`, showing that the facade also absorbs historical API pressure.
### Why it matters
Repeated signal: large libraries can keep internal structure fluid while giving users one stable import surface. The top-level package behaves like an API contract, not just a mirror of file layout.
### Caveat / counterexample
This pattern does create maintenance pressure: once the facade exports a name, deprecating or removing it becomes a public compatibility event. Pytest accepts that tradeoff intentionally.
## Fixture cleanup is modeled as an explicit lifetime boundary
### Repeated evidence
- `testing/test_monkeypatch.py:17-23` defines a fixture that snapshots global state, `yield`s the resource, then restores state after the test.
- `testing/test_threadexception.py:84-90` uses a `yield` fixture where the teardown action intentionally runs after test execution.
- `doc/en/how-to/fixtures.rst:551-553` marks `yield` fixtures as the recommended path.
- `doc/en/how-to/fixtures.rst:669-677` contrasts them with `addfinalizer`, explicitly framing finalizers as the alternative when needed.
### Why it matters
Repeated signal: pytest prefers resource lifetime that is legible in source order: setup before `yield`, teardown after `yield`. That shape scales better than hidden cleanup hooks.
### Caveat / counterexample
`request.addfinalizer(...)` still exists for cases where teardown must be registered dynamically, but pytest's own docs present it as the less straightforward option. That is important evidence that `yield` fixtures are the convention, not just one possible style.
## Failure expectations are explicit context boundaries
### Repeated evidence
- `testing/test_monkeypatch.py:31-32`, `testing/test_monkeypatch.py:50-51`, and `testing/test_monkeypatch.py:76-85` repeatedly use `with pytest.raises(...)` around the exact failing operation.
- `testing/acceptance_test.py:513-514` and `testing/test_pluginmanager.py:254-255` show the same shape in broader integration tests.
### Why it matters
Repeated signal: pytest encourages failure expectations that wrap the smallest relevant operation, keeping the expected failure boundary local and visible.
## Pattern candidates supported by this repo
- expose a stable top-level facade over private implementation packages
- use explicit `__all__`/re-export curation for public APIs
- model test resource lifetime with `yield` fixtures
- express expected failures with tight context-manager boundaries
+49
View File
@@ -0,0 +1,49 @@
# SQLAlchemy source notes
Repo: `sqlalchemy/sqlalchemy`
Local checkout: `/home/ubuntu/repos/rodin-sources/sqlalchemy`
## Why this repo is useful
- SQLAlchemy is a strong source for persistence-boundary patterns: explicit session lifetime, transaction visibility, and parallel sync/async APIs.
- It is especially useful because the examples make lifecycle boundaries visible in ordinary calling code rather than hiding them in framework glue.
## Sync and async persistence APIs are parallel but distinct
### Repeated evidence
- `examples/asyncio/async_orm.py:15-18` imports async-specific engine and session primitives.
- `examples/asyncio/async_orm.py:51-67` builds an async engine and `async_sessionmaker(...)`, then enters an async session and transaction block explicitly.
- `examples/inheritance/joined.py:7-17` imports sync engine/session primitives separately.
- `examples/inheritance/joined.py:90-120` uses `Session(engine)` with explicit add/commit calls in the synchronous path.
### Why it matters
Repeated signal: SQLAlchemy does not pretend sync and async persistence are the same execution model. The APIs are conceptually parallel, but the entrypoints stay distinct.
### Caveat / counterexample
The useful pattern is not "keep two unrelated APIs." `examples/asyncio/async_orm.py:78-79` explicitly notes that async execution uses the same 2.0-style ORM execution concepts as the sync API. The separation is at runtime model and lifecycle, not at overall mental model.
## Session and transaction lifetime are made visible in calling code
### Repeated evidence
- `examples/asyncio/async_orm.py:56-59` uses `engine.begin()` blocks for schema setup.
- `examples/asyncio/async_orm.py:65-74` nests `session.begin()` inside `async_session()` so write scope is easy to see.
- `examples/inheritance/joined.py:93-120` uses `with Session(engine) as session:` and makes the write boundary explicit with `session.add(...)` followed by `session.commit()`.
- `examples/inheritance/joined.py:133-135` shows a later mutation followed by another explicit `session.commit()` rather than hidden autoflush-as-commit semantics.
### Why it matters
Repeated signal: SQLAlchemy favors visible unit-of-work boundaries. You can usually point to the exact lines where a session starts, a transaction begins, and persistence becomes durable.
## Async examples surface async-specific caveats instead of hiding them
### Repeated evidence
- `examples/asyncio/async_orm.py:61-63` calls out `expire_on_commit=False` and explains the post-commit attribute-expiration consequence directly.
- `examples/asyncio/async_orm.py:75-80` notes that eager loading should be applied for relationship loading in the async example.
- `examples/asyncio/async_orm.py:106-107` shows the explicit `AsyncAttrs` path for lazy-loaded relationships via `awaitable_attrs`.
### Why it matters
Repeated signal: the async API is not just a renamed sync API. SQLAlchemy documents where async changes loading and object-lifetime behavior, which is exactly the kind of caveat future synthesis should preserve.
## Pattern candidates supported by this repo
- keep sync and async persistence entrypoints distinct but conceptually parallel
- make session and transaction scope visible in user code
- use explicit commit boundaries for writes
- preserve async-specific loading/lifecycle caveats rather than smoothing them over