Initial extracted documentation set

2026-06-01 21:42:05 +00:00
commit 60ffec18e4
17 changed files with 1590 additions and 0 deletions
@@ -0,0 +1,32 @@
+# Source Notes
+
+This directory stores the reusable evidence behind the pattern docs.
+
+## What belongs here
+
+One note per upstream repo, with:
+- why the repo is useful
+- repeated patterns, not a reading diary
+- caveats and counterexamples
+- exact `file:line` anchors
+- pattern candidates supported by the evidence
+
+## Current notes
+
+- `cpython.md`
+- `httpx.md`
+- `pytest.md`
+- `pydantic.md`
+- `attrs.md`
+- `click.md`
+- `sqlalchemy.md`
+
+## Quality bar
+
+A source note is good when it makes later synthesis cheap:
+- repeated evidence is separated from one-off examples
+- vague claims are avoided
+- citations are fast to verify
+- caveats survive instead of being flattened away
+
+Read `../PROCESS.md` for the full repeatable workflow.
@@ -0,0 +1,48 @@
+# Attrs source notes
+
+Repo: `python-attrs/attrs`
+Local checkout: `/home/ubuntu/repos/rodin-sources/attrs`
+
+## Why this repo is useful
+- `attrs` is a strong source for explicit data-carrier design: generated methods, constructor-shape control, and deliberate conversion at boundaries.
+- Its docs are especially valuable because they include caveats and failure cases, not just happy-path examples.
+
+## Declarative fields are the default for data-heavy classes
+
+### Repeated evidence
+- `docs/examples.md:24-44` shows `@define` with typed fields immediately generating constructor, repr, and equality behavior.
+- `docs/examples.md:31-44` makes the generated behavior visible at the REPL rather than implicit.
+- `docs/examples.md:51-58` shows the same declarative shape without relying on type annotations, via `field()`.
+
+### Why it matters
+Repeated signal: when a class mostly carries data, `attrs` prefers declaring fields and letting the library generate the boilerplate. The code emphasizes structure and invariants over handwritten dunder noise.
+
+### Caveat / counterexample
+`docs/examples.md:60-62` warns that mixing `field()` declarations without annotations flips `attrs` into a no-typing mode and can ignore annotation-only attributes. That is a sharp edge worth preserving in synthesis: declarative does not mean "mix styles freely."
+
+## Keyword-only fields are used to protect call-site clarity and inheritance
+
+### Repeated evidence
+- `docs/examples.md:147-157` shows `field(kw_only=True)` forcing explicit construction at the call site.
+- `docs/examples.md:159-172` shows decorator-level `@define(kw_only=True)` applying the same rule to the whole class.
+- `docs/examples.md:176-191` shows the practical inheritance payoff: subclasses can add required fields even when the base class already has defaults.
+- `docs/examples.md:193-205` shows the counterexample when `kw_only=True` is omitted: invalid attribute ordering raises a `ValueError`.
+
+### Why it matters
+Repeated signal: keyword-only fields are not just cosmetic. They are a tool for making constructor calls self-describing and for avoiding inheritance-order traps.
+
+## Serialization is explicit and filterable
+
+### Repeated evidence
+- `docs/examples.md:211-217` shows `asdict(...)` as a deliberate conversion step from object to plain data.
+- `docs/examples.md:219-235` shows `asdict(..., filter=...)` excluding sensitive fields like passwords.
+- `docs/examples.md:238-253` shows built-in include/exclude helpers for more reusable serialization control.
+
+### Why it matters
+Repeated signal: even value-like objects are not assumed to be wire-ready. `attrs` makes the serialization boundary explicit and gives callers hooks to shape what crosses it.
+
+## Pattern candidates supported by this repo
+- use declarative field definitions for data-carrier classes
+- prefer keyword-only construction when call-site clarity or inheritance safety matters
+- keep serialization as an explicit step
+- preserve caveats about mixed declaration styles and constructor ordering
@@ -0,0 +1,50 @@
+# Click source notes
+
+Repo: `pallets/click`
+Local checkout: `/home/ubuntu/repos/rodin-sources/click`
+
+## Why this repo is useful
+- Click is a strong source for CLI API design: stable top-level exports, context-passing conventions, and user-facing exception behavior.
+- It is especially useful because the implementation ties API design directly to operator experience at the terminal.
+
+## The package root is a curated facade with compatibility shims
+
+### Repeated evidence
+- `src/click/__init__.py:10-75` re-exports commands, decorators, exceptions, types, and terminal helpers from internal modules.
+- `src/click/__init__.py:77-124` uses `__getattr__` to keep deprecated compatibility names (`BaseCommand`, `MultiCommand`, `OptionParser`, `__version__`) working while emitting warnings.
+
+### Why it matters
+Repeated signal: mature libraries often keep the package root stable even while internal layout evolves. Click treats the package root as the user-facing API and places compatibility logic there deliberately.
+
+### Caveat / counterexample
+Compatibility shims are useful, but they are debt. Click's use of deprecation warnings is the important pattern: keep compatibility explicit and time-bounded rather than silently permanent.
+
+## Command state is passed through context objects, not globals
+
+### Repeated evidence
+- `docs/complex.md:53-61` explains that callbacks do not receive context unless they opt in, and that `Context.invoke` mediates invocation.
+- `docs/complex.md:92-99` shows a root command storing application state on `ctx.obj`.
+- `docs/complex.md:107-113` states directly that `Context.obj` is the place commands are supposed to remember what they need to pass to children.
+- `src/click/decorators.py:51-93` implements `make_pass_decorator(...)` by searching the linked context chain for the nearest object of the desired type and invoking the callback with it.
+
+### Why it matters
+Repeated signal: Click favors explicit, nestable context propagation over module globals or hidden singletons. That matters for complex CLIs with subcommands and plugins.
+
+### Caveat / counterexample
+`docs/complex.md:143-163` points out the interleaved-command problem: plugin layers can replace `ctx.obj`. That is why `make_pass_decorator(...)` exists; plain `pass_obj` is not always enough once commands are nested by third parties.
+
+## Exceptions encode user-visible behavior, not just categorization
+
+### Repeated evidence
+- `src/click/exceptions.py:35-65` defines `ClickException` with an exit code, cached color behavior, and a `show()` method for terminal rendering.
+- `src/click/exceptions.py:68-111` defines `UsageError` with a different exit code and help-aware rendering that prints usage plus a "Try '--help'" hint when context is available.
+- `src/click/exceptions.py:114-118` documents `BadParameter` as a subtype that gains parameter context automatically.
+
+### Why it matters
+Repeated signal: in CLI libraries, exceptions often need to carry exit semantics and presentation rules, not just messages. Click's hierarchy is built around what the operator should see next.
+
+## Pattern candidates supported by this repo
+- expose a stable package-level facade over internal modules
+- use explicit compatibility shims with deprecation warnings
+- pass CLI state through typed/named context objects rather than globals
+- design exception types around exit behavior and user guidance
@@ -0,0 +1,53 @@
+# CPython source notes
+
+Repo: `python/cpython`
+Local checkout: `/home/ubuntu/repos/rodin-sources/cpython`
+
+## Why this repo is useful
+- CPython is a strong source for patterns that survive long-term compatibility pressure.
+- It is especially useful for API-boundary choices and error-shape choices because stdlib modules must stay understandable to a huge caller base.
+- Caveat: stdlib code is not stylistically uniform. Treat repeated shapes across modules as signal; do not treat a single module's convention as "the Python way."
+
+## Public API boundaries are often declared explicitly
+
+### Repeated evidence
+- `Lib/operator.py:13-20` declares a tight `__all__` list instead of exporting every helper or alias in the module namespace.
+- `Lib/smtplib.py:55-58` does the same for the SMTP module, listing public exceptions and the `SMTP` client.
+- `Lib/warnings.py:3-18` likewise enumerates the supported warning helpers instead of relying on incidental module globals.
+
+### Why it matters
+Repeated signal: when a module has helper names, aliases, or internal scaffolding, CPython often freezes the supported surface explicitly. That makes the public API a maintained decision rather than an accident of file layout.
+
+### Caveat / counterexample
+This is common, not universal. Some stdlib modules still expose names without a curated `__all__`, so the useful pattern is not "always add `__all__`" but "use it when the file contains more than the intended public surface."
+
+## Exception trees are shaped around recovery needs, not just taxonomy
+
+### Repeated evidence
+- `Lib/smtplib.py:69-71` defines `SMTPException` as the broad catch point for the module.
+- `Lib/smtplib.py:73-86` adds semantic subtypes for unsupported commands and disconnected-session failures.
+- `Lib/smtplib.py:88-100` defines `SMTPResponseException` and stores structured payload on the exception itself via `smtp_code` and `smtp_error`.
+- `Lib/smtplib.py:102-125` and `Lib/smtplib.py:128-142` narrow further into sender, recipient, data, connect, HELO, and auth failures.
+
+### Why it matters
+Repeated signal: CPython exception trees usually give callers two useful options at once:
+- catch broadly at the subsystem boundary, or
+- branch narrowly on structured failure data when recovery differs.
+
+This is stronger than a flat set of unrelated exception classes and stronger than raising plain `OSError`/`ValueError` with message parsing.
+
+## Resource-owning helpers usually make lifetime visible
+
+### Repeated evidence
+- `Lib/contextlib.py:31-43` defines the abstract context-manager protocol directly in the stdlib.
+- `Lib/tempfile.py:487-539` wraps temporary files so `__enter__` returns the wrapper and `__exit__` guarantees cleanup.
+- `Lib/tempfile.py:758-761` also guards context entry on closed temporary files, showing that lifetime rules are enforced, not just documented.
+
+### Why it matters
+Repeated signal: when a stdlib helper owns cleanup-sensitive state, CPython prefers explicit context-manager boundaries over ambient cleanup assumptions.
+
+## Pattern candidates supported by this repo
+- declare public module surfaces explicitly when helper names would otherwise leak
+- build exception hierarchies around caller recovery paths
+- attach structured data to exceptions when callers need to branch without string parsing
+- make resource lifetime obvious with context-manager boundaries
@@ -0,0 +1,54 @@
+# HTTPX source notes
+
+Repo: `encode/httpx`
+Local checkout: `/home/ubuntu/repos/rodin-sources/httpx`
+
+## Why this repo is useful
+- HTTPX is a strong source for modern boundary-design patterns in Python libraries: sync/async separation, transport seams, and caller-oriented exception design.
+- It is especially useful because the same conceptual API is implemented twice (sync and async), making repeated design choices easy to spot.
+
+## Sync and async APIs are parallel types, not a mode flag
+
+### Repeated evidence
+- `httpx/_client.py:594-660` defines `Client` as the synchronous entrypoint with `BaseTransport` and thread-sharing semantics.
+- `httpx/_client.py:1307-1374` defines `AsyncClient` separately with the same broad constructor shape but `AsyncBaseTransport` and task-sharing semantics.
+- `httpx/_client.py:688-696` and `httpx/_client.py:1402-1410` show the same transport-initialization flow in each class, reinforcing that the APIs are intentionally parallel rather than conditionally branching inside one type.
+
+### Why it matters
+Repeated signal: mature Python networking libraries keep sync and async usage obviously separate in the type system while preserving familiar parameter shapes. That lowers cognitive overhead without hiding execution-model differences.
+
+### Caveat / counterexample
+The pattern is not "duplicate everything." HTTPX keeps shared behavior in `BaseClient`; the duplication is at the public entrypoint where transport types and calling style genuinely differ.
+
+## Exceptions are layered for catch-broadly / recover-narrowly behavior
+
+### Repeated evidence
+- `httpx/_exceptions.py:74-90` makes `HTTPError` the top-level catch point and explicitly documents `try/except httpx.HTTPError` as a supported usage pattern.
+- `httpx/_exceptions.py:107-120` defines `RequestError` for failures that occur while issuing a request and explains why request context may be attached later.
+- `httpx/_exceptions.py:123-160` narrows transport failures into `TransportError` and timeout-specific subclasses.
+- `httpx/_exceptions.py:167-178` continues the layering into network-specific failures.
+
+### Why it matters
+Repeated signal: the exception tree is organized around what callers do next:
+- catch one broad library exception for "request failed somehow"
+- catch a narrower transport or timeout subtype for retry/backoff behavior
+- still access `exc.request` when request context has been attached
+
+## Testing and embedding happen at the transport boundary
+
+### Repeated evidence
+- `httpx/_transports/asgi.py:63-97` exposes `ASGITransport` as a first-class transport for routing requests directly into an ASGI app.
+- `httpx/_transports/asgi.py:78-83` explicitly calls out `raise_app_exceptions=False` for testing 500 responses instead of surfacing app exceptions immediately.
+- `httpx/_transports/mock.py:15-43` defines `MockTransport` as a shared sync/async seam that accepts a handler and adapts it through the transport interface.
+
+### Why it matters
+Repeated signal: HTTPX prefers substitutable transports over monkeypatching internal request code. That is a strong pattern for any client library that needs both real I/O and test-time embedding.
+
+### Caveat / counterexample
+Transport seams are great for boundary tests, but they are not a full replacement for end-to-end network tests. The strong pattern is to make the boundary swappable, not to pretend boundary tests cover every production behavior.
+
+## Pattern candidates supported by this repo
+- split sync and async public APIs into separate types
+- keep constructor shapes parallel across sync/async variants
+- design exception trees around recovery decisions
+- expose transport seams for testing, embedding, and alternate runtimes
@@ -0,0 +1,56 @@
+# Pydantic source notes
+
+Repo: `pydantic/pydantic`
+Local checkout: `/home/ubuntu/repos/rodin-sources/pydantic`
+
+## Why this repo is useful
+- Pydantic is a strong source for boundary-object patterns: validating incoming data, preserving typed state, and serializing back out explicitly.
+- It is also useful for validation-hook design because the docs distinguish several validator phases and call out their tradeoffs clearly.
+
+## Models are explicit validation + serialization boundaries
+
+### Repeated evidence
+- `pydantic/main.py:119-145` defines `BaseModel` as the central abstraction and documents that models carry schema, field metadata, and decorator metadata.
+- `pydantic/main.py:201-205` explicitly exposes model-level serializer and validator machinery as core parts of the abstraction.
+- `docs/index.md:68-82` shows external data entering through model construction.
+- `docs/index.md:82-89` immediately turns the model back into a plain data structure with `model_dump()`.
+- `docs/index.md:109-152` shows invalid boundary data raising `ValidationError` with structured per-field errors instead of silently degrading.
+
+### Why it matters
+Repeated signal: Pydantic models are meant to sit at I/O boundaries. Input is validated/coerced at construction time; output is serialized through an explicit dump step.
+
+### Caveat / counterexample
+The strong pattern is not "models are your whole domain model." The evidence here is boundary-oriented: construct from external data, then call `model_dump()` when leaving the boundary again.
+
+## Validators are narrow and phase-aware
+
+### Repeated evidence
+- `docs/concepts/validators.md:91-114` shows an `after` field validator that checks one parsed field and must return the validated value.
+- `docs/concepts/validators.md:160-167` explains that `before` validators run prior to internal parsing and therefore receive raw input.
+- `docs/concepts/validators.md:220-252` demonstrates a `before` validator that reshapes raw input and then lets normal item validation continue.
+
+### Why it matters
+Repeated signal: the best validator hooks are small in scope and explicit about phase:
+- `before` for raw-input normalization
+- `after` for post-parse invariants
+
+This prevents validation logic from becoming an opaque second parser.
+
+## Validator mode choice has real behavioral consequences
+
+### Repeated evidence
+- `docs/concepts/validators.md:160-164` warns that `before` validators should avoid careless mutation when raising later, especially with unions.
+- `docs/concepts/validators.md:254-255` states that `plain` validators terminate validation immediately.
+- `docs/concepts/validators.md:273-283` shows the consequence directly: a `PlainValidator` can return `'invalid'` for a field annotated as `int`, and Pydantic will accept it.
+
+### Why it matters
+Repeated signal: validator mode is not just an implementation detail. It changes whether core type validation still runs.
+
+### Caveat / counterexample
+This is the sharpest anti-pattern in the repo: `plain` validators are powerful, but they can bypass the type guarantee a reader expects from the annotation. Use them only when terminating validation is the actual goal.
+
+## Pattern candidates supported by this repo
+- use typed models at I/O boundaries
+- serialize explicitly with `model_dump()`
+- keep validators field-scoped and phase-aware
+- treat `plain` validators as an escape hatch, not the default
@@ -0,0 +1,50 @@
+# Pytest source notes
+
+Repo: `pytest-dev/pytest`
+Local checkout: `/home/ubuntu/repos/rodin-sources/pytest`
+
+## Why this repo is useful
+- Pytest is a strong source for package-facade patterns and lifecycle-heavy test helper patterns.
+- It is especially useful because the public package is deliberately small compared to the internal `_pytest.*` implementation tree.
+
+## The public package is a curated facade over internal modules
+
+### Repeated evidence
+- `src/pytest/__init__.py:6-92` re-exports the supported testing API from many `_pytest.*` implementation modules.
+- `src/pytest/__init__.py:98-186` defines `__all__` explicitly, turning the package root into a maintained public surface.
+- `src/pytest/__init__.py:23-30` includes compatibility-minded exports like `yield_fixture`, showing that the facade also absorbs historical API pressure.
+
+### Why it matters
+Repeated signal: large libraries can keep internal structure fluid while giving users one stable import surface. The top-level package behaves like an API contract, not just a mirror of file layout.
+
+### Caveat / counterexample
+This pattern does create maintenance pressure: once the facade exports a name, deprecating or removing it becomes a public compatibility event. Pytest accepts that tradeoff intentionally.
+
+## Fixture cleanup is modeled as an explicit lifetime boundary
+
+### Repeated evidence
+- `testing/test_monkeypatch.py:17-23` defines a fixture that snapshots global state, `yield`s the resource, then restores state after the test.
+- `testing/test_threadexception.py:84-90` uses a `yield` fixture where the teardown action intentionally runs after test execution.
+- `doc/en/how-to/fixtures.rst:551-553` marks `yield` fixtures as the recommended path.
+- `doc/en/how-to/fixtures.rst:669-677` contrasts them with `addfinalizer`, explicitly framing finalizers as the alternative when needed.
+
+### Why it matters
+Repeated signal: pytest prefers resource lifetime that is legible in source order: setup before `yield`, teardown after `yield`. That shape scales better than hidden cleanup hooks.
+
+### Caveat / counterexample
+`request.addfinalizer(...)` still exists for cases where teardown must be registered dynamically, but pytest's own docs present it as the less straightforward option. That is important evidence that `yield` fixtures are the convention, not just one possible style.
+
+## Failure expectations are explicit context boundaries
+
+### Repeated evidence
+- `testing/test_monkeypatch.py:31-32`, `testing/test_monkeypatch.py:50-51`, and `testing/test_monkeypatch.py:76-85` repeatedly use `with pytest.raises(...)` around the exact failing operation.
+- `testing/acceptance_test.py:513-514` and `testing/test_pluginmanager.py:254-255` show the same shape in broader integration tests.
+
+### Why it matters
+Repeated signal: pytest encourages failure expectations that wrap the smallest relevant operation, keeping the expected failure boundary local and visible.
+
+## Pattern candidates supported by this repo
+- expose a stable top-level facade over private implementation packages
+- use explicit `__all__`/re-export curation for public APIs
+- model test resource lifetime with `yield` fixtures
+- express expected failures with tight context-manager boundaries
@@ -0,0 +1,49 @@
+# SQLAlchemy source notes
+
+Repo: `sqlalchemy/sqlalchemy`
+Local checkout: `/home/ubuntu/repos/rodin-sources/sqlalchemy`
+
+## Why this repo is useful
+- SQLAlchemy is a strong source for persistence-boundary patterns: explicit session lifetime, transaction visibility, and parallel sync/async APIs.
+- It is especially useful because the examples make lifecycle boundaries visible in ordinary calling code rather than hiding them in framework glue.
+
+## Sync and async persistence APIs are parallel but distinct
+
+### Repeated evidence
+- `examples/asyncio/async_orm.py:15-18` imports async-specific engine and session primitives.
+- `examples/asyncio/async_orm.py:51-67` builds an async engine and `async_sessionmaker(...)`, then enters an async session and transaction block explicitly.
+- `examples/inheritance/joined.py:7-17` imports sync engine/session primitives separately.
+- `examples/inheritance/joined.py:90-120` uses `Session(engine)` with explicit add/commit calls in the synchronous path.
+
+### Why it matters
+Repeated signal: SQLAlchemy does not pretend sync and async persistence are the same execution model. The APIs are conceptually parallel, but the entrypoints stay distinct.
+
+### Caveat / counterexample
+The useful pattern is not "keep two unrelated APIs." `examples/asyncio/async_orm.py:78-79` explicitly notes that async execution uses the same 2.0-style ORM execution concepts as the sync API. The separation is at runtime model and lifecycle, not at overall mental model.
+
+## Session and transaction lifetime are made visible in calling code
+
+### Repeated evidence
+- `examples/asyncio/async_orm.py:56-59` uses `engine.begin()` blocks for schema setup.
+- `examples/asyncio/async_orm.py:65-74` nests `session.begin()` inside `async_session()` so write scope is easy to see.
+- `examples/inheritance/joined.py:93-120` uses `with Session(engine) as session:` and makes the write boundary explicit with `session.add(...)` followed by `session.commit()`.
+- `examples/inheritance/joined.py:133-135` shows a later mutation followed by another explicit `session.commit()` rather than hidden autoflush-as-commit semantics.
+
+### Why it matters
+Repeated signal: SQLAlchemy favors visible unit-of-work boundaries. You can usually point to the exact lines where a session starts, a transaction begins, and persistence becomes durable.
+
+## Async examples surface async-specific caveats instead of hiding them
+
+### Repeated evidence
+- `examples/asyncio/async_orm.py:61-63` calls out `expire_on_commit=False` and explains the post-commit attribute-expiration consequence directly.
+- `examples/asyncio/async_orm.py:75-80` notes that eager loading should be applied for relationship loading in the async example.
+- `examples/asyncio/async_orm.py:106-107` shows the explicit `AsyncAttrs` path for lazy-loaded relationships via `awaitable_attrs`.
+
+### Why it matters
+Repeated signal: the async API is not just a renamed sync API. SQLAlchemy documents where async changes loading and object-lifetime behavior, which is exactly the kind of caveat future synthesis should preserve.
+
+## Pattern candidates supported by this repo
+- keep sync and async persistence entrypoints distinct but conceptually parallel
+- make session and transaction scope visible in user code
+- use explicit commit boundaries for writes
+- preserve async-specific loading/lifecycle caveats rather than smoothing them over