Initial extracted documentation set

This commit is contained in:
Rodin
2026-06-01 21:42:05 +00:00
commit 60ffec18e4
17 changed files with 1590 additions and 0 deletions
+140
View File
@@ -0,0 +1,140 @@
# Data Models
Treat data models as explicit boundary objects, and keep simple internal state in small value-like types.
## Why
A lot of messy Python comes from making one class do four jobs at once:
- validate external input
- represent internal state
- serialize output
- own persistence or business behavior
Mature codebases usually split those concerns.
The recurring pattern is:
- use explicit boundary models for validated input and output
- keep serialization explicit
- use lightweight value-like classes for simple internal state
- avoid turning every data container into a god object
## The pattern
1. Use dedicated boundary models where data enters or leaves the system.
2. Validate and serialize explicitly.
3. Use lightweight declarative classes for data-heavy, behavior-light state.
4. Keep persistence, transport, and business logic from collapsing into one model.
## When to use
Use validated boundary models when:
- data enters from HTTP, files, queues, or user input
- you need a predictable serialization contract
- callers need one clear place for validation and defaults
Use small value-like objects when:
- the object mainly carries state
- immutability helps reasoning
- behavior is narrow and local
## When not to use
Do not make every internal object a heavyweight validation model.
Do not scatter custom `to_dict()` logic across the codebase.
Do not let one model become schema, ORM wrapper, validator, service object, and side-effect manager at the same time.
## Preferred shapes
### Explicit boundary model
```python
class UserIn(BaseModel):
email: str
name: str
user = UserIn.model_validate(payload)
serialized = user.model_dump()
```
Why this works:
- validation is explicit
- serialization is explicit
- the models job is clear
### Small value-like internal class
```python
@dataclass(frozen=True)
class Duration:
start: Instant
stop: Instant
```
Why this works:
- state is simple
- mutation is constrained
- the class stays easy to trust
## Counterexamples
### Hand-rolled serialization everywhere
```python
class User:
def to_dict(self):
return {
"id": str(self.id),
"name": self.name,
"created_at": self.created_at.isoformat(),
}
```
Every model now invents its own boundary behavior.
### One class owns every concern
```python
class OrderModel:
def validate(self): ...
def save(self): ...
def send_webhook(self): ...
def render_html(self): ...
```
That is not a model. That is a junk drawer.
## Source signals
### Pydantic
- `pydantic/main.py:253-264` says `BaseModel.__init__` parses and validates input data and raises `ValidationError` on bad input.
- `pydantic/main.py:455-519` defines `model_dump(...)` as an explicit serialization step with caller-controlled include/exclude behavior.
- `pydantic/main.py:721-768` defines `model_validate(...) -> Self` as a named boundary-crossing API.
- `docs/index.md:82-107` pairs model creation with `model_dump()` instead of treating instances as already wire-ready.
- `docs/index.md:109-124` shows invalid external input failing loudly with `ValidationError`.
### Attrs
- `docs/examples.md:24-44` shows `@define` creating lightweight typed classes with generated constructor, repr, and equality behavior.
- `docs/examples.md:143-205` uses keyword-only fields to keep construction explicit at the call site.
- `docs/examples.md:209-220` uses `asdict(...)` as an intentional conversion step.
### Pytest
- `src/_pytest/timing.py:24-64` models `Instant` and `Duration` as frozen dataclasses for simple internal timing state.
## Bottom line
Boundary models should validate and serialize cleanly.
Internal models should stay small and honest.
If one model starts owning every concern in the system, split it before it turns to mud.