Initial extracted documentation set
This commit is contained in:
@@ -0,0 +1,140 @@
|
||||
# Data Models
|
||||
|
||||
Treat data models as explicit boundary objects, and keep simple internal state in small value-like types.
|
||||
|
||||
## Why
|
||||
|
||||
A lot of messy Python comes from making one class do four jobs at once:
|
||||
|
||||
- validate external input
|
||||
- represent internal state
|
||||
- serialize output
|
||||
- own persistence or business behavior
|
||||
|
||||
Mature codebases usually split those concerns.
|
||||
|
||||
The recurring pattern is:
|
||||
|
||||
- use explicit boundary models for validated input and output
|
||||
- keep serialization explicit
|
||||
- use lightweight value-like classes for simple internal state
|
||||
- avoid turning every data container into a god object
|
||||
|
||||
## The pattern
|
||||
|
||||
1. Use dedicated boundary models where data enters or leaves the system.
|
||||
2. Validate and serialize explicitly.
|
||||
3. Use lightweight declarative classes for data-heavy, behavior-light state.
|
||||
4. Keep persistence, transport, and business logic from collapsing into one model.
|
||||
|
||||
## When to use
|
||||
|
||||
Use validated boundary models when:
|
||||
|
||||
- data enters from HTTP, files, queues, or user input
|
||||
- you need a predictable serialization contract
|
||||
- callers need one clear place for validation and defaults
|
||||
|
||||
Use small value-like objects when:
|
||||
|
||||
- the object mainly carries state
|
||||
- immutability helps reasoning
|
||||
- behavior is narrow and local
|
||||
|
||||
## When not to use
|
||||
|
||||
Do not make every internal object a heavyweight validation model.
|
||||
|
||||
Do not scatter custom `to_dict()` logic across the codebase.
|
||||
|
||||
Do not let one model become schema, ORM wrapper, validator, service object, and side-effect manager at the same time.
|
||||
|
||||
## Preferred shapes
|
||||
|
||||
### Explicit boundary model
|
||||
|
||||
```python
|
||||
class UserIn(BaseModel):
|
||||
email: str
|
||||
name: str
|
||||
|
||||
user = UserIn.model_validate(payload)
|
||||
serialized = user.model_dump()
|
||||
```
|
||||
|
||||
Why this works:
|
||||
|
||||
- validation is explicit
|
||||
- serialization is explicit
|
||||
- the model’s job is clear
|
||||
|
||||
### Small value-like internal class
|
||||
|
||||
```python
|
||||
@dataclass(frozen=True)
|
||||
class Duration:
|
||||
start: Instant
|
||||
stop: Instant
|
||||
```
|
||||
|
||||
Why this works:
|
||||
|
||||
- state is simple
|
||||
- mutation is constrained
|
||||
- the class stays easy to trust
|
||||
|
||||
## Counterexamples
|
||||
|
||||
### Hand-rolled serialization everywhere
|
||||
|
||||
```python
|
||||
class User:
|
||||
def to_dict(self):
|
||||
return {
|
||||
"id": str(self.id),
|
||||
"name": self.name,
|
||||
"created_at": self.created_at.isoformat(),
|
||||
}
|
||||
```
|
||||
|
||||
Every model now invents its own boundary behavior.
|
||||
|
||||
### One class owns every concern
|
||||
|
||||
```python
|
||||
class OrderModel:
|
||||
def validate(self): ...
|
||||
def save(self): ...
|
||||
def send_webhook(self): ...
|
||||
def render_html(self): ...
|
||||
```
|
||||
|
||||
That is not a model. That is a junk drawer.
|
||||
|
||||
## Source signals
|
||||
|
||||
### Pydantic
|
||||
|
||||
- `pydantic/main.py:253-264` says `BaseModel.__init__` parses and validates input data and raises `ValidationError` on bad input.
|
||||
- `pydantic/main.py:455-519` defines `model_dump(...)` as an explicit serialization step with caller-controlled include/exclude behavior.
|
||||
- `pydantic/main.py:721-768` defines `model_validate(...) -> Self` as a named boundary-crossing API.
|
||||
- `docs/index.md:82-107` pairs model creation with `model_dump()` instead of treating instances as already wire-ready.
|
||||
- `docs/index.md:109-124` shows invalid external input failing loudly with `ValidationError`.
|
||||
|
||||
### Attrs
|
||||
|
||||
- `docs/examples.md:24-44` shows `@define` creating lightweight typed classes with generated constructor, repr, and equality behavior.
|
||||
- `docs/examples.md:143-205` uses keyword-only fields to keep construction explicit at the call site.
|
||||
- `docs/examples.md:209-220` uses `asdict(...)` as an intentional conversion step.
|
||||
|
||||
### Pytest
|
||||
|
||||
- `src/_pytest/timing.py:24-64` models `Instant` and `Duration` as frozen dataclasses for simple internal timing state.
|
||||
|
||||
## Bottom line
|
||||
|
||||
Boundary models should validate and serialize cleanly.
|
||||
|
||||
Internal models should stay small and honest.
|
||||
|
||||
If one model starts owning every concern in the system, split it before it turns to mud.
|
||||
Reference in New Issue
Block a user