7.3 KiB
Pydantic source notes
Repo: pydantic/pydantic
Local checkout: /home/ubuntu/repos/rodin-bootstrap/upstream/pydantic
Why this repo is useful
- Pydantic is a strong source for runtime boundary-object patterns: validating incoming data, coercing raw values into typed fields, and serializing back out explicitly.
- It is also useful for validation-hook design because the docs distinguish validator phases and call out where those phases change the guarantees readers should expect.
- It is not a generic argument that every Python data object should become a
BaseModel. The strongest repeated signals are boundary-oriented.
Model construction is a validation boundary, not plain attribute assignment
Repeated evidence
pydantic/main.py:253-264saysBaseModel.__init__creates a model by parsing and validating input data and raisesValidationErrorif the input cannot form a valid model.pydantic/main.py:263-264routes construction throughself.__pydantic_validator__.validate_python(...), which is a much stronger runtime contract than normal Python object initialization.pydantic/main.py:721-768exposesmodel_validate(...) -> Selfas an explicit alternative validation entrypoint with knobs forstrict,extra,from_attributes,by_alias, andby_name.docs/index.md:68-107shows external data entering through model construction, being coerced into typed fields, and then becoming a typed model instance.docs/index.md:93-107explicitly explains coercions such as strings to integers, strings to datetimes, and bytes keys to strings.
Why it matters
Repeated signal: once you choose Pydantic, object construction is no longer just “assign the fields.” It is a boundary-crossing operation that parses, coerces, and validates raw input.
Caveat / counterexample
That makes Pydantic great for untrusted or external data. It does not automatically make it the right default for every small internal value object.
Field annotations are runtime parsing and schema instructions, not just static hints
Repeated evidence
pydantic/main.py:156-205exposes typed class-level metadata such asmodel_config,__pydantic_core_schema__,__pydantic_serializer__,__pydantic_validator__, and__pydantic_fields__.pydantic/main.py:167-168documents a synthesized__init__signature for the model.docs/index.md:93-104explains field annotations in runtime terms: requiredness, accepted/coerced input shapes, and typed container expectations.docs/index.md:99-100tiesPositiveIntdirectly to an annotated constrained type, showing that type declarations are part of the runtime contract.
Why it matters
Repeated signal: in Pydantic, changing a field annotation can change runtime acceptance, coercion, and emitted schema behavior. These annotations are not mere editor decoration.
Models are strongest as explicit boundary objects
Repeated evidence
docs/index.md:68-82starts fromexternal_dataand immediately feeds it into a model.docs/index.md:82-89then immediately usesmodel_dump()to cross back out of the model into plain data.pydantic/main.py:455-519definesmodel_dump(...)as an explicit serialization API with include/exclude, alias, unset/default/none filtering, and error-handling controls.pydantic/main.py:521-569providesmodel_dump_json(...)as the corresponding JSON-mode serialization boundary.
Why it matters
Repeated signal: Pydantic models are designed to sit at runtime boundaries where input validation and output shaping matter. The repo keeps showing “raw external data in, explicit dump back out,” not “all domain state everywhere should live inside BaseModel forever.”
Caveat / counterexample
The strong pattern is boundary ownership, not model monoculture. If an internal object only needs simple state and no runtime parsing or schema behavior, generic Python types may be clearer.
Serialization is explicit and configurable
Repeated evidence
docs/index.md:82-89usesmodel_dump()as the normal way to convert a model to a dictionary.pydantic/main.py:455-519givesmodel_dump(...)explicit controls for aliasing, partial output, omission of unset/default/none values, round-tripping, and serialization error handling.pydantic/main.py:493-496shows that serialization errors are configurable behavior, not an afterthought.
Why it matters
Repeated signal: Pydantic wants serialization to be named and explicit. That is stronger and safer than scattering hand-built dict shaping around the codebase.
Validation errors are structured boundary output
Repeated evidence
pydantic/main.py:253-257documentsValidationErroron invalid model construction.pydantic/main.py:745-749documentsValidationErroronmodel_validate(...)as well.docs/index.md:109-152shows one bad input producing a list of per-field errors withtype,loc,msg,input, and documentation URL.
Why it matters
Repeated signal: Pydantic treats invalid input as a structured parsing result, not just a plain exception string. That is part of the contract callers and outer boundaries can build on.
Validators are narrow and phase-aware
Repeated evidence
docs/concepts/validators.md:94-114shows anafterfield validator checking one parsed field and returning the validated value.docs/concepts/validators.md:160-167saysbeforevalidators run before internal parsing and validation.docs/concepts/validators.md:177-209shows abeforevalidator reshaping raw input while Pydantic still performs normal type validation afterward.docs/concepts/validators.md:201-206explicitly warns thatbeforevalidators receive arbitrary raw input and therefore must account for more cases.
Why it matters
Repeated signal: the best validator hooks are small in scope and explicit about phase:
beforefor raw-input normalizationafterfor parsed-value invariants
This keeps validator logic from becoming a second opaque parser.
Validator mode choice changes guarantees
Repeated evidence
docs/concepts/validators.md:160-164warns against careless mutation inbeforevalidators, especially when later raising errors and when unions are involved.docs/concepts/validators.md:254-255states thatplainvalidators terminate validation immediately.docs/concepts/validators.md:273-283shows the consequence directly: aPlainValidatorcan return'invalid'for a field annotated asint, and Pydantic will accept it.docs/concepts/validators.md:296-308repeats the same consequence in decorator form.
Why it matters
Repeated signal: validator mode is not a cosmetic option. It changes whether core type validation still runs.
Caveat / counterexample
This is the sharpest anti-pattern in the repo: plain validators are powerful, but they can bypass the type guarantee a reader expects from the annotation. Use them only when terminating validation is the real goal.
Pattern candidates supported by this repo
- use Pydantic models at runtime input/output boundaries
- treat model construction as a validation step, not plain assignment
- treat field annotations as runtime parsing contracts when using
BaseModel - serialize explicitly with
model_dump()/model_dump_json() - keep validators field-scoped and phase-aware
- treat
plainvalidators as an escape hatch, not the default