Files
elixir-patterns/patterns/changesets.md
T
aweiker 10218813d3 docs: backfill TOC + decision trees, fix review findings
- Add ## Contents and ## Decision Tree to all 10 existing pattern files
- Fix embed_as/1 semantics inversion in types.md (:self → :dump)
- Fix fabricated __meta__.changes reference in changesets.md
- Fix default primary key type (:integer → :id) in schemas.md
- Combine @impl subsections into single "Minimal Callback Annotation"
2026-05-01 22:13:35 -07:00

1082 lines
41 KiB
Markdown

# Changeset Patterns in Ecto
Patterns extracted from Ecto's source code for building safe, composable data pipelines.
## Contents
1. [`cast/4` — The External/Internal Data Boundary](#1-cast4--the-externalinternal-data-boundary)
2. [`change/2` — Internal-Only Modifications](#2-change2--internal-only-modifications)
3. [Validation Pipeline — Composable Validators](#3-validation-pipeline--composable-validators)
4. [`validate_change/3` — Custom Validators](#4-validate_change3--custom-validators)
5. [`add_error/4` — Manual Error Injection](#5-add_error4--manual-error-injection)
6. [`put_change/3` vs `force_change/3` — Tracked vs Forced Changes](#6-put_change3-vs-force_change3--tracked-vs-forced-changes)
7. [Constraints vs Validations — DB-level Safety](#7-constraints-vs-validations--db-level-safety)
8. [`prepare_changes/2` — Last-Mile DB-Aware Transforms](#8-prepare_changes2--last-mile-db-aware-transforms)
9. [`apply_action/2` — Schemaless Validation](#9-apply_action2--schemaless-validation)
10. [`cast_assoc/3` vs `put_assoc/4` — External vs Internal Association Changes](#10-cast_assoc3-vs-put_assoc4--external-vs-internal-association-changes)
---
## 1. `cast/4` — The External/Internal Data Boundary
**Source:** [lib/ecto/changeset.ex#L729](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L729)
**What it does:** `cast/4` is the entry point for all externally-supplied data (user input, API params, form submissions). It converts string keys to atom keys, coerces string values to the schema's declared types, and records which fields were changed. The permitted-fields list is an explicit allow-list — unlisted fields are silently dropped.
```elixir
# From the Ecto docs (lib/ecto/changeset.ex line 108-115):
def changeset(user, params \\ %{}) do
user
|> cast(params, [:name, :email, :age])
|> validate_required([:name, :email])
|> validate_format(:email, ~r/@/)
|> validate_inclusion(:age, 18..100)
|> unique_constraint(:email)
end
```
**Why:** External data is untrusted and untyped (form submissions arrive as strings). `cast/4` establishes a hard boundary: everything outside the permitted list is discarded, and everything inside is type-checked against the schema. Using `cast/4` for external data and `change/2` for internal data makes the trust boundary explicit and auditable — grep for `cast(` to find every place user data enters the system.
**Anti-pattern:** Using `change/2` for external data bypasses type coercion and the allow-list:
```elixir
# BAD — no type coercion, no field filtering, all params accepted
def changeset(user, params) do
user
|> change(params)
|> validate_required([:name, :email])
end
```
### When to Use
**Triggers:**
- Data originates from a controller `params` map, an API request body, or any user-submitted form
- Field values arrive as strings that need type coercion (integers, booleans, dates)
- You need to whitelist which fields a user is allowed to set
**Example — before:**
```elixir
def update_profile(user, attrs) do
# Directly merging external attrs skips coercion and allows mass assignment
changeset = Ecto.Changeset.change(user, attrs)
Repo.update(changeset)
end
```
**Example — after:**
```elixir
def update_profile(user, attrs) do
user
|> Ecto.Changeset.cast(attrs, [:name, :bio, :avatar_url])
|> Ecto.Changeset.validate_required([:name])
|> Repo.update()
end
```
### When NOT to Use
**Don't use this when:**
- The data originates from application code (system-generated timestamps, counters, foreign keys computed by your own logic)
- The values are already the correct Elixir types and have been validated elsewhere
- You're building a test fixture and want to set fields directly
**Over-application example:**
```elixir
# Overkill — casting internal data that's already correctly typed
def mark_confirmed(user) do
user
|> cast(%{confirmed_at: DateTime.utc_now()}, [:confirmed_at])
|> Repo.update!()
end
```
**Better alternative:**
```elixir
def mark_confirmed(user) do
user
|> Ecto.Changeset.change(confirmed_at: DateTime.utc_now())
|> Repo.update!()
end
```
**Why:** `cast/4` has a runtime cost (string key normalization, type coercion, allow-list filtering) that adds no value when the data is already correctly typed. It also signals to readers that external data is flowing in — misusing it for internal data erodes that signal.
---
## 2. `change/2` — Internal-Only Modifications
**Source:** [lib/ecto/changeset.ex#L491](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L491)
**What it does:** `change/2` wraps a struct in a changeset (or merges changes into an existing one) without any type coercion. It accepts a map or keyword list of already-typed values. No allow-list filtering occurs — all given fields are set directly.
```elixir
# Used for trusted, already-typed internal data
def activate_subscription(sub) do
sub
|> Ecto.Changeset.change(
status: :active,
activated_at: DateTime.utc_now(),
trial_ends_at: nil
)
|> Repo.update!()
end
```
**Why:** When your application computes a value (a system timestamp, a foreign key returned from a previous query, a status atom), you already know its type. Running it through `cast/4` would add coercion work that can't succeed or fail in a meaningful way — the value either matches the schema type already or it's a bug in your code. `change/2` communicates "I trust this data" to readers.
**Anti-pattern:** Using `cast/4` for programmatic data creates phantom external-data semantics:
```elixir
# BAD — cast signals "user data" but this comes from application logic
def set_password_reset_token(user, token) do
user
|> cast(%{reset_token: token, reset_token_expires_at: hours_from_now(2)}, [:reset_token, :reset_token_expires_at])
|> Repo.update!()
end
```
### When to Use
**Triggers:**
- Values come from your own application code, not from user input
- Types are already correct (atom status values, DateTime structs, integer IDs from prior queries)
- You're updating fields that users are never allowed to set directly (internal audit fields, system state)
**Example — before:**
```elixir
def record_login(user) do
user
|> cast(%{last_login_at: DateTime.utc_now(), login_count: user.login_count + 1}, [:last_login_at, :login_count])
|> Repo.update!()
end
```
**Example — after:**
```elixir
def record_login(user) do
user
|> Ecto.Changeset.change(
last_login_at: DateTime.utc_now(),
login_count: user.login_count + 1
)
|> Repo.update!()
end
```
### When NOT to Use
**Don't use this when:**
- Any part of the data originates from user input or an external API response
- You need type coercion (e.g., converting a string "true" to boolean `true`)
- You want validation errors to propagate via the changeset (use `cast/4` + validators)
**Over-application example:**
```elixir
# Using change/2 for external data bypasses all safety checks
def create_user(params) do
%User{}
|> Ecto.Changeset.change(params) # params from controller — untrusted strings!
|> Repo.insert()
end
```
**Better alternative:**
```elixir
def create_user(params) do
%User{}
|> Ecto.Changeset.cast(params, [:name, :email, :password])
|> Ecto.Changeset.validate_required([:name, :email, :password])
|> Repo.insert()
end
```
**Why:** `change/2` skips type coercion entirely. Passing untyped external params through `change/2` means a string `"true"` stays a string instead of becoming boolean `true`, and a string `"42"` stays a string instead of becoming integer `42`. Schema inserts will either fail with a confusing DB error or silently store the wrong type.
---
## 3. Validation Pipeline — Composable Validators
**Source:** [lib/ecto/changeset.ex#L2596](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L2596) (`validate_required`), [#L2920](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L2920) (`validate_format`), [#L3083](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L3083) (`validate_length`), [#L3275](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L3275) (`validate_number`), [#L3451](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L3451) (`validate_inclusion`), [#L2987](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L2987) (`validate_subset`)
**What it does:** Every built-in validator takes a changeset as its first argument and returns a changeset, enabling pipe composition. Validators accumulate errors in `changeset.errors` rather than raising. Once `valid?` is false, subsequent validators still run (to collect all errors at once) — except `validate_required`, which must pass for dependent validators to be useful.
```elixir
def changeset(product, params) do
product
|> cast(params, [:name, :price, :category, :tags])
|> validate_required([:name, :price])
|> validate_length(:name, min: 2, max: 100)
|> validate_number(:price, greater_than: 0)
|> validate_inclusion(:category, ~w(electronics clothing food))
|> validate_subset(:tags, ~w(sale featured new))
|> validate_format(:name, ~r/\A[[:print:]]+\z/, message: "must be printable characters")
end
```
**Why:** The pipe model means you can add, remove, or reorder validators without changing surrounding code. Each validator is independent — it doesn't know about others. This lets you compose context-specific changeset functions (e.g., `admin_changeset/2` permitting extra fields, then calling the same validators as `user_changeset/2`). Validators run entirely in-process and BEFORE constraint checks, so users get fast feedback without a DB round-trip.
**Anti-pattern:** Manually checking fields and accumulating errors with ad-hoc conditionals:
```elixir
# BAD — imperative validation, hard to extend, misses Ecto's error format
def validate(params) do
errors = []
errors = if params[:name] == nil, do: [{:name, "can't be blank"} | errors], else: errors
errors = if String.length(params[:name] || "") < 2, do: [{:name, "too short"} | errors], else: errors
errors
end
```
### When to Use
**Triggers:**
- After `cast/4`, you need to verify format, range, presence, or membership constraints
- You want all validation errors reported in a single pass (not stop on first failure)
- You're building a changeset function that other modules or contexts will reuse
**Example — before:**
```elixir
def create_account(params) do
changeset = cast(%Account{}, params, [:username, :age])
if get_change(changeset, :username) == nil do
add_error(changeset, :username, "can't be blank")
else
changeset
end
end
```
**Example — after:**
```elixir
def changeset(account, params) do
account
|> cast(params, [:username, :age])
|> validate_required([:username])
|> validate_length(:username, min: 3, max: 20)
|> validate_number(:age, greater_than_or_equal_to: 18)
end
```
### When NOT to Use
**Don't use this when:**
- Validation requires a DB query (use a constraint instead, or `validate_change/3` with a query and accept the race condition risk)
- The rule is so context-specific it only applies once (inline `add_error/4` may be cleaner than a full custom validator)
- You're validating data that will never be persisted (prefer pure functions returning `{:ok, _} | {:error, _}`)
**Over-application example:**
```elixir
# Validating internal system config that can never be wrong at runtime
def system_config_changeset(config, attrs) do
config
|> cast(attrs, [:timeout_ms, :pool_size])
|> validate_number(:timeout_ms, greater_than: 0)
|> validate_number(:pool_size, greater_than: 0)
# Config comes from compile-time env vars checked at boot — this is noise
end
```
**Better alternative:**
```elixir
# Validate config at application boot with a clear error, not at runtime
def validate_config!(timeout_ms, pool_size) do
unless is_integer(timeout_ms) and timeout_ms > 0 do
raise ArgumentError, "timeout_ms must be a positive integer, got: #{inspect(timeout_ms)}"
end
end
```
**Why:** Changeset validators are designed for user-facing feedback in multi-error scenarios. For internal configuration validated once at boot, a plain `raise` with a clear message is faster to write, easier to understand, and appropriate for code that should crash if misconfigured.
---
## 4. `validate_change/3` — Custom Validators
**Source:** [lib/ecto/changeset.ex#L2508](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L2508)
**What it does:** The primitive that all built-in validators delegate to. Takes a `(field, value) -> [{field, message}]` function. If the field has no change (not in `changeset.changes`), the validator is skipped entirely. Returning `[]` means valid; returning a non-empty list adds errors and sets `valid?` to false.
```elixir
# From lib/ecto/changeset.ex lines 2508-2528:
def validate_change(%Changeset{} = changeset, field, validator) when is_atom(field) do
%{changes: changes, types: types, errors: errors} = changeset
ensure_field_exists!(changeset, types, field)
value = Map.get(changes, field)
new = if is_nil(value), do: [], else: validator.(field, value)
case new do
[] -> changeset
[_ | _] -> %{changeset | errors: new ++ errors, valid?: false}
end
end
```
**Why:** The "only called if the field has a change" behavior is load-bearing. On an update changeset where the user didn't touch a field, validators for that field don't run — this avoids re-validating unchanged data and prevents false errors when partial updates are allowed. If you need to validate the current value regardless of whether it changed, fetch from the struct directly.
**Anti-pattern:** Reimplementing the "has change?" check inside the validator:
```elixir
# BAD — redundant check; validate_change already skips if :sku is unchanged
def changeset(item, params) do
changeset = cast(item, params, [:sku])
if Map.has_key?(changeset.changes, :sku) do
value = get_change(changeset, :sku)
if valid_sku?(value), do: changeset, else: add_error(changeset, :sku, "invalid format")
else
changeset
end
end
```
### When to Use
**Triggers:**
- None of the built-in validators (`validate_format`, `validate_length`, etc.) fit your rule
- Your validation logic requires calling a helper function with domain knowledge
- You're building a reusable custom validator to share across multiple changeset functions
**Example — before:**
```elixir
def changeset(order, params) do
order
|> cast(params, [:quantity, :unit])
|> validate_change(:unit, fn :unit, value ->
if value in ["kg", "lb", "oz", "g"] do
[]
else
[{:unit, "must be a known unit of weight"}]
end
end)
end
```
**Example — after:**
```elixir
@weight_units ~w(kg lb oz g)
def validate_weight_unit(changeset, field) do
validate_change(changeset, field, fn _, value ->
if value in @weight_units do
[]
else
[{field, {"must be a known unit of weight, got: %{value}", [value: value]}}]
end
end)
end
def changeset(order, params) do
order
|> cast(params, [:quantity, :unit])
|> validate_weight_unit(:unit)
end
```
### When NOT to Use
**Don't use this when:**
- A built-in validator already covers the rule (`validate_inclusion` would handle the weight unit example above)
- The validation requires a DB query — use a constraint or accept that you need a repo call
- You need to validate the stored value (not the change) — read from the struct instead
**Over-application example:**
```elixir
# validate_change for something validate_format handles natively
validate_change(changeset, :email, fn :email, value ->
if String.match?(value, ~r/@/), do: [], else: [{:email, "must contain @"}]
end)
```
**Better alternative:**
```elixir
validate_format(changeset, :email, ~r/@/)
```
**Why:** Built-in validators produce consistent error message structures (with interpolation keys for translation), are well-documented, and are immediately recognizable to Ecto users. Reach for `validate_change/3` only when you need logic the built-ins cannot express.
---
## 5. `add_error/4` — Manual Error Injection
**Source:** [lib/ecto/changeset.ex#L2460](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L2460)
**What it does:** Directly adds an error tuple to `changeset.errors` and sets `valid?` to false. Takes `(changeset, key, message, opts)` where `opts` typically contains `validation:` metadata. Unlike `validate_change/3`, it always adds the error — there is no "only if the field changed" guard.
```elixir
# Adding a cross-field validation error
def changeset(event, params) do
event
|> cast(params, [:start_date, :end_date])
|> validate_required([:start_date, :end_date])
|> validate_date_range()
end
defp validate_date_range(changeset) do
start_date = get_field(changeset, :start_date)
end_date = get_field(changeset, :end_date)
if start_date && end_date && Date.compare(end_date, start_date) == :lt do
add_error(changeset, :end_date, "must be on or after start date")
else
changeset
end
end
```
**Why:** Some validation rules span multiple fields or depend on context that isn't available to `validate_change/3`'s field-scoped callback. `add_error/4` gives you an escape hatch to inject errors after you've already determined the problem, without needing to wrap your logic in a `validate_change/3` call that only has access to one field's value.
**Anti-pattern:** Using `add_error/4` unconditionally (forgetting the conditional):
```elixir
# BAD — always adds the error, making the changeset permanently invalid
def changeset(user, params) do
user
|> cast(params, [:role])
|> add_error(:role, "admin role requires approval") # Fires even for non-admin roles!
end
```
### When to Use
**Triggers:**
- Validation logic spans two or more fields (start/end dates, password/confirmation, price/discount)
- You've already computed whether an error applies (result of a service call, a complex business rule) and just need to record it
- You're building a wrapper that receives errors from an external system and maps them onto changeset fields
**Example — before:**
```elixir
def changeset(subscription, params) do
subscription
|> cast(params, [:plan, :seats])
|> validate_change(:seats, fn :seats, seats ->
plan = get_field(subscription, :plan) # Can't access changeset here
max = plan_seat_limit(plan)
if seats > max, do: [{:seats, "exceeds limit for #{plan} plan"}], else: []
end)
end
```
**Example — after:**
```elixir
def changeset(subscription, params) do
subscription
|> cast(params, [:plan, :seats])
|> validate_required([:plan, :seats])
|> validate_seat_limit()
end
defp validate_seat_limit(changeset) do
plan = get_field(changeset, :plan)
seats = get_field(changeset, :seats)
if plan && seats && seats > plan_seat_limit(plan) do
add_error(changeset, :seats, "exceeds limit for %{plan} plan", plan: plan)
else
changeset
end
end
```
### When NOT to Use
**Don't use this when:**
- A built-in validator covers the rule — `add_error/4` bypasses standard validation metadata
- The error applies to only one field and doesn't depend on other fields (use `validate_change/3`)
- You want the error to be skipped if the field hasn't changed (use `validate_change/3`)
**Over-application example:**
```elixir
# Manually adding what validate_required already does
def changeset(user, params) do
cs = cast(%User{}, params, [:email])
if get_change(cs, :email) == nil do
add_error(cs, :email, "can't be blank")
else
cs
end
end
```
**Better alternative:**
```elixir
def changeset(user, params) do
user
|> cast(params, [:email])
|> validate_required([:email])
end
```
**Why:** Built-in validators embed the `validation:` metadata key in the error opts, which Ecto uses internally and Phoenix uses for error rendering. `add_error/4` without this metadata can cause unexpected behavior with form helpers that inspect error opts.
---
## 6. `put_change/3` vs `force_change/3` — Tracked vs Forced Changes
**Source:** [lib/ecto/changeset.ex#L1959](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L1959) (`put_change`), [#L2248](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L2248) (`force_change`)
**What it does:** `put_change/3` compares the new value against the current struct field value; if they are equal, it does NOT add the change (the field stays absent from `changeset.changes`). `force_change/3` always adds the field to `changeset.changes` regardless of whether the value differs.
```elixir
user = %User{name: "Alice", role: :member}
cs = Ecto.Changeset.change(user)
# put_change: no-op if value matches current
cs1 = Ecto.Changeset.put_change(cs, :name, "Alice")
cs1.changes #=> %{} (no change recorded)
# force_change: always records the change
cs2 = Ecto.Changeset.force_change(cs, :name, "Alice")
cs2.changes #=> %{name: "Alice"} (change recorded, triggers DB write)
```
**Why:** Ecto uses `changeset.changes` to build the SQL `UPDATE` statement — only changed fields are included. If `put_change/3` detects no actual change, the field is excluded from the UPDATE, saving a write and avoiding spurious `updated_at` bumps. `force_change/3` marks the changeset dirty unconditionally, which triggers a DB write even if the row would be identical afterward. Default to `put_change/3`; reach for `force_change/3` only when you specifically need the DB write to happen (e.g., triggering a DB trigger, bumping `updated_at` intentionally).
**Anti-pattern:** Using `force_change/3` by default causes unnecessary DB writes on every update:
```elixir
# BAD — forces every field into changes even when nothing actually changed
def normalize_user(changeset) do
name = get_field(changeset, :name)
force_change(changeset, :name, String.trim(name || ""))
end
```
### When to Use `put_change/3`
**Triggers:**
- Setting a field to a computed value that might or might not differ from the current value
- Normalizing input (trim whitespace, downcase email) — the result may equal what's already stored
- Any programmatic field update where skipping unnecessary DB writes is correct behavior
**Example — before:**
```elixir
def normalize(changeset) do
email = get_field(changeset, :email)
# force_change even if email is already lowercase
force_change(changeset, :email, String.downcase(email || ""))
end
```
**Example — after:**
```elixir
def normalize(changeset) do
email = get_field(changeset, :email)
put_change(changeset, :email, String.downcase(email || ""))
end
```
### When NOT to Use `put_change/3` (use `force_change/3`)
**Don't use `put_change/3` when:**
- You need to unconditionally bump `updated_at` even when no data changed
- A DB trigger or audit log depends on the UPDATE statement firing regardless
- You're implementing optimistic locking and need to force a version increment
**Over-application example:**
```elixir
# Using force_change everywhere "just to be safe"
def admin_update(user, params) do
user
|> cast(params, [:name, :email, :role])
|> then(fn cs ->
cs.changes
|> Enum.reduce(cs, fn {field, value}, acc ->
force_change(acc, field, value) # Redundant — changes are already in changeset.changes
end)
end)
end
```
**Better alternative:**
```elixir
def admin_update(user, params) do
user
|> cast(params, [:name, :email, :role])
|> validate_required([:name, :email])
end
```
**Why:** `cast/4` already correctly populates `changeset.changes` with only the fields that differ. Calling `force_change/3` on top of a cast changeset doubles the work and makes every update unconditionally dirty, preventing Ecto's change-detection optimization from functioning.
---
## 7. Constraints vs Validations — DB-level Safety
**Source:** [lib/ecto/changeset.ex#L3916](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L3916) (`unique_constraint`), [#L3987](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L3987) (`foreign_key_constraint`), [#L3777](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L3777) (`check_constraint`)
**What it does:** Constraint functions register a DB error name → changeset error mapping. They do NOT run a query themselves. When `Repo.insert/2` or `Repo.update/2` raises a DB constraint violation, Ecto catches it and converts it into a changeset error on the registered field, returning `{:error, changeset}` instead of raising.
```elixir
# From lib/ecto/changeset.ex (unique_constraint/3):
def unique_constraint(changeset, [first_field | _] = fields, opts) do
name = opts[:name] || unique_index_name(changeset, fields)
message = constraint_message(opts, "has already been taken")
match_type = Keyword.get(opts, :match, :exact)
error_key = Keyword.get(opts, :error_key, first_field)
add_constraint(changeset, :unique, name, match_type, error_key, message, :unique)
end
```
**Why:** A uniqueness check via `validate_change/3` (selecting before inserting) has a race condition — two concurrent requests can both pass the select, then both attempt the insert, with one failing with an unhandled DB error. Constraints delegate to the DB's atomic enforcement. The constraint name in the changeset must exactly match the index name in the migration (Ecto generates `table_field_index` by default, or you can override with `name:`).
**Anti-pattern:** Using `unsafe_validate_unique/3` as a replacement for `unique_constraint/3`:
```elixir
# RISKY — has a TOCTOU race condition under concurrent load
def changeset(user, params) do
user
|> cast(params, [:email])
|> unsafe_validate_unique(:email, MyApp.Repo)
# Two concurrent registrations with the same email can both pass this check
end
```
### When to Use
**Triggers:**
- Uniqueness must be guaranteed across concurrent inserts (always use `unique_constraint`, not `unsafe_validate_unique` alone)
- A foreign key references a row that might not exist (use `foreign_key_constraint`)
- The DB has a check constraint that corresponds to a business rule (use `check_constraint`)
**Example — before:**
```elixir
def changeset(user, params) do
user
|> cast(params, [:email, :username])
|> validate_required([:email, :username])
# No constraint registration — DB errors bubble up as Postgrex.Error exceptions
end
```
**Example — after:**
```elixir
def changeset(user, params) do
user
|> cast(params, [:email, :username])
|> validate_required([:email, :username])
|> unique_constraint(:email)
|> unique_constraint(:username)
|> foreign_key_constraint(:organization_id)
end
```
### When NOT to Use
**Don't use this when:**
- The DB index doesn't exist — the constraint will never fire and the error is silently dropped
- You need immediate feedback before hitting the DB (use `unsafe_validate_unique` as a UX hint, then also add `unique_constraint`)
- The rule is application-level only and has no corresponding DB constraint
**Over-application example:**
```elixir
# Registering a unique_constraint for a field with no DB index
def changeset(order, params) do
order
|> cast(params, [:reference_number])
|> unique_constraint(:reference_number)
# No index in the migration — this constraint registration does nothing
end
```
**Better alternative:**
```elixir
# First, add the index in a migration:
# create unique_index(:orders, [:reference_number])
# Then register the constraint:
def changeset(order, params) do
order
|> cast(params, [:reference_number])
|> unique_constraint(:reference_number)
end
```
**Why:** If the DB index doesn't exist, the DB will never raise the expected constraint error and the `unique_constraint/3` registration is silently a no-op. Always add the migration index first, then register the constraint in the changeset.
---
## 8. `prepare_changes/2` — Last-Mile DB-Aware Transforms
**Source:** [lib/ecto/changeset.ex#L3706](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L3706)
**What it does:** Registers a callback to run inside the database transaction, just before the DB operation executes. At callback time, `changeset.repo` is populated with the Repo module. Used for side effects that must be atomic with the main insert/update (counter-caches, denormalized fields, audit log entries).
```elixir
# From lib/ecto/changeset.ex lines 3689-3703:
def create_comment(comment, params) do
comment
|> cast(params, [:body, :post_id])
|> prepare_changes(fn changeset ->
if post_id = get_change(changeset, :post_id) do
query = from Post, where: [id: ^post_id]
changeset.repo.update_all(query, inc: [comment_count: 1])
end
changeset
end)
end
```
**Why:** Counter-caches and denormalized aggregates must update atomically with the row that triggered them. If you increment `comment_count` outside the transaction, a crash between the insert and the increment leaves counts inconsistent. `prepare_changes/2` guarantees both operations commit or both roll back. The callback receives the final changeset (after all `before_*` callbacks), so it sees the actual values that will be written.
**Anti-pattern:** Performing side effects after `Repo.insert/update` outside a transaction:
```elixir
# BAD — comment is saved but counter may not update if the second call crashes
def create_comment(attrs) do
with {:ok, comment} <- Repo.insert(Comment.changeset(%Comment{}, attrs)) do
Repo.update_all(
from(p in Post, where: p.id == ^comment.post_id),
inc: [comment_count: 1]
)
{:ok, comment}
end
end
```
### When to Use
**Triggers:**
- You need to update a related row (counter-cache, denormalized field) atomically with the main operation
- The side effect requires the Repo and must be inside the transaction
- You're computing a value (e.g., slug from a title) that depends on the final changeset state after all other callbacks have run
**Example — before:**
```elixir
def create_order_item(item, params) do
case Repo.insert(OrderItem.changeset(item, params)) do
{:ok, item} ->
# Race condition: order total might be stale if this crashes or races
Repo.update_all(
from(o in Order, where: o.id == ^item.order_id),
inc: [total_cents: item.price_cents]
)
{:ok, item}
error -> error
end
end
```
**Example — after:**
```elixir
def changeset(item, params) do
item
|> cast(params, [:price_cents, :order_id, :product_id])
|> validate_required([:price_cents, :order_id])
|> prepare_changes(fn changeset ->
if price = get_change(changeset, :price_cents) do
order_id = get_field(changeset, :order_id)
changeset.repo.update_all(
from(o in Order, where: o.id == ^order_id),
inc: [total_cents: price]
)
end
changeset
end)
end
```
### When NOT to Use
**Don't use this when:**
- The side effect doesn't need to be in the same transaction (e.g., sending an email — use `Ecto.Multi` or an after-commit hook instead)
- The computation doesn't need Repo access (use a regular `validate_change/3` or map transformation before the changeset reaches Repo)
- You need the side effect to run even when the changeset is invalid (callbacks only run when `valid?` is true)
**Over-application example:**
```elixir
# prepare_changes for something that doesn't need the transaction
def changeset(user, params) do
user
|> cast(params, [:name])
|> prepare_changes(fn changeset ->
# Email sending in a DB transaction holds the connection open — bad
Mailer.send_welcome_email(get_field(changeset, :email))
changeset
end)
end
```
**Better alternative:**
```elixir
def create_user(attrs) do
with {:ok, user} <- Repo.insert(User.changeset(%User{}, attrs)) do
Mailer.send_welcome_email(user.email)
{:ok, user}
end
end
```
**Why:** `prepare_changes/2` holds the DB connection and transaction open for its duration. Long-running operations (HTTP calls, email sending, file I/O) inside `prepare_changes/2` exhaust the connection pool. Reserve it strictly for quick DB operations that must be atomic.
---
## 9. `apply_action/2` — Schemaless Validation
**Source:** [lib/ecto/changeset.ex#L2332](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L2332)
**What it does:** Applies the changeset's changes to the underlying struct and returns `{:ok, struct}` if valid, or `{:error, changeset}` (with `action` set) if invalid. No DB operation is performed. Sets `changeset.action` to the given atom, which Phoenix form helpers use to decide whether to show error messages.
```elixir
# From lib/ecto/changeset.ex lines 2332-2338:
def apply_action(%Changeset{} = changeset, action) when is_atom(action) do
if changeset.valid? do
{:ok, apply_changes(changeset)}
else
{:error, %{changeset | action: action}}
end
end
```
**Why:** Not all changeset workflows end with a DB write. Form objects (search filters, multi-step wizards, command structs) need the full changeset pipeline (cast, validate, present errors) without ever touching a database. `apply_action/2` is the endpoint for those workflows — it returns the same `{:ok, value} | {:error, changeset}` shape that `Repo.insert/update` returns, so callers and controllers work identically regardless of whether data is persisted.
**Anti-pattern:** Checking `changeset.valid?` manually and calling `apply_changes/1` yourself:
```elixir
# BAD — duplicates apply_action logic, misses setting changeset.action
def submit_search(params) do
cs = SearchForm.changeset(%SearchForm{}, params)
if cs.valid? do
{:ok, Ecto.Changeset.apply_changes(cs)}
else
{:error, cs} # action not set — Phoenix won't show errors in the template
end
end
```
### When to Use
**Triggers:**
- You're using an embedded schema or schemaless changeset for a form that doesn't persist to the DB
- You need the `{:ok, struct} | {:error, changeset}` return shape for compatibility with standard controller patterns
- You want `changeset.action` to be set so Phoenix form helpers render inline errors correctly
**Example — before:**
```elixir
def process_filter(params) do
changeset = FilterForm.changeset(%FilterForm{}, params)
if changeset.valid? do
filters = Ecto.Changeset.apply_changes(changeset)
{:ok, filters}
else
{:error, changeset} # action not set — form errors won't render
end
end
```
**Example — after:**
```elixir
def process_filter(params) do
%FilterForm{}
|> FilterForm.changeset(params)
|> Ecto.Changeset.apply_action(:validate)
end
```
### When NOT to Use
**Don't use this when:**
- You intend to persist the data — use `Repo.insert/2` or `Repo.update/2` directly
- You need to inspect `changeset.changes` rather than the applied struct (call `apply_changes/1` only if valid, otherwise work with the changeset)
- The action atom matters to downstream code — `:validate`, `:insert`, `:update` have semantic meaning in some form libraries
**Over-application example:**
```elixir
# apply_action before persisting — redundant, Repo.insert calls apply_changes internally
def create_user(attrs) do
changeset = User.changeset(%User{}, attrs)
with {:ok, user} <- Ecto.Changeset.apply_action(changeset, :insert) do
Repo.insert(User.changeset(user, %{}))
end
end
```
**Better alternative:**
```elixir
def create_user(attrs) do
%User{}
|> User.changeset(attrs)
|> Repo.insert()
end
```
**Why:** `Repo.insert/2` already validates the changeset and applies changes internally. Calling `apply_action/2` before `Repo.insert/2` runs the changeset twice and re-builds the struct unnecessarily.
---
## 10. `cast_assoc/3` vs `put_assoc/4` — External vs Internal Association Changes
**Source:** [lib/ecto/changeset.ex#L1213](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L1213) (`cast_assoc`)
**What it does:** `cast_assoc/3` handles nested external data by invoking the associated schema's changeset function for each nested params map. `put_assoc/4` replaces the association directly with a struct or list of structs that your code has already built and trusts.
```elixir
# cast_assoc — external data flows through the association's changeset function
def changeset(post, params) do
post
|> cast(params, [:title, :body])
|> cast_assoc(:comments, with: &Comment.changeset/2)
end
# put_assoc — trusted internal data replaces the association wholesale
def assign_tags(post, tags) when is_list(tags) do
post
|> Ecto.Changeset.change()
|> Ecto.Changeset.put_assoc(:tags, tags)
end
```
**Why:** `cast_assoc/3` delegates to the association's own changeset function, which applies its own cast, validate_required, constraints, etc. This means all the safety guarantees of the association's changeset apply to nested user input. `put_assoc/4` bypasses the association's changeset entirely — it trusts that you've already validated the data. Using `put_assoc/4` with raw external params would skip all of the association's validation.
**Anti-pattern:** Using `put_assoc/4` with external/untrusted nested params bypasses validation:
```elixir
# BAD — building Comment structs from raw params skips Comment's changeset
def changeset(post, params) do
comments = Enum.map(params["comments"] || [], fn p ->
struct(Comment, %{body: p["body"], post_id: post.id})
end)
post
|> cast(params, [:title])
|> put_assoc(:comments, comments)
# Comment's validate_required(:body), validate_length(:body), etc. never run
end
```
### When to Use `cast_assoc/3`
**Triggers:**
- Nested params arrive from a user-submitted form or API body (e.g., a post with embedded comments)
- The association has its own changeset function with validations you want applied
- You need nested error messages to appear in the parent changeset's `:errors` structure
**Example — before:**
```elixir
def changeset(invoice, params) do
invoice
|> cast(params, [:number, :due_date])
|> put_assoc(:line_items, build_line_items(params["line_items"]))
end
defp build_line_items(nil), do: []
defp build_line_items(items), do: Enum.map(items, &struct(LineItem, &1))
```
**Example — after:**
```elixir
def changeset(invoice, params) do
invoice
|> cast(params, [:number, :due_date])
|> cast_assoc(:line_items, with: &LineItem.changeset/2)
end
```
### When to Use `put_assoc/4`
**Triggers:**
- You're loading existing records from the DB and associating them (e.g., tagging a post with existing Tag records)
- The associated structs were built by your application code, not by user input
- You want to replace the entire association with a pre-validated list
**Example:**
```elixir
def update_post_tags(post, tag_ids) do
tags = Repo.all(from t in Tag, where: t.id in ^tag_ids)
post
|> Ecto.Changeset.change()
|> Ecto.Changeset.put_assoc(:tags, tags)
|> Repo.update()
end
```
### When NOT to Use
**Don't use `cast_assoc` when:**
- You have pre-built, validated structs — `put_assoc` is simpler and more explicit
- The association doesn't have (or need) a changeset function
- You're building a test fixture and want direct struct assignment
**Don't use `put_assoc` when:**
- Any part of the association data came from user input — use `cast_assoc` so the association's validators run
**Over-application example:**
```elixir
# put_assoc for external data — Profile.changeset validations never run
def changeset(user, params) do
profile = struct(Profile, params["profile"] || %{})
user
|> cast(params, [:email])
|> put_assoc(:profile, profile)
end
```
**Better alternative:**
```elixir
def changeset(user, params) do
user
|> cast(params, [:email])
|> cast_assoc(:profile, with: &Profile.changeset/2)
end
```
**Why:** `put_assoc/4` with externally-sourced structs is a silent validation bypass. If `Profile.changeset/2` validates that `:bio` must be under 500 characters, `put_assoc/4` with a user-supplied profile will store any length — the validation simply never ran.
---
## Decision Tree
Use this tree when deciding which Ecto changeset function to reach for:
- **Data comes from a user, form, or API?** → Use `cast/4` (type coercion + allow-list)
- **Data comes from your application code?** → Use `change/2` (no coercion, no filtering)
- **Need to validate a value without hitting the DB?** → Use `validate_required`, `validate_format`, `validate_length`, `validate_number`, `validate_inclusion`, `validate_subset`, or `validate_change/3` for custom rules
- **Need DB-level uniqueness guarantee (race-safe)?** → Use `unique_constraint/2` (not `unsafe_validate_unique/3` alone)
- **Setting a field to a computed value that might not have changed?** → Use `put_change/3` (skips write if unchanged); use `force_change/3` only when the DB write must occur regardless
- **Need to run something inside the transaction?** → Use `prepare_changes/2`
- **Validating without persisting (form objects, embedded schemas)?** → Use `apply_action/2`
- **Nested association data comes from external params?** → Use `cast_assoc/3` (delegates to association's changeset function)
- **Nested association data is trusted internal data (loaded from DB)?** → Use `put_assoc/4`
<!-- PATTERN_COMPLETE -->