Files

T

aweiker 10218813d3 docs: backfill TOC + decision trees, fix review findings

- Add ## Contents and ## Decision Tree to all 10 existing pattern files
- Fix embed_as/1 semantics inversion in types.md (:self → :dump)
- Fix fabricated __meta__.changes reference in changesets.md
- Fix default primary key type (:integer → :id) in schemas.md
- Combine @impl subsections into single "Minimal Callback Annotation"

2026-05-01 22:13:35 -07:00

41 KiB

Raw Permalink Blame History

Changeset Patterns in Ecto

Patterns extracted from Ecto's source code for building safe, composable data pipelines.

cast/4 — The External/Internal Data Boundary
change/2 — Internal-Only Modifications
Validation Pipeline — Composable Validators
validate_change/3 — Custom Validators
add_error/4 — Manual Error Injection
put_change/3 vs force_change/3 — Tracked vs Forced Changes
Constraints vs Validations — DB-level Safety
prepare_changes/2 — Last-Mile DB-Aware Transforms
apply_action/2 — Schemaless Validation
cast_assoc/3 vs put_assoc/4 — External vs Internal Association Changes

1. `cast/4` — The External/Internal Data Boundary

Source: lib/ecto/changeset.ex#L729

What it does: cast/4 is the entry point for all externally-supplied data (user input, API params, form submissions). It converts string keys to atom keys, coerces string values to the schema's declared types, and records which fields were changed. The permitted-fields list is an explicit allow-list — unlisted fields are silently dropped.

# From the Ecto docs (lib/ecto/changeset.ex line 108-115):
def changeset(user, params \\ %{}) do
  user
  |> cast(params, [:name, :email, :age])
  |> validate_required([:name, :email])
  |> validate_format(:email, ~r/@/)
  |> validate_inclusion(:age, 18..100)
  |> unique_constraint(:email)
end

Why: External data is untrusted and untyped (form submissions arrive as strings). cast/4 establishes a hard boundary: everything outside the permitted list is discarded, and everything inside is type-checked against the schema. Using cast/4 for external data and change/2 for internal data makes the trust boundary explicit and auditable — grep for cast( to find every place user data enters the system.

Anti-pattern: Using change/2 for external data bypasses type coercion and the allow-list:

# BAD — no type coercion, no field filtering, all params accepted
def changeset(user, params) do
  user
  |> change(params)
  |> validate_required([:name, :email])
end

When to Use

Triggers:

Data originates from a controller params map, an API request body, or any user-submitted form
Field values arrive as strings that need type coercion (integers, booleans, dates)
You need to whitelist which fields a user is allowed to set

Example — before:

def update_profile(user, attrs) do
  # Directly merging external attrs skips coercion and allows mass assignment
  changeset = Ecto.Changeset.change(user, attrs)
  Repo.update(changeset)
end

Example — after:

def update_profile(user, attrs) do
  user
  |> Ecto.Changeset.cast(attrs, [:name, :bio, :avatar_url])
  |> Ecto.Changeset.validate_required([:name])
  |> Repo.update()
end

When NOT to Use

Don't use this when:

The data originates from application code (system-generated timestamps, counters, foreign keys computed by your own logic)
The values are already the correct Elixir types and have been validated elsewhere
You're building a test fixture and want to set fields directly

Over-application example:

# Overkill — casting internal data that's already correctly typed
def mark_confirmed(user) do
  user
  |> cast(%{confirmed_at: DateTime.utc_now()}, [:confirmed_at])
  |> Repo.update!()
end

Better alternative:

def mark_confirmed(user) do
  user
  |> Ecto.Changeset.change(confirmed_at: DateTime.utc_now())
  |> Repo.update!()
end

Why: cast/4 has a runtime cost (string key normalization, type coercion, allow-list filtering) that adds no value when the data is already correctly typed. It also signals to readers that external data is flowing in — misusing it for internal data erodes that signal.

2. `change/2` — Internal-Only Modifications

Source: lib/ecto/changeset.ex#L491

What it does: change/2 wraps a struct in a changeset (or merges changes into an existing one) without any type coercion. It accepts a map or keyword list of already-typed values. No allow-list filtering occurs — all given fields are set directly.

# Used for trusted, already-typed internal data
def activate_subscription(sub) do
  sub
  |> Ecto.Changeset.change(
    status: :active,
    activated_at: DateTime.utc_now(),
    trial_ends_at: nil
  )
  |> Repo.update!()
end

Why: When your application computes a value (a system timestamp, a foreign key returned from a previous query, a status atom), you already know its type. Running it through cast/4 would add coercion work that can't succeed or fail in a meaningful way — the value either matches the schema type already or it's a bug in your code. change/2 communicates "I trust this data" to readers.

Anti-pattern: Using cast/4 for programmatic data creates phantom external-data semantics:

# BAD — cast signals "user data" but this comes from application logic
def set_password_reset_token(user, token) do
  user
  |> cast(%{reset_token: token, reset_token_expires_at: hours_from_now(2)}, [:reset_token, :reset_token_expires_at])
  |> Repo.update!()
end

When to Use

Triggers:

Values come from your own application code, not from user input
Types are already correct (atom status values, DateTime structs, integer IDs from prior queries)
You're updating fields that users are never allowed to set directly (internal audit fields, system state)

Example — before:

def record_login(user) do
  user
  |> cast(%{last_login_at: DateTime.utc_now(), login_count: user.login_count + 1}, [:last_login_at, :login_count])
  |> Repo.update!()
end

Example — after:

def record_login(user) do
  user
  |> Ecto.Changeset.change(
    last_login_at: DateTime.utc_now(),
    login_count: user.login_count + 1
  )
  |> Repo.update!()
end

When NOT to Use

Don't use this when:

Any part of the data originates from user input or an external API response
You need type coercion (e.g., converting a string "true" to boolean true)
You want validation errors to propagate via the changeset (use cast/4 + validators)

Over-application example:

# Using change/2 for external data bypasses all safety checks
def create_user(params) do
  %User{}
  |> Ecto.Changeset.change(params)   # params from controller — untrusted strings!
  |> Repo.insert()
end

Better alternative:

def create_user(params) do
  %User{}
  |> Ecto.Changeset.cast(params, [:name, :email, :password])
  |> Ecto.Changeset.validate_required([:name, :email, :password])
  |> Repo.insert()
end

Why: change/2 skips type coercion entirely. Passing untyped external params through change/2 means a string "true" stays a string instead of becoming boolean true, and a string "42" stays a string instead of becoming integer 42. Schema inserts will either fail with a confusing DB error or silently store the wrong type.

3. Validation Pipeline — Composable Validators

Source: lib/ecto/changeset.ex#L2596 (validate_required), #L2920 (validate_format), #L3083 (validate_length), #L3275 (validate_number), #L3451 (validate_inclusion), #L2987 (validate_subset)

What it does: Every built-in validator takes a changeset as its first argument and returns a changeset, enabling pipe composition. Validators accumulate errors in changeset.errors rather than raising. Once valid? is false, subsequent validators still run (to collect all errors at once) — except validate_required, which must pass for dependent validators to be useful.

def changeset(product, params) do
  product
  |> cast(params, [:name, :price, :category, :tags])
  |> validate_required([:name, :price])
  |> validate_length(:name, min: 2, max: 100)
  |> validate_number(:price, greater_than: 0)
  |> validate_inclusion(:category, ~w(electronics clothing food))
  |> validate_subset(:tags, ~w(sale featured new))
  |> validate_format(:name, ~r/\A[[:print:]]+\z/, message: "must be printable characters")
end

Why: The pipe model means you can add, remove, or reorder validators without changing surrounding code. Each validator is independent — it doesn't know about others. This lets you compose context-specific changeset functions (e.g., admin_changeset/2 permitting extra fields, then calling the same validators as user_changeset/2). Validators run entirely in-process and BEFORE constraint checks, so users get fast feedback without a DB round-trip.

Anti-pattern: Manually checking fields and accumulating errors with ad-hoc conditionals:

# BAD — imperative validation, hard to extend, misses Ecto's error format
def validate(params) do
  errors = []
  errors = if params[:name] == nil, do: [{:name, "can't be blank"} | errors], else: errors
  errors = if String.length(params[:name] || "") < 2, do: [{:name, "too short"} | errors], else: errors
  errors
end

When to Use

Triggers:

After cast/4, you need to verify format, range, presence, or membership constraints
You want all validation errors reported in a single pass (not stop on first failure)
You're building a changeset function that other modules or contexts will reuse

Example — before:

def create_account(params) do
  changeset = cast(%Account{}, params, [:username, :age])
  if get_change(changeset, :username) == nil do
    add_error(changeset, :username, "can't be blank")
  else
    changeset
  end
end

Example — after:

def changeset(account, params) do
  account
  |> cast(params, [:username, :age])
  |> validate_required([:username])
  |> validate_length(:username, min: 3, max: 20)
  |> validate_number(:age, greater_than_or_equal_to: 18)
end

When NOT to Use

Don't use this when:

Validation requires a DB query (use a constraint instead, or validate_change/3 with a query and accept the race condition risk)
The rule is so context-specific it only applies once (inline add_error/4 may be cleaner than a full custom validator)
You're validating data that will never be persisted (prefer pure functions returning {:ok, _} | {:error, _})

Over-application example:

# Validating internal system config that can never be wrong at runtime
def system_config_changeset(config, attrs) do
  config
  |> cast(attrs, [:timeout_ms, :pool_size])
  |> validate_number(:timeout_ms, greater_than: 0)
  |> validate_number(:pool_size, greater_than: 0)
  # Config comes from compile-time env vars checked at boot — this is noise
end

Better alternative:

# Validate config at application boot with a clear error, not at runtime
def validate_config!(timeout_ms, pool_size) do
  unless is_integer(timeout_ms) and timeout_ms > 0 do
    raise ArgumentError, "timeout_ms must be a positive integer, got: #{inspect(timeout_ms)}"
  end
end

Why: Changeset validators are designed for user-facing feedback in multi-error scenarios. For internal configuration validated once at boot, a plain raise with a clear message is faster to write, easier to understand, and appropriate for code that should crash if misconfigured.

4. `validate_change/3` — Custom Validators

Source: lib/ecto/changeset.ex#L2508

What it does: The primitive that all built-in validators delegate to. Takes a (field, value) -> [{field, message}] function. If the field has no change (not in changeset.changes), the validator is skipped entirely. Returning [] means valid; returning a non-empty list adds errors and sets valid? to false.

# From lib/ecto/changeset.ex lines 2508-2528:
def validate_change(%Changeset{} = changeset, field, validator) when is_atom(field) do
  %{changes: changes, types: types, errors: errors} = changeset
  ensure_field_exists!(changeset, types, field)
  value = Map.get(changes, field)
  new = if is_nil(value), do: [], else: validator.(field, value)
  case new do
    [] -> changeset
    [_ | _] -> %{changeset | errors: new ++ errors, valid?: false}
  end
end

Why: The "only called if the field has a change" behavior is load-bearing. On an update changeset where the user didn't touch a field, validators for that field don't run — this avoids re-validating unchanged data and prevents false errors when partial updates are allowed. If you need to validate the current value regardless of whether it changed, fetch from the struct directly.

Anti-pattern: Reimplementing the "has change?" check inside the validator:

# BAD — redundant check; validate_change already skips if :sku is unchanged
def changeset(item, params) do
  changeset = cast(item, params, [:sku])

  if Map.has_key?(changeset.changes, :sku) do
    value = get_change(changeset, :sku)
    if valid_sku?(value), do: changeset, else: add_error(changeset, :sku, "invalid format")
  else
    changeset
  end
end

When to Use

Triggers:

None of the built-in validators (validate_format, validate_length, etc.) fit your rule
Your validation logic requires calling a helper function with domain knowledge
You're building a reusable custom validator to share across multiple changeset functions

Example — before:

def changeset(order, params) do
  order
  |> cast(params, [:quantity, :unit])
  |> validate_change(:unit, fn :unit, value ->
    if value in ["kg", "lb", "oz", "g"] do
      []
    else
      [{:unit, "must be a known unit of weight"}]
    end
  end)
end

Example — after:

@weight_units ~w(kg lb oz g)

def validate_weight_unit(changeset, field) do
  validate_change(changeset, field, fn _, value ->
    if value in @weight_units do
      []
    else
      [{field, {"must be a known unit of weight, got: %{value}", [value: value]}}]
    end
  end)
end

def changeset(order, params) do
  order
  |> cast(params, [:quantity, :unit])
  |> validate_weight_unit(:unit)
end

When NOT to Use

Don't use this when:

A built-in validator already covers the rule (validate_inclusion would handle the weight unit example above)
The validation requires a DB query — use a constraint or accept that you need a repo call
You need to validate the stored value (not the change) — read from the struct instead

Over-application example:

# validate_change for something validate_format handles natively
validate_change(changeset, :email, fn :email, value ->
  if String.match?(value, ~r/@/), do: [], else: [{:email, "must contain @"}]
end)

Better alternative:

validate_format(changeset, :email, ~r/@/)

Why: Built-in validators produce consistent error message structures (with interpolation keys for translation), are well-documented, and are immediately recognizable to Ecto users. Reach for validate_change/3 only when you need logic the built-ins cannot express.

5. `add_error/4` — Manual Error Injection

Source: lib/ecto/changeset.ex#L2460

What it does: Directly adds an error tuple to changeset.errors and sets valid? to false. Takes (changeset, key, message, opts) where opts typically contains validation: metadata. Unlike validate_change/3, it always adds the error — there is no "only if the field changed" guard.

# Adding a cross-field validation error
def changeset(event, params) do
  event
  |> cast(params, [:start_date, :end_date])
  |> validate_required([:start_date, :end_date])
  |> validate_date_range()
end

defp validate_date_range(changeset) do
  start_date = get_field(changeset, :start_date)
  end_date = get_field(changeset, :end_date)

  if start_date && end_date && Date.compare(end_date, start_date) == :lt do
    add_error(changeset, :end_date, "must be on or after start date")
  else
    changeset
  end
end

Why: Some validation rules span multiple fields or depend on context that isn't available to validate_change/3's field-scoped callback. add_error/4 gives you an escape hatch to inject errors after you've already determined the problem, without needing to wrap your logic in a validate_change/3 call that only has access to one field's value.

Anti-pattern: Using add_error/4 unconditionally (forgetting the conditional):

# BAD — always adds the error, making the changeset permanently invalid
def changeset(user, params) do
  user
  |> cast(params, [:role])
  |> add_error(:role, "admin role requires approval")  # Fires even for non-admin roles!
end

When to Use

Triggers:

Validation logic spans two or more fields (start/end dates, password/confirmation, price/discount)
You've already computed whether an error applies (result of a service call, a complex business rule) and just need to record it
You're building a wrapper that receives errors from an external system and maps them onto changeset fields

Example — before:

def changeset(subscription, params) do
  subscription
  |> cast(params, [:plan, :seats])
  |> validate_change(:seats, fn :seats, seats ->
    plan = get_field(subscription, :plan)  # Can't access changeset here
    max = plan_seat_limit(plan)
    if seats > max, do: [{:seats, "exceeds limit for #{plan} plan"}], else: []
  end)
end

Example — after:

def changeset(subscription, params) do
  subscription
  |> cast(params, [:plan, :seats])
  |> validate_required([:plan, :seats])
  |> validate_seat_limit()
end

defp validate_seat_limit(changeset) do
  plan = get_field(changeset, :plan)
  seats = get_field(changeset, :seats)

  if plan && seats && seats > plan_seat_limit(plan) do
    add_error(changeset, :seats, "exceeds limit for %{plan} plan", plan: plan)
  else
    changeset
  end
end

When NOT to Use

Don't use this when:

A built-in validator covers the rule — add_error/4 bypasses standard validation metadata
The error applies to only one field and doesn't depend on other fields (use validate_change/3)
You want the error to be skipped if the field hasn't changed (use validate_change/3)

Over-application example:

# Manually adding what validate_required already does
def changeset(user, params) do
  cs = cast(%User{}, params, [:email])
  if get_change(cs, :email) == nil do
    add_error(cs, :email, "can't be blank")
  else
    cs
  end
end

Better alternative:

def changeset(user, params) do
  user
  |> cast(params, [:email])
  |> validate_required([:email])
end

Why: Built-in validators embed the validation: metadata key in the error opts, which Ecto uses internally and Phoenix uses for error rendering. add_error/4 without this metadata can cause unexpected behavior with form helpers that inspect error opts.

6. `put_change/3` vs `force_change/3` — Tracked vs Forced Changes

Source: lib/ecto/changeset.ex#L1959 (put_change), #L2248 (force_change)

What it does: put_change/3 compares the new value against the current struct field value; if they are equal, it does NOT add the change (the field stays absent from changeset.changes). force_change/3 always adds the field to changeset.changes regardless of whether the value differs.

user = %User{name: "Alice", role: :member}
cs = Ecto.Changeset.change(user)

# put_change: no-op if value matches current
cs1 = Ecto.Changeset.put_change(cs, :name, "Alice")
cs1.changes  #=> %{}  (no change recorded)

# force_change: always records the change
cs2 = Ecto.Changeset.force_change(cs, :name, "Alice")
cs2.changes  #=> %{name: "Alice"}  (change recorded, triggers DB write)

Why: Ecto uses changeset.changes to build the SQL UPDATE statement — only changed fields are included. If put_change/3 detects no actual change, the field is excluded from the UPDATE, saving a write and avoiding spurious updated_at bumps. force_change/3 marks the changeset dirty unconditionally, which triggers a DB write even if the row would be identical afterward. Default to put_change/3; reach for force_change/3 only when you specifically need the DB write to happen (e.g., triggering a DB trigger, bumping updated_at intentionally).

Anti-pattern: Using force_change/3 by default causes unnecessary DB writes on every update:

# BAD — forces every field into changes even when nothing actually changed
def normalize_user(changeset) do
  name = get_field(changeset, :name)
  force_change(changeset, :name, String.trim(name || ""))
end

When to Use `put_change/3`

Triggers:

Setting a field to a computed value that might or might not differ from the current value
Normalizing input (trim whitespace, downcase email) — the result may equal what's already stored
Any programmatic field update where skipping unnecessary DB writes is correct behavior

Example — before:

def normalize(changeset) do
  email = get_field(changeset, :email)
  # force_change even if email is already lowercase
  force_change(changeset, :email, String.downcase(email || ""))
end

Example — after:

def normalize(changeset) do
  email = get_field(changeset, :email)
  put_change(changeset, :email, String.downcase(email || ""))
end

When NOT to Use `put_change/3` (use `force_change/3`)

Don't use put_change/3 when:

You need to unconditionally bump updated_at even when no data changed
A DB trigger or audit log depends on the UPDATE statement firing regardless
You're implementing optimistic locking and need to force a version increment

Over-application example:

# Using force_change everywhere "just to be safe"
def admin_update(user, params) do
  user
  |> cast(params, [:name, :email, :role])
  |> then(fn cs ->
    cs.changes
    |> Enum.reduce(cs, fn {field, value}, acc ->
      force_change(acc, field, value)  # Redundant — changes are already in changeset.changes
    end)
  end)
end

Better alternative:

def admin_update(user, params) do
  user
  |> cast(params, [:name, :email, :role])
  |> validate_required([:name, :email])
end

Why: cast/4 already correctly populates changeset.changes with only the fields that differ. Calling force_change/3 on top of a cast changeset doubles the work and makes every update unconditionally dirty, preventing Ecto's change-detection optimization from functioning.

7. Constraints vs Validations — DB-level Safety

Source: lib/ecto/changeset.ex#L3916 (unique_constraint), #L3987 (foreign_key_constraint), #L3777 (check_constraint)

What it does: Constraint functions register a DB error name → changeset error mapping. They do NOT run a query themselves. When Repo.insert/2 or Repo.update/2 raises a DB constraint violation, Ecto catches it and converts it into a changeset error on the registered field, returning {:error, changeset} instead of raising.

# From lib/ecto/changeset.ex (unique_constraint/3):
def unique_constraint(changeset, [first_field | _] = fields, opts) do
  name = opts[:name] || unique_index_name(changeset, fields)
  message = constraint_message(opts, "has already been taken")
  match_type = Keyword.get(opts, :match, :exact)
  error_key = Keyword.get(opts, :error_key, first_field)
  add_constraint(changeset, :unique, name, match_type, error_key, message, :unique)
end

Why: A uniqueness check via validate_change/3 (selecting before inserting) has a race condition — two concurrent requests can both pass the select, then both attempt the insert, with one failing with an unhandled DB error. Constraints delegate to the DB's atomic enforcement. The constraint name in the changeset must exactly match the index name in the migration (Ecto generates table_field_index by default, or you can override with name:).

Anti-pattern: Using unsafe_validate_unique/3 as a replacement for unique_constraint/3:

# RISKY — has a TOCTOU race condition under concurrent load
def changeset(user, params) do
  user
  |> cast(params, [:email])
  |> unsafe_validate_unique(:email, MyApp.Repo)
  # Two concurrent registrations with the same email can both pass this check
end

When to Use

Triggers:

Uniqueness must be guaranteed across concurrent inserts (always use unique_constraint, not unsafe_validate_unique alone)
A foreign key references a row that might not exist (use foreign_key_constraint)
The DB has a check constraint that corresponds to a business rule (use check_constraint)

Example — before:

def changeset(user, params) do
  user
  |> cast(params, [:email, :username])
  |> validate_required([:email, :username])
  # No constraint registration — DB errors bubble up as Postgrex.Error exceptions
end

Example — after:

def changeset(user, params) do
  user
  |> cast(params, [:email, :username])
  |> validate_required([:email, :username])
  |> unique_constraint(:email)
  |> unique_constraint(:username)
  |> foreign_key_constraint(:organization_id)
end

When NOT to Use

Don't use this when:

The DB index doesn't exist — the constraint will never fire and the error is silently dropped
You need immediate feedback before hitting the DB (use unsafe_validate_unique as a UX hint, then also add unique_constraint)
The rule is application-level only and has no corresponding DB constraint

Over-application example:

# Registering a unique_constraint for a field with no DB index
def changeset(order, params) do
  order
  |> cast(params, [:reference_number])
  |> unique_constraint(:reference_number)
  # No index in the migration — this constraint registration does nothing
end

Better alternative:

# First, add the index in a migration:
# create unique_index(:orders, [:reference_number])

# Then register the constraint:
def changeset(order, params) do
  order
  |> cast(params, [:reference_number])
  |> unique_constraint(:reference_number)
end

Why: If the DB index doesn't exist, the DB will never raise the expected constraint error and the unique_constraint/3 registration is silently a no-op. Always add the migration index first, then register the constraint in the changeset.

8. `prepare_changes/2` — Last-Mile DB-Aware Transforms

Source: lib/ecto/changeset.ex#L3706

What it does: Registers a callback to run inside the database transaction, just before the DB operation executes. At callback time, changeset.repo is populated with the Repo module. Used for side effects that must be atomic with the main insert/update (counter-caches, denormalized fields, audit log entries).

# From lib/ecto/changeset.ex lines 3689-3703:
def create_comment(comment, params) do
  comment
  |> cast(params, [:body, :post_id])
  |> prepare_changes(fn changeset ->
       if post_id = get_change(changeset, :post_id) do
         query = from Post, where: [id: ^post_id]
         changeset.repo.update_all(query, inc: [comment_count: 1])
       end
       changeset
     end)
end

Why: Counter-caches and denormalized aggregates must update atomically with the row that triggered them. If you increment comment_count outside the transaction, a crash between the insert and the increment leaves counts inconsistent. prepare_changes/2 guarantees both operations commit or both roll back. The callback receives the final changeset (after all before_* callbacks), so it sees the actual values that will be written.

Anti-pattern: Performing side effects after Repo.insert/update outside a transaction:

# BAD — comment is saved but counter may not update if the second call crashes
def create_comment(attrs) do
  with {:ok, comment} <- Repo.insert(Comment.changeset(%Comment{}, attrs)) do
    Repo.update_all(
      from(p in Post, where: p.id == ^comment.post_id),
      inc: [comment_count: 1]
    )
    {:ok, comment}
  end
end

When to Use

Triggers:

You need to update a related row (counter-cache, denormalized field) atomically with the main operation
The side effect requires the Repo and must be inside the transaction
You're computing a value (e.g., slug from a title) that depends on the final changeset state after all other callbacks have run

Example — before:

def create_order_item(item, params) do
  case Repo.insert(OrderItem.changeset(item, params)) do
    {:ok, item} ->
      # Race condition: order total might be stale if this crashes or races
      Repo.update_all(
        from(o in Order, where: o.id == ^item.order_id),
        inc: [total_cents: item.price_cents]
      )
      {:ok, item}
    error -> error
  end
end

Example — after:

def changeset(item, params) do
  item
  |> cast(params, [:price_cents, :order_id, :product_id])
  |> validate_required([:price_cents, :order_id])
  |> prepare_changes(fn changeset ->
    if price = get_change(changeset, :price_cents) do
      order_id = get_field(changeset, :order_id)
      changeset.repo.update_all(
        from(o in Order, where: o.id == ^order_id),
        inc: [total_cents: price]
      )
    end
    changeset
  end)
end

When NOT to Use

Don't use this when:

The side effect doesn't need to be in the same transaction (e.g., sending an email — use Ecto.Multi or an after-commit hook instead)
The computation doesn't need Repo access (use a regular validate_change/3 or map transformation before the changeset reaches Repo)
You need the side effect to run even when the changeset is invalid (callbacks only run when valid? is true)

Over-application example:

# prepare_changes for something that doesn't need the transaction
def changeset(user, params) do
  user
  |> cast(params, [:name])
  |> prepare_changes(fn changeset ->
    # Email sending in a DB transaction holds the connection open — bad
    Mailer.send_welcome_email(get_field(changeset, :email))
    changeset
  end)
end

Better alternative:

def create_user(attrs) do
  with {:ok, user} <- Repo.insert(User.changeset(%User{}, attrs)) do
    Mailer.send_welcome_email(user.email)
    {:ok, user}
  end
end

Why: prepare_changes/2 holds the DB connection and transaction open for its duration. Long-running operations (HTTP calls, email sending, file I/O) inside prepare_changes/2 exhaust the connection pool. Reserve it strictly for quick DB operations that must be atomic.

9. `apply_action/2` — Schemaless Validation

Source: lib/ecto/changeset.ex#L2332

What it does: Applies the changeset's changes to the underlying struct and returns {:ok, struct} if valid, or {:error, changeset} (with action set) if invalid. No DB operation is performed. Sets changeset.action to the given atom, which Phoenix form helpers use to decide whether to show error messages.

# From lib/ecto/changeset.ex lines 2332-2338:
def apply_action(%Changeset{} = changeset, action) when is_atom(action) do
  if changeset.valid? do
    {:ok, apply_changes(changeset)}
  else
    {:error, %{changeset | action: action}}
  end
end

Why: Not all changeset workflows end with a DB write. Form objects (search filters, multi-step wizards, command structs) need the full changeset pipeline (cast, validate, present errors) without ever touching a database. apply_action/2 is the endpoint for those workflows — it returns the same {:ok, value} | {:error, changeset} shape that Repo.insert/update returns, so callers and controllers work identically regardless of whether data is persisted.

Anti-pattern: Checking changeset.valid? manually and calling apply_changes/1 yourself:

# BAD — duplicates apply_action logic, misses setting changeset.action
def submit_search(params) do
  cs = SearchForm.changeset(%SearchForm{}, params)
  if cs.valid? do
    {:ok, Ecto.Changeset.apply_changes(cs)}
  else
    {:error, cs}  # action not set — Phoenix won't show errors in the template
  end
end

When to Use

Triggers:

You're using an embedded schema or schemaless changeset for a form that doesn't persist to the DB
You need the {:ok, struct} | {:error, changeset} return shape for compatibility with standard controller patterns
You want changeset.action to be set so Phoenix form helpers render inline errors correctly

Example — before:

def process_filter(params) do
  changeset = FilterForm.changeset(%FilterForm{}, params)
  if changeset.valid? do
    filters = Ecto.Changeset.apply_changes(changeset)
    {:ok, filters}
  else
    {:error, changeset}  # action not set — form errors won't render
  end
end

Example — after:

def process_filter(params) do
  %FilterForm{}
  |> FilterForm.changeset(params)
  |> Ecto.Changeset.apply_action(:validate)
end

When NOT to Use

Don't use this when:

You intend to persist the data — use Repo.insert/2 or Repo.update/2 directly
You need to inspect changeset.changes rather than the applied struct (call apply_changes/1 only if valid, otherwise work with the changeset)
The action atom matters to downstream code — :validate, :insert, :update have semantic meaning in some form libraries

Over-application example:

# apply_action before persisting — redundant, Repo.insert calls apply_changes internally
def create_user(attrs) do
  changeset = User.changeset(%User{}, attrs)
  with {:ok, user} <- Ecto.Changeset.apply_action(changeset, :insert) do
    Repo.insert(User.changeset(user, %{}))
  end
end

Better alternative:

def create_user(attrs) do
  %User{}
  |> User.changeset(attrs)
  |> Repo.insert()
end

Why: Repo.insert/2 already validates the changeset and applies changes internally. Calling apply_action/2 before Repo.insert/2 runs the changeset twice and re-builds the struct unnecessarily.

10. `cast_assoc/3` vs `put_assoc/4` — External vs Internal Association Changes

Source: lib/ecto/changeset.ex#L1213 (cast_assoc)

What it does: cast_assoc/3 handles nested external data by invoking the associated schema's changeset function for each nested params map. put_assoc/4 replaces the association directly with a struct or list of structs that your code has already built and trusts.

# cast_assoc — external data flows through the association's changeset function
def changeset(post, params) do
  post
  |> cast(params, [:title, :body])
  |> cast_assoc(:comments, with: &Comment.changeset/2)
end

# put_assoc — trusted internal data replaces the association wholesale
def assign_tags(post, tags) when is_list(tags) do
  post
  |> Ecto.Changeset.change()
  |> Ecto.Changeset.put_assoc(:tags, tags)
end

Why: cast_assoc/3 delegates to the association's own changeset function, which applies its own cast, validate_required, constraints, etc. This means all the safety guarantees of the association's changeset apply to nested user input. put_assoc/4 bypasses the association's changeset entirely — it trusts that you've already validated the data. Using put_assoc/4 with raw external params would skip all of the association's validation.

Anti-pattern: Using put_assoc/4 with external/untrusted nested params bypasses validation:

# BAD — building Comment structs from raw params skips Comment's changeset
def changeset(post, params) do
  comments = Enum.map(params["comments"] || [], fn p ->
    struct(Comment, %{body: p["body"], post_id: post.id})
  end)
  post
  |> cast(params, [:title])
  |> put_assoc(:comments, comments)
  # Comment's validate_required(:body), validate_length(:body), etc. never run
end

When to Use `cast_assoc/3`

Triggers:

Nested params arrive from a user-submitted form or API body (e.g., a post with embedded comments)
The association has its own changeset function with validations you want applied
You need nested error messages to appear in the parent changeset's :errors structure

Example — before:

def changeset(invoice, params) do
  invoice
  |> cast(params, [:number, :due_date])
  |> put_assoc(:line_items, build_line_items(params["line_items"]))
end

defp build_line_items(nil), do: []
defp build_line_items(items), do: Enum.map(items, &struct(LineItem, &1))

Example — after:

def changeset(invoice, params) do
  invoice
  |> cast(params, [:number, :due_date])
  |> cast_assoc(:line_items, with: &LineItem.changeset/2)
end

When to Use `put_assoc/4`

Triggers:

You're loading existing records from the DB and associating them (e.g., tagging a post with existing Tag records)
The associated structs were built by your application code, not by user input
You want to replace the entire association with a pre-validated list

Example:

def update_post_tags(post, tag_ids) do
  tags = Repo.all(from t in Tag, where: t.id in ^tag_ids)

  post
  |> Ecto.Changeset.change()
  |> Ecto.Changeset.put_assoc(:tags, tags)
  |> Repo.update()
end

When NOT to Use

Don't use cast_assoc when:

You have pre-built, validated structs — put_assoc is simpler and more explicit
The association doesn't have (or need) a changeset function
You're building a test fixture and want direct struct assignment

Don't use put_assoc when:

Any part of the association data came from user input — use cast_assoc so the association's validators run

Over-application example:

# put_assoc for external data — Profile.changeset validations never run
def changeset(user, params) do
  profile = struct(Profile, params["profile"] || %{})
  user
  |> cast(params, [:email])
  |> put_assoc(:profile, profile)
end

Better alternative:

def changeset(user, params) do
  user
  |> cast(params, [:email])
  |> cast_assoc(:profile, with: &Profile.changeset/2)
end

Why: put_assoc/4 with externally-sourced structs is a silent validation bypass. If Profile.changeset/2 validates that :bio must be under 500 characters, put_assoc/4 with a user-supplied profile will store any length — the validation simply never ran.

Decision Tree

Use this tree when deciding which Ecto changeset function to reach for:

Data comes from a user, form, or API? → Use cast/4 (type coercion + allow-list)
Data comes from your application code? → Use change/2 (no coercion, no filtering)
Need to validate a value without hitting the DB? → Use validate_required, validate_format, validate_length, validate_number, validate_inclusion, validate_subset, or validate_change/3 for custom rules
Need DB-level uniqueness guarantee (race-safe)? → Use unique_constraint/2 (not unsafe_validate_unique/3 alone)
Setting a field to a computed value that might not have changed? → Use put_change/3 (skips write if unchanged); use force_change/3 only when the DB write must occur regardless
Need to run something inside the transaction? → Use prepare_changes/2
Validating without persisting (form objects, embedded schemas)? → Use apply_action/2
Nested association data comes from external params? → Use cast_assoc/3 (delegates to association's changeset function)
Nested association data is trusted internal data (loaded from DB)? → Use put_assoc/4

41 KiB Raw Permalink Blame History

Changeset Patterns in Ecto

Contents

1. cast/4 — The External/Internal Data Boundary

When to Use

When NOT to Use

2. change/2 — Internal-Only Modifications

When to Use

When NOT to Use

3. Validation Pipeline — Composable Validators

When to Use

When NOT to Use

4. validate_change/3 — Custom Validators

When to Use

When NOT to Use

5. add_error/4 — Manual Error Injection

When to Use

When NOT to Use

6. put_change/3 vs force_change/3 — Tracked vs Forced Changes

When to Use put_change/3

When NOT to Use put_change/3 (use force_change/3)

7. Constraints vs Validations — DB-level Safety

When to Use

When NOT to Use

8. prepare_changes/2 — Last-Mile DB-Aware Transforms

When to Use

When NOT to Use

9. apply_action/2 — Schemaless Validation

When to Use

When NOT to Use

10. cast_assoc/3 vs put_assoc/4 — External vs Internal Association Changes

When to Use cast_assoc/3

When to Use put_assoc/4

When NOT to Use

Decision Tree

41 KiB

Raw Permalink Blame History

1. `cast/4` — The External/Internal Data Boundary

2. `change/2` — Internal-Only Modifications

4. `validate_change/3` — Custom Validators

5. `add_error/4` — Manual Error Injection

6. `put_change/3` vs `force_change/3` — Tracked vs Forced Changes

When to Use `put_change/3`

When NOT to Use `put_change/3` (use `force_change/3`)

8. `prepare_changes/2` — Last-Mile DB-Aware Transforms

9. `apply_action/2` — Schemaless Validation

10. `cast_assoc/3` vs `put_assoc/4` — External vs Internal Association Changes

When to Use `cast_assoc/3`

When to Use `put_assoc/4`