# Changeset Patterns in Ecto Patterns extracted from Ecto's source code for building safe, composable data pipelines. ## Contents 1. [`cast/4` — The External/Internal Data Boundary](#1-cast4--the-externalinternal-data-boundary) 2. [`change/2` — Internal-Only Modifications](#2-change2--internal-only-modifications) 3. [Validation Pipeline — Composable Validators](#3-validation-pipeline--composable-validators) 4. [`validate_change/3` — Custom Validators](#4-validate_change3--custom-validators) 5. [`add_error/4` — Manual Error Injection](#5-add_error4--manual-error-injection) 6. [`put_change/3` vs `force_change/3` — Tracked vs Forced Changes](#6-put_change3-vs-force_change3--tracked-vs-forced-changes) 7. [Constraints vs Validations — DB-level Safety](#7-constraints-vs-validations--db-level-safety) 8. [`prepare_changes/2` — Last-Mile DB-Aware Transforms](#8-prepare_changes2--last-mile-db-aware-transforms) 9. [`apply_action/2` — Schemaless Validation](#9-apply_action2--schemaless-validation) 10. [`cast_assoc/3` vs `put_assoc/4` — External vs Internal Association Changes](#10-cast_assoc3-vs-put_assoc4--external-vs-internal-association-changes) --- ## 1. `cast/4` — The External/Internal Data Boundary **Source:** [lib/ecto/changeset.ex#L729](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L729) **What it does:** `cast/4` is the entry point for all externally-supplied data (user input, API params, form submissions). It converts string keys to atom keys, coerces string values to the schema's declared types, and records which fields were changed. The permitted-fields list is an explicit allow-list — unlisted fields are silently dropped. ```elixir # From the Ecto docs (lib/ecto/changeset.ex line 108-115): def changeset(user, params \\ %{}) do user |> cast(params, [:name, :email, :age]) |> validate_required([:name, :email]) |> validate_format(:email, ~r/@/) |> validate_inclusion(:age, 18..100) |> unique_constraint(:email) end ``` **Why:** External data is untrusted and untyped (form submissions arrive as strings). `cast/4` establishes a hard boundary: everything outside the permitted list is discarded, and everything inside is type-checked against the schema. Using `cast/4` for external data and `change/2` for internal data makes the trust boundary explicit and auditable — grep for `cast(` to find every place user data enters the system. **Anti-pattern:** Using `change/2` for external data bypasses type coercion and the allow-list: ```elixir # BAD — no type coercion, no field filtering, all params accepted def changeset(user, params) do user |> change(params) |> validate_required([:name, :email]) end ``` ### When to Use **Triggers:** - Data originates from a controller `params` map, an API request body, or any user-submitted form - Field values arrive as strings that need type coercion (integers, booleans, dates) - You need to whitelist which fields a user is allowed to set **Example — before:** ```elixir def update_profile(user, attrs) do # Directly merging external attrs skips coercion and allows mass assignment changeset = Ecto.Changeset.change(user, attrs) Repo.update(changeset) end ``` **Example — after:** ```elixir def update_profile(user, attrs) do user |> Ecto.Changeset.cast(attrs, [:name, :bio, :avatar_url]) |> Ecto.Changeset.validate_required([:name]) |> Repo.update() end ``` ### When NOT to Use **Don't use this when:** - The data originates from application code (system-generated timestamps, counters, foreign keys computed by your own logic) - The values are already the correct Elixir types and have been validated elsewhere - You're building a test fixture and want to set fields directly **Over-application example:** ```elixir # Overkill — casting internal data that's already correctly typed def mark_confirmed(user) do user |> cast(%{confirmed_at: DateTime.utc_now()}, [:confirmed_at]) |> Repo.update!() end ``` **Better alternative:** ```elixir def mark_confirmed(user) do user |> Ecto.Changeset.change(confirmed_at: DateTime.utc_now()) |> Repo.update!() end ``` **Why:** `cast/4` has a runtime cost (string key normalization, type coercion, allow-list filtering) that adds no value when the data is already correctly typed. It also signals to readers that external data is flowing in — misusing it for internal data erodes that signal. --- ## 2. `change/2` — Internal-Only Modifications **Source:** [lib/ecto/changeset.ex#L491](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L491) **What it does:** `change/2` wraps a struct in a changeset (or merges changes into an existing one) without any type coercion. It accepts a map or keyword list of already-typed values. No allow-list filtering occurs — all given fields are set directly. ```elixir # Used for trusted, already-typed internal data def activate_subscription(sub) do sub |> Ecto.Changeset.change( status: :active, activated_at: DateTime.utc_now(), trial_ends_at: nil ) |> Repo.update!() end ``` **Why:** When your application computes a value (a system timestamp, a foreign key returned from a previous query, a status atom), you already know its type. Running it through `cast/4` would add coercion work that can't succeed or fail in a meaningful way — the value either matches the schema type already or it's a bug in your code. `change/2` communicates "I trust this data" to readers. **Anti-pattern:** Using `cast/4` for programmatic data creates phantom external-data semantics: ```elixir # BAD — cast signals "user data" but this comes from application logic def set_password_reset_token(user, token) do user |> cast(%{reset_token: token, reset_token_expires_at: hours_from_now(2)}, [:reset_token, :reset_token_expires_at]) |> Repo.update!() end ``` ### When to Use **Triggers:** - Values come from your own application code, not from user input - Types are already correct (atom status values, DateTime structs, integer IDs from prior queries) - You're updating fields that users are never allowed to set directly (internal audit fields, system state) **Example — before:** ```elixir def record_login(user) do user |> cast(%{last_login_at: DateTime.utc_now(), login_count: user.login_count + 1}, [:last_login_at, :login_count]) |> Repo.update!() end ``` **Example — after:** ```elixir def record_login(user) do user |> Ecto.Changeset.change( last_login_at: DateTime.utc_now(), login_count: user.login_count + 1 ) |> Repo.update!() end ``` ### When NOT to Use **Don't use this when:** - Any part of the data originates from user input or an external API response - You need type coercion (e.g., converting a string "true" to boolean `true`) - You want validation errors to propagate via the changeset (use `cast/4` + validators) **Over-application example:** ```elixir # Using change/2 for external data bypasses all safety checks def create_user(params) do %User{} |> Ecto.Changeset.change(params) # params from controller — untrusted strings! |> Repo.insert() end ``` **Better alternative:** ```elixir def create_user(params) do %User{} |> Ecto.Changeset.cast(params, [:name, :email, :password]) |> Ecto.Changeset.validate_required([:name, :email, :password]) |> Repo.insert() end ``` **Why:** `change/2` skips type coercion entirely. Passing untyped external params through `change/2` means a string `"true"` stays a string instead of becoming boolean `true`, and a string `"42"` stays a string instead of becoming integer `42`. Schema inserts will either fail with a confusing DB error or silently store the wrong type. --- ## 3. Validation Pipeline — Composable Validators **Source:** [lib/ecto/changeset.ex#L2596](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L2596) (`validate_required`), [#L2920](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L2920) (`validate_format`), [#L3083](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L3083) (`validate_length`), [#L3275](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L3275) (`validate_number`), [#L3451](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L3451) (`validate_inclusion`), [#L2987](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L2987) (`validate_subset`) **What it does:** Every built-in validator takes a changeset as its first argument and returns a changeset, enabling pipe composition. Validators accumulate errors in `changeset.errors` rather than raising. Once `valid?` is false, subsequent validators still run (to collect all errors at once) — except `validate_required`, which must pass for dependent validators to be useful. ```elixir def changeset(product, params) do product |> cast(params, [:name, :price, :category, :tags]) |> validate_required([:name, :price]) |> validate_length(:name, min: 2, max: 100) |> validate_number(:price, greater_than: 0) |> validate_inclusion(:category, ~w(electronics clothing food)) |> validate_subset(:tags, ~w(sale featured new)) |> validate_format(:name, ~r/\A[[:print:]]+\z/, message: "must be printable characters") end ``` **Why:** The pipe model means you can add, remove, or reorder validators without changing surrounding code. Each validator is independent — it doesn't know about others. This lets you compose context-specific changeset functions (e.g., `admin_changeset/2` permitting extra fields, then calling the same validators as `user_changeset/2`). Validators run entirely in-process and BEFORE constraint checks, so users get fast feedback without a DB round-trip. **Anti-pattern:** Manually checking fields and accumulating errors with ad-hoc conditionals: ```elixir # BAD — imperative validation, hard to extend, misses Ecto's error format def validate(params) do errors = [] errors = if params[:name] == nil, do: [{:name, "can't be blank"} | errors], else: errors errors = if String.length(params[:name] || "") < 2, do: [{:name, "too short"} | errors], else: errors errors end ``` ### When to Use **Triggers:** - After `cast/4`, you need to verify format, range, presence, or membership constraints - You want all validation errors reported in a single pass (not stop on first failure) - You're building a changeset function that other modules or contexts will reuse **Example — before:** ```elixir def create_account(params) do changeset = cast(%Account{}, params, [:username, :age]) if get_change(changeset, :username) == nil do add_error(changeset, :username, "can't be blank") else changeset end end ``` **Example — after:** ```elixir def changeset(account, params) do account |> cast(params, [:username, :age]) |> validate_required([:username]) |> validate_length(:username, min: 3, max: 20) |> validate_number(:age, greater_than_or_equal_to: 18) end ``` ### When NOT to Use **Don't use this when:** - Validation requires a DB query (use a constraint instead, or `validate_change/3` with a query and accept the race condition risk) - The rule is so context-specific it only applies once (inline `add_error/4` may be cleaner than a full custom validator) - You're validating data that will never be persisted (prefer pure functions returning `{:ok, _} | {:error, _}`) **Over-application example:** ```elixir # Validating internal system config that can never be wrong at runtime def system_config_changeset(config, attrs) do config |> cast(attrs, [:timeout_ms, :pool_size]) |> validate_number(:timeout_ms, greater_than: 0) |> validate_number(:pool_size, greater_than: 0) # Config comes from compile-time env vars checked at boot — this is noise end ``` **Better alternative:** ```elixir # Validate config at application boot with a clear error, not at runtime def validate_config!(timeout_ms, pool_size) do unless is_integer(timeout_ms) and timeout_ms > 0 do raise ArgumentError, "timeout_ms must be a positive integer, got: #{inspect(timeout_ms)}" end end ``` **Why:** Changeset validators are designed for user-facing feedback in multi-error scenarios. For internal configuration validated once at boot, a plain `raise` with a clear message is faster to write, easier to understand, and appropriate for code that should crash if misconfigured. --- ## 4. `validate_change/3` — Custom Validators **Source:** [lib/ecto/changeset.ex#L2508](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L2508) **What it does:** The primitive that all built-in validators delegate to. Takes a `(field, value) -> [{field, message}]` function. If the field has no change (not in `changeset.changes`), the validator is skipped entirely. Returning `[]` means valid; returning a non-empty list adds errors and sets `valid?` to false. ```elixir # From lib/ecto/changeset.ex lines 2508-2528: def validate_change(%Changeset{} = changeset, field, validator) when is_atom(field) do %{changes: changes, types: types, errors: errors} = changeset ensure_field_exists!(changeset, types, field) value = Map.get(changes, field) new = if is_nil(value), do: [], else: validator.(field, value) case new do [] -> changeset [_ | _] -> %{changeset | errors: new ++ errors, valid?: false} end end ``` **Why:** The "only called if the field has a change" behavior is load-bearing. On an update changeset where the user didn't touch a field, validators for that field don't run — this avoids re-validating unchanged data and prevents false errors when partial updates are allowed. If you need to validate the current value regardless of whether it changed, fetch from the struct directly. **Anti-pattern:** Reimplementing the "has change?" check inside the validator: ```elixir # BAD — redundant check; validate_change already skips if :sku is unchanged def changeset(item, params) do changeset = cast(item, params, [:sku]) if Map.has_key?(changeset.changes, :sku) do value = get_change(changeset, :sku) if valid_sku?(value), do: changeset, else: add_error(changeset, :sku, "invalid format") else changeset end end ``` ### When to Use **Triggers:** - None of the built-in validators (`validate_format`, `validate_length`, etc.) fit your rule - Your validation logic requires calling a helper function with domain knowledge - You're building a reusable custom validator to share across multiple changeset functions **Example — before:** ```elixir def changeset(order, params) do order |> cast(params, [:quantity, :unit]) |> validate_change(:unit, fn :unit, value -> if value in ["kg", "lb", "oz", "g"] do [] else [{:unit, "must be a known unit of weight"}] end end) end ``` **Example — after:** ```elixir @weight_units ~w(kg lb oz g) def validate_weight_unit(changeset, field) do validate_change(changeset, field, fn _, value -> if value in @weight_units do [] else [{field, {"must be a known unit of weight, got: %{value}", [value: value]}}] end end) end def changeset(order, params) do order |> cast(params, [:quantity, :unit]) |> validate_weight_unit(:unit) end ``` ### When NOT to Use **Don't use this when:** - A built-in validator already covers the rule (`validate_inclusion` would handle the weight unit example above) - The validation requires a DB query — use a constraint or accept that you need a repo call - You need to validate the stored value (not the change) — read from the struct instead **Over-application example:** ```elixir # validate_change for something validate_format handles natively validate_change(changeset, :email, fn :email, value -> if String.match?(value, ~r/@/), do: [], else: [{:email, "must contain @"}] end) ``` **Better alternative:** ```elixir validate_format(changeset, :email, ~r/@/) ``` **Why:** Built-in validators produce consistent error message structures (with interpolation keys for translation), are well-documented, and are immediately recognizable to Ecto users. Reach for `validate_change/3` only when you need logic the built-ins cannot express. --- ## 5. `add_error/4` — Manual Error Injection **Source:** [lib/ecto/changeset.ex#L2460](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L2460) **What it does:** Directly adds an error tuple to `changeset.errors` and sets `valid?` to false. Takes `(changeset, key, message, opts)` where `opts` typically contains `validation:` metadata. Unlike `validate_change/3`, it always adds the error — there is no "only if the field changed" guard. ```elixir # Adding a cross-field validation error def changeset(event, params) do event |> cast(params, [:start_date, :end_date]) |> validate_required([:start_date, :end_date]) |> validate_date_range() end defp validate_date_range(changeset) do start_date = get_field(changeset, :start_date) end_date = get_field(changeset, :end_date) if start_date && end_date && Date.compare(end_date, start_date) == :lt do add_error(changeset, :end_date, "must be on or after start date") else changeset end end ``` **Why:** Some validation rules span multiple fields or depend on context that isn't available to `validate_change/3`'s field-scoped callback. `add_error/4` gives you an escape hatch to inject errors after you've already determined the problem, without needing to wrap your logic in a `validate_change/3` call that only has access to one field's value. **Anti-pattern:** Using `add_error/4` unconditionally (forgetting the conditional): ```elixir # BAD — always adds the error, making the changeset permanently invalid def changeset(user, params) do user |> cast(params, [:role]) |> add_error(:role, "admin role requires approval") # Fires even for non-admin roles! end ``` ### When to Use **Triggers:** - Validation logic spans two or more fields (start/end dates, password/confirmation, price/discount) - You've already computed whether an error applies (result of a service call, a complex business rule) and just need to record it - You're building a wrapper that receives errors from an external system and maps them onto changeset fields **Example — before:** ```elixir def changeset(subscription, params) do subscription |> cast(params, [:plan, :seats]) |> validate_change(:seats, fn :seats, seats -> plan = get_field(subscription, :plan) # Can't access changeset here max = plan_seat_limit(plan) if seats > max, do: [{:seats, "exceeds limit for #{plan} plan"}], else: [] end) end ``` **Example — after:** ```elixir def changeset(subscription, params) do subscription |> cast(params, [:plan, :seats]) |> validate_required([:plan, :seats]) |> validate_seat_limit() end defp validate_seat_limit(changeset) do plan = get_field(changeset, :plan) seats = get_field(changeset, :seats) if plan && seats && seats > plan_seat_limit(plan) do add_error(changeset, :seats, "exceeds limit for %{plan} plan", plan: plan) else changeset end end ``` ### When NOT to Use **Don't use this when:** - A built-in validator covers the rule — `add_error/4` bypasses standard validation metadata - The error applies to only one field and doesn't depend on other fields (use `validate_change/3`) - You want the error to be skipped if the field hasn't changed (use `validate_change/3`) **Over-application example:** ```elixir # Manually adding what validate_required already does def changeset(user, params) do cs = cast(%User{}, params, [:email]) if get_change(cs, :email) == nil do add_error(cs, :email, "can't be blank") else cs end end ``` **Better alternative:** ```elixir def changeset(user, params) do user |> cast(params, [:email]) |> validate_required([:email]) end ``` **Why:** Built-in validators embed the `validation:` metadata key in the error opts, which Ecto uses internally and Phoenix uses for error rendering. `add_error/4` without this metadata can cause unexpected behavior with form helpers that inspect error opts. --- ## 6. `put_change/3` vs `force_change/3` — Tracked vs Forced Changes **Source:** [lib/ecto/changeset.ex#L1959](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L1959) (`put_change`), [#L2248](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L2248) (`force_change`) **What it does:** `put_change/3` compares the new value against the current struct field value; if they are equal, it does NOT add the change (the field stays absent from `changeset.changes`). `force_change/3` always adds the field to `changeset.changes` regardless of whether the value differs. ```elixir user = %User{name: "Alice", role: :member} cs = Ecto.Changeset.change(user) # put_change: no-op if value matches current cs1 = Ecto.Changeset.put_change(cs, :name, "Alice") cs1.changes #=> %{} (no change recorded) # force_change: always records the change cs2 = Ecto.Changeset.force_change(cs, :name, "Alice") cs2.changes #=> %{name: "Alice"} (change recorded, triggers DB write) ``` **Why:** Ecto uses `changeset.changes` to build the SQL `UPDATE` statement — only changed fields are included. If `put_change/3` detects no actual change, the field is excluded from the UPDATE, saving a write and avoiding spurious `updated_at` bumps. `force_change/3` marks the changeset dirty unconditionally, which triggers a DB write even if the row would be identical afterward. Default to `put_change/3`; reach for `force_change/3` only when you specifically need the DB write to happen (e.g., triggering a DB trigger, bumping `updated_at` intentionally). **Anti-pattern:** Using `force_change/3` by default causes unnecessary DB writes on every update: ```elixir # BAD — forces every field into changes even when nothing actually changed def normalize_user(changeset) do name = get_field(changeset, :name) force_change(changeset, :name, String.trim(name || "")) end ``` ### When to Use `put_change/3` **Triggers:** - Setting a field to a computed value that might or might not differ from the current value - Normalizing input (trim whitespace, downcase email) — the result may equal what's already stored - Any programmatic field update where skipping unnecessary DB writes is correct behavior **Example — before:** ```elixir def normalize(changeset) do email = get_field(changeset, :email) # force_change even if email is already lowercase force_change(changeset, :email, String.downcase(email || "")) end ``` **Example — after:** ```elixir def normalize(changeset) do email = get_field(changeset, :email) put_change(changeset, :email, String.downcase(email || "")) end ``` ### When NOT to Use `put_change/3` (use `force_change/3`) **Don't use `put_change/3` when:** - You need to unconditionally bump `updated_at` even when no data changed - A DB trigger or audit log depends on the UPDATE statement firing regardless - You're implementing optimistic locking and need to force a version increment **Over-application example:** ```elixir # Using force_change everywhere "just to be safe" def admin_update(user, params) do user |> cast(params, [:name, :email, :role]) |> then(fn cs -> cs.changes |> Enum.reduce(cs, fn {field, value}, acc -> force_change(acc, field, value) # Redundant — changes are already in changeset.changes end) end) end ``` **Better alternative:** ```elixir def admin_update(user, params) do user |> cast(params, [:name, :email, :role]) |> validate_required([:name, :email]) end ``` **Why:** `cast/4` already correctly populates `changeset.changes` with only the fields that differ. Calling `force_change/3` on top of a cast changeset doubles the work and makes every update unconditionally dirty, preventing Ecto's change-detection optimization from functioning. --- ## 7. Constraints vs Validations — DB-level Safety **Source:** [lib/ecto/changeset.ex#L3916](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L3916) (`unique_constraint`), [#L3987](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L3987) (`foreign_key_constraint`), [#L3777](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L3777) (`check_constraint`) **What it does:** Constraint functions register a DB error name → changeset error mapping. They do NOT run a query themselves. When `Repo.insert/2` or `Repo.update/2` raises a DB constraint violation, Ecto catches it and converts it into a changeset error on the registered field, returning `{:error, changeset}` instead of raising. ```elixir # From lib/ecto/changeset.ex (unique_constraint/3): def unique_constraint(changeset, [first_field | _] = fields, opts) do name = opts[:name] || unique_index_name(changeset, fields) message = constraint_message(opts, "has already been taken") match_type = Keyword.get(opts, :match, :exact) error_key = Keyword.get(opts, :error_key, first_field) add_constraint(changeset, :unique, name, match_type, error_key, message, :unique) end ``` **Why:** A uniqueness check via `validate_change/3` (selecting before inserting) has a race condition — two concurrent requests can both pass the select, then both attempt the insert, with one failing with an unhandled DB error. Constraints delegate to the DB's atomic enforcement. The constraint name in the changeset must exactly match the index name in the migration (Ecto generates `table_field_index` by default, or you can override with `name:`). **Anti-pattern:** Using `unsafe_validate_unique/3` as a replacement for `unique_constraint/3`: ```elixir # RISKY — has a TOCTOU race condition under concurrent load def changeset(user, params) do user |> cast(params, [:email]) |> unsafe_validate_unique(:email, MyApp.Repo) # Two concurrent registrations with the same email can both pass this check end ``` ### When to Use **Triggers:** - Uniqueness must be guaranteed across concurrent inserts (always use `unique_constraint`, not `unsafe_validate_unique` alone) - A foreign key references a row that might not exist (use `foreign_key_constraint`) - The DB has a check constraint that corresponds to a business rule (use `check_constraint`) **Example — before:** ```elixir def changeset(user, params) do user |> cast(params, [:email, :username]) |> validate_required([:email, :username]) # No constraint registration — DB errors bubble up as Postgrex.Error exceptions end ``` **Example — after:** ```elixir def changeset(user, params) do user |> cast(params, [:email, :username]) |> validate_required([:email, :username]) |> unique_constraint(:email) |> unique_constraint(:username) |> foreign_key_constraint(:organization_id) end ``` ### When NOT to Use **Don't use this when:** - The DB index doesn't exist — the constraint will never fire and the error is silently dropped - You need immediate feedback before hitting the DB (use `unsafe_validate_unique` as a UX hint, then also add `unique_constraint`) - The rule is application-level only and has no corresponding DB constraint **Over-application example:** ```elixir # Registering a unique_constraint for a field with no DB index def changeset(order, params) do order |> cast(params, [:reference_number]) |> unique_constraint(:reference_number) # No index in the migration — this constraint registration does nothing end ``` **Better alternative:** ```elixir # First, add the index in a migration: # create unique_index(:orders, [:reference_number]) # Then register the constraint: def changeset(order, params) do order |> cast(params, [:reference_number]) |> unique_constraint(:reference_number) end ``` **Why:** If the DB index doesn't exist, the DB will never raise the expected constraint error and the `unique_constraint/3` registration is silently a no-op. Always add the migration index first, then register the constraint in the changeset. --- ## 8. `prepare_changes/2` — Last-Mile DB-Aware Transforms **Source:** [lib/ecto/changeset.ex#L3706](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L3706) **What it does:** Registers a callback to run inside the database transaction, just before the DB operation executes. At callback time, `changeset.repo` is populated with the Repo module. Used for side effects that must be atomic with the main insert/update (counter-caches, denormalized fields, audit log entries). ```elixir # From lib/ecto/changeset.ex lines 3689-3703: def create_comment(comment, params) do comment |> cast(params, [:body, :post_id]) |> prepare_changes(fn changeset -> if post_id = get_change(changeset, :post_id) do query = from Post, where: [id: ^post_id] changeset.repo.update_all(query, inc: [comment_count: 1]) end changeset end) end ``` **Why:** Counter-caches and denormalized aggregates must update atomically with the row that triggered them. If you increment `comment_count` outside the transaction, a crash between the insert and the increment leaves counts inconsistent. `prepare_changes/2` guarantees both operations commit or both roll back. The callback receives the final changeset (after all `before_*` callbacks), so it sees the actual values that will be written. **Anti-pattern:** Performing side effects after `Repo.insert/update` outside a transaction: ```elixir # BAD — comment is saved but counter may not update if the second call crashes def create_comment(attrs) do with {:ok, comment} <- Repo.insert(Comment.changeset(%Comment{}, attrs)) do Repo.update_all( from(p in Post, where: p.id == ^comment.post_id), inc: [comment_count: 1] ) {:ok, comment} end end ``` ### When to Use **Triggers:** - You need to update a related row (counter-cache, denormalized field) atomically with the main operation - The side effect requires the Repo and must be inside the transaction - You're computing a value (e.g., slug from a title) that depends on the final changeset state after all other callbacks have run **Example — before:** ```elixir def create_order_item(item, params) do case Repo.insert(OrderItem.changeset(item, params)) do {:ok, item} -> # Race condition: order total might be stale if this crashes or races Repo.update_all( from(o in Order, where: o.id == ^item.order_id), inc: [total_cents: item.price_cents] ) {:ok, item} error -> error end end ``` **Example — after:** ```elixir def changeset(item, params) do item |> cast(params, [:price_cents, :order_id, :product_id]) |> validate_required([:price_cents, :order_id]) |> prepare_changes(fn changeset -> if price = get_change(changeset, :price_cents) do order_id = get_field(changeset, :order_id) changeset.repo.update_all( from(o in Order, where: o.id == ^order_id), inc: [total_cents: price] ) end changeset end) end ``` ### When NOT to Use **Don't use this when:** - The side effect doesn't need to be in the same transaction (e.g., sending an email — use `Ecto.Multi` or an after-commit hook instead) - The computation doesn't need Repo access (use a regular `validate_change/3` or map transformation before the changeset reaches Repo) - You need the side effect to run even when the changeset is invalid (callbacks only run when `valid?` is true) **Over-application example:** ```elixir # prepare_changes for something that doesn't need the transaction def changeset(user, params) do user |> cast(params, [:name]) |> prepare_changes(fn changeset -> # Email sending in a DB transaction holds the connection open — bad Mailer.send_welcome_email(get_field(changeset, :email)) changeset end) end ``` **Better alternative:** ```elixir def create_user(attrs) do with {:ok, user} <- Repo.insert(User.changeset(%User{}, attrs)) do Mailer.send_welcome_email(user.email) {:ok, user} end end ``` **Why:** `prepare_changes/2` holds the DB connection and transaction open for its duration. Long-running operations (HTTP calls, email sending, file I/O) inside `prepare_changes/2` exhaust the connection pool. Reserve it strictly for quick DB operations that must be atomic. --- ## 9. `apply_action/2` — Schemaless Validation **Source:** [lib/ecto/changeset.ex#L2332](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L2332) **What it does:** Applies the changeset's changes to the underlying struct and returns `{:ok, struct}` if valid, or `{:error, changeset}` (with `action` set) if invalid. No DB operation is performed. Sets `changeset.action` to the given atom, which Phoenix form helpers use to decide whether to show error messages. ```elixir # From lib/ecto/changeset.ex lines 2332-2338: def apply_action(%Changeset{} = changeset, action) when is_atom(action) do if changeset.valid? do {:ok, apply_changes(changeset)} else {:error, %{changeset | action: action}} end end ``` **Why:** Not all changeset workflows end with a DB write. Form objects (search filters, multi-step wizards, command structs) need the full changeset pipeline (cast, validate, present errors) without ever touching a database. `apply_action/2` is the endpoint for those workflows — it returns the same `{:ok, value} | {:error, changeset}` shape that `Repo.insert/update` returns, so callers and controllers work identically regardless of whether data is persisted. **Anti-pattern:** Checking `changeset.valid?` manually and calling `apply_changes/1` yourself: ```elixir # BAD — duplicates apply_action logic, misses setting changeset.action def submit_search(params) do cs = SearchForm.changeset(%SearchForm{}, params) if cs.valid? do {:ok, Ecto.Changeset.apply_changes(cs)} else {:error, cs} # action not set — Phoenix won't show errors in the template end end ``` ### When to Use **Triggers:** - You're using an embedded schema or schemaless changeset for a form that doesn't persist to the DB - You need the `{:ok, struct} | {:error, changeset}` return shape for compatibility with standard controller patterns - You want `changeset.action` to be set so Phoenix form helpers render inline errors correctly **Example — before:** ```elixir def process_filter(params) do changeset = FilterForm.changeset(%FilterForm{}, params) if changeset.valid? do filters = Ecto.Changeset.apply_changes(changeset) {:ok, filters} else {:error, changeset} # action not set — form errors won't render end end ``` **Example — after:** ```elixir def process_filter(params) do %FilterForm{} |> FilterForm.changeset(params) |> Ecto.Changeset.apply_action(:validate) end ``` ### When NOT to Use **Don't use this when:** - You intend to persist the data — use `Repo.insert/2` or `Repo.update/2` directly - You need to inspect `changeset.changes` rather than the applied struct (call `apply_changes/1` only if valid, otherwise work with the changeset) - The action atom matters to downstream code — `:validate`, `:insert`, `:update` have semantic meaning in some form libraries **Over-application example:** ```elixir # apply_action before persisting — redundant, Repo.insert calls apply_changes internally def create_user(attrs) do changeset = User.changeset(%User{}, attrs) with {:ok, user} <- Ecto.Changeset.apply_action(changeset, :insert) do Repo.insert(User.changeset(user, %{})) end end ``` **Better alternative:** ```elixir def create_user(attrs) do %User{} |> User.changeset(attrs) |> Repo.insert() end ``` **Why:** `Repo.insert/2` already validates the changeset and applies changes internally. Calling `apply_action/2` before `Repo.insert/2` runs the changeset twice and re-builds the struct unnecessarily. --- ## 10. `cast_assoc/3` vs `put_assoc/4` — External vs Internal Association Changes **Source:** [lib/ecto/changeset.ex#L1213](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex#L1213) (`cast_assoc`) **What it does:** `cast_assoc/3` handles nested external data by invoking the associated schema's changeset function for each nested params map. `put_assoc/4` replaces the association directly with a struct or list of structs that your code has already built and trusts. ```elixir # cast_assoc — external data flows through the association's changeset function def changeset(post, params) do post |> cast(params, [:title, :body]) |> cast_assoc(:comments, with: &Comment.changeset/2) end # put_assoc — trusted internal data replaces the association wholesale def assign_tags(post, tags) when is_list(tags) do post |> Ecto.Changeset.change() |> Ecto.Changeset.put_assoc(:tags, tags) end ``` **Why:** `cast_assoc/3` delegates to the association's own changeset function, which applies its own cast, validate_required, constraints, etc. This means all the safety guarantees of the association's changeset apply to nested user input. `put_assoc/4` bypasses the association's changeset entirely — it trusts that you've already validated the data. Using `put_assoc/4` with raw external params would skip all of the association's validation. **Anti-pattern:** Using `put_assoc/4` with external/untrusted nested params bypasses validation: ```elixir # BAD — building Comment structs from raw params skips Comment's changeset def changeset(post, params) do comments = Enum.map(params["comments"] || [], fn p -> struct(Comment, %{body: p["body"], post_id: post.id}) end) post |> cast(params, [:title]) |> put_assoc(:comments, comments) # Comment's validate_required(:body), validate_length(:body), etc. never run end ``` ### When to Use `cast_assoc/3` **Triggers:** - Nested params arrive from a user-submitted form or API body (e.g., a post with embedded comments) - The association has its own changeset function with validations you want applied - You need nested error messages to appear in the parent changeset's `:errors` structure **Example — before:** ```elixir def changeset(invoice, params) do invoice |> cast(params, [:number, :due_date]) |> put_assoc(:line_items, build_line_items(params["line_items"])) end defp build_line_items(nil), do: [] defp build_line_items(items), do: Enum.map(items, &struct(LineItem, &1)) ``` **Example — after:** ```elixir def changeset(invoice, params) do invoice |> cast(params, [:number, :due_date]) |> cast_assoc(:line_items, with: &LineItem.changeset/2) end ``` ### When to Use `put_assoc/4` **Triggers:** - You're loading existing records from the DB and associating them (e.g., tagging a post with existing Tag records) - The associated structs were built by your application code, not by user input - You want to replace the entire association with a pre-validated list **Example:** ```elixir def update_post_tags(post, tag_ids) do tags = Repo.all(from t in Tag, where: t.id in ^tag_ids) post |> Ecto.Changeset.change() |> Ecto.Changeset.put_assoc(:tags, tags) |> Repo.update() end ``` ### When NOT to Use **Don't use `cast_assoc` when:** - You have pre-built, validated structs — `put_assoc` is simpler and more explicit - The association doesn't have (or need) a changeset function - You're building a test fixture and want direct struct assignment **Don't use `put_assoc` when:** - Any part of the association data came from user input — use `cast_assoc` so the association's validators run **Over-application example:** ```elixir # put_assoc for external data — Profile.changeset validations never run def changeset(user, params) do profile = struct(Profile, params["profile"] || %{}) user |> cast(params, [:email]) |> put_assoc(:profile, profile) end ``` **Better alternative:** ```elixir def changeset(user, params) do user |> cast(params, [:email]) |> cast_assoc(:profile, with: &Profile.changeset/2) end ``` **Why:** `put_assoc/4` with externally-sourced structs is a silent validation bypass. If `Profile.changeset/2` validates that `:bio` must be under 500 characters, `put_assoc/4` with a user-supplied profile will store any length — the validation simply never ran. --- ## Decision Tree Use this tree when deciding which Ecto changeset function to reach for: - **Data comes from a user, form, or API?** → Use `cast/4` (type coercion + allow-list) - **Data comes from your application code?** → Use `change/2` (no coercion, no filtering) - **Need to validate a value without hitting the DB?** → Use `validate_required`, `validate_format`, `validate_length`, `validate_number`, `validate_inclusion`, `validate_subset`, or `validate_change/3` for custom rules - **Need DB-level uniqueness guarantee (race-safe)?** → Use `unique_constraint/2` (not `unsafe_validate_unique/3` alone) - **Setting a field to a computed value that might not have changed?** → Use `put_change/3` (skips write if unchanged); use `force_change/3` only when the DB write must occur regardless - **Need to run something inside the transaction?** → Use `prepare_changes/2` - **Validating without persisting (form objects, embedded schemas)?** → Use `apply_action/2` - **Nested association data comes from external params?** → Use `cast_assoc/3` (delegates to association's changeset function) - **Nested association data is trusted internal data (loaded from DB)?** → Use `put_assoc/4`