Files
ecto-conventions/conventions.md
T
Rodin 703eb613fd docs: initial conventions from elixir-ecto/ecto
Key patterns: query composition, schemaless changesets, protocol-based
extensibility, zero TODOs discipline, version-gated cleanup
2026-04-30 11:45:32 -07:00

27 KiB

Patterns Extracted from elixir-ecto/ecto

Pattern: Schemaless Changesets for Validation Without Persistence

Source: lib/ecto/changeset.ex Category: changeset What: Use {data, types} tuples instead of schema structs to validate arbitrary data without a database-backed schema. Why: Decouples validation logic from persistence. Web forms, CLI inputs, API parameters, and configuration can all be validated using the same changeset pipeline without requiring a schema module or database table. Example:

data = %{}
types = %{name: :string, email: :string, age: :integer}
params = %{"name" => "Callum", "email" => "callum@example.com", "age" => "27"}

changeset =
  {data, types}
  |> Ecto.Changeset.cast(params, Map.keys(types))
  |> Ecto.Changeset.validate_required([:name, :email])
  |> Ecto.Changeset.validate_format(:email, ~r/@/)

case Ecto.Changeset.apply_action(changeset, :validate) do
  {:ok, validated} -> # Use validated data
  {:error, changeset} -> # Handle errors
end

When to use: Form validation, API parameter validation, configuration parsing, any data that needs casting + validation but won't go to a database. Also useful in tests. When NOT to use: When you have a real schema already — just use it. Don't create schemaless changesets as a workaround for schema design issues.


Pattern: Validation vs Constraint Boundary

Source: lib/ecto/changeset.ex Category: changeset What: Explicitly separate validations (checked in-memory before DB) from constraints (checked by the database after insert/update). Constraints only run if all validations pass. Why: Creates a two-phase error pipeline: fast client-side-checkable errors (validations) and data-integrity errors that require the database (constraints). This design avoids race conditions — uniqueness can only be truly guaranteed by the DB. Example:

def changeset(user, params) do
  user
  |> cast(params, [:name, :email, :age])
  # Phase 1: Validations (in-memory, immediate)
  |> validate_required([:name, :email])
  |> validate_format(:email, ~r/@/)
  |> validate_inclusion(:age, 18..100)
  # Phase 2: Constraints (database, deferred)
  |> unique_constraint(:email)
  |> check_constraint(:age, name: :age_must_be_positive)
end

When to use: Always. This is the pattern for input handling in Ecto. Understand the boundary: validations catch the obvious stuff fast; constraints catch the stuff only the DB knows. When NOT to use: Don't use unsafe_validate_unique/4 as a substitute for unique_constraint/3 — it's a UX optimization (show errors early), not a guarantee.


Pattern: Cast/Change Duality (External vs Internal Data)

Source: lib/ecto/changeset.ex Category: changeset What: cast/4 handles external (untrusted) data with string keys, performing type coercion and filtering. change/2 handles internal (trusted) data with atom keys, storing values directly without validation. Why: This duality enforces a security boundary. External data (user input) goes through cast which only permits explicitly listed fields. Internal data (programmatic changes) uses change which trusts the caller. If you have atom keys, the data has already been parsed/validated — that's the invariant. Example:

# External: user input from a web form (string keys, untrusted)
changeset = cast(user, %{"name" => "Alice", "is_admin" => "true"}, [:name])
# Only :name is permitted — is_admin is silently dropped

# Internal: programmatic change (atom keys, trusted)
changeset = change(user, %{verified_at: DateTime.utc_now()})
# No filtering, no casting — you know what you're doing

When to use: cast for anything from HTTP, CLI, or external systems. change for server-side logic (background jobs, system events, migrations). When NOT to use: Never pass user-controlled data to change/2. Never use cast/4 for internal programmatic updates (it's wasteful and may reject valid internal representations).


Pattern: Composable Query Building with dynamic/2

Source: lib/ecto/query.ex Category: query What: Build query expressions incrementally using dynamic/2, composing conditions from runtime values, then interpolate the full expression at the root of where, order_by, group_by, or select. Why: Standard Ecto queries are compile-time expressions. dynamic bridges the gap for search/filter UIs where conditions are assembled based on user input. Each dynamic expression is self-contained with its own bindings and can reference different joins. Example:

def filter(params) do
  conditions = dynamic(true)

  conditions =
    if params["published"] do
      dynamic([p], p.is_published and ^conditions)
    else
      conditions
    end

  conditions =
    if min_age = params["min_age"] do
      dynamic([p, a], a.age >= ^min_age and ^conditions)
    else
      conditions
    end

  from(p in Post, where: ^conditions)
end

When to use: Search forms, API filters, any scenario where the number and combination of conditions is unknown at compile time. Also works for dynamic order_by, group_by, select, and update. When NOT to use: Simple queries with fixed conditions — just use regular where. Don't use dynamic for every query; it obscures intent and loses type casting across dynamic boundaries.


Pattern: Named Bindings for Composable Joins

Source: lib/ecto/query.ex Category: query What: Use as: :name on from and join clauses to create stable references that survive query composition, then reference them positionally or by name in later refinements. Why: Positional bindings break when queries are composed in different orders. Named bindings decouple the query structure from the position, making helper functions composable without knowing the full query shape. Example:

# Base query with named binding
def base_query do
  from(p in Post, as: :post,
    join: a in assoc(p, :author), as: :author)
end

# Filter function doesn't need to know position
def filter_by_author_age(query, min_age) do
  from([author: a] in query, where: a.age >= ^min_age)
end

# Late binding with parent_as in subqueries
def with_comment_count(query) do
  comment_count = from(c in Comment,
    where: c.post_id == parent_as(:post).id,
    select: count()
  )
  from([post: p] in query,
    select_merge: %{comment_count: subquery(comment_count)})
end

When to use: Library code, shared query helpers, any query that will be extended by callers. Essential for parent_as in lateral joins and correlated subqueries. When NOT to use: One-off queries in a single function where positional bindings are clear and the query won't be composed further.


Pattern: Ecto.Multi as Inspectable Transaction Blueprint

Source: lib/ecto/multi.ex Category: repo What: Build transactions as data (an Ecto.Multi struct) that can be inspected, tested, and composed before execution. Use to_list/1 to introspect operations without hitting the database. Why: Traditional Repo.transaction(fn -> ... end) is opaque — you can't test what it will do without running it. Multi makes the transaction plan a first-class value: you can unit test changesets, assert operation order, and compose Multis from different modules. Example:

defmodule PasswordManager do
  def reset(account, params) do
    Multi.new()
    |> Multi.update(:account, Account.password_reset_changeset(account, params))
    |> Multi.insert(:log, Log.password_reset_changeset(account, params))
    |> Multi.delete_all(:sessions, Ecto.assoc(account, :sessions))
  end
end

# Unit test without database
test "password reset multi structure" do
  multi = PasswordManager.reset(%Account{}, %{password: "new"})
  assert [{:account, {:update, changeset, []}},
          {:log, {:insert, _, []}},
          {:sessions, {:delete_all, _, []}}] = Ecto.Multi.to_list(multi)
  assert changeset.valid?
end

# Execute
case Repo.transact(PasswordManager.reset(account, params)) do
  {:ok, %{account: account}} -> # success
  {:error, :account, changeset, _} -> # account update failed
end

When to use: When the set of operations is dynamic, when you need to test transaction logic without a database, or when multiple modules contribute operations to the same transaction. When NOT to use: Simple happy-path transactions. Per Ecto's own docs: "For most other use cases, using regular control flow within Repo.transact(fun) and returning {:ok, result} or {:error, reason} is more straightforward."


Pattern: Multi.merge for Dynamic Transaction Composition

Source: lib/ecto/multi.ex Category: repo What: Use Multi.merge/2 to dynamically compose transactions where later operations depend on earlier results. The merge function receives all changes so far and returns a new Multi. Why: Static Multi pipelines can't branch based on previous results. merge enables conditional logic (if X was inserted, also insert Y) while keeping the transaction atomic and the plan composable. Example:

Multi.new()
|> Multi.insert(:post, %Post{title: "first"})
|> Multi.merge(fn %{post: post} ->
  Multi.new()
  |> Multi.insert(:comment, Ecto.build_assoc(post, :comments, body: "auto"))
  |> Multi.update_all(:notify, from(s in Sub, where: s.post_id == ^post.id),
       set: [notified: true])
end)
|> Repo.transact()

When to use: When transaction operations depend on auto-generated IDs or computed values from earlier steps. When composing Multis from different bounded contexts. When NOT to use: When operations don't actually depend on each other — just use Multi.insert/update directly. Avoid deeply nested merges; flatten when possible.


Pattern: Multi with Non-Atom Names for Collection Operations

Source: lib/ecto/multi.ex Category: repo What: Multi operation names can be any term (not just atoms) — use tuples like {:account, id} to track individual items when processing collections within a transaction. Why: When updating N items in a transaction, you need N unique names. Using tuples gives semantic meaning and allows you to pattern-match on the error to find which specific item failed. Example:

accounts = [%Account{id: 1}, %Account{id: 2}, %Account{id: 3}]

multi =
  Enum.reduce(accounts, Multi.new(), fn account, multi ->
    Multi.update(multi, {:account, account.id},
      Account.password_reset_changeset(account, params))
  end)

case Repo.transact(multi) do
  {:ok, results} ->
    # results[{:account, 1}], results[{:account, 2}], etc.
  {:error, {:account, failed_id}, changeset, _} ->
    # Know exactly which account failed
end

When to use: Batch operations within a transaction where you need to identify which operation failed. When NOT to use: Single-item operations — just use descriptive atoms like :insert_user.


Pattern: The Queryable Protocol for Extensible Data Sources

Source: lib/ecto/queryable.ex Category: protocol What: Ecto.Queryable is a protocol that converts any data structure into an Ecto.Query. Implementations exist for atoms (schemas), bitstrings (table names), tuples ({"table", Schema}), Ecto.Query itself, and Ecto.SubQuery. Why: This is how Ecto achieves its composability. You can pass a schema, a string, a tuple, or a query to any function that accepts a queryable — they're all interchangeable. It's the extension point for custom data sources. Example:

# All of these work wherever a queryable is expected:
Repo.all(User)                           # Atom (schema)
Repo.all(from u in User, where: u.active) # Ecto.Query
Repo.all("users")                         # BitString (table name)
Repo.all({"legacy_users", User})          # Tuple (alternate source + schema)

# Custom queryable — a struct that knows how to become a query:
defimpl Ecto.Queryable, for: ActiveUsers do
  def to_query(%ActiveUsers{min_age: age}) do
    from u in User, where: u.active and u.age >= ^age
  end
end

Repo.all(%ActiveUsers{min_age: 18})

When to use: When you want to create reusable, composable query objects that encapsulate filtering logic. When you have alternate table sources (views, partitioned tables) that share a schema. When NOT to use: For simple filtering — just compose queries normally. Don't over-abstract with custom Queryable implementations when a function returning a query would suffice.


Pattern: ParameterizedType for Field-Level Customization

Source: lib/ecto/parameterized_type.ex Category: protocol What: Ecto.ParameterizedType is a behaviour that extends the type system with compile-time options. Unlike basic Ecto.Type (one implementation per module), parameterized types receive field-specific options at compile time via init/1 and pass them to every callback. Why: Enables a single type module to behave differently per field. Ecto.Enum is the canonical example: one module handles all enum fields, but each field has its own valid values configured at the schema level. Example:

defmodule EncryptedString do
  use Ecto.ParameterizedType

  def type(_params), do: :binary

  def init(opts) do
    %{key: Keyword.fetch!(opts, :key), algorithm: Keyword.get(opts, :algorithm, :aes_256)}
  end

  def cast(value, _params) when is_binary(value), do: {:ok, value}
  def cast(_, _), do: :error

  def dump(nil, _dumper, _params), do: {:ok, nil}
  def dump(value, _dumper, %{key: key, algorithm: alg}) do
    {:ok, encrypt(value, key, alg)}
  end

  def load(nil, _loader, _params), do: {:ok, nil}
  def load(value, _loader, %{key: key, algorithm: alg}) do
    {:ok, decrypt(value, key, alg)}
  end
end

# Usage in schema — different keys per field
schema "users" do
  field :ssn, EncryptedString, key: :ssn_key
  field :notes, EncryptedString, key: :notes_key, algorithm: :chacha20
end

When to use: When a type needs per-field configuration (encryption keys, enum values, precision settings, format options). When you want nil handling in load/dump (basic types skip nil). When NOT to use: When the type behaves identically regardless of field — use basic Ecto.Type instead. Don't reach for parameterized types just because you can.


Pattern: Ecto.Enum as Atom-Safe Persistence

Source: lib/ecto/enum.ex Category: schema What: Ecto.Enum maps atoms to strings (or integers) for database storage. It uses ParameterizedType to accept a values: option per field and builds lookup maps at compile time for O(1) casting/loading/dumping. Why: Atoms are great in Elixir (pattern matching, clarity) but dangerous to create from untrusted input. Ecto.Enum provides safe atom-to-string roundtripping: only atoms declared at compile time are ever created from DB values. The compile-time map generation means zero runtime overhead. Example:

schema "orders" do
  field :status, Ecto.Enum, values: [:pending, :processing, :shipped, :delivered]
  field :priority, Ecto.Enum, values: [low: 1, medium: 2, high: 3, critical: 4]
  field :roles, {:array, Ecto.Enum}, values: [:admin, :editor, :viewer]
end

# Compile-time init builds:
# on_cast: %{"pending" => :pending, "processing" => :processing, ...}
# on_dump: %{pending: "pending", processing: "processing", ...}
# on_load: %{"pending" => :pending, "processing" => :processing, ...}

When to use: Any field with a fixed set of allowed values. Prefer over raw strings for domain modeling (compile-time typo detection, pattern matching). When NOT to use: When the set of values is user-defined and can change without a code deploy — you'd need a different approach (lookup table, custom type).


Pattern: prepare_changes for Side Effects Deferred to Transaction Time

Source: lib/ecto/changeset.ex Category: changeset What: prepare_changes/2 registers a callback that runs inside the database transaction, after the changeset is valid but before it's committed. The callback receives the changeset with repo already set. Why: Some operations must happen atomically with the main insert/update but can't be expressed as changeset changes (counter caches, audit logs, cache invalidation). prepare_changes gives you access to the repo inside the transaction while keeping the changeset's validation pipeline clean. Example:

def create_comment(comment, params) do
  comment
  |> cast(params, [:body, :post_id])
  |> validate_required([:body, :post_id])
  |> prepare_changes(fn changeset ->
    if post_id = get_change(changeset, :post_id) do
      query = from(p in Post, where: [id: ^post_id])
      changeset.repo.update_all(query, inc: [comment_count: 1])
    end
    changeset
  end)
end

When to use: Counter caches, denormalization updates, audit trail inserts that must be atomic with the primary operation. When you need the repo but want to keep it out of the changeset function signature. When NOT to use: Complex multi-step operations — use Ecto.Multi instead. Don't use for non-DB side effects (sending emails) — those should happen after successful commit.


Pattern: Embedded Schemas as UI/Domain Boundary Objects

Source: lib/ecto/schema.ex, lib/ecto/embedded.ex Category: schema What: Use embedded_schema/1 for structs that represent application-layer concepts (form inputs, API requests, configuration) without a backing database table. Combine with cast_embed/3 for nested validation within a parent schema. Why: Decouples your domain's internal representation from what gets persisted. A SignUp embedded schema validates the form; then you split it into Account and Profile schemas for persistence. Embedded schemas also work as value objects inside other schemas (embeds_one/embeds_many). Example:

# As a boundary object (no persistence)
defmodule SearchForm do
  use Ecto.Schema
  import Ecto.Changeset

  embedded_schema do
    field :query, :string
    field :min_price, :decimal
    field :max_price, :decimal
    field :categories, {:array, :string}
  end

  def changeset(form, params) do
    form
    |> cast(params, [:query, :min_price, :max_price, :categories])
    |> validate_required([:query])
    |> validate_number(:min_price, greater_than_or_equal_to: 0)
  end
end

# As a nested value object in a persisted schema
schema "orders" do
  embeds_one :shipping_address, Address, on_replace: :update do
    field :street, :string
    field :city, :string
    field :zip, :string
  end
end

When to use: Form objects, command objects, API request shapes, inline value objects (addresses, coordinates, metadata). Anywhere you need validation without a table. When NOT to use: When the data needs its own lifecycle (CRUD, querying, associations) — use a proper schema with a table.


Pattern: The Adapter Behaviour Split (Layered Capabilities)

Source: lib/ecto/adapter.ex, lib/ecto/adapter/queryable.ex, lib/ecto/adapter/transaction.ex Category: protocol What: Ecto splits adapter responsibilities into multiple behaviours: Ecto.Adapter (base: init, loaders, dumpers), Ecto.Adapter.Queryable (prepare, execute, stream), Ecto.Adapter.Schema (insert, update, delete), Ecto.Adapter.Transaction. An adapter implements only what it supports. Why: Not all data stores support all operations. A key-value store might implement Schema but not Queryable. A read-only store skips Schema entirely. This granular behaviour split means the type system (and the Repo use macro with :read_only) can enforce valid combinations at compile time. Example:

# An adapter declares what it supports:
defmodule MyAdapter do
  @behaviour Ecto.Adapter
  @behaviour Ecto.Adapter.Queryable
  # NOT implementing Ecto.Adapter.Transaction = no transaction support

  @impl Ecto.Adapter
  def init(config), do: {:ok, child_spec, meta}

  @impl Ecto.Adapter
  def loaders(:binary_id, type), do: [Ecto.UUID, type]
  def loaders(_primitive, type), do: [type]

  @impl Ecto.Adapter.Queryable
  def prepare(:all, query), do: {:cache, normalize(query)}
  def execute(meta, query_meta, cache, params, opts), do: {count, rows}
end

When to use: When designing a system with pluggable backends that have different capability sets. The pattern: define a base behaviour for universal operations, then optional behaviours for extended capabilities. When NOT to use: When all implementations will always support all operations — a single behaviour is simpler. Don't over-split when there's no real variability.


Pattern: Loader/Dumper Pipeline for Type Coercion

Source: lib/ecto/adapter.ex, lib/ecto/type.ex Category: protocol What: Adapters return ordered lists of loaders and dumpers for each type — a pipeline of functions/types that values pass through when loading from or dumping to the database. The adapter controls the pipeline shape. Why: Different databases represent the same logical type differently (booleans as 0/1, UUIDs as binary vs string). The loader/dumper pipeline lets adapters inject coercion steps without the type system needing to know about adapter internals. Example:

# Adapter that stores booleans as integers:
def loaders(:boolean, type), do: [&bool_decode/1, type]
def dumpers(:boolean, type), do: [type, &bool_encode/1]

defp bool_decode(0), do: {:ok, false}
defp bool_decode(1), do: {:ok, true}
defp bool_encode(false), do: {:ok, 0}
defp bool_encode(true), do: {:ok, 1}

# The pipeline: DB value → adapter decode → Ecto type load → Elixir value
# And reverse: Elixir value → Ecto type dump → adapter encode → DB value

When to use: Custom adapters, custom types that need adapter-aware coercion. Understanding this pattern helps debug "value not loading correctly" issues. When NOT to use: Application code rarely needs to think about this directly. Use Ecto.Type or Ecto.ParameterizedType for custom type logic.


Pattern: on_replace Strategy for Association Lifecycle

Source: lib/ecto/changeset.ex, lib/ecto/embedded.ex Category: schema What: When a parent changeset replaces associated data (e.g., removes items from a has_many), :on_replace controls what happens to the orphaned records: :raise, :mark_as_invalid, :nilify, :update, :delete, or :delete_if_exists. Why: Default :raise forces explicit design decisions about data lifecycle. Accidentally deleting child records by omitting them from a form submission is a common security/data-loss bug. This pattern makes the lifecycle policy declarative at the schema level. Example:

schema "posts" do
  # Comments are owned — deleting from the list deletes from DB
  has_many :comments, Comment, on_replace: :delete

  # Tags are shared — removing from post just nilifies the FK
  has_many :taggings, Tagging, on_replace: :nilify

  # Profile is 1:1 — replacing updates in place
  has_one :profile, Profile, on_replace: :update

  # Draft is embedded — update content in place
  embeds_one :draft, Draft, on_replace: :update
end

When to use: Always explicitly set :on_replace on associations/embeds that will be modified through cast_assoc/cast_embed. The default :raise is intentional — it forces you to think about it. When NOT to use: N/A — this is a required decision for any association that participates in parent changeset operations.


Pattern: apply_action/2 for Non-Persistence Validation

Source: lib/ecto/changeset.ex Category: changeset What: Call apply_action(changeset, :action) to get {:ok, data} or {:error, changeset} without touching the database. It emulates Repo.insert/update return shapes for changesets that aren't persisted. Why: Phoenix forms and other UIs use changeset.action to decide whether to show errors. Without apply_action, you'd need to manually set the action. With it, you get the same {:ok, _} | {:error, _} pattern as Repo operations, enabling consistent error handling. Example:

# API controller validating params without a DB
def search(conn, params) do
  changeset = SearchForm.changeset(%SearchForm{}, params)

  case Ecto.Changeset.apply_action(changeset, :validate) do
    {:ok, search} ->
      results = SearchService.execute(search)
      json(conn, results)
    {:error, changeset} ->
      conn |> put_status(422) |> render("errors.json", changeset: changeset)
  end
end

When to use: Schemaless changesets, embedded schema validation, any place you want {:ok, _} | {:error, _} semantics without a database roundtrip. Essential for form validation in LiveView. When NOT to use: When you're going to persist anyway — just call Repo.insert/update which does this internally.


Pattern: Repo.transact vs Repo.transaction

Source: lib/ecto/repo/transaction.ex Category: repo What: Repo.transact/2 accepts either a function returning {:ok, _} or {:error, _}, or an Ecto.Multi. It auto-rolls back on {:error, _}. This replaces the older Repo.transaction(fn -> ... end) which requires manual Repo.rollback/1. Why: The older transaction/2 with manual rollback is error-prone — forget to rollback and you commit partial state. transact/2 enforces the {:ok, _} | {:error, _} contract, making it impossible to accidentally commit on error. Example:

# Old pattern (error-prone):
Repo.transaction(fn ->
  case do_thing() do
    {:ok, result} -> result
    {:error, reason} -> Repo.rollback(reason)  # Easy to forget!
  end
end)

# New pattern (safe by construction):
Repo.transact(fn repo ->
  with {:ok, user} <- repo.insert(User.changeset(%User{}, params)),
       {:ok, profile} <- repo.insert(Profile.changeset(%Profile{}, user)) do
    {:ok, %{user: user, profile: profile}}
  end
  # {:error, _} automatically triggers rollback
end)

When to use: All new transaction code. The function form is simpler than Multi when operations are sequential and the logic is straightforward. When NOT to use: When you need the introspection/testing benefits of Ecto.Multi. When the set of operations is dynamic.


Pattern: Query Prefix for Multi-Tenancy

Source: lib/ecto/query.ex Category: query What: Set a query prefix at the query level, from/join level, schema level (@schema_prefix), or repo operation level (prefix: "tenant_x") to route queries to different database schemas/databases. Why: Enables schema-based multi-tenancy (PostgreSQL schemas, MySQL databases) without changing application code. The prefix cascades: repo option → query prefix → schema prefix → from/join option, with later overriding earlier. Example:

# Option 1: At the repo call
Repo.all(User, prefix: "tenant_abc")

# Option 2: On the query
from(u in User) |> Ecto.Query.put_query_prefix("tenant_abc") |> Repo.all()

# Option 3: At the schema level
defmodule TenantABC.User do
  use Ecto.Schema
  @schema_prefix "tenant_abc"
  schema "users" do ... end
end

# Option 4: Per-join (useful for cross-tenant queries)
from u in User, prefix: "tenant_abc",
  join: g in GlobalSetting, prefix: "public", on: ...

When to use: Multi-tenant applications with schema-per-tenant architecture. Also useful for accessing shared/public schemas alongside tenant-specific ones. When NOT to use: Row-level multi-tenancy (where all tenants share tables with a tenant_id column) — just use where clauses or default scopes.