Merge pull request 'feat: add Ecto patterns extracted from elixir-ecto/ecto source' (#1) from feat/ecto-patterns into master

This commit was merged in pull request #1.
This commit is contained in:
2026-05-02 05:48:16 +00:00
15 changed files with 4701 additions and 0 deletions
+23
View File
@@ -2,6 +2,18 @@
How behaviours are designed, implemented, and used in Elixir core and Phoenix.
## Contents
1. [Behaviour Definition with `@callback`](#1-behaviour-definition-with-callback)
2. [`@optional_callbacks` for Extensibility](#2-optional_callbacks-for-extensibility)
3. [`@behaviour` Declaration in `__using__`](#3-behaviour-declaration-in-__using__)
4. [Default Implementations via `defoverridable`](#4-default-implementations-via-defoverridable)
5. [Phoenix Channel: Behaviour + Process + Protocol](#5-phoenix-channel-behaviour--process--protocol)
6. [Callback Documentation Pattern](#6-callback-documentation-pattern)
7. [Phoenix.Endpoint: Behaviour as Interface Contract](#7-phoenixendpoint-behaviour-as-interface-contract)
---
## 1. Behaviour Definition with `@callback`
**Source:** [lib/elixir/lib/gen_server.ex#L577](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/gen_server.ex#L577) (all callback definitions)
@@ -678,4 +690,15 @@ end
**Why:** The more code a `use` macro generates, the harder it is to debug. If users regularly need to read the generated code to understand failures, the abstraction is leaking. Reserve heavy `use` macros for well-established patterns (GenServer, Endpoint, Channel) where the community has internalized the mental model.
## Decision Tree
- If you need a contract that multiple modules will implement differently → define a behaviour with `@callback` (Pattern 1)
- If most implementors will use a default for some callbacks → mark those `@optional_callbacks` (Pattern 2)
- If your behaviour requires boilerplate setup (module attributes, compile hooks) → inject `@behaviour` inside `__using__` (Pattern 3)
- If 90% of implementors want the same default for a callback → provide a `defoverridable` implementation (Pattern 4)
- If the behaviour involves a running process with lifecycle configuration → combine behaviour + process + module attributes (Pattern 5)
- If callback semantics are non-obvious (multiple return shapes, triggering conditions) → write comprehensive `@doc` with examples on each `@callback` (Pattern 6)
- If the behaviour requires significant generated boilerplate (plugs, routing, supervision wiring) → use the `use` macro as the full interface contract (Pattern 7)
- If there is only one implementation and no plans for more → skip the behaviour, use a plain module
<!-- PATTERN_COMPLETE -->
File diff suppressed because it is too large Load Diff
+28
View File
@@ -2,6 +2,21 @@
Patterns extracted from Elixir's standard library source code.
## Contents
1. [List-Specialized Clause Before Protocol Dispatch](#1-list-specialized-clause-before-protocol-dispatch)
2. [Build-Then-Reverse (Cons-Cell Accumulation)](#2-build-then-reverse-cons-cell-accumulation)
3. [Pipeline for Linear Transformations, Bare Calls for Control Flow](#3-pipeline-for-linear-transformations-bare-calls-for-control-flow)
4. [Pipeline Ending with `|> elem(1)` (Protocol Reduce Unwrap)](#4-pipeline-ending-with--elem1-protocol-reduce-unwrap)
5. [Private Helper Decomposition: Recursive Workers with Guards](#5-private-helper-decomposition-recursive-workers-with-guards)
6. [Enum vs Stream Decision Pattern](#6-enum-vs-stream-decision-pattern)
7. [Map.update vs Map.put Decision Pattern](#7-mapupdate-vs-mapput-decision-pattern)
8. [Pattern Matching on Map Structure for Dispatch](#8-pattern-matching-on-map-structure-for-dispatch)
9. [Delegating to Erlang BIFs with `defdelegate`](#9-delegating-to-erlang-bifs-with-defdelegate)
10. [Reduce as the Universal Primitive](#10-reduce-as-the-universal-primitive)
11. [Keyword Multi-Clause Guard Dispatch (String.split pattern)](#11-keyword-multi-clause-guard-dispatch-stringsplit-pattern)
12. [Lazy Private Helpers with `defp parts_to_index`](#12-lazy-private-helpers-with-defp-parts_to_index)
---
## 1. List-Specialized Clause Before Protocol Dispatch
@@ -1010,4 +1025,17 @@ def log(msg) when is_atom(msg), do: IO.puts(Atom.to_string(msg))
**Why:** When a conversion is used exactly once and the calling function already dispatches on clauses, folding the conversion into the caller's clauses reduces indirection. Named helpers shine when reused or when they name a non-obvious transformation.
## Decision Tree
- If you accept "any enumerable" but lists are the common case → add a `when is_list` clause before protocol dispatch (Pattern 1)
- If you are building a result list element-by-element and order matters → prepend with `[x | acc]` then reverse at the end (Pattern 2)
- If data flows through 2+ sequential transformations → use the pipe operator (Pattern 3)
- If you call `Enumerable.reduce/3` directly and always want the accumulated value → unwrap with `|> elem(1)` (Pattern 4)
- If you need a recursive function with multiple termination conditions → decompose into public entry + private multi-clause worker (Pattern 5)
- If the collection is large/infinite or you chain 3+ transforms → use Stream; otherwise use Enum (Pattern 6)
- If the new value depends on the old value (increment, append) → use `Map.update/4`; if replacing unconditionally → use `Map.put/3` (Pattern 7)
- If you need to branch on whether a key exists and extract the value → pattern-match with `%{^key => value}` in a `case` (Pattern 8)
- If an Erlang function has identical semantics and argument order → use `defdelegate` (Pattern 9)
- If you are implementing a custom iterable data structure → implement `Enumerable.reduce/3` to get the full Enum API (Pattern 10)
<!-- PATTERN_COMPLETE -->
+27
View File
@@ -2,6 +2,20 @@
Patterns extracted from the Elixir standard library source code.
## Contents
1. [@moduledoc with Structured Sections](#1-moduledoc-with-structured-sections)
2. [@doc with Sections and Examples](#2-doc-with-sections-and-examples)
3. [@doc since: Version Annotation](#3-doc-since-version-annotation)
4. [@doc guard: true Metadata](#4-doc-guard-true-metadata)
5. [@doc false — Hiding from Documentation](#5-doc-false--hiding-from-documentation)
6. [@moduledoc false — Hiding Modules](#6-moduledoc-false--hiding-modules)
7. [Mermaid Diagrams in Documentation](#7-mermaid-diagrams-in-documentation)
8. [Admonition Blocks in Documentation](#8-admonition-blocks-in-documentation)
9. [@doc deprecated: Soft Deprecation](#9-doc-deprecated-soft-deprecation)
10. [Callback Documentation Convention](#10-callback-documentation-convention)
11. [Documentation with Link References (c: and t: prefixes)](#11-documentation-with-link-references-c-and-t-prefixes)
---
## 1. @moduledoc with Structured Sections
@@ -1000,4 +1014,17 @@ Returns `true` if the calling process is the owner of this resource.
**Why:** Link references should aid navigation, not turn documentation into hypertext soup. Link types and callbacks that users might need to look up; don't link primitive types or universally known functions.
## Decision Tree
- If the module is a primary entry point with 4+ public functions → use structured `@moduledoc` with sections (Pattern 1)
- If a function has non-obvious behavior or edge cases → add `@doc` with sections and `## Examples` doctests (Pattern 2)
- If adding a new public function to a versioned library → annotate with `@doc since: "X.Y.Z"` (Pattern 3)
- If the function/macro is valid in guard clauses → add `@doc guard: true` metadata (Pattern 4)
- If a function must be public for technical reasons but is not user-facing → use `@doc false` (Pattern 5)
- If an entire module is purely internal implementation → use `@moduledoc false` (Pattern 6)
- If documenting multi-component architecture (client-server, pipelines) → embed a Mermaid diagram (Pattern 7)
- If critical information must stand out (security, breaking changes, `use` behavior) → use an admonition block (Pattern 8)
- If a function still works but a better alternative exists → use `@doc deprecated:` for soft deprecation (Pattern 9)
- If defining a behaviour callback with multiple return shapes → write comprehensive callback docs with trigger, params, returns, and example (Pattern 10)
<!-- PATTERN_COMPLETE -->
+32
View File
@@ -2,6 +2,23 @@
Patterns extracted from Elixir's standard library source code.
## Contents
1. [The `with` Macro — Normalized Error Clauses](#1-the-with-macro--normalized-error-clauses)
2. [Real-World `with` — Multi-Step Fallible Operations](#2-real-world-with--multi-step-fallible-operations)
3. [Another `with` — Error Info Extraction](#3-another-with--error-info-extraction)
4. [`{:ok, value}` / `:error` Convention (Map.fetch)](#4-ok-value--error-convention-mapfetch)
5. [Bang Functions: Raise on Error (`fetch!` vs `fetch`)](#5-bang-functions-raise-on-error-fetch-vs-fetch)
6. [Exception Structure: `defexception` Fields](#6-exception-structure-defexception-fields)
7. [Custom `exception/1` Callback for Ergonomic Raising](#7-custom-exception1-callback-for-ergonomic-raising)
8. [`raise` Macro Internals: Compile-Time Type Resolution](#8-raise-macro-internals-compile-time-type-resolution)
9. [Error Normalization: Erlang → Elixir Exception Translation](#9-error-normalization-erlang--elixir-exception-translation)
10. [`blame/2` Callback: Enriching Exceptions After the Fact](#10-blame2-callback-enriching-exceptions-after-the-fact)
11. [Guards for Type Dispatch in Error Handling](#11-guards-for-type-dispatch-in-error-handling)
12. [The `:error` / `{:error, reason}` Convention Split](#12-the-error--error-reason-convention-split)
13. [`reduce_while` — Early Exit Without Exceptions](#13-reduce_while--early-exit-without-exceptions)
14. [Three-Tier Error Strategy in Map Operations](#14-three-tier-error-strategy-in-map-operations)
---
## 1. The `with` Macro — Normalized Error Clauses
@@ -1400,4 +1417,19 @@ end
**Why:** The three-tier pattern only makes sense when failure is a real possibility and different callers genuinely need different responses to that failure. Don't cargo-cult it onto functions that always succeed or have a single calling context.
## Decision Tree
- If you have 2+ sequential steps that each return a value to pattern-match → use `with` with normalized error shapes (Pattern 1)
- If the caller only cares success vs failure (not which step failed) → use `with` + `else _ -> :error` catch-all (Pattern 2)
- If extracting nested data from loosely-structured inputs (stacktraces, metadata) → chain pattern matching in `with` (Pattern 3)
- If a function has exactly one failure mode obvious from context → return bare `:error` (Pattern 4)
- If failure means a bug (preconditions guarantee success) → provide a bang variant that raises (Pattern 5)
- If callers need to programmatically inspect error context → use `defexception` with structured fields (Pattern 6)
- If the exception message is computed from multiple fields or requires validation → override `exception/1` (Pattern 7)
- If wrapping an Erlang library that returns raw error atoms/tuples → normalize to Elixir exceptions at the boundary (Pattern 9)
- If you can provide expensive but helpful context (did-you-mean suggestions) → implement `blame/2` (Pattern 10)
- If multiple distinct failure modes exist → use `{:error, reason}` tuples; if only one → use bare `:error` (Pattern 12)
- If you need early exit from iteration without exceptions → use `reduce_while` with `{:cont, acc}` / `{:halt, acc}` (Pattern 13)
- If designing a module with lookup operations for different caller needs → provide three tiers: get/fetch/fetch! (Pattern 14)
<!-- PATTERN_COMPLETE -->
+95
View File
@@ -2,6 +2,21 @@
Analysis of `lib/elixir/lib/gen_server.ex`, `lib/elixir/lib/agent.ex`, and related modules.
## Contents
1. [Pattern 1: Client/Server API Separation](#pattern-1-clientserver-api-separation)
2. [Pattern 2: `@impl true` Annotations on All Callbacks](#pattern-2-impl-true-annotations-on-all-callbacks)
3. [Pattern 3: Guard-Protected `start_link`](#pattern-3-guard-protected-start_link)
4. [Pattern 4: `handle_continue` for Post-Init Work](#pattern-4-handle_continue-for-post-init-work)
5. [Pattern 5: Timeout-Based Idle Shutdown](#pattern-5-timeout-based-idle-shutdown)
6. [Pattern 6: Periodic Work via `Process.send_after`](#pattern-6-periodic-work-via-processsend_after)
7. [Pattern 7: Call vs Cast Decision (Synchronous vs Asynchronous)](#pattern-7-call-vs-cast-decision-synchronous-vs-asynchronous)
8. [Pattern 8: Default Callback Implementations with Clear Error Messages](#pattern-8-default-callback-implementations-with-clear-error-messages)
9. [Pattern 9: `child_spec/1` Generation and Customization via `use` Options](#pattern-9-child_spec1-generation-and-customization-via-use-options)
10. [Pattern 10: Agent as Minimal State Wrapper (GenServer Under the Hood)](#pattern-10-agent-as-minimal-state-wrapper-genserver-under-the-hood)
11. [Pattern 11: Name Registration via `:via` Tuple](#pattern-11-name-registration-via-via-tuple)
12. [Pattern 12: GenServer as Anti-Pattern — Don't Use Processes for Code Organization](#pattern-12-genserver-as-anti-pattern--dont-use-processes-for-code-organization)
---
## Pattern 1: Client/Server API Separation
@@ -189,6 +204,72 @@ end
**Why:** `@impl true` only makes sense in the context of a declared behaviour. Using it without one causes a compiler warning, not a benefit.
### Minimal Callback Annotation
**Source:** [lib/elixir/lib/module.ex#L72](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/module.ex#L72) (`@impl` documentation)
**What it does:** Once `@impl true` is on a callback, two things become redundant:
1. **Repeating `@impl true` on subsequent clauses** — the annotation applies to the function as a whole, not individual clauses. All clauses inherit it from the first.
2. **Adding `@spec`** — the behaviour's `@callback` already defines the type contract. Dialyzer uses the callback spec to check implementations, so a redundant `@spec` creates a second source of truth that can drift.
**Why:** The behaviour owns the contract. Adding `@spec init(term()) :: {:ok, map()}` on a GenServer callback just restates `@callback init(init_arg :: term) :: {:ok, state} | ...` with less information. Repeating `@impl true` on every clause is noise that misleads readers into thinking each clause is a separate function. The minimal annotation communicates "this is a callback, the behaviour defines the contract."
**Anti-pattern:**
```elixir
# BAD — redundant @spec and @impl on every clause
@spec handle_call(term(), GenServer.from(), map()) :: {:reply, term(), map()}
@impl true
def handle_call({:get, key}, _from, state) do
{:reply, Map.get(state, key), state}
end
@impl true
def handle_call({:keys}, _from, state) do
{:reply, Map.keys(state), state}
end
@impl true
def handle_call({:size}, _from, state) do
{:reply, map_size(state), state}
end
```
**Example — after:**
```elixir
@impl true
def handle_call({:get, key}, _from, state) do
{:reply, Map.get(state, key), state}
end
def handle_call({:keys}, _from, state) do
{:reply, Map.keys(state), state}
end
def handle_call({:size}, _from, state) do
{:reply, map_size(state), state}
end
```
**When NOT to apply this:**
- **Non-consecutive clauses:** If clauses of the same callback are separated by other functions, the compiler may not associate them. Keep multi-clause callbacks grouped together.
- **Intentionally narrower types:** When the implementation accepts a narrower type than the callback declares, a more specific `@spec` documents that constraint:
```elixir
# Justified — documents that this init/1 only accepts keyword lists,
# not arbitrary term() as the callback permits
@spec init(keyword()) :: {:ok, Config.t()}
@impl true
def init(opts) when is_list(opts) do
{:ok, Config.new!(opts)}
end
```
If the `@spec` would be identical to (or wider than) the `@callback`, omit it. If it's meaningfully narrower, keep it.
---
## Pattern 3: Guard-Protected `start_link`
@@ -1109,4 +1190,18 @@ end
**Why:** The pattern cuts both ways. Over-using GenServer creates bottlenecks. Under-using it means reinventing state management poorly. The litmus test: does the state need to survive between function calls? Does access need serialization? If yes, you need a process.
## Decision Tree
- If other modules will interact with your GenServer → define a client API wrapping call/cast (Pattern 1)
- If implementing any behaviour callback → annotate with `@impl true` (Pattern 2)
- If `start_link` accepts arguments with a specific expected shape → add guards for fail-fast validation (Pattern 3)
- If `init/1` does expensive work (DB, network, cache warming) → split into fast init + `handle_continue` (Pattern 4)
- If the process is ephemeral (per-user, per-session) and should clean up when idle → use timeout-based idle shutdown (Pattern 5)
- If you need work at regular intervals regardless of message traffic → use `Process.send_after` self-scheduling loop (Pattern 6)
- If the caller needs confirmation or backpressure → use `call`; only use `cast` for genuine fire-and-forget (Pattern 7)
- If the process needs non-default restart/shutdown behavior → customize via `use GenServer` options (Pattern 9)
- If the process is purely about state (no custom messages, no timers) → use Agent instead of GenServer (Pattern 10)
- If spawning processes dynamically with unbounded names → use `{:via, Registry, ...}` to avoid atom leaks (Pattern 11)
- If the operation is stateless pure computation → don't use a GenServer at all, use a plain function (Pattern 12)
<!-- PATTERN_COMPLETE -->
+29
View File
@@ -2,6 +2,21 @@
Patterns extracted from the Elixir standard library source code.
## Contents
1. [Context-Aware Macros (__CALLER__.context)](#1-context-aware-macros-__caller__context)
2. [defguard — Macro for Guard-Safe Expressions](#2-defguard--macro-for-guard-safe-expressions)
3. [quote + unquote for Code Generation](#3-quote--unquote-for-code-generation)
4. [var! for Breaking Hygiene](#4-var-for-breaking-hygiene)
5. [Macro Expanding with Macro.expand](#5-macro-expanding-with-macroexpand)
6. [assert_no_match_or_guard_scope Pattern](#6-assert_no_match_or_guard_scope-pattern)
7. [Protocol Definition as a Macro (defprotocol)](#7-protocol-definition-as-a-macro-defprotocol)
8. [@fallback_to_any in Protocols](#8-fallback_to_any-in-protocols)
9. [use/2 as Macro Injection Point](#9-use2-as-macro-injection-point)
10. [Sigil Macros (Pattern for DSL Literals)](#10-sigil-macros-pattern-for-dsl-literals)
11. [Pipe Operator as a Macro](#11-pipe-operator-as-a-macro)
12. [Macro.generate_unique_arguments for Hygiene](#12-macrogenerate_unique_arguments-for-hygiene)
---
## 1. Context-Aware Macros (__CALLER__.context)
@@ -1105,4 +1120,18 @@ end
**Why:** Variables created in `quote` are already hygienic by default — they can't clash with caller variables. `generate_unique_arguments` is needed when you're generating *multiple* variables dynamically (e.g., function parameters for a generated clause) where you need distinct names that also interoperate correctly.
## Decision Tree
- If a macro must behave differently in guards vs normal code → check `__CALLER__.context` (Pattern 1)
- If you need a reusable, compile-time-validated guard expression → use `defguard` (Pattern 2)
- If a macro argument might have side effects or be expensive → use `quote bind_quoted:` to evaluate once (Pattern 3)
- If a macro must reference a variable in the caller's scope → use `var!` sparingly (Pattern 4)
- If the macro receives input that could be an alias or module attribute → expand with `Macro.expand` before branching (Pattern 5)
- If your macro defines module-level constructs and should never appear in guards → assert context at the top (Pattern 6)
- If you need open-ended type dispatch that external code can extend → use `defprotocol` (Pattern 7)
- If a protocol should handle any value rather than raising on unknown types → use `@fallback_to_any true` (Pattern 8)
- If a module needs injected behaviours, attributes, or compile hooks → use the `use/2` + `__using__/1` pattern (Pattern 9)
- If you have compile-time-known literals that benefit from validation → define a sigil macro (Pattern 10)
- If you need a zero-cost syntactic transformation (argument rewriting) → implement as a macro like `|>` (Pattern 11)
<!-- PATTERN_COMPLETE -->
+22
View File
@@ -2,6 +2,18 @@
How modules are structured, named, and organized in Elixir core and Phoenix.
## Contents
1. [One Module per Concept, Nested for Sub-Concepts](#1-one-module-per-concept-nested-for-sub-concepts)
2. [Public API at the Top, Private Functions at the Bottom](#2-public-api-at-the-top-private-functions-at-the-bottom)
3. [`@moduledoc false` for Internal Modules](#3-moduledoc-false-for-internal-modules)
4. [Struct Definition Conventions](#4-struct-definition-conventions)
5. [Selective Imports in `__using__`](#5-selective-imports-in-__using__)
6. [Alias at Module Scope for Readability](#6-alias-at-module-scope-for-readability)
7. [Boolean-Suffixed Fields in Structs](#7-boolean-suffixed-fields-in-structs)
---
## 1. One Module per Concept, Nested for Sub-Concepts
**Source:** `lib/elixir/lib/` directory structure
@@ -590,4 +602,14 @@ defstruct [:user, :admin?, :count]
**Why:** The `?` suffix should only mark genuine booleans. Using it on non-boolean fields creates confusion about the field's type and breaks the convention's usefulness as a type signal.
## Decision Tree
- If your module has grown beyond 300 lines with distinct sub-responsibilities → [One Module per Concept, Nested for Sub-Concepts](#1-one-module-per-concept-nested-for-sub-concepts)
- If you need to decide function ordering within a module → [Public API at the Top, Private Functions at the Bottom](#2-public-api-at-the-top-private-functions-at-the-bottom)
- If a module exists purely for internal code organization and should not appear in docs → [`@moduledoc false` for Internal Modules](#3-moduledoc-false-for-internal-modules)
- If you need to define a struct and decide which fields are mandatory → [Struct Definition Conventions](#4-struct-definition-conventions)
- If your `use` macro needs to set up the caller's namespace with specific functions → [Selective Imports in `__using__`](#5-selective-imports-in-__using__)
- If multiple modules from the same parent namespace are used repeatedly → [Alias at Module Scope for Readability](#6-alias-at-module-scope-for-readability)
- If a struct field stores a boolean value and you want self-documenting naming → [Boolean-Suffixed Fields in Structs](#7-boolean-suffixed-fields-in-structs)
<!-- PATTERN_COMPLETE -->
+712
View File
@@ -0,0 +1,712 @@
# Ecto.Multi Patterns
Patterns extracted from Ecto's `Ecto.Multi` source code.
## Contents
1. [`Multi.new() |> Multi.insert/update/delete` — Named Operation Pipeline](#1-multinew--multiinsertupdatedelete--named-operation-pipeline)
2. [`Multi.run/3` — Arbitrary Code in a Transaction](#2-multirun3--arbitrary-code-in-a-transaction)
3. [Dependent Operations with Function Variants](#3-dependent-operations-with-function-variants)
4. [`Multi.merge/2` — Dynamic Transaction Composition](#4-multimerge2--dynamic-transaction-composition)
5. [`Multi.append/2` / `Multi.prepend/2` — Static Multi Composition](#5-multiappend2--multiprepend2--static-multi-composition)
6. [Tuple Keys — Dynamic Collections of Operations](#6-tuple-keys--dynamic-collections-of-operations)
7. [`Multi.to_list/1` — Testing Without a Database](#7-multito_list1--testing-without-a-database)
---
## 1. `Multi.new() |> Multi.insert/update/delete` — Named Operation Pipeline
**Source:** [lib/ecto/multi.ex#L58](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/multi.ex#L58)
```elixir
def reset(account, params) do
Multi.new()
|> Multi.update(:account, Account.password_reset_changeset(account, params))
|> Multi.insert(:log, Log.password_reset_changeset(account, params))
|> Multi.delete_all(:sessions, Ecto.assoc(account, :sessions))
end
# Execute:
case Repo.transaction(PasswordManager.reset(account, params)) do
{:ok, %{account: account, log: log}} -> # success
{:error, :account, changeset, _} -> # account step failed
end
```
**Why:** Each operation is named. On success, `Repo.transaction` returns `{:ok, results_map}` where each key is the name given to that operation. On failure, it returns `{:error, failed_name, failed_value, changes_so_far}`, making it immediately clear which step aborted the transaction and why. This is more precise than a bare transaction function where you'd have to inspect the return value to guess which step failed.
**Anti-pattern:** Using an anonymous function with bare `case` statements inside a transaction, where failure attribution is implicit:
```elixir
# BAD — no way to know which operation failed from the return value alone
Repo.transaction(fn ->
case Repo.update(Account.password_reset_changeset(account, params)) do
{:ok, account} ->
case Repo.insert(Log.password_reset_changeset(account, params)) do
{:ok, log} -> {:ok, %{account: account, log: log}}
{:error, changeset} -> Repo.rollback(changeset)
end
{:error, changeset} -> Repo.rollback(changeset)
end
end)
```
### When to Use
**Triggers:**
- You have 2+ database operations that must all succeed or all roll back
- The set of operations is known at compile time (not dynamically generated)
- The caller needs to know which specific operation failed
**Example — before:**
```elixir
def create_user_with_profile(params) do
Repo.transaction(fn ->
case Repo.insert(User.changeset(params)) do
{:ok, user} ->
case Repo.insert(Profile.changeset(user, params)) do
{:ok, profile} -> {:ok, {user, profile}}
{:error, cs} -> Repo.rollback(cs)
end
{:error, cs} -> Repo.rollback(cs)
end
end)
end
```
**Example — after:**
```elixir
def create_user_with_profile(params) do
Multi.new()
|> Multi.insert(:user, User.changeset(params))
|> Multi.insert(:profile, fn %{user: user} ->
Profile.changeset(user, params)
end)
|> Repo.transaction()
end
# Caller:
case create_user_with_profile(params) do
{:ok, %{user: user, profile: profile}} -> {:ok, user}
{:error, :user, changeset, _} -> {:error, changeset}
{:error, :profile, changeset, _} -> {:error, changeset}
end
```
### When NOT to Use
**Don't use this when:**
- You have a single database operation (just call `Repo.insert/update/delete` directly)
- Operations are simple and sequential with no branching (a plain `Repo.transaction(fn -> ... end)` is more readable)
- The overhead of building a Multi struct is not justified by the number of operations
**Over-application example:**
```elixir
# Overkill for a single operation
Multi.new()
|> Multi.insert(:user, User.changeset(params))
|> Repo.transaction()
```
**Better alternative:**
```elixir
Repo.insert(User.changeset(params))
```
**Why:** `Ecto.Multi` introduces indirection. For simple cases, calling Repo functions directly or using `Repo.transaction(fn -> ... end)` is clearer. The named-pipeline form pays off when 3+ operations are involved and failure attribution matters.
---
## 2. `Multi.run/3` — Arbitrary Code in a Transaction
**Source:** [lib/ecto/multi.ex#L39](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/multi.ex#L39)
```elixir
Multi.new()
|> Multi.insert(:user, user_changeset)
|> Multi.run(:welcome_email, fn repo, %{user: user} ->
case Mailer.send_welcome(user) do
:ok -> {:ok, :sent}
{:error, reason} -> {:error, reason}
end
end)
```
**Why:** `Multi.run` is the escape hatch for operations that don't fit the standard insert/update/delete API. The callback receives `(repo, changes_so_far)` — the repo argument means you can issue raw queries using the same transaction connection. Returning `{:error, value}` from the function aborts the whole transaction and surfaces the value in the `{:error, :welcome_email, value, _}` result. Returning `{:ok, value}` stores the value under `:welcome_email` in the success map.
**Anti-pattern:** Embedding operations in `Multi.run` that don't need transaction context or that don't return the required `{:ok, value}` / `{:error, value}` shape:
```elixir
# BAD — run used just to transform data, no transaction context needed
Multi.new()
|> Multi.insert(:user, user_changeset)
|> Multi.run(:formatted_name, fn _repo, %{user: user} ->
# This has no side effects and doesn't need a transaction
{:ok, String.upcase(user.name)}
end)
```
### When to Use
**Triggers:**
- You need to perform an operation that isn't a standard Ecto schema action (e.g., calling an external service, running a raw query, computing a value that might fail)
- The operation must participate in the transaction rollback on failure
- The result of the operation is needed by a subsequent step in the Multi
**Example — before:**
```elixir
def register_user(params) do
Repo.transaction(fn ->
{:ok, user} = Repo.insert(User.changeset(params))
# This runs outside Multi — if it fails, user was already inserted
case ExternalService.provision(user.id) do
{:ok, token} -> {:ok, %{user: user, token: token}}
{:error, reason} -> Repo.rollback(reason)
end
end)
end
```
**Example — after:**
```elixir
def register_user(params) do
Multi.new()
|> Multi.insert(:user, User.changeset(params))
|> Multi.run(:provision, fn _repo, %{user: user} ->
ExternalService.provision(user.id)
end)
|> Repo.transaction()
end
```
### When NOT to Use
**Don't use this when:**
- The operation is a standard Ecto schema action — use `Multi.insert`, `Multi.update`, `Multi.delete`, or `Multi.delete_all` instead
- The callback only transforms data without any side effects or failure modes
- You need to insert/update using a changeset that depends on prior results — use the function variant of `Multi.insert/update` (see Pattern 3) instead
**Over-application example:**
```elixir
# Multi.run adds overhead when Multi.insert's function variant covers this
Multi.new()
|> Multi.insert(:post, post_changeset)
|> Multi.run(:comment, fn _repo, %{post: post} ->
Repo.insert(Ecto.build_assoc(post, :comments, body: "first"))
end)
```
**Better alternative:**
```elixir
Multi.new()
|> Multi.insert(:post, post_changeset)
|> Multi.insert(:comment, fn %{post: post} ->
Ecto.build_assoc(post, :comments, body: "first")
end)
```
**Why:** `Multi.run` requires you to return a tagged `{:ok, value}` / `{:error, value}` tuple. When the operation is a plain Ecto changeset, the function variants of `Multi.insert/update/delete` accept a function that returns a changeset and handle the wrapping internally. Reserve `Multi.run` for operations with their own failure semantics.
---
## 3. Dependent Operations with Function Variants
**Source:** [lib/ecto/multi.ex#L298](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/multi.ex#L298)
```elixir
Multi.new()
|> Multi.insert(:post, %Post{title: "first"})
|> Multi.insert(:comment, fn %{post: post} ->
Ecto.build_assoc(post, :comments, body: "first comment")
end)
```
**Why:** Each of the standard Multi operations (`insert`, `update`, `delete`, `delete_all`, `insert_or_update`) accepts either a static changeset/struct or a one-argument function `fn changes -> changeset end`. When an operation depends on the result of a previous step, pass the function form. The function receives the accumulated changes map up to that point, giving you access to all prior results. This keeps operations in a single pipeline without introducing a separate `Multi.run` call.
**Anti-pattern:** Using `Multi.run` when a function variant of `Multi.insert/update` suffices:
```elixir
# BAD — Multi.run is more verbose and requires explicit {:ok, value} wrapping
Multi.new()
|> Multi.insert(:post, %Post{title: "first"})
|> Multi.run(:comment, fn _repo, %{post: post} ->
Repo.insert(Ecto.build_assoc(post, :comments, body: "first comment"))
end)
```
### When to Use
**Triggers:**
- An operation's changeset or struct must be constructed using the result of a prior operation
- You need an association ID, a foreign key, or any field that isn't available until a prior step runs
- You want to keep the pipeline as `Multi.insert/update/delete` calls without dropping into `Multi.run`
**Example — before:**
```elixir
def create_order_with_items(cart, user) do
Multi.new()
|> Multi.insert(:order, Order.changeset(cart, user))
|> Multi.run(:items, fn _repo, %{order: order} ->
items = Enum.map(cart.items, &Repo.insert(OrderItem.changeset(&1, order)))
errors = Enum.filter(items, &match?({:error, _}, &1))
if errors == [], do: {:ok, items}, else: {:error, errors}
end)
end
```
**Example — after:**
```elixir
def create_order_with_items(cart, user) do
cart.items
|> Enum.with_index()
|> Enum.reduce(Multi.new() |> Multi.insert(:order, Order.changeset(cart, user)),
fn {item, idx}, multi ->
Multi.insert(multi, {:item, idx}, fn %{order: order} ->
OrderItem.changeset(item, order)
end)
end)
end
```
### When NOT to Use
**Don't use this when:**
- The changeset does not depend on any prior operation — pass it directly as a static value
- The dependent operation has complex failure logic that requires returning `{:ok, value}` / `{:error, value}` — use `Multi.run` in that case
- The function would need to perform multiple Repo calls — the function variant only accepts a changeset/struct return, not arbitrary DB operations
**Over-application example:**
```elixir
# Function variant used when the changeset doesn't depend on prior results
Multi.new()
|> Multi.insert(:user, fn _changes ->
User.changeset(%User{}, params) # No dependency on changes — pass directly
end)
```
**Better alternative:**
```elixir
Multi.new()
|> Multi.insert(:user, User.changeset(%User{}, params))
```
**Why:** The function variant defers changeset construction to transaction execution time, which means it can't be inspected with `Multi.to_list/1` until the Multi runs. For static changesets, passing the value directly keeps the Multi inspectable and testable without executing the transaction.
---
## 4. `Multi.merge/2` — Dynamic Transaction Composition
**Source:** [lib/ecto/multi.ex#L239](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/multi.ex#L239)
```elixir
Multi.new()
|> Multi.insert(:post, post_changeset)
|> Multi.merge(fn %{post: post} ->
if post.requires_approval do
Multi.new()
|> Multi.insert(:approval_request, ApprovalRequest.changeset(post))
else
Multi.new()
end
end)
```
**Why:** `Multi.merge/2` accepts a function that receives the changes accumulated so far and must return another `Ecto.Multi`. The returned Multi's operations are appended to the pipeline at execution time. This is the correct tool when the *set of operations to add* — not just the changesets — depends on prior results. A function variant of `Multi.insert` can swap in a different changeset; `merge` can add or remove entire operations.
**Anti-pattern:** Using `merge` when `append` or a function variant of `insert/update` would suffice — `merge` defers the entire sub-pipeline to runtime, making it harder to inspect:
```elixir
# BAD — merge used just to pass a different changeset, not to change which operations run
Multi.new()
|> Multi.insert(:post, post_changeset)
|> Multi.merge(fn %{post: post} ->
Multi.new()
|> Multi.update(:post_meta, PostMeta.changeset(post))
end)
```
### When to Use
**Triggers:**
- Whether certain operations should be included at all depends on runtime data
- A prior step's result determines which of several alternative sub-pipelines to execute
- You have a reusable function that takes results and returns a configured `Ecto.Multi`
**Example — before:**
```elixir
def publish_article(article, user) do
multi = Multi.new()
|> Multi.update(:article, Article.publish_changeset(article))
# Tacking on conditonal operations awkwardly outside the pipeline
multi =
if article.notify_subscribers do
Multi.run(multi, :notifications, fn _repo, %{article: article} ->
send_notifications(article, user)
end)
else
multi
end
multi
end
```
**Example — after:**
```elixir
def publish_article(article, user) do
Multi.new()
|> Multi.update(:article, Article.publish_changeset(article))
|> Multi.merge(fn %{article: published} ->
if published.notify_subscribers do
Multi.new()
|> Multi.run(:notifications, fn _repo, _changes ->
send_notifications(published, user)
end)
else
Multi.new()
end
end)
end
```
### When NOT to Use
**Don't use this when:**
- The set of operations is fixed — use `Multi.append/2` to combine two pre-built Multis instead
- Only the changeset values vary between operations — use the function variant of `Multi.insert/update`
- The merge function is trivially always the same Multi (just use `Multi.append`)
**Over-application example:**
```elixir
# merge used when append would do — the sub-multi never varies
Multi.new()
|> Multi.insert(:user, user_changeset)
|> Multi.merge(fn _changes ->
Multi.new() |> Multi.insert(:audit_log, AuditLog.create())
end)
```
**Better alternative:**
```elixir
audit_multi = Multi.new() |> Multi.insert(:audit_log, AuditLog.create())
Multi.new()
|> Multi.insert(:user, user_changeset)
|> Multi.append(audit_multi)
```
**Why:** `Multi.merge` with a constant-returning function is just `Multi.append` with extra indirection. Use `merge` only when the function body actually branches on the `changes` argument.
---
## 5. `Multi.append/2` / `Multi.prepend/2` — Static Multi Composition
**Source:** [lib/ecto/multi.ex#L183](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/multi.ex#L183)
```elixir
def audit_multi(entity, user) do
Multi.new()
|> Multi.insert(:audit_log, AuditLog.changeset(entity, user))
end
def create_post(params, user) do
post_multi = Multi.new() |> Multi.insert(:post, Post.changeset(params))
Multi.append(post_multi, audit_multi(:post, user))
end
```
**Why:** `append` and `prepend` combine two fully-built `Ecto.Multi` structs into one. All operation names across both Multis must be unique — a conflict raises at composition time, not at execution time. This makes transaction fragments reusable: `audit_multi/2` can be appended to any business operation without duplicating the audit logic. `prepend` puts the second Multi's operations first; `append` adds them at the end.
**Anti-pattern:** Re-defining the same operations in every Multi instead of extracting reusable fragments:
```elixir
# BAD — audit logic duplicated across every operation
def create_post(params, user) do
Multi.new()
|> Multi.insert(:post, Post.changeset(params))
|> Multi.insert(:audit_log, AuditLog.changeset(:post, user))
end
def update_post(post, params, user) do
Multi.new()
|> Multi.update(:post, Post.changeset(post, params))
|> Multi.insert(:audit_log, AuditLog.changeset(:post, user)) # copy-paste
end
```
### When to Use
**Triggers:**
- You have a recurring group of operations (auditing, notifications, cleanup) that should attach to multiple different transactions
- Two independently-defined Multis must run in the same transaction
- You want to compose transaction fragments from different modules without either module knowing about the other's internals
**Example — before:**
```elixir
def transfer_funds(from, to, amount) do
Multi.new()
|> Multi.update(:debit, Account.debit_changeset(from, amount))
|> Multi.update(:credit, Account.credit_changeset(to, amount))
|> Multi.insert(:ledger_entry, LedgerEntry.changeset(from, to, amount))
|> Multi.insert(:audit_log, AuditLog.changeset(:transfer, %{from: from, to: to}))
end
```
**Example — after:**
```elixir
def audit_multi(action, context) do
Multi.new()
|> Multi.insert(:audit_log, AuditLog.changeset(action, context))
end
def transfer_funds(from, to, amount) do
transfer =
Multi.new()
|> Multi.update(:debit, Account.debit_changeset(from, amount))
|> Multi.update(:credit, Account.credit_changeset(to, amount))
|> Multi.insert(:ledger_entry, LedgerEntry.changeset(from, to, amount))
Multi.append(transfer, audit_multi(:transfer, %{from: from, to: to}))
end
```
### When NOT to Use
**Don't use this when:**
- Which operations to add depends on the results of earlier operations — use `Multi.merge/2` instead
- The two Multis share operation names — this will raise at composition time and requires renaming
- The combination is only used once — just build a single Multi inline
**Over-application example:**
```elixir
# Splitting a single logical operation into two Multis for no benefit
user_part = Multi.new() |> Multi.insert(:user, user_cs)
profile_part = Multi.new() |> Multi.insert(:profile, profile_cs)
Multi.append(user_part, profile_part)
```
**Better alternative:**
```elixir
Multi.new()
|> Multi.insert(:user, user_cs)
|> Multi.insert(:profile, profile_cs)
```
**Why:** `append` and `prepend` are tools for *reuse*. When operations are only ever combined in one place, defining them as a single pipeline is simpler. Extract fragments only when the same group of operations genuinely recurs across different transactions.
---
## 6. Tuple Keys — Dynamic Collections of Operations
**Source:** [lib/ecto/multi.ex#L109](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/multi.ex#L109)
```elixir
Enum.reduce(accounts, Multi.new(), fn account, multi ->
Multi.update(
multi,
{:account, account.id},
Account.password_reset_changeset(account, params)
)
end)
# Error pattern-matching:
case Repo.transaction(multi) do
{:ok, results} -> Map.keys(results) # [{:account, 1}, {:account, 2}, ...]
{:error, {:account, id}, changeset, _} -> "account #{id} failed"
end
```
**Why:** Multi operation names can be any term, not just atoms. Tuple keys like `{:account, account.id}` give each operation in a dynamically-generated collection a unique, structured name. On failure, the name in `{:error, name, value, _}` tells you exactly which item failed — without this, all accounts would compete for the same atom name, which would raise a duplicate name error.
**Anti-pattern:** Using a bare atom for all iterations, which raises at composition time:
```elixir
# BAD — all operations have the same name :account — raises DuplicateNameError
Enum.reduce(accounts, Multi.new(), fn account, multi ->
Multi.update(multi, :account, Account.password_reset_changeset(account, params))
end)
```
### When to Use
**Triggers:**
- You're building a Multi by reducing over a collection and need each item to be a separate named operation
- You need to identify which specific item in a collection caused the transaction to fail
- The result map keys must be inspectable to determine which items succeeded
**Example — before:**
```elixir
# Forced to use Multi.run and handle errors manually
def reset_passwords(accounts, params) do
Multi.new()
|> Multi.run(:all_accounts, fn _repo, _changes ->
results = Enum.map(accounts, fn account ->
Repo.update(Account.password_reset_changeset(account, params))
end)
errors = Enum.filter(results, &match?({:error, _}, &1))
if errors == [], do: {:ok, results}, else: {:error, hd(errors)}
end)
end
```
**Example — after:**
```elixir
def reset_passwords(accounts, params) do
Enum.reduce(accounts, Multi.new(), fn account, multi ->
Multi.update(
multi,
{:account, account.id},
Account.password_reset_changeset(account, params)
)
end)
end
# Caller knows exactly which account failed:
case Repo.transaction(reset_passwords(accounts, params)) do
{:ok, _} -> :ok
{:error, {:account, id}, changeset, _} ->
Logger.error("Failed to reset account #{id}: #{inspect(changeset.errors)}")
end
```
### When NOT to Use
**Don't use this when:**
- The collection has a fixed, known size — just name each operation with a distinct atom instead
- You don't need per-item failure attribution — a `Multi.run` that processes all items may be simpler
- All items should be processed regardless of individual failures — use a `Multi.run` that collects partial results
**Over-application example:**
```elixir
# Tuple keys on a fixed two-item "collection"
[:primary, :secondary]
|> Enum.reduce(Multi.new(), fn role, multi ->
Multi.insert(multi, {:membership, role}, Membership.changeset(user, role))
end)
```
**Better alternative:**
```elixir
Multi.new()
|> Multi.insert(:primary_membership, Membership.changeset(user, :primary))
|> Multi.insert(:secondary_membership, Membership.changeset(user, :secondary))
```
**Why:** Tuple keys shine for truly dynamic collections where the size is not known at compile time. For small, fixed sets, distinct atom names are more readable and produce clearer error messages.
---
## 7. `Multi.to_list/1` — Testing Without a Database
**Source:** [lib/ecto/multi.ex#L88](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/multi.ex#L88)
```elixir
test "dry run password reset" do
account = %Account{password: "letmein"}
multi = PasswordManager.reset(account, params)
assert [
{:account, {:update, account_changeset, []}},
{:log, {:insert, log_changeset, []}},
{:sessions, {:delete_all, query, []}}
] = Ecto.Multi.to_list(multi)
assert account_changeset.valid?
end
```
**Why:** `Multi.to_list/1` returns the list of operations in the Multi as `{name, {operation_type, changeset_or_query, opts}}` tuples. This allows you to assert on changeset validity, query structure, and operation order without running the transaction against a database. Tests that previously required database setup and teardown can become pure unit tests — faster and fully isolated.
**Anti-pattern:** Always running the transaction in tests even when only changeset validity is being checked:
```elixir
# BAD — hits the database just to validate a changeset
test "password reset changeset is valid" do
account = %Account{password: "letmein"}
{:ok, %{account: updated}} =
PasswordManager.reset(account, valid_params)
|> Repo.transaction()
# The changeset validity was the whole point, not the DB state
assert updated.password != account.password
end
```
### When to Use
**Triggers:**
- You want to unit-test changeset validity or query construction without a database connection
- Your test environment cannot or should not touch the database for a given test
- You want to assert on operation names, order, or count in a Multi pipeline
**Example — before:**
```elixir
# Tests require database + ExUnit.DataCase + fixtures
@tag :integration
test "creates user with profile" do
{:ok, %{user: user, profile: profile}} =
UserRegistration.multi(valid_params())
|> Repo.transaction()
assert user.email == "test@example.com"
assert profile.user_id == user.id
end
```
**Example — after:**
```elixir
# Pure unit test — no database, no fixtures, no DataCase
test "multi contains user insert with valid changeset" do
multi = UserRegistration.multi(valid_params())
assert [
{:user, {:insert, user_changeset, []}},
{:profile, _}
] = Ecto.Multi.to_list(multi)
assert user_changeset.valid?
assert Ecto.Changeset.get_change(user_changeset, :email) == "test@example.com"
end
```
### When NOT to Use
**Don't use this when:**
- Operations use the function variant (e.g., `Multi.insert(:comment, fn %{post: post} -> ... end)`) — those deferred functions are opaque in `to_list` output until the transaction runs
- You need to test that the operations actually succeed against a real database (constraint checks, triggers, concurrent writes)
- The Multi uses `Multi.run` with side effects — `to_list` cannot execute those callbacks
**Over-application example:**
```elixir
# to_list can't help here — the changeset is hidden inside a function
multi =
Multi.new()
|> Multi.insert(:post, post_changeset)
|> Multi.insert(:comment, fn %{post: post} ->
Comment.changeset(post, params) # This fn is opaque to to_list
end)
# This assertion will fail or be meaningless
[{:comment, {:insert, cs, _}}] = Enum.drop(Ecto.Multi.to_list(multi), 1)
assert cs.valid? # cs is a function, not a changeset
```
**Better alternative:**
```elixir
# Test the function-variant changesets in isolation
test "comment changeset is valid given a post" do
post = %Post{id: 1, title: "first"}
changeset = Comment.changeset(post, valid_comment_params())
assert changeset.valid?
end
```
**Why:** `Multi.to_list/1` reflects the state of the pipeline at build time. Deferred values (function variants, `Multi.run` callbacks, `Multi.merge` functions) are not evaluated until `Repo.transaction` runs them. Test those deferred pieces independently rather than trying to inspect them through `to_list`.
---
## Decision Tree
- If operations are static and ordered → `Multi.new() |> Multi.insert/update/delete`
- If an operation depends on a prior operation's result → use function variant `fn changes -> changeset end`
- If later operations depend on runtime data to decide what to include → `Multi.merge/2` with anonymous function
- If you have reusable Multi fragments to combine → `Multi.append/2` or `Multi.prepend/2`
- If you're updating a dynamic collection → tuple keys `{:operation, id}`
- If you want to validate changesets without hitting the DB → `Multi.to_list/1` in tests
- If operations are simple and static (no dynamic branching) → consider `Repo.transaction(fn -> ... end)` instead
<!-- PATTERN_COMPLETE -->
+42
View File
@@ -2,6 +2,27 @@
Analysis of `lib/elixir/lib/supervisor.ex`, `lib/elixir/lib/dynamic_supervisor.ex`, `lib/elixir/lib/task.ex`, `lib/elixir/lib/task/supervisor.ex`, `lib/elixir/lib/process.ex`, and `lib/elixir/lib/registry.ex`.
## Contents
1. [Pattern 1: Static vs Dynamic Supervision — Choose the Right Tool](#pattern-1-static-vs-dynamic-supervision--choose-the-right-tool)
2. [Pattern 2: PartitionSupervisor for Scalability](#pattern-2-partitionsupervisor-for-scalability)
3. [Pattern 3: Supervision Strategies — Choosing the Right Restart Behavior](#pattern-3-supervision-strategies--choosing-the-right-restart-behavior)
4. [Pattern 4: Restart Intensity (`max_restarts` / `max_seconds`)](#pattern-4-restart-intensity-max_restarts--max_seconds)
5. [Pattern 5: Restart Values — `:permanent` vs `:transient` vs `:temporary`](#pattern-5-restart-values--permanent-vs-transient-vs-temporary)
6. [Pattern 6: Automatic Shutdown for Pipeline Supervisors](#pattern-6-automatic-shutdown-for-pipeline-supervisors)
7. [Pattern 7: Task.async/await for Concurrent Value Computation](#pattern-7-taskasyncawait-for-concurrent-value-computation)
8. [Pattern 8: Task.Supervisor.async_nolink for Fault-Tolerant Task Execution](#pattern-8-tasksupervisorasync_nolink-for-fault-tolerant-task-execution)
9. [Pattern 9: Task Supervisor as DynamicSupervisor Specialization](#pattern-9-task-supervisor-as-dynamicsupervisor-specialization)
10. [Pattern 10: Registry for Dynamic Process Naming and PubSub](#pattern-10-registry-for-dynamic-process-naming-and-pubsub)
11. [Pattern 11: Shutdown Semantics — Graceful Termination](#pattern-11-shutdown-semantics--graceful-termination)
12. [Pattern 12: DynamicSupervisor Internal State — Struct with Restart Tracking](#pattern-12-dynamicsupervisor-internal-state--struct-with-restart-tracking)
13. [Pattern 13: Restart Logic with Exponential Backoff via `:try_again`](#pattern-13-restart-logic-with-exponential-backoff-via-try_again)
14. [Pattern 14: `$ancestors` and `$callers` — Process Lineage Tracking](#pattern-14-ancestors-and-callers--process-lineage-tracking)
15. [Pattern 15: GenServer.reply/2 for Deferred Responses](#pattern-15-genserverreply2-for-deferred-responses)
16. [Pattern 16: Process.alias for Safe Request/Response](#pattern-16-processalias-for-safe-requestresponse)
17. [Pattern 17: Registry Partitioning Strategies](#pattern-17-registry-partitioning-strategies)
18. [Pattern 18: `init/1` Return Values — The Full Spectrum](#pattern-18-init1-return-values--the-full-spectrum)
---
## Pattern 1: Static vs Dynamic Supervision — Choose the Right Tool
@@ -1911,4 +1932,25 @@ end
**Why:** `:ignore` means "this child intentionally should not run right now." `{:stop, reason}` means "this child tried to start and failed." Conflating the two hides real failures from your supervision tree.
## Decision Tree
- If you have children known at compile time with ordering dependencies → [Pattern 1: Static vs Dynamic Supervision](#pattern-1-static-vs-dynamic-supervision--choose-the-right-tool)
- If a single DynamicSupervisor or Task.Supervisor is a bottleneck under high spawn load → [Pattern 2: PartitionSupervisor for Scalability](#pattern-2-partitionsupervisor-for-scalability)
- If you need to decide how a supervisor reacts when children share state or have dependencies → [Pattern 3: Supervision Strategies](#pattern-3-supervision-strategies--choosing-the-right-restart-behavior)
- If you want to tune how many restarts are tolerated before escalation → [Pattern 4: Restart Intensity](#pattern-4-restart-intensity-max_restarts--max_seconds)
- If different processes have different lifecycle expectations (one-shot vs permanent) → [Pattern 5: Restart Values](#pattern-5-restart-values--permanent-vs-transient-vs-temporary)
- If a supervisor should self-terminate when its children finish their work → [Pattern 6: Automatic Shutdown](#pattern-6-automatic-shutdown-for-pipeline-supervisors)
- If you need to compute values concurrently and the caller should crash on failure → [Pattern 7: Task.async/await](#pattern-7-taskasyncawait-for-concurrent-value-computation)
- If a GenServer needs to spawn work that might fail without taking down the server → [Pattern 8: Task.Supervisor.async_nolink](#pattern-8-tasksupervisorasync_nolink-for-fault-tolerant-task-execution)
- If you need supervised tasks with caller tracking, async_nolink, and streaming → [Pattern 9: Task Supervisor](#pattern-9-task-supervisor-as-dynamicsupervisor-specialization)
- If you need to look up processes by a dynamic key without atom leaks → [Pattern 10: Registry](#pattern-10-registry-for-dynamic-process-naming-and-pubsub)
- If processes hold external resources that need cleanup on shutdown → [Pattern 11: Shutdown Semantics](#pattern-11-shutdown-semantics--graceful-termination)
- If you are building a custom supervisor-like process and need efficient child tracking → [Pattern 12: DynamicSupervisor Internal State](#pattern-12-dynamicsupervisor-internal-state--struct-with-restart-tracking)
- If a child fails to start due to transient conditions and you want non-blocking retry → [Pattern 13: Restart Logic with Backoff](#pattern-13-restart-logic-with-exponential-backoff-via-try_again)
- If you need to trace which process initiated spawned work for debugging → [Pattern 14: Process Lineage Tracking](#pattern-14-ancestors-and-callers--process-lineage-tracking)
- If a GenServer needs to do async work before replying to a caller → [Pattern 15: GenServer.reply/2](#pattern-15-genserverreply2-for-deferred-responses)
- If you build a custom request/response protocol with timeouts and need to prevent late replies → [Pattern 16: Process.alias](#pattern-16-processalias-for-safe-requestresponse)
- If your Registry dispatch is slow because of wrong partitioning strategy → [Pattern 17: Registry Partitioning](#pattern-17-registry-partitioning-strategies)
- If you need to communicate "don't start this child" or split init into fast/slow phases → [Pattern 18: init/1 Return Values](#pattern-18-init1-return-values--the-full-spectrum)
<!-- PATTERN_COMPLETE -->
+922
View File
@@ -0,0 +1,922 @@
# Ecto Query Patterns
Patterns extracted from Ecto's query layer source code.
## Contents
1. [Named Query Functions — Composable Query Building](#1-named-query-functions--composable-query-building)
2. [Query Piping — Schema to Query Pipeline](#2-query-piping--schema-to-query-pipeline)
3. [Named Bindings — Position-Independent Composition](#3-named-bindings--position-independent-composition)
4. [`dynamic/2` — Runtime-Constructed Predicates](#4-dynamic2--runtime-constructed-predicates)
5. [`subquery/1` — Correlated Subqueries](#5-subquery1--correlated-subqueries)
6. [`exclude/2` — Strip Clauses for Reuse](#6-exclude2--strip-clauses-for-reuse)
7. [Bindingless Queries — Data-Driven Clauses](#7-bindingless-queries--data-driven-clauses)
8. [`select_merge/3` — Augmenting Selects Dynamically](#8-select_merge3--augmenting-selects-dynamically)
9. [`fragment/1` and `type/2` — Escape Hatches for DB-Specific Expressions](#9-fragment1-and-type2--escape-hatches-for-db-specific-expressions)
---
## 1. Named Query Functions — Composable Query Building
**Source:** [lib/ecto/query.ex#L1112](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L1112)
**What it does:** Define named functions that accept a query and return a refined query. The query itself is the accumulator; each function layers one concern.
```elixir
# From lib/ecto/query.ex lines 1112-1134
def paginate(query, page, size) do
from query,
limit: ^size,
offset: ^((page-1) * size)
end
def published(query) do
from p in query, where: not(is_nil(p.published_at))
end
```
These functions compose naturally at the call site:
```elixir
User |> active() |> published() |> paginate(1, 20)
```
**Why:** Each function encodes exactly one policy decision. The composed result is a single query that the database executes once. Because `from query` appends rather than replaces, the caller chooses which policies to apply and in what order — without any one function needing to know about the others.
**Anti-pattern:** One monolithic query that mixes pagination, filtering, and ordering:
```elixir
# Hard to reuse parts independently
def list_published_users(page, size) do
from u in User,
where: u.active == true and not is_nil(u.published_at),
order_by: [desc: u.inserted_at],
limit: ^size,
offset: ^((page - 1) * size)
end
```
### When to Use
**Triggers:**
- The same filter, ordering, or limit appears in multiple query contexts
- You need to mix and match clauses — some queries paginate, some don't
- A policy (e.g. "only active records") should be enforced consistently without copy-pasting conditions
**Example — before:**
```elixir
def list_recent_posts(page, size) do
from p in Post,
where: not is_nil(p.published_at),
order_by: [desc: p.published_at],
limit: ^size,
offset: ^((page - 1) * size)
end
def count_published_posts do
from p in Post,
where: not is_nil(p.published_at),
select: count()
end
```
**Example — after:**
```elixir
def published(query), do: from p in query, where: not is_nil(p.published_at)
def by_newest(query), do: from p in query, order_by: [desc: p.published_at]
def paginate(query, page, size) do
from query, limit: ^size, offset: ^((page - 1) * size)
end
def list_recent_posts(page, size) do
Post |> published() |> by_newest() |> paginate(page, size)
end
def count_published_posts do
Post |> published() |> select([p], count())
end
```
### When NOT to Use
**Don't use this when:**
- The query is used exactly once and decomposing it adds names with no reuse value
- The clauses are tightly coupled and meaningless in isolation (e.g. a join whose `on` condition references a specific sibling join)
**Over-application example:**
```elixir
# Not worth extracting — used once, no meaningful reuse
def with_user_and_org_and_permissions(query) do
from [u, o, p] in query,
where: u.org_id == o.id and p.user_id == u.id and p.role == "admin"
end
```
**Better alternative:**
```elixir
from u in User,
join: o in Org, on: u.org_id == o.id,
join: p in Permission, on: p.user_id == u.id,
where: p.role == "admin"
```
**Why:** Extraction is worth it when the function has a name that communicates intent reusably. When a query is one-off and the extracted name just paraphrases the code, keep it inline.
---
## 2. Query Piping — Schema to Query Pipeline
**Source:** [lib/ecto/query.ex#L310](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L310)
**What it does:** Ecto implements the `Ecto.Queryable` protocol for schemas, strings, and query structs, so any of them can be the starting point of a pipeline. The pipe operator chains named query functions:
```elixir
# From the macro API docs (query.ex line 319-324)
"users"
|> where([u], u.age > 18)
|> select([u], u.name)
```
Starting from a schema module name is idiomatic:
```elixir
User
|> where([u], u.active == true)
|> order_by([u], u.name)
|> limit(10)
```
**Why:** The pipe operator makes query construction read left-to-right, mirroring how SQL clauses are mentally composed. The `Ecto.Queryable` protocol means `from Schema` and `Schema |> where(...)` are equivalent, so the choice of `from`/`|>` is stylistic — but pipe form scales better when each step is a named function.
**Anti-pattern:** Building one monolithic keyword query instead of small composable pipes:
```elixir
# Cannot reuse paginate, active, or order_by separately
from u in User,
where: u.active == true,
order_by: [asc: u.name],
limit: ^limit,
offset: ^offset
```
### When to Use
**Triggers:**
- You have 3+ clauses that each correspond to an independently reusable policy
- The query is assembled conditionally based on runtime inputs
- You want the query construction steps to be readable as an English sentence
**Example — before:**
```elixir
from p in Post,
where: p.author_id == ^author_id and not is_nil(p.published_at),
order_by: [desc: p.published_at],
limit: 10
```
**Example — after:**
```elixir
Post
|> by_author(author_id)
|> published()
|> by_newest()
|> limit(10)
```
### When NOT to Use
**Don't use this when:**
- The query has two or fewer clauses and a single `from` is more concise
- You're assembling a complex join where positional bindings require a single `from` for clarity
**Over-application example:**
```elixir
# Excessive piping for a trivial lookup
User
|> where([u], u.id == ^id)
|> limit(1)
|> Repo.one()
```
**Better alternative:**
```elixir
Repo.get(User, id)
```
**Why:** `Repo.get/2` and `Repo.get_by/2` exist precisely for simple lookups. Piping adds ceremony without benefit when the standard API already expresses the intent.
---
## 3. Named Bindings — Position-Independent Composition
**Source:** [lib/ecto/query.ex#L211](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L211)
**What it does:** Assign stable names to `from` and `join` sources using `as:`. Reference those names in any function without knowing or caring about join order.
```elixir
# Name the join at definition time (query.ex line 218-219)
posts_with_comments =
from p in Post,
join: c in Comment, as: :comment, on: c.post_id == p.id
# Reference by name instead of position (line 223)
from [p, comment: c] in posts_with_comments, select: {p.title, c.body}
```
Generic sort function that works on any named binding (line 254-256):
```elixir
def sort(query, as, field) do
from [{^as, x}] in query, order_by: field(x, ^field)
end
```
The spread `...` syntax lets you reference first and last bindings without caring about the middle (line 206):
```elixir
from [p, ..., c] in posts_with_comments, select: {p.title, c.body}
```
**Why:** Positional bindings break when a new join is inserted earlier in the pipeline. Named bindings are stable: adding a join between `Post` and `Comment` does not change how `:comment` is referenced. This is essential for composable query libraries where join order is not known at authorship time.
**Anti-pattern:** Relying on position when queries are built across functions:
```elixir
# Breaks if anyone inserts a join before Comment
def with_comment_body(query) do
from [p, c] in query, select: {p.title, c.body}
end
```
### When to Use
**Triggers:**
- A join is referenced from a function that didn't define it
- You're writing generic helpers (sorting, filtering) that work on any named source
- Multiple joins make positional counting error-prone
**Example — before:**
```elixir
def filter_by_org(query) do
from [u, o] in query, where: o.active == true # Breaks if join order changes
end
```
**Example — after:**
```elixir
# Define the join with a name
from u in User,
join: o in Org, as: :org, on: u.org_id == o.id
# Reference by name — order-independent
def filter_by_org(query) do
from [org: o] in query, where: o.active == true
end
```
### When NOT to Use
**Don't use this when:**
- The query lives entirely in one function and is never extended
- You have a single join that will never be reordered — positional is fine and shorter
- You're using `...` spread and don't need to reference intermediate sources by name
**Over-application example:**
```elixir
# Naming everything in a one-off query adds noise
from u in User, as: :user,
join: p in Post, as: :post, on: p.user_id == u.id,
where: p.published == true,
select: u
```
**Better alternative:**
```elixir
from u in User,
join: p in Post, on: p.user_id == u.id,
where: p.published == true,
select: u
```
**Why:** Named bindings pay off at composition boundaries. In a self-contained query, they add verbosity without stability benefit.
---
## 4. `dynamic/2` — Runtime-Constructed Predicates
**Source:** [lib/ecto/query.ex#L770](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L770)
**What it does:** `dynamic/2` builds query expressions at runtime without executing a query. The resulting value can be composed with `and`/`or` and interpolated into `where`, `having`, `on`, `order_by`, and `select_merge`.
```elixir
# From the dynamic/2 docs (query.ex lines 568-585)
conditions = false
conditions =
if params["is_public"] do
dynamic([p], p.is_public or ^conditions)
else
conditions
end
conditions =
if params["allow_reviewers"] do
dynamic([p, a], a.reviewer == true or ^conditions)
else
conditions
end
from query, where: ^conditions
```
The canonical reduce pattern for multi-field search forms:
```elixir
def filter(params) do
Enum.reduce(params, dynamic(true), fn
{:name, name}, dynamic ->
dynamic([p], ^dynamic and p.name == ^name)
{:age, age}, dynamic ->
dynamic([p], ^dynamic and p.age > ^age)
_, dynamic ->
dynamic
end)
end
from p in Post, where: ^filter(params)
```
**Why:** Without `dynamic/2`, building conditional filters requires runtime `if` guards that build different query structs, or string interpolation (SQL injection risk). `dynamic/2` keeps filtering logic in Ecto's type-safe DSL while composing predicates conditionally. The resulting expression is validated and cast before the query runs.
**Anti-pattern:** Building filter strings with interpolation, or separate query branches per condition:
```elixir
# SQL injection risk
where_clause = "name = '#{params["name"]}'"
Repo.query("SELECT * FROM posts WHERE #{where_clause}")
# Brittle — duplicates the query structure N times
query =
if params["name"] do
from p in Post, where: p.name == ^params["name"]
else
from p in Post
end
```
### When to Use
**Triggers:**
- You're building a search or filter form where 0..N conditions apply based on user input
- Conditions need to be composed with `and`/`or` across different code paths
- You want conditional filtering without forking the entire query
**Example — before:**
```elixir
def search(params) do
query = from p in Post
query =
if params[:title] do
from p in query, where: ilike(p.title, ^"%#{params[:title]}%")
else
query
end
query =
if params[:category] do
from p in query, where: p.category == ^params[:category]
else
query
end
query
end
```
**Example — after:**
```elixir
def search(params) do
filters =
Enum.reduce(params, dynamic(true), fn
{:title, title}, d -> dynamic([p], ^d and ilike(p.title, ^"%#{title}%"))
{:category, cat}, d -> dynamic([p], ^d and p.category == ^cat)
_, d -> d
end)
from p in Post, where: ^filters
end
```
### When NOT to Use
**Don't use this when:**
- The conditions are always applied — static `where` clauses in a named function are simpler
- You only have one conditional — a simple `if` that builds two query variants is clearer
- The condition references a join binding that may not exist — use named bindings and verify first
**Over-application example:**
```elixir
# dynamic() for a condition that's always present
filters = dynamic([p], p.active == true)
from p in Post, where: ^filters
```
**Better alternative:**
```elixir
from p in Post, where: p.active == true
```
**Why:** `dynamic/2` introduces a layer of indirection. When the condition is unconditional, a plain `where` clause in the `from` expression communicates intent more directly.
---
## 5. `subquery/1` — Correlated Subqueries
**Source:** [lib/ecto/query.ex#L897](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L897)
**What it does:** `subquery/1` wraps an `Ecto.Query` for use as a source inside another query — in joins, `where` conditions, or directly in `select`. The canonical use case is batched `update_all` without loading rows into memory.
```elixir
# From subquery/1 docs (query.ex lines 869-878)
subset = from(p in Post,
where: p.synced == false and
(is_nil(p.sync_started_at) or p.sync_started_at < ^min_sync_started_at),
limit: ^batch_size
)
Repo.update_all(
from(p in Post, join: s in subquery(subset), on: s.id == p.id),
set: [sync_started_at: NaiveDateTime.utc_now()]
)
```
Correlated subquery in `select` using `parent_as` (lines 894-895):
```elixir
comments_count = from(c in Comment, where: c.post_id == parent_as(:post).id, select: count())
from(p in Post, as: :post, select: %{id: p.id, comments: subquery(comments_count)})
```
**Why:** Batched updates via a subquery join let the database enforce the limit at the SQL level — no rows are fetched into Elixir. `parent_as` correlates a subquery to the outer query's binding, computing aggregates per row without an explicit `GROUP BY` in the outer query.
**Anti-pattern:** Loading rows into memory to get their IDs, then issuing a second query:
```elixir
# Fetches all IDs into memory before updating
ids =
Post
|> where([p], p.synced == false)
|> limit(^batch_size)
|> select([p], p.id)
|> Repo.all()
Repo.update_all(from(p in Post, where: p.id in ^ids),
set: [sync_started_at: NaiveDateTime.utc_now()])
```
### When to Use
**Triggers:**
- You need to batch-update records matching a subselect without loading them
- You need a per-row aggregate (count, sum) in a `select` without adding it as a join
- The subquery filter depends on the parent row's value (`parent_as`)
**Example — before:**
```elixir
# N+1 pattern — one query per post to count comments
posts = Repo.all(Post)
Enum.map(posts, fn post ->
count = Repo.aggregate(from(c in Comment, where: c.post_id == ^post.id), :count)
Map.put(post, :comment_count, count)
end)
```
**Example — after:**
```elixir
comments_count =
from c in Comment,
where: c.post_id == parent_as(:post).id,
select: count()
Repo.all(from p in Post, as: :post,
select: %{id: p.id, title: p.title, comment_count: subquery(comments_count)})
```
### When NOT to Use
**Don't use this when:**
- A join + `group_by` expresses the aggregation more clearly and performs comparably
- The subquery is not correlated — a preload or separate query may be more readable
- The query is simple enough that `Repo.all` + in-memory grouping is fast enough and clearer
**Over-application example:**
```elixir
# Subquery where a simple preload is idiomatic
comments_query = from(c in Comment, where: c.post_id == parent_as(:post).id)
from(p in Post, as: :post, select: %{id: p.id, comments: subquery(comments_query)})
# Returns raw maps, not structs — preloads are often better for associations
```
**Better alternative:**
```elixir
Post |> Repo.all() |> Repo.preload(:comments)
```
**Why:** `subquery/1` is best suited to aggregates and batched writes. For loading associated structs, `preload` is idiomatic and returns properly typed structs that Ecto's association machinery can use.
---
## 6. `exclude/2` — Strip Clauses for Reuse
**Source:** [lib/ecto/query.ex#L989](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L989)
**What it does:** Removes one or more previously set clauses from a query. Enables deriving a variant of a base query — most commonly stripping `select`, `order_by`, and `preload` to build a count query.
```elixir
# From exclude/2 docs (query.ex lines 946-958)
Ecto.Query.exclude(query, :join)
Ecto.Query.exclude(query, :where)
Ecto.Query.exclude(query, :order_by)
Ecto.Query.exclude(query, :select)
Ecto.Query.exclude(query, :preload)
Ecto.Query.exclude(query, :limit)
Ecto.Query.exclude(query, :offset)
# Remove a list at once (line 964)
Ecto.Query.exclude(query, [:limit, :offset])
```
The count query pattern:
```elixir
def count_query(query) do
query
|> exclude(:select)
|> exclude(:order_by)
|> exclude(:preload)
|> select([x], count(x.id))
end
```
**Why:** Without `exclude/2`, you must maintain two parallel query paths — one for data, one for counts — that can drift out of sync. Deriving the count query from the data query guarantees they share all `where` and `join` clauses: adding a filter in one place automatically applies to both.
**Anti-pattern:** Two independent query definitions that must be kept in sync manually:
```elixir
# Any filter added to data_query must also be added to count_query
def data_query(params) do
from p in Post, where: p.active == true, order_by: [desc: p.inserted_at]
end
def count_query(params) do
from p in Post, where: p.active == true, select: count() # easy to forget
end
```
### When to Use
**Triggers:**
- You need both a data query and a count query from the same base (pagination)
- A query includes `order_by` or `limit` that must be absent for counting or aggregation
- You need to reuse a query's `where` clauses in an update or delete without its `select`
**Example — before:**
```elixir
defmodule MyApp.Posts do
def list_posts(filters) do
base = build_base_query(filters)
data = Repo.all(base)
count = Repo.aggregate(build_count_query(filters), :count)
{data, count}
end
defp build_base_query(filters), do: ...
defp build_count_query(filters), do: ... # must track build_base_query manually
end
```
**Example — after:**
```elixir
defmodule MyApp.Posts do
def list_posts(filters) do
base = build_base_query(filters)
data = Repo.all(base)
count = base |> exclude(:select) |> exclude(:order_by) |> Repo.aggregate(:count)
{data, count}
end
defp build_base_query(filters), do: ... # one source of truth
end
```
### When NOT to Use
**Don't use this when:**
- The base query uses a `join` that is only needed for sorting, not filtering — excluding `order_by` still keeps the join, which may produce duplicates in the count
- The clauses to exclude would leave the query in an invalid state (e.g. excluding `select` from a query with `select_merge` built on it)
- The count query differs structurally enough that a shared base would be forced
**Over-application example:**
```elixir
# exclude doesn't remove joins — count may be inflated by join duplicates
def count(query) do
query
|> exclude(:order_by)
|> exclude(:select)
|> select([x], count(x.id))
|> Repo.one()
# If query has a left_join, this overcounts
end
```
**Better alternative:**
```elixir
def count(query) do
query
|> exclude(:order_by)
|> exclude(:select)
|> select([x], count(x.id, :distinct)) # or exclude joins and recount
|> Repo.one()
end
```
**Why:** `exclude/2` removes clause expressions but not join sources. If the base query joins tables that multiply rows (one-to-many), counting without `DISTINCT` overstates results. Know your join cardinality before deriving count queries this way.
---
## 7. Bindingless Queries — Data-Driven Clauses
**Source:** [lib/ecto/query.ex#L264](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L264)
**What it does:** When a query has only one source and clauses use simple field equality or fixed expressions, bindings can be omitted entirely. Clauses accept keyword lists and atom field names.
```elixir
# From the bindingless docs (query.ex lines 265-268)
from Post,
where: [category: "fresh and new"],
order_by: [desc: :published_at],
select: [:id, :title, :body]
```
This is equivalent to the binding form:
```elixir
from p in Post,
where: p.category == "fresh and new",
order_by: [desc: p.published_at],
select: struct(p, [:id, :title, :body])
```
Bindingless syntax is fully dynamic (line 283-287):
```elixir
where = [category: "fresh and new"]
order_by = [desc: :published_at]
select = [:id, :title, :body]
from Post, where: ^where, order_by: ^order_by, select: ^select
```
**Why:** Bindings exist to name sources so they can be referenced in expressions. When you're only filtering by equality on fields of the single source, bindings add syntax without adding capability. The bindingless form is shorter, more data-driven, and maps cleanly to keyword lists built at runtime.
**Anti-pattern:** Always using binding syntax even for simple equality filters:
```elixir
# More verbose than necessary for simple filters
from p in Post,
where: p.category == ^category and p.status == ^status,
select: [:id, :title]
```
### When to Use
**Triggers:**
- The query has exactly one source (no joins)
- All `where` conditions are field equality checks against interpolated values
- You're building query clauses dynamically from a map or keyword list (web search forms, CLIs)
**Example — before:**
```elixir
def search(filters) do
from p in Post,
where: p.category == ^filters[:category],
where: p.status == ^filters[:status],
select: [:id, :title, :body]
end
```
**Example — after:**
```elixir
def search(filters) do
where = Keyword.take(filters, [:category, :status])
from Post, where: ^where, select: [:id, :title, :body]
end
```
### When NOT to Use
**Don't use this when:**
- The query includes a `join` — bindings are required to reference joined sources
- A `where` condition uses operators other than equality (`>`, `<`, `like`, `fragment`)
- You need to pass the source binding to a function like `field/2` or `type/2`
**Over-application example:**
```elixir
# Bindingless can't express non-equality conditions
from Post,
where: [inserted_at: ^date] # Works only for exact equality — not a range
```
**Better alternative:**
```elixir
from p in Post, where: p.inserted_at >= ^start_date and p.inserted_at < ^end_date
```
**Why:** Bindingless keyword syntax maps to equality (`==`). Any non-equality comparison, function call, or multi-table reference requires a named binding. Use bindingless for pure equality filters; reach for bindings the moment expressions get richer.
---
## 8. `select_merge/3` — Augmenting Selects Dynamically
**Source:** [lib/ecto/query.ex#L693](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L693)
**What it does:** Merges additional fields into an existing `select` without replacing it. Especially useful with `dynamic/2` to add computed columns conditionally.
```elixir
# From the dynamic docs (query.ex lines 693-695)
metric = dynamic([p], p.distance)
from query, select: [:period, :metric], select_merge: ^%{metric: metric}
```
With aliasing and dynamic ordering (lines 700-707):
```elixir
fields = %{
period: dynamic([p], selected_as(p.month, :month)),
metric: dynamic([p], p.distance)
}
order = dynamic(selected_as(:month))
from query, select: ^fields, order_by: ^order
```
**Why:** `select_merge` lets base queries define the fixed fields and separate concerns add computed or conditional fields. Without it, adding a field requires rewriting the entire `select` clause — or maintaining multiple `select` variants. Combined with `dynamic/2`, it enables data-driven projections where which columns appear depends on runtime configuration.
**Anti-pattern:** Rewriting the entire `select` whenever a computed column is needed:
```elixir
# The base select must be duplicated in every variant
def with_distance(query) do
from p in query, select: %{id: p.id, name: p.name, distance: p.distance}
end
def without_distance(query) do
from p in query, select: %{id: p.id, name: p.name}
end
```
### When to Use
**Triggers:**
- A computed column should be added conditionally depending on caller context
- You're building a reporting query where which aggregates appear is configured at runtime
- A base query provides the structural select and feature-specific code augments it
**Example — before:**
```elixir
def list_metrics(include_distance?) do
if include_distance? do
from p in Post, select: %{period: p.period, metric: p.views, distance: p.distance}
else
from p in Post, select: %{period: p.period, metric: p.views}
end
end
```
**Example — after:**
```elixir
def list_metrics(opts) do
base = from p in Post, select: %{period: p.period, metric: p.views}
if opts[:include_distance] do
from p in base, select_merge: %{distance: p.distance}
else
base
end
end
```
### When NOT to Use
**Don't use this when:**
- The `select` is a single value or tuple rather than a map — `select_merge` requires a map shape
- The added field requires a binding not present in the base query
- You're replacing, not augmenting — use a plain `select` or `exclude(:select)` first
**Over-application example:**
```elixir
# select_merge on a non-map select causes a runtime error
from p in Post,
select: p.name,
select_merge: %{email: p.email} # error: base select is not a map
```
**Better alternative:**
```elixir
from p in Post, select: %{name: p.name, email: p.email}
```
**Why:** `select_merge` merges into the existing map-shaped select. If the base select returns a struct or a scalar, there is no map to merge into. Ensure the base `select` produces a map before using `select_merge`.
---
## 9. `fragment/1` and `type/2` — Escape Hatches for DB-Specific Expressions
**Source:** [lib/ecto/query.ex#L291](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L291)
**What it does:** `fragment/1` passes a raw SQL expression to the database engine for functions or operators the Ecto DSL cannot express. `type/2` coerces an Elixir value to a specific Ecto type for comparison when schema type information is not available.
```elixir
# From the fragments docs (query.ex lines 301-303)
from p in Post,
where: is_nil(p.published_at) and
fragment("lower(?)", p.title) == ^title
```
`type/2` for schemaless queries where Ecto cannot infer the cast type:
```elixir
# Coerce to :integer when no schema field exists to infer from
from u in "users",
where: u.age > type(^age, :integer)
```
**Why:** Databases have functions (full-text search, JSON operators, trigram similarity, window functions) that the Ecto DSL cannot enumerate. `fragment/1` is the intentional escape hatch: interpolations with `?` placeholders are still parameterized, so SQL injection is not a risk. `type/2` is necessary for schemaless queries where Ecto cannot cast a bound parameter to the correct DB type automatically.
**Anti-pattern:** Using fragment for everything, bypassing Ecto's type safety and composability:
```elixir
# Loses all type inference; fragment output is opaque to Ecto
from u in User,
where: fragment("? = ? AND ? > ?", u.name, ^name, u.age, ^age)
```
Or using string interpolation inside fragments:
```elixir
# SQL injection — never interpolate directly into fragment strings
from u in User,
where: fragment("lower(email) = '#{email}'")
```
### When to Use
**Triggers:**
- The database function has no Ecto DSL equivalent (e.g. `lower()`, `similarity()`, `jsonb_array_elements()`)
- You're writing a schemaless query (`from u in "users"`) and need to cast a bound parameter
- A DB-specific operator or syntax is required for a performance-critical path
**Example — before:**
```elixir
# Can't express case-insensitive equality in pure Ecto DSL
from u in User, where: u.email == ^String.downcase(email)
# Compares raw stored value; doesn't work if DB stores mixed case
```
**Example — after:**
```elixir
from u in User,
where: fragment("lower(?)", u.email) == ^String.downcase(email)
```
### When NOT to Use
**Don't use this when:**
- The Ecto DSL or a library like `EctoCommons` already provides the operation
- You want type-cast values in a regular schema query — Ecto infers the type from the field
- You're tempted to fragment an entire `WHERE` clause — named functions with `dynamic/2` compose better
**Over-application example:**
```elixir
# Fragments for standard operations Ecto handles natively
from p in Post,
where: fragment("? = ?", p.status, ^:published),
order_by: fragment("? DESC", p.inserted_at)
```
**Better alternative:**
```elixir
from p in Post,
where: p.status == :published,
order_by: [desc: p.inserted_at]
```
**Why:** Fragment output is opaque to Ecto's type system: no cast validation, no composability with `dynamic/2` type inference, and no portability across adapters. Reserve `fragment/1` for genuine gaps in the DSL; prefer native Ecto expressions for everything the DSL can express.
---
## Decision Tree
- If you need to filter conditionally at runtime → `dynamic/2` (Pattern 4)
- If you need to join or sort across function composition boundaries → named bindings with `as:` (Pattern 3)
- If you need a count or aggregate from the same base as a data query → `exclude/2` (Pattern 6)
- If you need a DB-side correlated count or aggregate per row → `subquery/1` with `parent_as` (Pattern 5)
- If the query has one source and all filters are equality checks → bindingless keyword syntax (Pattern 7)
- If you need to add computed columns without rewriting the select → `select_merge/3` (Pattern 8)
- If the DB function has no Ecto DSL equivalent → `fragment/1` as last resort (Pattern 9)
- For all queries: define small named functions that take a query and return a query (Patterns 1 and 2)
<!-- PATTERN_COMPLETE -->
+893
View File
@@ -0,0 +1,893 @@
# Ecto Schema Patterns
Patterns extracted from `lib/ecto/schema.ex` in the Ecto source.
## Contents
1. [Base Schema Module — App-Wide Schema Defaults](#1-base-schema-module--app-wide-schema-defaults)
2. [`@primary_key false` — Composite or No Primary Key](#2-primary_key-false--composite-or-no-primary-key)
3. [Virtual Fields — In-Memory-Only Data](#3-virtual-fields--in-memory-only-data)
4. [`embedded_schema/1` — Schemaless Validation Structs](#4-embedded_schema1--schemaless-validation-structs)
5. [`@timestamps_opts` — Consistent Timestamp Types](#5-timestamps_opts--consistent-timestamp-types)
6. [Field `:source` Option — Column Name Mapping](#6-field-source-option--column-name-mapping)
7. [`redact: true` — Protecting Sensitive Fields](#7-redact-true--protecting-sensitive-fields)
8. [`__schema__/1` Reflection — Runtime Schema Introspection](#8-__schema__1-reflection--runtime-schema-introspection)
---
## 1. Base Schema Module — App-Wide Schema Defaults
**Source:** [lib/ecto/schema.ex#L194](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/schema.ex#L194)
```elixir
defmodule MyApp.Schema do
defmacro __using__(_) do
quote do
use Ecto.Schema
@primary_key {:id, :binary_id, autogenerate: true}
@foreign_key_type :binary_id
end
end
end
defmodule MyApp.Comment do
use MyApp.Schema
schema "comments" do
belongs_to :post, MyApp.Post
end
end
```
**Why:** `@primary_key` and `@foreign_key_type` are module attributes that must be set before each `schema/2` block. Without a shared base module, every schema in the application must repeat these two lines. One forgotten schema silently gets integer `:id` columns and integer foreign keys — a type mismatch that only surfaces at runtime when joins or associations break.
**Anti-pattern:** Setting `@primary_key` in every schema individually:
```elixir
# BAD — duplicated in every schema, easy to miss in a new module
defmodule MyApp.Comment do
use Ecto.Schema
@primary_key {:id, :binary_id, autogenerate: true}
@foreign_key_type :binary_id
schema "comments" do
belongs_to :post, MyApp.Post
end
end
defmodule MyApp.Post do
use Ecto.Schema
# Forgot to set @primary_key — now uses integer :id
schema "posts" do
has_many :comments, MyApp.Comment # Association type mismatch
end
end
```
### When to Use
**Triggers:**
- Application uses UUID (`:binary_id`) primary keys
- Application has more than one schema module
- You want all future schemas to inherit the same defaults without manual setup
**Example — before:**
```elixir
defmodule MyApp.User do
use Ecto.Schema
@primary_key {:id, :binary_id, autogenerate: true}
@foreign_key_type :binary_id
schema "users" do
field :name, :string
end
end
defmodule MyApp.Post do
use Ecto.Schema
@primary_key {:id, :binary_id, autogenerate: true}
@foreign_key_type :binary_id
schema "posts" do
belongs_to :user, MyApp.User
end
end
```
**Example — after:**
```elixir
# lib/my_app/schema.ex — define once
defmodule MyApp.Schema do
defmacro __using__(_) do
quote do
use Ecto.Schema
@primary_key {:id, :binary_id, autogenerate: true}
@foreign_key_type :binary_id
end
end
end
# All schemas use the base module
defmodule MyApp.User do
use MyApp.Schema
schema "users" do
field :name, :string
end
end
defmodule MyApp.Post do
use MyApp.Schema
schema "posts" do
belongs_to :user, MyApp.User
end
end
```
### When NOT to Use
**Don't use this when:**
- The application uses integer primary keys (Ecto's default) — no base module needed
- Different schemas intentionally use different primary key types
- You only have one schema (the base module adds indirection for no gain)
**Over-application example:**
```elixir
# Unnecessary when using integer primary keys (the Ecto default)
defmodule MyApp.Schema do
defmacro __using__(_) do
quote do
use Ecto.Schema
@primary_key {:id, :id, autogenerate: true} # Same as the default (:id resolves to integer)
@foreign_key_type :id
end
end
end
```
**Better alternative:**
```elixir
# Just use Ecto.Schema directly when the defaults are fine
defmodule MyApp.Post do
use Ecto.Schema
schema "posts" do
field :title, :string
end
end
```
**Why:** The base module pattern exists to override Ecto defaults, not to mirror them. When integer primary keys are acceptable, the base module adds a layer of indirection without adding any value.
---
## 2. `@primary_key false` — Composite or No Primary Key
**Source:** [lib/ecto/schema.ex#L1525](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/schema.ex#L1525)
```elixir
defmodule PostTag do
use Ecto.Schema
@primary_key false
schema "posts_tags" do
belongs_to :post, Post
belongs_to :tag, Tag
end
end
```
**Why:** By default, Ecto injects an `:id` field as the first field in every schema. For join tables with composite primary keys — or tables that intentionally have no primary key — this auto-injected field causes problems: migrations add a spurious column, queries include it, and Ecto's identity tracking operates on the wrong key. Setting `@primary_key false` disables the auto-injection entirely, giving the schema full control over which fields exist.
**Anti-pattern:** Defining a join table schema with the default `:id` field:
```elixir
# BAD — Ecto injects :id, which doesn't exist in posts_tags
defmodule PostTag do
use Ecto.Schema
schema "posts_tags" do
belongs_to :post, Post
belongs_to :tag, Tag
end
end
# Migration adds an :id column that the DB or application doesn't need.
# Ecto load/insert operations will reference a non-meaningful primary key.
```
### When to Use
**Triggers:**
- The table is a join/association table with a composite primary key (e.g., `post_id + tag_id`)
- The table has no meaningful single-column primary key at all
- You're mapping a legacy table that lacks a primary key column
**Example — before:**
```elixir
# posts_tags table has only (post_id, tag_id) — no separate :id column
defmodule PostTag do
use Ecto.Schema
schema "posts_tags" do
belongs_to :post, Post # generates post_id
belongs_to :tag, Tag # generates tag_id
timestamps()
end
end
# Ecto injects :id — but the DB column doesn't exist, causing query errors.
```
**Example — after:**
```elixir
defmodule PostTag do
use Ecto.Schema
@primary_key false
schema "posts_tags" do
belongs_to :post, Post
belongs_to :tag, Tag
timestamps()
end
end
```
### When NOT to Use
**Don't use this when:**
- The table has a legitimate single-column primary key (use the default or `@primary_key {:id, :binary_id, autogenerate: true}`)
- You want a composite primary key but still need Ecto's `Repo.get/2` and identity-based operations — those rely on a single primary key field
- The schema is used with `Repo.get/2` or `Repo.update/2`, which require a primary key
**Over-application example:**
```elixir
# Disabling primary key on a regular entity schema
defmodule MyApp.User do
use Ecto.Schema
@primary_key false # Users need a primary key for Repo.get/2
schema "users" do
field :email, :string
field :name, :string
end
end
```
**Better alternative:**
```elixir
defmodule MyApp.User do
use Ecto.Schema
schema "users" do # Default :id primary key is correct here
field :email, :string
field :name, :string
end
end
```
**Why:** `@primary_key false` removes Ecto's ability to uniquely identify rows. `Repo.get/2`, `Repo.update/2`, and association loading all depend on a primary key. Use it only for tables that are genuinely identified by composite keys or have no identity concept.
---
## 3. Virtual Fields — In-Memory-Only Data
**Source:** [lib/ecto/schema.ex#L332](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/schema.ex#L332)
```elixir
schema "users" do
field :password_hash, :string
field :password, :string, virtual: true
field :delete, :boolean, virtual: true
end
```
**Why:** Virtual fields live on the struct and participate in changesets — they can be `cast`, `validate_required`, and read in changeset functions — but Ecto never attempts to load or persist them. This is exactly right for data that only exists for the duration of an operation: a plaintext password for hashing, a confirmation field, a checkbox that signals intent to delete, or a computed display value assembled in application code.
**Anti-pattern:** Storing transient data in a map or separate variable passed alongside the changeset:
```elixir
# BAD — passing raw password alongside changeset is error-prone
def register_user(params) do
changeset = User.changeset(%User{}, params)
password = Map.get(params, "password") # Not validated, not in changeset pipeline
if changeset.valid?, do: hash_and_insert(changeset, password)
end
```
### When to Use
**Triggers:**
- A field is needed in changeset validation but must never be persisted (passwords, confirmation fields)
- A value is computed in application code and attached to the struct for rendering but has no DB column
- A form sends a signal (e.g., a "delete" checkbox) that controls behavior rather than being stored
**Example — before:**
```elixir
# Confirmation field handled outside the changeset pipeline
def create_account(params) do
password = params["password"]
confirm = params["password_confirmation"]
if password != confirm do
{:error, "passwords do not match"}
else
User.changeset(%User{}, params) |> Repo.insert()
end
end
```
**Example — after:**
```elixir
defmodule User do
use Ecto.Schema
import Ecto.Changeset
schema "users" do
field :email, :string
field :password_hash, :string
field :password, :string, virtual: true
field :password_confirmation, :string, virtual: true
end
def registration_changeset(user, params) do
user
|> cast(params, [:email, :password, :password_confirmation])
|> validate_required([:email, :password])
|> validate_confirmation(:password)
|> hash_password()
end
defp hash_password(%{valid?: true, changes: %{password: pw}} = changeset) do
put_change(changeset, :password_hash, Bcrypt.hash_pwd_salt(pw))
end
defp hash_password(changeset), do: changeset
end
```
### When NOT to Use
**Don't use this when:**
- The value should be persisted and loaded from the DB (use a regular field)
- The value needs to participate in Ecto queries (`where`, `order_by`) — virtual fields cannot be queried
- The field has a `:default` that should be the DB column default — virtual field defaults only affect the struct, not the database
**Over-application example:**
```elixir
# Using virtual for a value that should live in the DB
schema "articles" do
field :title, :string
field :slug, :string, virtual: true # Wrong — slug needs to be persisted and queried
field :body, :string
end
```
**Better alternative:**
```elixir
schema "articles" do
field :title, :string
field :slug, :string # Regular field — persisted and queryable
field :body, :string
end
```
**Why:** Virtual fields are intentionally invisible to the database layer. Any field that needs to survive a page reload, be searched, sorted, or returned in a list query must be a real column.
---
## 4. `embedded_schema/1` — Schemaless Validation Structs
**Source:** [lib/ecto/schema.ex#L110](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/schema.ex#L110)
```elixir
defmodule Search do
use Ecto.Schema
import Ecto.Changeset
embedded_schema do
field :query, :string
field :page, :integer, default: 1
field :per_page, :integer, default: 20
end
def changeset(search, params) do
search
|> cast(params, [:query, :page, :per_page])
|> validate_required([:query])
|> validate_number(:page, greater_than: 0)
end
end
```
**Why:** `embedded_schema` defines a schema module with no backing DB table. The struct, changeset pipeline, and all validations work identically to a regular schema, but there is no `Repo` involved. This makes it the right tool for validating and casting data that doesn't map to a table: search/filter forms, API request bodies, command parameters, multi-step wizard state. A `Search.changeset/2` call returns a changeset with errors just like `User.changeset/2` does — the same controller and form helper code works for both.
**Anti-pattern:** Validating formless data with raw `Map` operations and ad-hoc checks:
```elixir
# BAD — manual validation without changeset pipeline
def search(params) do
query = Map.get(params, "query")
page = Map.get(params, "page", "1") |> String.to_integer()
if is_nil(query) or query == "" do
{:error, "query is required"}
else
{:ok, %{query: query, page: page}}
end
# No type coercion, no error accumulation, no `valid?` flag
end
```
### When to Use
**Triggers:**
- A form or API endpoint needs validated, typed input but no persistence
- You want `Ecto.Changeset` error formatting and `valid?` semantics without a Repo
- Multi-step wizard where intermediate state needs validation before DB write
- An external API receives a complex payload that needs casting and validation
**Example — before:**
```elixir
# Validating filter params without Ecto — verbose and fragile
def filter_posts(params) do
status = Map.get(params, "status")
valid_statuses = ["draft", "published", "archived"]
cond do
status not in valid_statuses ->
{:error, "status must be one of: #{Enum.join(valid_statuses, ", ")}"}
true ->
Post |> where(status: ^status) |> Repo.all() |> then(&{:ok, &1})
end
end
```
**Example — after:**
```elixir
defmodule PostFilter do
use Ecto.Schema
import Ecto.Changeset
embedded_schema do
field :status, Ecto.Enum, values: [:draft, :published, :archived]
field :page, :integer, default: 1
end
def changeset(filter \\ %PostFilter{}, params) do
filter
|> cast(params, [:status, :page])
|> validate_required([:status])
|> validate_number(:page, greater_than: 0)
end
end
def filter_posts(params) do
with %{valid?: true} = cs <- PostFilter.changeset(params),
filter <- Ecto.Changeset.apply_changes(cs) do
Post |> where(status: ^filter.status) |> Repo.all() |> then(&{:ok, &1})
else
cs -> {:error, cs}
end
end
```
### When NOT to Use
**Don't use this when:**
- The data needs to be persisted — use a regular `schema/2` with a migration
- The validation logic is trivial (one or two fields) — a plain `cast` + `validate_required` in a context module may be sufficient
- You need DB-level constraints or uniqueness checks — `embedded_schema` has no Repo access
**Over-application example:**
```elixir
# Overkill for a single required string field
defmodule SearchQuery do
use Ecto.Schema
embedded_schema do
field :q, :string
end
def changeset(s, p), do: s |> cast(p, [:q]) |> validate_required([:q])
end
```
**Better alternative:**
```elixir
# Simple inline validation is clearer for trivial cases
def search(params) do
case Map.fetch(params, "q") do
{:ok, q} when q != "" -> {:ok, q}
_ -> {:error, :missing_query}
end
end
```
**Why:** `embedded_schema` earns its keep when you have multiple fields, type coercion, or multiple validators — the changeset pipeline pays for itself. For a single field with one constraint, the ceremony outweighs the benefit.
---
## 5. `@timestamps_opts` — Consistent Timestamp Types
**Source:** [lib/ecto/schema.ex#L170](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/schema.ex#L170)
```elixir
defmodule MyApp.Schema do
defmacro __using__(_) do
quote do
use Ecto.Schema
@timestamps_opts [type: :utc_datetime_usec]
end
end
end
```
**Why:** The `timestamps()` macro inserts `inserted_at` and `updated_at` fields whose type is controlled by `@timestamps_opts`. The default type is `:naive_datetime`, which stores timestamps with no timezone information. Applications that store or compare timestamps across timezones need `:utc_datetime` or `:utc_datetime_usec`. Microsecond precision (`:utc_datetime_usec`) avoids silent truncation when high-resolution timestamps are generated in Elixir and stored in PostgreSQL (which supports microsecond precision). Setting this in the base module ensures every schema uses the same timestamp type.
**Anti-pattern:** Using the default `naive_datetime` timestamps when UTC compliance or precision is required:
```elixir
# BAD — naive_datetime drops timezone context
defmodule MyApp.Post do
use Ecto.Schema
schema "posts" do
field :title, :string
timestamps() # inserts inserted_at/updated_at as :naive_datetime
end
end
# Timestamps stored as "2024-01-15 10:30:00" — no UTC indicator.
# Comparisons and serialization can silently produce wrong results.
```
### When to Use
**Triggers:**
- Application stores or compares timestamps across timezones (almost always true for web apps)
- Timestamps need to be serialized to ISO 8601 / JSON with timezone information
- PostgreSQL or another DB with microsecond precision should not lose sub-second data
- Multiple schemas share the same timestamp behavior
**Example — before:**
```elixir
# Each schema repeats the option
defmodule MyApp.User do
use Ecto.Schema
schema "users" do
timestamps(type: :utc_datetime_usec)
end
end
defmodule MyApp.Post do
use Ecto.Schema
schema "posts" do
timestamps(type: :utc_datetime_usec)
end
end
```
**Example — after:**
```elixir
# Base module sets the default once
defmodule MyApp.Schema do
defmacro __using__(_) do
quote do
use Ecto.Schema
@primary_key {:id, :binary_id, autogenerate: true}
@foreign_key_type :binary_id
@timestamps_opts [type: :utc_datetime_usec]
end
end
end
defmodule MyApp.User do
use MyApp.Schema
schema "users" do
timestamps() # uses :utc_datetime_usec from base module
end
end
```
### When NOT to Use
**Don't use this when:**
- The application intentionally avoids timezones (internal tooling, batch jobs)
- The DB column type is `timestamp without time zone` and you want to preserve that semantic
- Only one or two schemas use timestamps and the option is set inline
**Over-application example:**
```elixir
# Setting timestamps_opts for a schema that has no timestamps() call
defmodule PostTag do
use Ecto.Schema
@primary_key false
@timestamps_opts [type: :utc_datetime_usec] # Never used — no timestamps() call
schema "posts_tags" do
belongs_to :post, Post
belongs_to :tag, Tag
end
end
```
**Better alternative:**
```elixir
# The option is harmless but unnecessary — omit it when there are no timestamps
defmodule PostTag do
use MyApp.Schema # Base module sets it — no need to repeat
@primary_key false
schema "posts_tags" do
belongs_to :post, Post
belongs_to :tag, Tag
end
end
```
**Why:** `@timestamps_opts` only takes effect when `timestamps()` is called inside a `schema` block. Setting it on schemas that call `timestamps()` is correct; setting it on schemas that don't is dead code.
---
## 6. Field `:source` Option — Column Name Mapping
**Source:** [lib/ecto/schema.ex](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/schema.ex)
```elixir
schema "legacy_users" do
field :email_address, :string, source: :emailaddress
field :created_at, :utc_datetime, source: :creation_timestamp
end
```
**Why:** The `:source` option on `field/3` maps an Elixir field name to a different database column name. This is the escape hatch for working with databases that don't follow Elixir naming conventions — legacy systems with camelCase or abbreviated column names, databases shared with other applications, or tables generated by external tools. The Elixir struct uses the field name (`:email_address`); SQL uses the source name (`emailaddress`). Ecto translates transparently in both directions — `select`, `where`, and `insert` all use the column name.
**Anti-pattern:** Mapping column names by wrapping all query fragments in raw SQL or using `fragment/1` everywhere:
```elixir
# BAD — forced to use raw column names in every query
from u in "legacy_users",
where: fragment("emailaddress = ?", ^email),
select: %{email: u.emailaddress, created_at: u.creation_timestamp}
# Loses type casting, association loading, and changeset integration
```
### When to Use
**Triggers:**
- The database column name cannot be changed (shared DB, legacy system, migration risk)
- The column name violates Elixir conventions (camelCase, abbreviations, reserved words)
- You want Ecto's type casting and query DSL to work with idiomatic Elixir field names
**Example — before:**
```elixir
# Schema mirrors the ugly DB column names directly
schema "legacy_users" do
field :emailaddress, :string
field :creation_timestamp, :utc_datetime
field :lastlogindt, :utc_datetime
end
# Every callsite uses the ugly names
from u in User, where: u.emailaddress == ^email
%{emailaddress: user.emailaddress, created: user.creation_timestamp}
```
**Example — after:**
```elixir
schema "legacy_users" do
field :email, :string, source: :emailaddress
field :inserted_at, :utc_datetime, source: :creation_timestamp
field :last_login_at, :utc_datetime, source: :lastlogindt
end
# Callsites use clean Elixir names
from u in User, where: u.email == ^email
%{email: user.email, created: user.inserted_at}
```
### When NOT to Use
**Don't use this when:**
- You control the database schema — use a migration to rename the column instead
- The column name follows Elixir conventions already — `:source` adds no value
- You're creating a new table — design the column names correctly from the start
**Over-application example:**
```elixir
# Renaming columns that are already idiomatic
schema "users" do
field :user_name, :string, source: :username # username is fine as-is
field :created_at_time, :utc_datetime, source: :inserted_at # just use inserted_at
end
```
**Better alternative:**
```elixir
schema "users" do
field :username, :string # Keep the DB name if it's already clear
timestamps() # inserted_at / updated_at are already conventional
end
```
**Why:** `:source` is a mapping layer between two names that creates a permanent translation cost every time someone reads the schema. When you control the DB, eliminate the mismatch at the migration level rather than carrying it forever in the schema.
---
## 7. `redact: true` — Protecting Sensitive Fields
**Source:** [lib/ecto/schema.ex#L128](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/schema.ex#L128)
```elixir
schema "users" do
field :email, :string
field :password_hash, :string, redact: true
field :api_token, :string, redact: true
end
```
**Why:** Ecto derives an `Inspect` implementation for every schema struct. Without `redact: true`, inspecting a changeset or struct in logs, IEx, or crash reports prints every field value in plain text. Marking a field with `redact: true` replaces its value with `**redacted**` in `inspect/2` output and in `changeset.changes` inspection. This is a passive, always-on protection that requires no additional code at each log site — once set on the field, the field is protected everywhere the struct is printed.
**Anti-pattern:** Relying on callers to manually omit sensitive fields before logging:
```elixir
# BAD — easy to forget, must be repeated at every log site
def create_user(params) do
changeset = User.changeset(%User{}, params)
# Must remember to drop :password_hash before logging
safe = Map.drop(changeset.changes, [:password_hash, :api_token])
Logger.info("Creating user: #{inspect(safe)}")
Repo.insert(changeset)
end
```
### When to Use
**Triggers:**
- A field stores a secret, credential, or token (passwords, API keys, session tokens)
- A field stores PII that should not appear in logs or error reports (SSN, credit card data)
- A changeset for the schema might be logged, inspected in IEx, or appear in crash reports
- You use `Logger.info/debug` calls that could inadvertently print struct values
**Example — before:**
```elixir
# Sensitive values appear in plain text in logs and crash dumps
defmodule User do
use Ecto.Schema
schema "users" do
field :email, :string
field :password_hash, :string # Logged as "password_hash: \"$2b$12$...\""
field :reset_token, :string # Logged as "reset_token: \"abc123secret\""
end
end
```
**Example — after:**
```elixir
defmodule User do
use Ecto.Schema
schema "users" do
field :email, :string
field :password_hash, :string, redact: true
field :reset_token, :string, redact: true
end
end
# inspect(user) => #User<email: "alice@example.com", password_hash: **redacted**, ...>
```
### When NOT to Use
**Don't use this when:**
- The field contains non-sensitive data — redaction adds visual noise to inspect output
- You need to see the field value during development debugging (temporarily remove `redact: true` locally, or use pattern matching to extract the value directly)
- The struct is used exclusively in contexts where it is never printed (pure computation)
**Over-application example:**
```elixir
# Redacting non-sensitive fields makes debugging needlessly difficult
schema "products" do
field :name, :string, redact: true # Why is a product name sensitive?
field :price, :decimal, redact: true # Price data isn't a secret
field :sku, :string, redact: true
end
```
**Better alternative:**
```elixir
schema "products" do
field :name, :string
field :price, :decimal
field :sku, :string
end
```
**Why:** `redact: true` makes fields invisible to standard Elixir tooling. Overusing it on non-sensitive fields makes crash reports and debug output harder to read without adding any security benefit. Apply it precisely to fields whose values would constitute a security or privacy exposure if logged.
---
## 8. `__schema__/1` Reflection — Runtime Schema Introspection
**Source:** [lib/ecto/schema.ex#L450](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/schema.ex#L450)
```elixir
# Get all field names
User.__schema__(:fields) #=> [:id, :name, :email, :age]
User.__schema__(:associations) #=> [:posts, :comments]
User.__schema__(:type, :email) #=> :string
User.__schema__(:virtual_fields) #=> [:password]
```
**Why:** Every module that calls `use Ecto.Schema` gets `__schema__/1` generated at compile time. These reflection callbacks expose the schema's structure at runtime without requiring the caller to know the schema's field list at compile time. Generic code — CSV exporters, audit loggers, API serializers, test factories — can introspect any schema and work with all its fields automatically. When a new field is added to the schema, the generic code picks it up without modification.
**Anti-pattern:** Hardcoding field names in generic helpers:
```elixir
# BAD — must be updated every time any schema gains or loses a field
def audit_changes(changeset) do
fields = [:name, :email, :role] # Hardcoded — User-specific, not generic
Enum.each(fields, fn field ->
if Map.has_key?(changeset.changes, field) do
AuditLog.record(field, changeset.changes[field])
end
end)
end
```
### When to Use
**Triggers:**
- Writing a helper that must work across multiple schema modules without knowing their fields
- Building a generic CSV exporter, JSON serializer, or audit logger
- Generating test factories or seed data for any schema
- Writing a linter or validation tool that checks schema conventions
**Example — before:**
```elixir
# Must manually maintain field lists for each schema
defmodule CsvExporter do
def export_users(records) do
headers = ["id", "name", "email", "age"]
rows = Enum.map(records, fn u -> [u.id, u.name, u.email, u.age] end)
[headers | rows]
end
def export_posts(records) do
headers = ["id", "title", "body", "user_id"]
rows = Enum.map(records, fn p -> [p.id, p.title, p.body, p.user_id] end)
[headers | rows]
end
end
```
**Example — after:**
```elixir
defmodule CsvExporter do
def export(schema, records) do
fields = schema.__schema__(:fields)
headers = Enum.map(fields, &to_string/1)
rows = Enum.map(records, fn record ->
Enum.map(fields, &Map.get(record, &1))
end)
[headers | rows]
end
end
# Works for any schema without modification
CsvExporter.export(User, users)
CsvExporter.export(Post, posts)
```
### When NOT to Use
**Don't use this when:**
- You're writing code for one specific schema and the fields are known and stable — just use them directly
- You need to control exactly which fields are exposed (e.g., public API — use an explicit allowlist)
- The schema has virtual fields or sensitive fields that should not be included — reflection returns all fields regardless of `virtual: true` or `redact: true`
**Over-application example:**
```elixir
# Using reflection for a single known schema adds indirection with no benefit
def display_user(user) do
User.__schema__(:fields)
|> Enum.map(fn field -> "#{field}: #{Map.get(user, field)}" end)
|> Enum.join(", ")
# Dumps :password_hash, :reset_token, and every internal field
end
```
**Better alternative:**
```elixir
# For a known schema, use explicit fields — safer and clearer intent
def display_user(user) do
"#{user.name} <#{user.email}>"
end
```
**Why:** `__schema__(:fields)` returns all persisted fields including internal ones (`inserted_at`, `updated_at`, foreign keys). For display or serialization of a known schema, an explicit field list documents intent and prevents sensitive data from leaking. Use reflection for genuinely generic code where the alternative is a hardcoded list that diverges from the schema over time.
---
## Decision Tree
- If the app uses UUID primary keys across all schemas → create a base schema module with `@primary_key {:id, :binary_id, autogenerate: true}` and `@foreign_key_type :binary_id`
- If a table has no single primary key (join table, composite key) → `@primary_key false`
- If a field holds runtime-only data (password input, confirmation field, computed display value) → `virtual: true`
- If you need changeset validation without a DB table (form objects, API request bodies) → `embedded_schema`
- If timestamps need UTC compliance or microsecond precision → set `@timestamps_opts [type: :utc_datetime_usec]` in the base schema module
- If a DB column name doesn't match Elixir conventions and can't be migrated → `:source` option on `field/3`
- If a field holds sensitive data (passwords, tokens, PII) that must not appear in logs → `redact: true`
- If writing generic code that works across schema modules without knowing their fields → `__schema__/1` reflection
<!-- PATTERN_COMPLETE -->
+46
View File
@@ -2,6 +2,29 @@
Patterns extracted from the Elixir standard library source code — how the core team writes and organizes tests.
## Contents
1. [Module-Level Async Declaration](#1-module-level-async-declaration)
2. [Parameterized Tests](#2-parameterized-tests)
3. [Setup with `start_supervised/2`](#3-setup-with-start_supervised2)
4. [Named Setup Functions (Composable Pipelines)](#4-named-setup-functions-composable-pipelines)
5. [`on_exit` for Reversing Global Side Effects](#5-on_exit-for-reversing-global-side-effects)
6. [Pattern Match Assertions](#6-pattern-match-assertions)
7. [`assert_receive` / `refute_receive` for Process Communication](#7-assert_receive--refute_receive-for-process-communication)
8. [Testing GenServers via Public API (No Internal State Inspection)](#8-testing-genservers-via-public-api-no-internal-state-inspection)
9. [`catch_exit` for Testing Process Failures](#9-catch_exit-for-testing-process-failures)
10. [`@tag capture_log: true` for Suppressing Expected Log Output](#10-tag-capture_log-true-for-suppressing-expected-log-output)
11. [`capture_log` / `capture_io` for Content Assertions](#11-capture_log--capture_io-for-content-assertions)
12. [`describe` Blocks for Logical Grouping](#12-describe-blocks-for-logical-grouping)
13. [`ExUnit.CaseTemplate` for Shared Test Infrastructure](#13-exunitcasetemplate-for-shared-test-infrastructure)
14. [`doctest` Integration](#14-doctest-integration)
15. [`Process.sleep(:infinity)` as a Process Parking Pattern](#15-processsleepinfinity-as-a-process-parking-pattern)
16. [Helper Functions for Test-Specific Behavior](#16-helper-functions-for-test-specific-behavior)
17. [`@tag :tmp_dir` for Filesystem Tests](#17-tag-tmp_dir-for-filesystem-tests)
18. [`assert_raise` with Message Matching](#18-assert_raise-with-message-matching)
19. [`@moduletag` / `@describetag` for Cross-Cutting Configuration](#19-moduletag--describetag-for-cross-cutting-configuration)
20. [Context Pattern Matching in Test Signatures](#20-context-pattern-matching-in-test-signatures)
---
## 1. Module-Level Async Declaration
@@ -1725,4 +1748,27 @@ end
**Why:** Context destructuring signals "this test depends on external setup." If the test is self-contained, the pattern match is misleading — readers will look for setup that doesn't exist or isn't needed.
## Decision Tree
- If you are creating a new test module and need to decide on concurrency → [Module-Level Async Declaration](#1-module-level-async-declaration)
- If the same logic must work across multiple configurations or backends → [Parameterized Tests](#2-parameterized-tests)
- If your test needs a running process with guaranteed cleanup → [Setup with `start_supervised/2`](#3-setup-with-start_supervised2)
- If setup has multiple independent steps that different describe blocks reuse → [Named Setup Functions](#4-named-setup-functions-composable-pipelines)
- If your test modifies global state that must be restored regardless of outcome → [`on_exit` for Reversing Global Side Effects](#5-on_exit-for-reversing-global-side-effects)
- If you care about the shape/structure of a result but not every field → [Pattern Match Assertions](#6-pattern-match-assertions)
- If you need to test asynchronous message delivery between processes → [`assert_receive` / `refute_receive`](#7-assert_receive--refute_receive-for-process-communication)
- If you are testing a GenServer and want tests that survive refactoring → [Testing GenServers via Public API](#8-testing-genservers-via-public-api-no-internal-state-inspection)
- If you need to assert on OTP exit signals (timeouts, noproc, shutdown) → [`catch_exit`](#9-catch_exit-for-testing-process-failures)
- If tests intentionally trigger error paths that produce noisy log output → [`@tag capture_log: true`](#10-tag-capture_log-true-for-suppressing-expected-log-output)
- If you need to verify specific log or IO content was emitted → [`capture_log` / `capture_io`](#11-capture_log--capture_io-for-content-assertions)
- If a module tests multiple public functions and needs logical organization → [`describe` Blocks](#12-describe-blocks-for-logical-grouping)
- If multiple test modules share the same setup/teardown infrastructure → [`ExUnit.CaseTemplate`](#13-exunitcasetemplate-for-shared-test-infrastructure)
- If your module has `iex>` examples that should be verified automatically → [`doctest` Integration](#14-doctest-integration)
- If you need an inert process that exists only to be observed or killed → [`Process.sleep(:infinity)`](#15-processsleepinfinity-as-a-process-parking-pattern)
- If the same 3-5 line test pattern repeats across multiple tests → [Helper Functions](#16-helper-functions-for-test-specific-behavior)
- If tests create or modify files and need filesystem isolation → [`@tag :tmp_dir`](#17-tag-tmp_dir-for-filesystem-tests)
- If you need to verify both the exception type and the user-facing message → [`assert_raise` with Message Matching](#18-assert_raise-with-message-matching)
- If tests only run on certain platforms or you want to filter subsets → [`@moduletag` / `@describetag`](#19-moduletag--describetag-for-cross-cutting-configuration)
- If you want to make test dependencies on setup context explicit → [Context Pattern Matching](#20-context-pattern-matching-in-test-signatures)
<!-- PATTERN_COMPLETE -->
+723
View File
@@ -0,0 +1,723 @@
# Ecto Type Patterns
Patterns extracted from Ecto's type system source code.
## Contents
1. [`use Ecto.Type` — The Four-Callback Custom Type](#1-use-ectotype--the-four-callback-custom-type)
2. [`embed_as/1` — Controlling Embedded Serialization](#2-embed_as1--controlling-embedded-serialization)
3. [`equal?/2` — Custom Equality for Change Detection](#3-equal2--custom-equality-for-change-detection)
4. [`Ecto.Enum` — Constrained Atom Fields](#4-ectoenum--constrained-atom-fields)
5. [`Ecto.ParameterizedType` — Types with Options](#5-ectoparameterizedtype--types-with-options)
6. [Schemaless Types — `{data, types}` Changesets](#6-schemaless-types--data-types-changesets)
---
## 1. `use Ecto.Type` — The Four-Callback Custom Type
**Source:** [lib/ecto/type.ex#L57-L89](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/type.ex#L57)
The canonical custom type implements exactly four callbacks. Each has a distinct role in the data flow:
- `type/0` — declares the underlying database column type (`:string`, `:map`, `:integer`, etc.)
- `cast/1` — converts external or user-supplied values into your Elixir type; called by `Ecto.Changeset.cast/4`
- `load/1` — converts a raw database value into your Elixir type; called when reading rows
- `dump/1` — converts your Elixir type into a database-compatible value; called when writing
The state diagram is: external --[cast]--> internal --[dump]--> database --[load]--> internal
```elixir
defmodule EctoURI do
use Ecto.Type
def type, do: :map
def cast(uri) when is_binary(uri), do: {:ok, URI.parse(uri)}
def cast(%URI{} = uri), do: {:ok, uri}
def cast(_), do: :error
def load(data) when is_map(data) do
data = for {key, val} <- data, do: {String.to_existing_atom(key), val}
{:ok, struct!(URI, data)}
end
def dump(%URI{} = uri), do: {:ok, Map.from_struct(uri)}
def dump(_), do: :error
end
```
**Why:** Ecto calls each callback at a different point in the lifecycle. Without `cast/1`, user input passes through as raw strings. Without `load/1`, the value comes out of the DB as a plain map. Without `dump/1`, Ecto cannot serialize the struct for storage. Implementing all four closes every gap in the round-trip.
**Anti-pattern:** Relying on the database to store your Elixir struct as-is, or using `:map` with manual pre/post processing scattered across contexts:
```elixir
# BAD — manual conversion at every call site instead of centralizing in a type
%Post{} = Repo.get(Post, id)
uri = post.url |> Map.from_struct() |> then(&URI.parse/1) # repeated everywhere
Repo.update(Post.changeset(post, %{url: Map.from_struct(new_uri)}))
```
### When to Use
**Triggers:**
- You want to store an Elixir struct (URI, Decimal, custom struct) in a single database column
- Cast/load/dump logic would otherwise be duplicated across changesets and contexts
- You want Ecto changesets to accept raw user input and produce the correct Elixir type automatically
**Example — before:**
```elixir
# Manual conversion in every changeset
def changeset(post, params) do
post
|> cast(params, [:url])
|> validate_change(:url, fn :url, val ->
case URI.parse(val) do
%URI{host: nil} -> [url: "must be a valid URI"]
_ -> []
end
end)
end
# Manual conversion when reading
def get_post(id) do
post = Repo.get!(Post, id)
%{post | url: struct!(URI, post.url)}
end
```
**Example — after:**
```elixir
# schema declaration — one line
field :url, EctoURI
# changeset just casts; EctoURI.cast/1 validates shape
def changeset(post, params), do: cast(post, params, [:url])
# reading returns %URI{} automatically via EctoURI.load/1
def get_post(id), do: Repo.get!(Post, id)
```
### When NOT to Use
**Don't use this when:**
- The transformation is trivial (storing a plain string that needs no parsing)
- The field contains highly variable or polymorphic data better handled by embedded schemas
- You need association semantics (use Ecto associations, not custom types)
**Over-application example:**
```elixir
# Overkill — a custom type just to upcase strings
defmodule UppercaseString do
use Ecto.Type
def type, do: :string
def cast(v) when is_binary(v), do: {:ok, String.upcase(v)}
def cast(_), do: :error
def load(v), do: {:ok, v}
def dump(v), do: {:ok, v}
end
```
**Better alternative:**
```elixir
# Changeset validation handles this cleanly without a new type module
def changeset(record, params) do
record
|> cast(params, [:code])
|> update_change(:code, &String.upcase/1)
end
```
**Why:** Custom types are for encapsulating a serialization contract that must be enforced consistently at every read and write. For one-off transformations or validations, changeset functions are simpler and more obvious to future readers.
---
## 2. `embed_as/1` — Controlling Embedded Serialization
**Source:** [lib/ecto/type.ex#L104-L109](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/type.ex#L104)
When a custom type is used inside an `embeds_one` or `embeds_many` field, Ecto calls `embed_as/1` to decide whether to pass the value through `dump/1` or treat it as its own serialized form. The callback receives the embed format (`:json` by default) and returns either `:self` or `:dump`.
- `:self` — the in-memory value is used as-is without calling `dump/1`; appropriate when the runtime representation is already JSON-compatible (scalars, plain maps)
- `:dump``dump/1` is called to serialize the value before encoding; needed when the runtime representation (e.g., a struct) is not directly JSON-serializable
`use Ecto.Type` provides a default implementation that returns `:self`. Override it to return `:dump` when your type holds an Elixir struct or other value that cannot be directly encoded to JSON.
```elixir
defmodule EctoURI do
use Ecto.Type
# Override: %URI{} is not JSON-serializable, so run dump/1 in embedded contexts
def embed_as(_format), do: :dump
end
```
**Why:** When Ecto builds embedded documents for export (e.g., storing a JSON blob), it needs to know whether to trust the in-memory value or to re-serialize it. If your type holds state in an Elixir struct that cannot be stored directly (like `%URI{}`), the default `:self` passes the raw struct to the JSON encoder — which either raises or produces garbage like `%{__struct__: "Elixir.URI", host: ..., ...}`. Returning `:dump` ensures `dump/1` converts it to a clean map first.
**Anti-pattern:** Assuming the default `:self` is correct for a type whose in-memory and storable representations differ, then debugging mysterious embedded schema corruption:
```elixir
# BAD — %URI{} stored as a raw struct map because embed_as/1 was never considered
embeds_one :metadata, Metadata do
field :canonical_url, EctoURI # stored as %{__struct__: "Elixir.URI", host: ..., ...}
end
```
### When to Use
**Triggers:**
- Your custom type is used inside `embeds_one` or `embeds_many` fields
- The Elixir representation and the storable representation differ (struct vs plain map, etc.)
- You observe that embedded fields are persisted with unexpected shapes
**Example — before:**
```elixir
# embed_as/1 not considered; embedded URL stored as raw struct map
defmodule EctoURI do
use Ecto.Type
def type, do: :map
def cast(uri) when is_binary(uri), do: {:ok, URI.parse(uri)}
def cast(%URI{} = uri), do: {:ok, uri}
def cast(_), do: :error
def load(data) when is_map(data) do
data = for {key, val} <- data, do: {String.to_existing_atom(key), val}
{:ok, struct!(URI, data)}
end
def dump(%URI{} = uri), do: {:ok, Map.from_struct(uri)}
def dump(_), do: :error
end
```
**Example — after:**
```elixir
defmodule EctoURI do
use Ecto.Type
def type, do: :map
def cast(uri) when is_binary(uri), do: {:ok, URI.parse(uri)}
def cast(%URI{} = uri), do: {:ok, uri}
def cast(_), do: :error
def load(data) when is_map(data) do
data = for {key, val} <- data, do: {String.to_existing_atom(key), val}
{:ok, struct!(URI, data)}
end
def dump(%URI{} = uri), do: {:ok, Map.from_struct(uri)}
def dump(_), do: :error
# Explicit: ensure dump/1 runs in embedded contexts to produce a clean map
def embed_as(_format), do: :dump
end
```
### When NOT to Use
**Don't use this when:**
- Your type is never used in embedded schemas (the callback has no effect)
- The in-memory and storable representations are the same (plain maps, scalars)
- You intentionally want to skip `dump/1` in embedded contexts (the default `:self` already does this)
**Over-application example:**
```elixir
# Overriding embed_as/1 on a type that stores plain integers
defmodule PriorityLevel do
use Ecto.Type
def type, do: :integer
def cast(v) when is_integer(v) and v in 1..5, do: {:ok, v}
def cast(_), do: :error
def load(v), do: {:ok, v}
def dump(v), do: {:ok, v}
def embed_as(_), do: :self # Pointless — integer is its own storable form
end
```
**Better alternative:**
```elixir
# Default embed_as/1 from `use Ecto.Type` is sufficient for scalar types
defmodule PriorityLevel do
use Ecto.Type
def type, do: :integer
def cast(v) when is_integer(v) and v in 1..5, do: {:ok, v}
def cast(_), do: :error
def load(v), do: {:ok, v}
def dump(v), do: {:ok, v}
end
```
**Why:** `embed_as/1` is only meaningful when there is a gap between what Ecto holds in memory and what must be written to the database. For scalar types where the two representations are identical, the default suffices and adding the override is noise.
---
## 3. `equal?/2` — Custom Equality for Change Detection
**Source:** [lib/ecto/type.ex](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/type.ex)
Ecto calls `equal?/2` to decide whether a field's value has changed before including it in an `UPDATE` statement. The default delegates to `==`. Override when your type's notion of equality is more nuanced than structural term equality.
Common cases: `Decimal` values where `Decimal.equal?/2` differs from `==`, sets or maps where insertion order shouldn't matter, floats with tolerance, URIs where trailing slashes are equivalent.
```elixir
defmodule EctoDecimal do
use Ecto.Type
def type, do: :string
def equal?(%Decimal{} = a, %Decimal{} = b), do: Decimal.equal?(a, b)
def equal?(a, b), do: a == b
end
```
**Why:** `Decimal.new("1.0") == Decimal.new("1.00")` is `false` in Elixir because the structs differ structurally. Without a custom `equal?/2`, Ecto would generate a spurious `UPDATE` every time a Decimal field is loaded and re-saved unchanged. The custom implementation delegates to the type's own equality semantics, preventing unnecessary database writes.
**Anti-pattern:** Allowing Ecto to generate spurious UPDATEs because `==` disagrees with logical equality:
```elixir
# BAD — default equal?/2 uses ==
# Decimal.new("1.0") != Decimal.new("1.00") structurally,
# so every load-and-save cycle marks the field dirty
defmodule EctoDecimal do
use Ecto.Type
def type, do: :string
def cast(v), do: {:ok, Decimal.new(v)}
def load(v), do: {:ok, Decimal.new(v)}
def dump(v), do: {:ok, Decimal.to_string(v)}
# Missing equal?/2 — spurious UPDATEs in production
end
```
### When to Use
**Triggers:**
- Your type wraps a value with non-structural equality (Decimal, Set, custom structs with computed fields)
- You see unexpected `UPDATE` queries in your logs when no meaningful data changed
- The type's own library provides an equality function (e.g., `Decimal.equal?/2`)
**Example — before:**
```elixir
# Tags stored as a sorted list — ["a", "b"] and ["b", "a"] treated as different
defmodule TagList do
use Ecto.Type
def type, do: {:array, :string}
def cast(tags) when is_list(tags), do: {:ok, Enum.sort(tags)}
def cast(_), do: :error
def load(v), do: {:ok, v}
def dump(v), do: {:ok, v}
# Missing equal?/2 — if DB returns unsorted list, it always appears changed
end
```
**Example — after:**
```elixir
defmodule TagList do
use Ecto.Type
def type, do: {:array, :string}
def cast(tags) when is_list(tags), do: {:ok, Enum.sort(tags)}
def cast(_), do: :error
def load(v), do: {:ok, v}
def dump(v), do: {:ok, v}
def equal?(a, b) when is_list(a) and is_list(b) do
Enum.sort(a) == Enum.sort(b)
end
def equal?(a, b), do: a == b
end
```
### When NOT to Use
**Don't use this when:**
- Structural equality (`==`) already matches your type's logical equality (most scalars, plain maps)
- You intentionally want every change in representation to trigger an update (audit fields, version counters)
- The equality logic would require database queries or I/O
**Over-application example:**
```elixir
# Overriding equal?/2 for a plain string type — == is already correct
defmodule TrimmedString do
use Ecto.Type
def type, do: :string
def cast(v) when is_binary(v), do: {:ok, String.trim(v)}
def cast(_), do: :error
def load(v), do: {:ok, v}
def dump(v), do: {:ok, v}
def equal?(a, b), do: String.trim(a) == String.trim(b) # misleading
end
```
**Better alternative:**
```elixir
# cast/1 already normalizes; == on the normalized value is correct
defmodule TrimmedString do
use Ecto.Type
def type, do: :string
def cast(v) when is_binary(v), do: {:ok, String.trim(v)}
def cast(_), do: :error
def load(v), do: {:ok, v}
def dump(v), do: {:ok, v}
# Default equal?/2 is correct — both sides are already trimmed
end
```
**Why:** Once `cast/1` normalizes values to a canonical form, `==` on that canonical form is correct. Overriding `equal?/2` to re-normalize creates two sources of truth for what "equal" means and can hide bugs where values escape normalization.
---
## 4. `Ecto.Enum` — Constrained Atom Fields
**Source:** [lib/ecto/enum.ex](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/enum.ex)
`Ecto.Enum` is Ecto's built-in parameterized type for fields that hold one of a fixed set of values. It stores atoms as strings in the database, validates membership automatically during `cast/4`, and provides a clean schema-level declaration of what values are legal.
```elixir
schema "orders" do
field :status, Ecto.Enum, values: [:pending, :processing, :shipped, :delivered, :cancelled]
end
# Changeset validation is automatic:
# cast will reject values outside the enum
# DB stores as "pending", "processing", etc.
# Can also map atoms to custom DB values:
field :status, Ecto.Enum,
values: [pending: "PENDING", shipped: "SHIPPED"]
```
**Why:** A fixed set of valid values is a domain constraint that belongs at the type level, not scattered across changeset validations. `Ecto.Enum` co-locates the constraint with the field declaration, making it impossible to add a new status without updating the schema. It also avoids the string-vs-atom impedance mismatch: your application code works with atoms, the DB stores strings.
**Anti-pattern:** Using `validate_inclusion` with a hardcoded list when the field is a fixed enum. The valid values now live in two places (schema field type and changeset validation), drift over time, and offer no guarantee that the DB stores a normalized form:
```elixir
# BAD — validation is disconnected from the field type
schema "orders" do
field :status, :string
end
def changeset(order, params) do
order
|> cast(params, [:status])
|> validate_inclusion(:status, ["pending", "processing", "shipped"])
# Atoms vs strings already a smell; easy to add a value in one place but not the other
end
```
### When to Use
**Triggers:**
- A field can only hold one of a fixed, small set of named values (status, role, priority, state machine states)
- You want `cast/4` to reject invalid values without writing a custom validator
- You want the valid values to be introspectable at runtime via `Ecto.Enum.values/2`
**Example — before:**
```elixir
schema "tickets" do
field :priority, :string
end
@valid_priorities ~w(low medium high critical)
def changeset(ticket, params) do
ticket
|> cast(params, [:priority])
|> validate_required([:priority])
|> validate_inclusion(:priority, @valid_priorities)
end
```
**Example — after:**
```elixir
schema "tickets" do
field :priority, Ecto.Enum, values: [:low, :medium, :high, :critical]
end
def changeset(ticket, params) do
ticket
|> cast(params, [:priority])
|> validate_required([:priority])
# validate_inclusion is implicit — cast rejects invalid values
end
```
### When NOT to Use
**Don't use this when:**
- The set of valid values is dynamic (loaded from the database, configurable at runtime)
- You need rich metadata per value (labels, descriptions, ordering weights) — a separate table or config map is better
- The field is an open-ended string that happens to have common values (tags, categories that grow)
**Over-application example:**
```elixir
# Enum for country codes that may expand and need localization
field :country, Ecto.Enum, values: [:us, :gb, :de, :fr, :jp]
# Adding a new country requires a schema migration AND a code change
```
**Better alternative:**
```elixir
# Reference table with a foreign key
field :country_code, :string
# Validated against a countries table at the application layer
```
**Why:** `Ecto.Enum` hard-codes valid values into the schema module. When the list is stable and small (status machines, role levels), that is exactly right. When the list is user-managed or requires non-code changes to extend, a reference table decouples the constraint from deployments.
---
## 5. `Ecto.ParameterizedType` — Types with Options
**Source:** [lib/ecto/parameterized_type.ex](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/parameterized_type.ex)
When a custom type needs configuration options set at field definition time (like `Ecto.Enum`'s `values:` option), implement `Ecto.ParameterizedType` instead of `Ecto.Type`. The `init/1` callback receives field options at compile/load time and returns a `params` term that is threaded through every other callback at runtime.
The five callbacks: `init/1`, `type/1`, `cast/2`, `load/3`, `dump/3`.
```elixir
defmodule MyApp.StatusType do
use Ecto.ParameterizedType
def init(opts) do
validate = Keyword.fetch!(opts, :validate)
%{validate: validate}
end
def type(_params), do: :string
def cast(value, %{validate: validator}) do
if validator.(value), do: {:ok, value}, else: :error
end
def load(value, _loader, _params), do: {:ok, value}
def dump(value, _dumper, _params), do: {:ok, value}
end
# Usage:
field :status, MyApp.StatusType, validate: &(&1 in ~w(a b c))
```
**Why:** `Ecto.Type` callbacks receive no configuration — the type module is a singleton with fixed behavior. `Ecto.ParameterizedType` solves the case where the same type module should behave differently per field (different valid values, different validation rules, different storage formats). `Ecto.Enum` itself is implemented as a parameterized type so the same module handles every `values:` list.
**Anti-pattern:** Defining a new type module for every variation of behavior that differs only in configuration:
```elixir
# BAD — combinatorial explosion of modules
defmodule StatusType.V1 do
use Ecto.Type
@valid ~w(draft published)
def cast(v) when v in @valid, do: {:ok, v}
def cast(_), do: :error
# ...
end
defmodule StatusType.V2 do
use Ecto.Type
@valid ~w(open in_progress closed)
# identical code, different @valid
end
```
### When to Use
**Triggers:**
- The same type logic should apply to multiple fields but with different configuration per field
- You are building a reusable library type that schema authors customize via options
- The configuration is known at schema definition time (compile-time or application startup)
**Example — before:**
```elixir
# A separate module per enum — doesn't scale
defmodule OrderStatus do
use Ecto.Type
def type, do: :string
def cast(v) when v in ~w(pending shipped), do: {:ok, v}
def cast(_), do: :error
def load(v), do: {:ok, v}
def dump(v), do: {:ok, v}
end
defmodule TicketPriority do
use Ecto.Type
def type, do: :string
def cast(v) when v in ~w(low high), do: {:ok, v}
def cast(_), do: :error
def load(v), do: {:ok, v}
def dump(v), do: {:ok, v}
end
```
**Example — after:**
```elixir
defmodule ConstrainedString do
use Ecto.ParameterizedType
def init(opts), do: %{values: Keyword.fetch!(opts, :values)}
def type(_params), do: :string
def cast(value, %{values: values}) when value in values, do: {:ok, value}
def cast(_, _), do: :error
def load(value, _, _), do: {:ok, value}
def dump(value, _, _), do: {:ok, value}
end
# In schemas:
field :status, ConstrainedString, values: ~w(pending shipped)
field :priority, ConstrainedString, values: ~w(low high)
```
### When NOT to Use
**Don't use this when:**
- The type has no configuration — use `Ecto.Type`, which has simpler callback signatures
- The configuration changes at runtime rather than at schema definition time (use `cast/1` with runtime state instead)
- You only need this for one field in one schema (inline the logic in a changeset validator)
**Over-application example:**
```elixir
# Parameterized type for a type that has no real options
defmodule MaybeString do
use Ecto.ParameterizedType
def init(_opts), do: %{} # no options used
def type(_), do: :string
def cast(v, _) when is_binary(v), do: {:ok, v}
def cast(_, _), do: :error
def load(v, _, _), do: {:ok, v}
def dump(v, _, _), do: {:ok, v}
end
```
**Better alternative:**
```elixir
# No configuration needed — plain Ecto.Type is simpler
defmodule MaybeString do
use Ecto.Type
def type, do: :string
def cast(v) when is_binary(v), do: {:ok, v}
def cast(_), do: :error
def load(v), do: {:ok, v}
def dump(v), do: {:ok, v}
end
```
**Why:** `Ecto.ParameterizedType` adds arity to every callback (`params` argument) and requires an `init/1` that must be implemented. This complexity is justified when you need per-field configuration. Without actual options, the extra arity is noise that obscures intent.
---
## 6. Schemaless Types — `{data, types}` Changesets
**Source:** [lib/ecto/changeset.ex](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/changeset.ex) (documentation for `cast/4`)
When you need to validate and cast data without defining a full schema module, pass a `{data, types}` tuple as the first argument to `Ecto.Changeset.cast/4`. The `data` map holds the current values (often `%{}`), and the `types` map specifies field names and their Ecto types. The full changeset API works normally — `validate_required`, `validate_number`, `apply_action`, etc.
```elixir
def validate_params(params) do
types = %{name: :string, age: :integer, role: :string}
{%{}, types}
|> Ecto.Changeset.cast(params, Map.keys(types))
|> Ecto.Changeset.validate_required([:name])
|> Ecto.Changeset.validate_number(:age, greater_than: 0)
|> Ecto.Changeset.apply_action(:insert)
end
```
**Why:** Defining a `use Ecto.Schema` module for a one-off params map is heavyweight: it introduces a new module, struct, and migration concern for data that never touches the database. The `{data, types}` tuple gives you Ecto's casting and validation pipeline — type coercion, error accumulation, `apply_action` — for ephemeral, transient, or API-boundary data structures.
**Anti-pattern:** Defining a full schema module solely to validate a one-off params map:
```elixir
# BAD — a schema with no table, used only for one validation function
defmodule SearchParams do
use Ecto.Schema
import Ecto.Changeset
@primary_key false
embedded_schema do
field :query, :string
field :page, :integer
field :per_page, :integer
end
def changeset(params) do
%SearchParams{}
|> cast(params, [:query, :page, :per_page])
|> validate_required([:query])
end
end
```
### When to Use
**Triggers:**
- Validating and casting controller params, webhook payloads, or CLI arguments that are never persisted
- Building a multi-step form or wizard where intermediate steps don't map to a database row
- Writing a context function that accepts a params map and needs to return normalized data or errors
**Example — before:**
```elixir
# Full schema module for transient data
defmodule ReportFilter do
use Ecto.Schema
import Ecto.Changeset
@primary_key false
embedded_schema do
field :start_date, :date
field :end_date, :date
field :user_id, :integer
end
def changeset(params) do
%ReportFilter{}
|> cast(params, [:start_date, :end_date, :user_id])
|> validate_required([:start_date, :end_date])
end
end
```
**Example — after:**
```elixir
def parse_report_filter(params) do
types = %{start_date: :date, end_date: :date, user_id: :integer}
{%{}, types}
|> Ecto.Changeset.cast(params, Map.keys(types))
|> Ecto.Changeset.validate_required([:start_date, :end_date])
|> Ecto.Changeset.apply_action(:validate)
end
```
### When NOT to Use
**Don't use this when:**
- The data structure is reused across many contexts (define a proper schema or embedded schema)
- You need associations, `autogenerate`, or `timestamps()` (requires a full schema)
- The validated data will eventually be persisted (start with a schema to avoid a rewrite)
**Over-application example:**
```elixir
# Schemaless changeset for data that has a natural schema home
def update_user_profile(user_id, params) do
types = %{name: :string, email: :string, bio: :string}
{%{}, types}
|> Ecto.Changeset.cast(params, Map.keys(types))
|> Ecto.Changeset.validate_required([:name, :email])
|> Ecto.Changeset.apply_action(:update)
# Then manually build an update query — loses Ecto.Repo integration
end
```
**Better alternative:**
```elixir
# User already has a schema — use it
def update_user_profile(user_id, params) do
Repo.get!(User, user_id)
|> User.profile_changeset(params)
|> Repo.update()
end
```
**Why:** The `{data, types}` approach trades schema infrastructure for simplicity. That trade is worth it for truly transient data (search filters, report parameters, one-time import validation). For persistent data, the schema module is not overhead — it is the source of truth for what the table contains.
---
## Decision Tree
- If storing a fixed set of values → `Ecto.Enum` with `values:`
- If wrapping an Elixir struct (URI, Decimal, etc.) in a field → `use Ecto.Type` with all four callbacks
- If your type needs configuration options at the field level → `use Ecto.ParameterizedType`
- If two values of your type may be logically equal without `==` → override `equal?/2`
- If your type is used inside `embeds_one`/`embeds_many` and has a non-trivial serialized form → verify `embed_as/1`
- If you need to validate params without a schema → schemaless `{data, types}` changeset
<!-- PATTERN_COMPLETE -->
+26
View File
@@ -2,6 +2,19 @@
Patterns extracted from the Elixir standard library source code.
## Contents
1. [Public Type with @typedoc](#1-public-type-with-typedoc)
2. [Private Types with @typep](#2-private-types-with-typep)
3. [@opaque Types (Protocol t())](#3-opaque-types-protocol-t)
4. [Union Types in @spec Return Values](#4-union-types-in-spec-return-values)
5. [`when` Constraints in Specs](#5-when-constraints-in-specs)
6. [Map Types with required/optional Keys](#6-map-types-with-requiredoptional-keys)
7. [Keyword List Types for Options](#7-keyword-list-types-for-options)
8. [Parameterized Types (t/1)](#8-parameterized-types-t1)
9. [Named Parameters in Specs (:: annotation)](#9-named-parameters-in-specs--annotation)
10. [@typedoc since: Annotation](#10-typedoc-since-annotation)
---
## 1. Public Type with @typedoc
@@ -798,4 +811,17 @@ end
**Why:** `since:` annotations are for library consumers checking compatibility across versions. Application code doesn't have "consumers" checking which version introduced a type — it's all deployed together.
## Decision Tree
- If you are defining a public `@type` that appears in any `@spec` or callback → [Public Type with @typedoc](#1-public-type-with-typedoc)
- If a type is used only internally for recursion or DRYing up repeated expressions → [Private Types with @typep](#2-private-types-with-typep)
- If you want to hide internal representation and force consumers to use accessor functions → [@opaque Types](#3-opaque-types-protocol-t)
- If a function can return multiple distinct shapes (tagged tuples, atoms) → [Union Types in @spec Return Values](#4-union-types-in-spec-return-values)
- If the return type depends on the input type (generic/polymorphic function) → [`when` Constraints in Specs](#5-when-constraints-in-specs)
- If you accept a map with a mix of mandatory and optional keys → [Map Types with required/optional Keys](#6-map-types-with-requiredoptional-keys)
- If a function accepts a keyword list of options and you want to document valid keys → [Keyword List Types for Options](#7-keyword-list-types-for-options)
- If you define a container type and want specs to express what element type is inside → [Parameterized Types (t/1)](#8-parameterized-types-t1)
- If a parameter's type alone does not convey its purpose → [Named Parameters in Specs](#9-named-parameters-in-specs--annotation)
- If you are adding a new public type to an existing library post-1.0 → [@typedoc since: Annotation](#10-typedoc-since-annotation)
<!-- PATTERN_COMPLETE -->