docs: add patterns extracted from ecto and oban

Ecto: 6 patterns (protocol dispatch, changeset separation, Multi pipelines)
Oban: 9 patterns (plugin behaviour, telemetry spans, engine abstraction)
This commit is contained in:
Rodin
2026-04-30 09:03:17 -07:00
parent 9a94765ea2
commit 44c61840df
2 changed files with 901 additions and 0 deletions
+542
View File
@@ -0,0 +1,542 @@
# Patterns Extracted from elixir-ecto/ecto
## Pattern: Schemaless Changesets for Validation Without Persistence
**Source:** `lib/ecto/changeset.ex`
**Category:** changeset
**What:** Use `{data, types}` tuples instead of schema structs to validate arbitrary data without a database-backed schema.
**Why:** Decouples validation logic from persistence. Web forms, CLI inputs, API parameters, and configuration can all be validated using the same changeset pipeline without requiring a schema module or database table.
**Example:**
```elixir
data = %{}
types = %{name: :string, email: :string, age: :integer}
params = %{"name" => "Callum", "email" => "callum@example.com", "age" => "27"}
changeset =
{data, types}
|> Ecto.Changeset.cast(params, Map.keys(types))
|> Ecto.Changeset.validate_required([:name, :email])
|> Ecto.Changeset.validate_format(:email, ~r/@/)
case Ecto.Changeset.apply_action(changeset, :validate) do
{:ok, validated} -> # Use validated data
{:error, changeset} -> # Handle errors
end
```
**When to use:** Form validation, API parameter validation, configuration parsing, any data that needs casting + validation but won't go to a database. Also useful in tests.
**When NOT to use:** When you have a real schema already — just use it. Don't create schemaless changesets as a workaround for schema design issues.
---
## Pattern: Validation vs Constraint Boundary
**Source:** `lib/ecto/changeset.ex`
**Category:** changeset
**What:** Explicitly separate validations (checked in-memory before DB) from constraints (checked by the database after insert/update). Constraints only run if all validations pass.
**Why:** Creates a two-phase error pipeline: fast client-side-checkable errors (validations) and data-integrity errors that require the database (constraints). This design avoids race conditions — uniqueness can only be truly guaranteed by the DB.
**Example:**
```elixir
def changeset(user, params) do
user
|> cast(params, [:name, :email, :age])
# Phase 1: Validations (in-memory, immediate)
|> validate_required([:name, :email])
|> validate_format(:email, ~r/@/)
|> validate_inclusion(:age, 18..100)
# Phase 2: Constraints (database, deferred)
|> unique_constraint(:email)
|> check_constraint(:age, name: :age_must_be_positive)
end
```
**When to use:** Always. This is *the* pattern for input handling in Ecto. Understand the boundary: validations catch the obvious stuff fast; constraints catch the stuff only the DB knows.
**When NOT to use:** Don't use `unsafe_validate_unique/4` as a substitute for `unique_constraint/3` — it's a UX optimization (show errors early), not a guarantee.
---
## Pattern: Cast/Change Duality (External vs Internal Data)
**Source:** `lib/ecto/changeset.ex`
**Category:** changeset
**What:** `cast/4` handles external (untrusted) data with string keys, performing type coercion and filtering. `change/2` handles internal (trusted) data with atom keys, storing values directly without validation.
**Why:** This duality enforces a security boundary. External data (user input) goes through `cast` which only permits explicitly listed fields. Internal data (programmatic changes) uses `change` which trusts the caller. If you have atom keys, the data has already been parsed/validated — that's the invariant.
**Example:**
```elixir
# External: user input from a web form (string keys, untrusted)
changeset = cast(user, %{"name" => "Alice", "is_admin" => "true"}, [:name])
# Only :name is permitted — is_admin is silently dropped
# Internal: programmatic change (atom keys, trusted)
changeset = change(user, %{verified_at: DateTime.utc_now()})
# No filtering, no casting — you know what you're doing
```
**When to use:** `cast` for anything from HTTP, CLI, or external systems. `change` for server-side logic (background jobs, system events, migrations).
**When NOT to use:** Never pass user-controlled data to `change/2`. Never use `cast/4` for internal programmatic updates (it's wasteful and may reject valid internal representations).
---
## Pattern: Composable Query Building with `dynamic/2`
**Source:** `lib/ecto/query.ex`
**Category:** query
**What:** Build query expressions incrementally using `dynamic/2`, composing conditions from runtime values, then interpolate the full expression at the root of `where`, `order_by`, `group_by`, or `select`.
**Why:** Standard Ecto queries are compile-time expressions. `dynamic` bridges the gap for search/filter UIs where conditions are assembled based on user input. Each dynamic expression is self-contained with its own bindings and can reference different joins.
**Example:**
```elixir
def filter(params) do
conditions = dynamic(true)
conditions =
if params["published"] do
dynamic([p], p.is_published and ^conditions)
else
conditions
end
conditions =
if min_age = params["min_age"] do
dynamic([p, a], a.age >= ^min_age and ^conditions)
else
conditions
end
from(p in Post, where: ^conditions)
end
```
**When to use:** Search forms, API filters, any scenario where the number and combination of conditions is unknown at compile time. Also works for dynamic `order_by`, `group_by`, `select`, and `update`.
**When NOT to use:** Simple queries with fixed conditions — just use regular `where`. Don't use dynamic for every query; it obscures intent and loses type casting across dynamic boundaries.
---
## Pattern: Named Bindings for Composable Joins
**Source:** `lib/ecto/query.ex`
**Category:** query
**What:** Use `as: :name` on `from` and `join` clauses to create stable references that survive query composition, then reference them positionally or by name in later refinements.
**Why:** Positional bindings break when queries are composed in different orders. Named bindings decouple the query structure from the position, making helper functions composable without knowing the full query shape.
**Example:**
```elixir
# Base query with named binding
def base_query do
from(p in Post, as: :post,
join: a in assoc(p, :author), as: :author)
end
# Filter function doesn't need to know position
def filter_by_author_age(query, min_age) do
from([author: a] in query, where: a.age >= ^min_age)
end
# Late binding with parent_as in subqueries
def with_comment_count(query) do
comment_count = from(c in Comment,
where: c.post_id == parent_as(:post).id,
select: count()
)
from([post: p] in query,
select_merge: %{comment_count: subquery(comment_count)})
end
```
**When to use:** Library code, shared query helpers, any query that will be extended by callers. Essential for `parent_as` in lateral joins and correlated subqueries.
**When NOT to use:** One-off queries in a single function where positional bindings are clear and the query won't be composed further.
---
## Pattern: Ecto.Multi as Inspectable Transaction Blueprint
**Source:** `lib/ecto/multi.ex`
**Category:** repo
**What:** Build transactions as data (an `Ecto.Multi` struct) that can be inspected, tested, and composed *before* execution. Use `to_list/1` to introspect operations without hitting the database.
**Why:** Traditional `Repo.transaction(fn -> ... end)` is opaque — you can't test what it will do without running it. Multi makes the transaction plan a first-class value: you can unit test changesets, assert operation order, and compose Multis from different modules.
**Example:**
```elixir
defmodule PasswordManager do
def reset(account, params) do
Multi.new()
|> Multi.update(:account, Account.password_reset_changeset(account, params))
|> Multi.insert(:log, Log.password_reset_changeset(account, params))
|> Multi.delete_all(:sessions, Ecto.assoc(account, :sessions))
end
end
# Unit test without database
test "password reset multi structure" do
multi = PasswordManager.reset(%Account{}, %{password: "new"})
assert [{:account, {:update, changeset, []}},
{:log, {:insert, _, []}},
{:sessions, {:delete_all, _, []}}] = Ecto.Multi.to_list(multi)
assert changeset.valid?
end
# Execute
case Repo.transact(PasswordManager.reset(account, params)) do
{:ok, %{account: account}} -> # success
{:error, :account, changeset, _} -> # account update failed
end
```
**When to use:** When the set of operations is dynamic, when you need to test transaction logic without a database, or when multiple modules contribute operations to the same transaction.
**When NOT to use:** Simple happy-path transactions. Per Ecto's own docs: "For most other use cases, using regular control flow within `Repo.transact(fun)` and returning `{:ok, result}` or `{:error, reason}` is more straightforward."
---
## Pattern: Multi.merge for Dynamic Transaction Composition
**Source:** `lib/ecto/multi.ex`
**Category:** repo
**What:** Use `Multi.merge/2` to dynamically compose transactions where later operations depend on earlier results. The merge function receives all changes so far and returns a new Multi.
**Why:** Static Multi pipelines can't branch based on previous results. `merge` enables conditional logic (if X was inserted, also insert Y) while keeping the transaction atomic and the plan composable.
**Example:**
```elixir
Multi.new()
|> Multi.insert(:post, %Post{title: "first"})
|> Multi.merge(fn %{post: post} ->
Multi.new()
|> Multi.insert(:comment, Ecto.build_assoc(post, :comments, body: "auto"))
|> Multi.update_all(:notify, from(s in Sub, where: s.post_id == ^post.id),
set: [notified: true])
end)
|> Repo.transact()
```
**When to use:** When transaction operations depend on auto-generated IDs or computed values from earlier steps. When composing Multis from different bounded contexts.
**When NOT to use:** When operations don't actually depend on each other — just use `Multi.insert/update` directly. Avoid deeply nested merges; flatten when possible.
---
## Pattern: Multi with Non-Atom Names for Collection Operations
**Source:** `lib/ecto/multi.ex`
**Category:** repo
**What:** Multi operation names can be any term (not just atoms) — use tuples like `{:account, id}` to track individual items when processing collections within a transaction.
**Why:** When updating N items in a transaction, you need N unique names. Using tuples gives semantic meaning and allows you to pattern-match on the error to find which specific item failed.
**Example:**
```elixir
accounts = [%Account{id: 1}, %Account{id: 2}, %Account{id: 3}]
multi =
Enum.reduce(accounts, Multi.new(), fn account, multi ->
Multi.update(multi, {:account, account.id},
Account.password_reset_changeset(account, params))
end)
case Repo.transact(multi) do
{:ok, results} ->
# results[{:account, 1}], results[{:account, 2}], etc.
{:error, {:account, failed_id}, changeset, _} ->
# Know exactly which account failed
end
```
**When to use:** Batch operations within a transaction where you need to identify which operation failed.
**When NOT to use:** Single-item operations — just use descriptive atoms like `:insert_user`.
---
## Pattern: The Queryable Protocol for Extensible Data Sources
**Source:** `lib/ecto/queryable.ex`
**Category:** protocol
**What:** `Ecto.Queryable` is a protocol that converts any data structure into an `Ecto.Query`. Implementations exist for atoms (schemas), bitstrings (table names), tuples (`{"table", Schema}`), `Ecto.Query` itself, and `Ecto.SubQuery`.
**Why:** This is how Ecto achieves its composability. You can pass a schema, a string, a tuple, or a query to any function that accepts a queryable — they're all interchangeable. It's the extension point for custom data sources.
**Example:**
```elixir
# All of these work wherever a queryable is expected:
Repo.all(User) # Atom (schema)
Repo.all(from u in User, where: u.active) # Ecto.Query
Repo.all("users") # BitString (table name)
Repo.all({"legacy_users", User}) # Tuple (alternate source + schema)
# Custom queryable — a struct that knows how to become a query:
defimpl Ecto.Queryable, for: ActiveUsers do
def to_query(%ActiveUsers{min_age: age}) do
from u in User, where: u.active and u.age >= ^age
end
end
Repo.all(%ActiveUsers{min_age: 18})
```
**When to use:** When you want to create reusable, composable query objects that encapsulate filtering logic. When you have alternate table sources (views, partitioned tables) that share a schema.
**When NOT to use:** For simple filtering — just compose queries normally. Don't over-abstract with custom Queryable implementations when a function returning a query would suffice.
---
## Pattern: ParameterizedType for Field-Level Customization
**Source:** `lib/ecto/parameterized_type.ex`
**Category:** protocol
**What:** `Ecto.ParameterizedType` is a behaviour that extends the type system with compile-time options. Unlike basic `Ecto.Type` (one implementation per module), parameterized types receive field-specific options at compile time via `init/1` and pass them to every callback.
**Why:** Enables a single type module to behave differently per field. `Ecto.Enum` is the canonical example: one module handles all enum fields, but each field has its own valid values configured at the schema level.
**Example:**
```elixir
defmodule EncryptedString do
use Ecto.ParameterizedType
def type(_params), do: :binary
def init(opts) do
%{key: Keyword.fetch!(opts, :key), algorithm: Keyword.get(opts, :algorithm, :aes_256)}
end
def cast(value, _params) when is_binary(value), do: {:ok, value}
def cast(_, _), do: :error
def dump(nil, _dumper, _params), do: {:ok, nil}
def dump(value, _dumper, %{key: key, algorithm: alg}) do
{:ok, encrypt(value, key, alg)}
end
def load(nil, _loader, _params), do: {:ok, nil}
def load(value, _loader, %{key: key, algorithm: alg}) do
{:ok, decrypt(value, key, alg)}
end
end
# Usage in schema — different keys per field
schema "users" do
field :ssn, EncryptedString, key: :ssn_key
field :notes, EncryptedString, key: :notes_key, algorithm: :chacha20
end
```
**When to use:** When a type needs per-field configuration (encryption keys, enum values, precision settings, format options). When you want `nil` handling in `load`/`dump` (basic types skip nil).
**When NOT to use:** When the type behaves identically regardless of field — use basic `Ecto.Type` instead. Don't reach for parameterized types just because you can.
---
## Pattern: Ecto.Enum as Atom-Safe Persistence
**Source:** `lib/ecto/enum.ex`
**Category:** schema
**What:** `Ecto.Enum` maps atoms to strings (or integers) for database storage. It uses `ParameterizedType` to accept a `values:` option per field and builds lookup maps at compile time for O(1) casting/loading/dumping.
**Why:** Atoms are great in Elixir (pattern matching, clarity) but dangerous to create from untrusted input. Ecto.Enum provides safe atom-to-string roundtripping: only atoms declared at compile time are ever created from DB values. The compile-time map generation means zero runtime overhead.
**Example:**
```elixir
schema "orders" do
field :status, Ecto.Enum, values: [:pending, :processing, :shipped, :delivered]
field :priority, Ecto.Enum, values: [low: 1, medium: 2, high: 3, critical: 4]
field :roles, {:array, Ecto.Enum}, values: [:admin, :editor, :viewer]
end
# Compile-time init builds:
# on_cast: %{"pending" => :pending, "processing" => :processing, ...}
# on_dump: %{pending: "pending", processing: "processing", ...}
# on_load: %{"pending" => :pending, "processing" => :processing, ...}
```
**When to use:** Any field with a fixed set of allowed values. Prefer over raw strings for domain modeling (compile-time typo detection, pattern matching).
**When NOT to use:** When the set of values is user-defined and can change without a code deploy — you'd need a different approach (lookup table, custom type).
---
## Pattern: prepare_changes for Side Effects Deferred to Transaction Time
**Source:** `lib/ecto/changeset.ex`
**Category:** changeset
**What:** `prepare_changes/2` registers a callback that runs inside the database transaction, *after* the changeset is valid but *before* it's committed. The callback receives the changeset with `repo` already set.
**Why:** Some operations must happen atomically with the main insert/update but can't be expressed as changeset changes (counter caches, audit logs, cache invalidation). `prepare_changes` gives you access to the repo inside the transaction while keeping the changeset's validation pipeline clean.
**Example:**
```elixir
def create_comment(comment, params) do
comment
|> cast(params, [:body, :post_id])
|> validate_required([:body, :post_id])
|> prepare_changes(fn changeset ->
if post_id = get_change(changeset, :post_id) do
query = from(p in Post, where: [id: ^post_id])
changeset.repo.update_all(query, inc: [comment_count: 1])
end
changeset
end)
end
```
**When to use:** Counter caches, denormalization updates, audit trail inserts that must be atomic with the primary operation. When you need the repo but want to keep it out of the changeset function signature.
**When NOT to use:** Complex multi-step operations — use `Ecto.Multi` instead. Don't use for non-DB side effects (sending emails) — those should happen after successful commit.
---
## Pattern: Embedded Schemas as UI/Domain Boundary Objects
**Source:** `lib/ecto/schema.ex`, `lib/ecto/embedded.ex`
**Category:** schema
**What:** Use `embedded_schema/1` for structs that represent application-layer concepts (form inputs, API requests, configuration) without a backing database table. Combine with `cast_embed/3` for nested validation within a parent schema.
**Why:** Decouples your domain's internal representation from what gets persisted. A `SignUp` embedded schema validates the form; then you split it into `Account` and `Profile` schemas for persistence. Embedded schemas also work as value objects inside other schemas (`embeds_one`/`embeds_many`).
**Example:**
```elixir
# As a boundary object (no persistence)
defmodule SearchForm do
use Ecto.Schema
import Ecto.Changeset
embedded_schema do
field :query, :string
field :min_price, :decimal
field :max_price, :decimal
field :categories, {:array, :string}
end
def changeset(form, params) do
form
|> cast(params, [:query, :min_price, :max_price, :categories])
|> validate_required([:query])
|> validate_number(:min_price, greater_than_or_equal_to: 0)
end
end
# As a nested value object in a persisted schema
schema "orders" do
embeds_one :shipping_address, Address, on_replace: :update do
field :street, :string
field :city, :string
field :zip, :string
end
end
```
**When to use:** Form objects, command objects, API request shapes, inline value objects (addresses, coordinates, metadata). Anywhere you need validation without a table.
**When NOT to use:** When the data needs its own lifecycle (CRUD, querying, associations) — use a proper schema with a table.
---
## Pattern: The Adapter Behaviour Split (Layered Capabilities)
**Source:** `lib/ecto/adapter.ex`, `lib/ecto/adapter/queryable.ex`, `lib/ecto/adapter/transaction.ex`
**Category:** protocol
**What:** Ecto splits adapter responsibilities into multiple behaviours: `Ecto.Adapter` (base: init, loaders, dumpers), `Ecto.Adapter.Queryable` (prepare, execute, stream), `Ecto.Adapter.Schema` (insert, update, delete), `Ecto.Adapter.Transaction`. An adapter implements only what it supports.
**Why:** Not all data stores support all operations. A key-value store might implement Schema but not Queryable. A read-only store skips Schema entirely. This granular behaviour split means the type system (and the Repo `use` macro with `:read_only`) can enforce valid combinations at compile time.
**Example:**
```elixir
# An adapter declares what it supports:
defmodule MyAdapter do
@behaviour Ecto.Adapter
@behaviour Ecto.Adapter.Queryable
# NOT implementing Ecto.Adapter.Transaction = no transaction support
@impl Ecto.Adapter
def init(config), do: {:ok, child_spec, meta}
@impl Ecto.Adapter
def loaders(:binary_id, type), do: [Ecto.UUID, type]
def loaders(_primitive, type), do: [type]
@impl Ecto.Adapter.Queryable
def prepare(:all, query), do: {:cache, normalize(query)}
def execute(meta, query_meta, cache, params, opts), do: {count, rows}
end
```
**When to use:** When designing a system with pluggable backends that have different capability sets. The pattern: define a base behaviour for universal operations, then optional behaviours for extended capabilities.
**When NOT to use:** When all implementations will always support all operations — a single behaviour is simpler. Don't over-split when there's no real variability.
---
## Pattern: Loader/Dumper Pipeline for Type Coercion
**Source:** `lib/ecto/adapter.ex`, `lib/ecto/type.ex`
**Category:** protocol
**What:** Adapters return ordered lists of `loaders` and `dumpers` for each type — a pipeline of functions/types that values pass through when loading from or dumping to the database. The adapter controls the pipeline shape.
**Why:** Different databases represent the same logical type differently (booleans as 0/1, UUIDs as binary vs string). The loader/dumper pipeline lets adapters inject coercion steps without the type system needing to know about adapter internals.
**Example:**
```elixir
# Adapter that stores booleans as integers:
def loaders(:boolean, type), do: [&bool_decode/1, type]
def dumpers(:boolean, type), do: [type, &bool_encode/1]
defp bool_decode(0), do: {:ok, false}
defp bool_decode(1), do: {:ok, true}
defp bool_encode(false), do: {:ok, 0}
defp bool_encode(true), do: {:ok, 1}
# The pipeline: DB value → adapter decode → Ecto type load → Elixir value
# And reverse: Elixir value → Ecto type dump → adapter encode → DB value
```
**When to use:** Custom adapters, custom types that need adapter-aware coercion. Understanding this pattern helps debug "value not loading correctly" issues.
**When NOT to use:** Application code rarely needs to think about this directly. Use `Ecto.Type` or `Ecto.ParameterizedType` for custom type logic.
---
## Pattern: `on_replace` Strategy for Association Lifecycle
**Source:** `lib/ecto/changeset.ex`, `lib/ecto/embedded.ex`
**Category:** schema
**What:** When a parent changeset replaces associated data (e.g., removes items from a `has_many`), `:on_replace` controls what happens to the orphaned records: `:raise`, `:mark_as_invalid`, `:nilify`, `:update`, `:delete`, or `:delete_if_exists`.
**Why:** Default `:raise` forces explicit design decisions about data lifecycle. Accidentally deleting child records by omitting them from a form submission is a common security/data-loss bug. This pattern makes the lifecycle policy declarative at the schema level.
**Example:**
```elixir
schema "posts" do
# Comments are owned — deleting from the list deletes from DB
has_many :comments, Comment, on_replace: :delete
# Tags are shared — removing from post just nilifies the FK
has_many :taggings, Tagging, on_replace: :nilify
# Profile is 1:1 — replacing updates in place
has_one :profile, Profile, on_replace: :update
# Draft is embedded — update content in place
embeds_one :draft, Draft, on_replace: :update
end
```
**When to use:** Always explicitly set `:on_replace` on associations/embeds that will be modified through `cast_assoc`/`cast_embed`. The default `:raise` is intentional — it forces you to think about it.
**When NOT to use:** N/A — this is a required decision for any association that participates in parent changeset operations.
---
## Pattern: `apply_action/2` for Non-Persistence Validation
**Source:** `lib/ecto/changeset.ex`
**Category:** changeset
**What:** Call `apply_action(changeset, :action)` to get `{:ok, data}` or `{:error, changeset}` without touching the database. It emulates `Repo.insert/update` return shapes for changesets that aren't persisted.
**Why:** Phoenix forms and other UIs use `changeset.action` to decide whether to show errors. Without `apply_action`, you'd need to manually set the action. With it, you get the same `{:ok, _} | {:error, _}` pattern as Repo operations, enabling consistent error handling.
**Example:**
```elixir
# API controller validating params without a DB
def search(conn, params) do
changeset = SearchForm.changeset(%SearchForm{}, params)
case Ecto.Changeset.apply_action(changeset, :validate) do
{:ok, search} ->
results = SearchService.execute(search)
json(conn, results)
{:error, changeset} ->
conn |> put_status(422) |> render("errors.json", changeset: changeset)
end
end
```
**When to use:** Schemaless changesets, embedded schema validation, any place you want `{:ok, _} | {:error, _}` semantics without a database roundtrip. Essential for form validation in LiveView.
**When NOT to use:** When you're going to persist anyway — just call `Repo.insert/update` which does this internally.
---
## Pattern: Repo.transact vs Repo.transaction
**Source:** `lib/ecto/repo/transaction.ex`
**Category:** repo
**What:** `Repo.transact/2` accepts either a function returning `{:ok, _}` or `{:error, _}`, or an `Ecto.Multi`. It auto-rolls back on `{:error, _}`. This replaces the older `Repo.transaction(fn -> ... end)` which requires manual `Repo.rollback/1`.
**Why:** The older `transaction/2` with manual rollback is error-prone — forget to rollback and you commit partial state. `transact/2` enforces the `{:ok, _} | {:error, _}` contract, making it impossible to accidentally commit on error.
**Example:**
```elixir
# Old pattern (error-prone):
Repo.transaction(fn ->
case do_thing() do
{:ok, result} -> result
{:error, reason} -> Repo.rollback(reason) # Easy to forget!
end
end)
# New pattern (safe by construction):
Repo.transact(fn repo ->
with {:ok, user} <- repo.insert(User.changeset(%User{}, params)),
{:ok, profile} <- repo.insert(Profile.changeset(%Profile{}, user)) do
{:ok, %{user: user, profile: profile}}
end
# {:error, _} automatically triggers rollback
end)
```
**When to use:** All new transaction code. The function form is simpler than Multi when operations are sequential and the logic is straightforward.
**When NOT to use:** When you need the introspection/testing benefits of `Ecto.Multi`. When the set of operations is dynamic.
---
## Pattern: Query Prefix for Multi-Tenancy
**Source:** `lib/ecto/query.ex`
**Category:** query
**What:** Set a query prefix at the query level, `from`/`join` level, schema level (`@schema_prefix`), or repo operation level (`prefix: "tenant_x"`) to route queries to different database schemas/databases.
**Why:** Enables schema-based multi-tenancy (PostgreSQL schemas, MySQL databases) without changing application code. The prefix cascades: repo option → query prefix → schema prefix → `from`/`join` option, with later overriding earlier.
**Example:**
```elixir
# Option 1: At the repo call
Repo.all(User, prefix: "tenant_abc")
# Option 2: On the query
from(u in User) |> Ecto.Query.put_query_prefix("tenant_abc") |> Repo.all()
# Option 3: At the schema level
defmodule TenantABC.User do
use Ecto.Schema
@schema_prefix "tenant_abc"
schema "users" do ... end
end
# Option 4: Per-join (useful for cross-tenant queries)
from u in User, prefix: "tenant_abc",
join: g in GlobalSetting, prefix: "public", on: ...
```
**When to use:** Multi-tenant applications with schema-per-tenant architecture. Also useful for accessing shared/public schemas alongside tenant-specific ones.
**When NOT to use:** Row-level multi-tenancy (where all tenants share tables with a `tenant_id` column) — just use `where` clauses or default scopes.
<!-- PATTERN_COMPLETE -->
+359
View File
@@ -0,0 +1,359 @@
# Patterns Extracted from oban-bg/oban
## Pattern: Plugin as Behaviour + GenServer
**Source:** `lib/oban/plugin.ex`
**Category:** plugin
**What:** Define a plugin interface as a behaviour with
`start_link/1` and `validate/1` callbacks. Plugins must be
OTP-compliant (GenServer/Agent). The host supervises them.
**Why:** Extensibility without coupling. Oban can start any
module that satisfies the behaviour — pruning, cron,
lifeline — without knowing implementation details. The
`validate/1` callback ensures misconfigured plugins fail at
startup, not at runtime.
**Example:**
```elixir
@callback start_link([option()]) :: GenServer.on_start()
@callback validate([option()]) :: :ok | {:error, String.t()}
@optional_callbacks [format_logger_output: 2]
```
**When to use:** When your application needs a plugin
system where third parties add behavior. The behaviour
ensures type safety; supervision ensures fault isolation.
**When NOT to use:** Internal modules that you control.
Behaviours add ceremony — if there is only one
implementation, use a module directly.
---
## Pattern: Structured Telemetry Spans
**Source:** `lib/oban/telemetry.ex`
**Category:** telemetry
**What:** Emit telemetry events as spans with
start/stop/exception structure. Every operation (job
execution, engine calls, plugin work) follows the same
three-event pattern with consistent metadata shapes.
**Why:** Uniform observability. Any monitoring tool
(AppSignal, Datadog, custom logger) can hook into the same
event structure. The span pattern (start → stop|exception)
enables latency tracking, error rates, and resource usage
measurement without custom instrumentation per feature.
**Example:**
```elixir
# Event names follow: [:oban, :component, :action, :phase]
[:oban, :job, :start]
[:oban, :job, :stop] # measurements: duration, memory
[:oban, :job, :exception] # + kind, reason, stacktrace
[:oban, :engine, :fetch_jobs, :start]
[:oban, :engine, :fetch_jobs, :stop]
[:oban, :engine, :fetch_jobs, :exception]
```
**When to use:** Any library or application that wants
observability without coupling to a specific monitoring
backend. The pattern works for database queries, HTTP
requests, background jobs, cache operations.
**When NOT to use:** Ultra-hot paths where telemetry
overhead matters (millions of events/second). Use sampling
or skip entirely.
---
## Pattern: Engine Abstraction for Backend Swap
**Source:** `lib/oban/engine.ex`
**Category:** engine
**What:** Define a behaviour (`Engine`) with callbacks for
all database operations (insert, fetch, complete, etc.).
Ship multiple implementations (Basic/Inline/Lite) that swap
at config time.
**Why:** Different environments need different backends:
Postgres for production, SQLite for development, inline
(in-memory) for testing. The engine abstraction lets you
swap without changing application code.
**Example:**
```elixir
@callback init(conf, opts) :: {:ok, meta} | {:error, term}
@callback insert_job(conf, changeset, opts) :: {:ok, Job.t()}
@callback fetch_jobs(conf, meta, opts) :: {:ok, {meta, [Job.t()]}}
@callback complete_job(conf, Job.t()) :: :ok
```
**When to use:** When your system needs to support multiple
storage backends, or when testing requires a fundamentally
different execution model (synchronous vs async).
**When NOT to use:** Single-backend applications. The
abstraction layer adds complexity that is only justified
when you actually swap implementations.
---
## Pattern: Keyword Validation with Reduce-While
**Source:** `lib/oban/validation.ex`
**Category:** config
**What:** Validate keyword options by iterating with
`Enum.reduce_while/3` and a validator function. Stop at
first error. Return `:ok` or `{:error, reason}`.
**Why:** Keyword lists are the standard Elixir config
format. Validating them procedurally (nested if/case) gets
messy. The reduce-while + validator pattern is composable:
each option validates independently, errors short-circuit,
and the validator function can be swapped or extended.
**Example:**
```elixir
def validate(opts, validator) when is_list(opts) do
Enum.reduce_while(opts, :ok, fn opt, acc ->
case validator.(opt) do
:ok -> {:cont, acc}
{:error, _} = error -> {:halt, error}
end
end)
end
```
**When to use:** Any public API that accepts keyword
options from users. Libraries, GenServer init, plugin
configs.
**When NOT to use:** Internal functions where the caller
is trusted. Also avoid for deeply nested configs — use
schema-based validation (NimbleOptions, Ecto embedded
schemas) instead.
---
## Pattern: Testing Mode Toggle
**Source:** `lib/oban/testing.ex`, `lib/oban/config.ex`
**Category:** testing
**What:** Support a `testing:` config option that switches
execution mode: `:disabled` (production), `:inline`
(execute immediately in caller process), `:manual` (enqueue
but don't execute — assert on DB state).
**Why:** Background job systems are inherently async, which
makes testing hard. The mode toggle gives you: (1) inline
for unit tests that need synchronous execution, (2) manual
for integration tests that verify enqueueing without
side effects.
**Example:**
```elixir
# In test config:
config :my_app, Oban, testing: :manual
# In tests:
use Oban.Testing, repo: MyApp.Repo
perform_job(MyWorker, %{id: 1})
assert_enqueued worker: MyWorker, args: %{id: 1}
```
**When to use:** Any async system that needs deterministic
testing — job queues, event buses, notification systems.
The testing mode replaces "sleep and hope" with explicit
control.
**When NOT to use:** Synchronous systems that are already
deterministic. Also avoid if the mode toggle leaks into
production code paths (keep it config-only, not conditional
logic scattered through business code).
---
## Pattern: Stopper for Goroutine Lifecycle (CockroachDB)
**Source:** `pkg/util/stop/stopper.go` (cockroachdb)
**Category:** concurrency
**What:** A dedicated struct that manages the lifecycle of
all goroutines in a component: tracks active tasks, refuses
new work during shutdown (quiesce), waits for completion,
then runs closers.
**Why:** In distributed systems, clean shutdown is critical.
You need to: (1) stop accepting new work, (2) finish
in-flight work, (3) release resources in order. The Stopper
centralizes this instead of scattering shutdown logic across
every goroutine.
**Example:**
```go
type Stopper struct {
quiescer chan struct{} // closed when quiescing
stopped chan struct{} // closed when fully stopped
mu struct {
syncutil.RWMutex
_numTasks int32
quiescing, stopping bool
closers []Closer
}
}
// RunAsyncTask refuses new work during quiesce
func (s *Stopper) RunAsyncTask(ctx context.Context,
taskName string, f func(context.Context)) error {
if !s.addTask() {
return ErrUnavailable
}
go func() {
defer s.decTask()
f(ctx)
}()
return nil
}
```
**When to use:** Any server or subsystem that spawns
goroutines and needs graceful shutdown. Especially in
long-running services where leaked goroutines cause
resource exhaustion.
**When NOT to use:** Simple programs with a single main
goroutine. Or when `errgroup` with context cancellation
suffices for the shutdown coordination.
---
## Pattern: Atomic File Operations with Suffix Convention
**Source:** `tsdb/db.go` (prometheus)
**Category:** storage
**What:** Use directory suffixes (`.tmp-for-creation`,
`.tmp-for-deletion`) to make multi-step file operations
crash-safe. On startup, clean up any dirs with these
suffixes (they represent incomplete operations).
**Why:** Database storage needs atomicity. If the process
crashes between creating a block and finalizing it, you
need to know the block is incomplete. The suffix convention
makes incomplete state visible at the filesystem level
without requiring a separate journal.
**Example:**
```go
const (
tmpForDeletionBlockDirSuffix = ".tmp-for-deletion"
tmpForCreationBlockDirSuffix = ".tmp-for-creation"
)
// On startup: remove any .tmp-* dirs (incomplete ops)
// On create: write to dir.tmp-for-creation, then rename
// On delete: rename to dir.tmp-for-deletion, then remove
```
**When to use:** Any system that manages files/directories
and needs crash consistency without a full WAL. Simpler
than a write-ahead log for coarse-grained operations.
**When NOT to use:** When you already have a WAL or
transaction log. Or for fine-grained operations where
rename semantics are insufficient.
---
## Pattern: Options as DefaultOptions() + Override
**Source:** `tsdb/db.go` (prometheus)
**Category:** configuration
**What:** Provide a `DefaultOptions()` function returning a
fully-populated config struct. Users copy and override only
what they need. No nil-means-default ambiguity.
**Why:** Large config structs (20+ fields) are unwieldy.
By providing sane defaults as a function (not a package-
level var), you avoid mutation bugs and make it clear what
"normal" looks like. Users only specify deviations.
**Example:**
```go
func DefaultOptions() *Options {
return &Options{
WALSegmentSize: wlog.DefaultSegmentSize,
RetentionDuration: int64(15 * 24 * time.Hour / ...),
MinBlockDuration: DefaultBlockDuration,
MaxBlockDuration: DefaultBlockDuration,
SamplesPerChunk: DefaultSamplesPerChunk,
// ... 20 more fields with sane defaults
}
}
// Usage:
opts := tsdb.DefaultOptions()
opts.RetentionDuration = 30 * 24 * time.Hour
db, err := tsdb.Open(dir, nil, nil, opts, nil)
```
**When to use:** Config structs with many fields where most
users want defaults. Especially when zero-value semantics
would be confusing (e.g., 0 retention = infinite? or off?).
**When NOT to use:** Small configs (3-4 fields) where
struct literal with zero-means-default is clear enough.
---
## Pattern: Scrape Loop with Aligned Timestamps
**Source:** `scrape/scrape.go` (prometheus)
**Category:** concurrency
**What:** Periodic scrape loops that align timestamps to
intervals with a small tolerance, enabling better storage
compression downstream.
**Why:** Time-series databases compress better when
timestamps are regular. A 2ms tolerance on alignment
means scraped data aligns to the expected grid while
accommodating real-world jitter.
**Example:**
```go
var ScrapeTimestampTolerance = 2 * time.Millisecond
var AlignScrapeTimestamps = true
// In scrape loop: if scrape finishes within tolerance
// of the expected timestamp, snap to the grid
```
**When to use:** Any periodic data collection where
downstream storage benefits from timestamp regularity.
Metrics, heartbeats, polling loops.
**When NOT to use:** Event-driven data where timestamps
must reflect actual occurrence time. Audit logs, user
actions, financial transactions.
<!-- PATTERN_COMPLETE -->