feat: add Ecto patterns extracted from elixir-ecto/ecto source

Adds 5 new pattern files covering the core Ecto APIs, each with real source citations at commit fd2ec52b, before/after examples, anti-patterns, and decision trees. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 19:22:39 -07:00
parent 4bc7f3b357
commit 7b38ac9b2a
5 changed files with 4277 additions and 0 deletions
@@ -0,0 +1,910 @@
+# Ecto Query Patterns
+
+Patterns extracted from Ecto's query layer source code.
+
+---
+
+## 1. Named Query Functions — Composable Query Building
+
+**Source:** [lib/ecto/query.ex#L1112](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L1112)
+
+**What it does:** Define named functions that accept a query and return a refined query. The query itself is the accumulator; each function layers one concern.
+
+```elixir
+# From lib/ecto/query.ex lines 1112-1134
+def paginate(query, page, size) do
+  from query,
+    limit: ^size,
+    offset: ^((page-1) * size)
+end
+
+def published(query) do
+  from p in query, where: not(is_nil(p.published_at))
+end
+```
+
+These functions compose naturally at the call site:
+
+```elixir
+User |> active() |> published() |> paginate(1, 20)
+```
+
+**Why:** Each function encodes exactly one policy decision. The composed result is a single query that the database executes once. Because `from query` appends rather than replaces, the caller chooses which policies to apply and in what order — without any one function needing to know about the others.
+
+**Anti-pattern:** One monolithic query that mixes pagination, filtering, and ordering:
+```elixir
+# Hard to reuse parts independently
+def list_published_users(page, size) do
+  from u in User,
+    where: u.active == true and not is_nil(u.published_at),
+    order_by: [desc: u.inserted_at],
+    limit: ^size,
+    offset: ^((page - 1) * size)
+end
+```
+
+### When to Use
+
+**Triggers:**
+- The same filter, ordering, or limit appears in multiple query contexts
+- You need to mix and match clauses — some queries paginate, some don't
+- A policy (e.g. "only active records") should be enforced consistently without copy-pasting conditions
+
+**Example — before:**
+```elixir
+def list_recent_posts(page, size) do
+  from p in Post,
+    where: not is_nil(p.published_at),
+    order_by: [desc: p.published_at],
+    limit: ^size,
+    offset: ^((page - 1) * size)
+end
+
+def count_published_posts do
+  from p in Post,
+    where: not is_nil(p.published_at),
+    select: count()
+end
+```
+
+**Example — after:**
+```elixir
+def published(query), do: from p in query, where: not is_nil(p.published_at)
+def by_newest(query), do: from p in query, order_by: [desc: p.published_at]
+def paginate(query, page, size) do
+  from query, limit: ^size, offset: ^((page - 1) * size)
+end
+
+def list_recent_posts(page, size) do
+  Post |> published() |> by_newest() |> paginate(page, size)
+end
+
+def count_published_posts do
+  Post |> published() |> select([p], count())
+end
+```
+
+### When NOT to Use
+
+**Don't use this when:**
+- The query is used exactly once and decomposing it adds names with no reuse value
+- The clauses are tightly coupled and meaningless in isolation (e.g. a join whose `on` condition references a specific sibling join)
+
+**Over-application example:**
+```elixir
+# Not worth extracting — used once, no meaningful reuse
+def with_user_and_org_and_permissions(query) do
+  from [u, o, p] in query,
+    where: u.org_id == o.id and p.user_id == u.id and p.role == "admin"
+end
+```
+
+**Better alternative:**
+```elixir
+from u in User,
+  join: o in Org, on: u.org_id == o.id,
+  join: p in Permission, on: p.user_id == u.id,
+  where: p.role == "admin"
+```
+
+**Why:** Extraction is worth it when the function has a name that communicates intent reusably. When a query is one-off and the extracted name just paraphrases the code, keep it inline.
+
+---
+
+## 2. Query Piping — Schema to Query Pipeline
+
+**Source:** [lib/ecto/query.ex#L310](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L310)
+
+**What it does:** Ecto implements the `Ecto.Queryable` protocol for schemas, strings, and query structs, so any of them can be the starting point of a pipeline. The pipe operator chains named query functions:
+
+```elixir
+# From the macro API docs (query.ex line 319-324)
+"users"
+|> where([u], u.age > 18)
+|> select([u], u.name)
+```
+
+Starting from a schema module name is idiomatic:
+
+```elixir
+User
+|> where([u], u.active == true)
+|> order_by([u], u.name)
+|> limit(10)
+```
+
+**Why:** The pipe operator makes query construction read left-to-right, mirroring how SQL clauses are mentally composed. The `Ecto.Queryable` protocol means `from Schema` and `Schema |> where(...)` are equivalent, so the choice of `from`/`|>` is stylistic — but pipe form scales better when each step is a named function.
+
+**Anti-pattern:** Building one monolithic keyword query instead of small composable pipes:
+```elixir
+# Cannot reuse paginate, active, or order_by separately
+from u in User,
+  where: u.active == true,
+  order_by: [asc: u.name],
+  limit: ^limit,
+  offset: ^offset
+```
+
+### When to Use
+
+**Triggers:**
+- You have 3+ clauses that each correspond to an independently reusable policy
+- The query is assembled conditionally based on runtime inputs
+- You want the query construction steps to be readable as an English sentence
+
+**Example — before:**
+```elixir
+from p in Post,
+  where: p.author_id == ^author_id and not is_nil(p.published_at),
+  order_by: [desc: p.published_at],
+  limit: 10
+```
+
+**Example — after:**
+```elixir
+Post
+|> by_author(author_id)
+|> published()
+|> by_newest()
+|> limit(10)
+```
+
+### When NOT to Use
+
+**Don't use this when:**
+- The query has two or fewer clauses and a single `from` is more concise
+- You're assembling a complex join where positional bindings require a single `from` for clarity
+
+**Over-application example:**
+```elixir
+# Excessive piping for a trivial lookup
+User
+|> where([u], u.id == ^id)
+|> limit(1)
+|> Repo.one()
+```
+
+**Better alternative:**
+```elixir
+Repo.get(User, id)
+```
+
+**Why:** `Repo.get/2` and `Repo.get_by/2` exist precisely for simple lookups. Piping adds ceremony without benefit when the standard API already expresses the intent.
+
+---
+
+## 3. Named Bindings — Position-Independent Composition
+
+**Source:** [lib/ecto/query.ex#L211](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L211)
+
+**What it does:** Assign stable names to `from` and `join` sources using `as:`. Reference those names in any function without knowing or caring about join order.
+
+```elixir
+# Name the join at definition time (query.ex line 218-219)
+posts_with_comments =
+  from p in Post,
+    join: c in Comment, as: :comment, on: c.post_id == p.id
+
+# Reference by name instead of position (line 223)
+from [p, comment: c] in posts_with_comments, select: {p.title, c.body}
+```
+
+Generic sort function that works on any named binding (line 254-256):
+
+```elixir
+def sort(query, as, field) do
+  from [{^as, x}] in query, order_by: field(x, ^field)
+end
+```
+
+The spread `...` syntax lets you reference first and last bindings without caring about the middle (line 206):
+
+```elixir
+from [p, ..., c] in posts_with_comments, select: {p.title, c.body}
+```
+
+**Why:** Positional bindings break when a new join is inserted earlier in the pipeline. Named bindings are stable: adding a join between `Post` and `Comment` does not change how `:comment` is referenced. This is essential for composable query libraries where join order is not known at authorship time.
+
+**Anti-pattern:** Relying on position when queries are built across functions:
+```elixir
+# Breaks if anyone inserts a join before Comment
+def with_comment_body(query) do
+  from [p, c] in query, select: {p.title, c.body}
+end
+```
+
+### When to Use
+
+**Triggers:**
+- A join is referenced from a function that didn't define it
+- You're writing generic helpers (sorting, filtering) that work on any named source
+- Multiple joins make positional counting error-prone
+
+**Example — before:**
+```elixir
+def filter_by_org(query) do
+  from [u, o] in query, where: o.active == true  # Breaks if join order changes
+end
+```
+
+**Example — after:**
+```elixir
+# Define the join with a name
+from u in User,
+  join: o in Org, as: :org, on: u.org_id == o.id
+
+# Reference by name — order-independent
+def filter_by_org(query) do
+  from [org: o] in query, where: o.active == true
+end
+```
+
+### When NOT to Use
+
+**Don't use this when:**
+- The query lives entirely in one function and is never extended
+- You have a single join that will never be reordered — positional is fine and shorter
+- You're using `...` spread and don't need to reference intermediate sources by name
+
+**Over-application example:**
+```elixir
+# Naming everything in a one-off query adds noise
+from u in User, as: :user,
+  join: p in Post, as: :post, on: p.user_id == u.id,
+  where: p.published == true,
+  select: u
+```
+
+**Better alternative:**
+```elixir
+from u in User,
+  join: p in Post, on: p.user_id == u.id,
+  where: p.published == true,
+  select: u
+```
+
+**Why:** Named bindings pay off at composition boundaries. In a self-contained query, they add verbosity without stability benefit.
+
+---
+
+## 4. `dynamic/2` — Runtime-Constructed Predicates
+
+**Source:** [lib/ecto/query.ex#L770](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L770)
+
+**What it does:** `dynamic/2` builds query expressions at runtime without executing a query. The resulting value can be composed with `and`/`or` and interpolated into `where`, `having`, `on`, `order_by`, and `select_merge`.
+
+```elixir
+# From the dynamic/2 docs (query.ex lines 568-585)
+conditions = false
+
+conditions =
+  if params["is_public"] do
+    dynamic([p], p.is_public or ^conditions)
+  else
+    conditions
+  end
+
+conditions =
+  if params["allow_reviewers"] do
+    dynamic([p, a], a.reviewer == true or ^conditions)
+  else
+    conditions
+  end
+
+from query, where: ^conditions
+```
+
+The canonical reduce pattern for multi-field search forms:
+
+```elixir
+def filter(params) do
+  Enum.reduce(params, dynamic(true), fn
+    {:name, name}, dynamic ->
+      dynamic([p], ^dynamic and p.name == ^name)
+    {:age, age}, dynamic ->
+      dynamic([p], ^dynamic and p.age > ^age)
+    _, dynamic ->
+      dynamic
+  end)
+end
+
+from p in Post, where: ^filter(params)
+```
+
+**Why:** Without `dynamic/2`, building conditional filters requires runtime `if` guards that build different query structs, or string interpolation (SQL injection risk). `dynamic/2` keeps filtering logic in Ecto's type-safe DSL while composing predicates conditionally. The resulting expression is validated and cast before the query runs.
+
+**Anti-pattern:** Building filter strings with interpolation, or separate query branches per condition:
+```elixir
+# SQL injection risk
+where_clause = "name = '#{params["name"]}'"
+Repo.query("SELECT * FROM posts WHERE #{where_clause}")
+
+# Brittle — duplicates the query structure N times
+query =
+  if params["name"] do
+    from p in Post, where: p.name == ^params["name"]
+  else
+    from p in Post
+  end
+```
+
+### When to Use
+
+**Triggers:**
+- You're building a search or filter form where 0..N conditions apply based on user input
+- Conditions need to be composed with `and`/`or` across different code paths
+- You want conditional filtering without forking the entire query
+
+**Example — before:**
+```elixir
+def search(params) do
+  query = from p in Post
+
+  query =
+    if params[:title] do
+      from p in query, where: ilike(p.title, ^"%#{params[:title]}%")
+    else
+      query
+    end
+
+  query =
+    if params[:category] do
+      from p in query, where: p.category == ^params[:category]
+    else
+      query
+    end
+
+  query
+end
+```
+
+**Example — after:**
+```elixir
+def search(params) do
+  filters =
+    Enum.reduce(params, dynamic(true), fn
+      {:title, title}, d -> dynamic([p], ^d and ilike(p.title, ^"%#{title}%"))
+      {:category, cat}, d -> dynamic([p], ^d and p.category == ^cat)
+      _, d -> d
+    end)
+
+  from p in Post, where: ^filters
+end
+```
+
+### When NOT to Use
+
+**Don't use this when:**
+- The conditions are always applied — static `where` clauses in a named function are simpler
+- You only have one conditional — a simple `if` that builds two query variants is clearer
+- The condition references a join binding that may not exist — use named bindings and verify first
+
+**Over-application example:**
+```elixir
+# dynamic() for a condition that's always present
+filters = dynamic([p], p.active == true)
+from p in Post, where: ^filters
+```
+
+**Better alternative:**
+```elixir
+from p in Post, where: p.active == true
+```
+
+**Why:** `dynamic/2` introduces a layer of indirection. When the condition is unconditional, a plain `where` clause in the `from` expression communicates intent more directly.
+
+---
+
+## 5. `subquery/1` — Correlated Subqueries
+
+**Source:** [lib/ecto/query.ex#L897](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L897)
+
+**What it does:** `subquery/1` wraps an `Ecto.Query` for use as a source inside another query — in joins, `where` conditions, or directly in `select`. The canonical use case is batched `update_all` without loading rows into memory.
+
+```elixir
+# From subquery/1 docs (query.ex lines 869-878)
+subset = from(p in Post,
+  where: p.synced == false and
+           (is_nil(p.sync_started_at) or p.sync_started_at < ^min_sync_started_at),
+  limit: ^batch_size
+)
+
+Repo.update_all(
+  from(p in Post, join: s in subquery(subset), on: s.id == p.id),
+  set: [sync_started_at: NaiveDateTime.utc_now()]
+)
+```
+
+Correlated subquery in `select` using `parent_as` (lines 894-895):
+
+```elixir
+comments_count = from(c in Comment, where: c.post_id == parent_as(:post).id, select: count())
+from(p in Post, as: :post, select: %{id: p.id, comments: subquery(comments_count)})
+```
+
+**Why:** Batched updates via a subquery join let the database enforce the limit at the SQL level — no rows are fetched into Elixir. `parent_as` correlates a subquery to the outer query's binding, computing aggregates per row without an explicit `GROUP BY` in the outer query.
+
+**Anti-pattern:** Loading rows into memory to get their IDs, then issuing a second query:
+```elixir
+# Fetches all IDs into memory before updating
+ids =
+  Post
+  |> where([p], p.synced == false)
+  |> limit(^batch_size)
+  |> select([p], p.id)
+  |> Repo.all()
+
+Repo.update_all(from(p in Post, where: p.id in ^ids),
+  set: [sync_started_at: NaiveDateTime.utc_now()])
+```
+
+### When to Use
+
+**Triggers:**
+- You need to batch-update records matching a subselect without loading them
+- You need a per-row aggregate (count, sum) in a `select` without adding it as a join
+- The subquery filter depends on the parent row's value (`parent_as`)
+
+**Example — before:**
+```elixir
+# N+1 pattern — one query per post to count comments
+posts = Repo.all(Post)
+Enum.map(posts, fn post ->
+  count = Repo.aggregate(from(c in Comment, where: c.post_id == ^post.id), :count)
+  Map.put(post, :comment_count, count)
+end)
+```
+
+**Example — after:**
+```elixir
+comments_count =
+  from c in Comment,
+    where: c.post_id == parent_as(:post).id,
+    select: count()
+
+Repo.all(from p in Post, as: :post,
+  select: %{id: p.id, title: p.title, comment_count: subquery(comments_count)})
+```
+
+### When NOT to Use
+
+**Don't use this when:**
+- A join + `group_by` expresses the aggregation more clearly and performs comparably
+- The subquery is not correlated — a preload or separate query may be more readable
+- The query is simple enough that `Repo.all` + in-memory grouping is fast enough and clearer
+
+**Over-application example:**
+```elixir
+# Subquery where a simple preload is idiomatic
+comments_query = from(c in Comment, where: c.post_id == parent_as(:post).id)
+from(p in Post, as: :post, select: %{id: p.id, comments: subquery(comments_query)})
+# Returns raw maps, not structs — preloads are often better for associations
+```
+
+**Better alternative:**
+```elixir
+Post |> Repo.all() |> Repo.preload(:comments)
+```
+
+**Why:** `subquery/1` is best suited to aggregates and batched writes. For loading associated structs, `preload` is idiomatic and returns properly typed structs that Ecto's association machinery can use.
+
+---
+
+## 6. `exclude/2` — Strip Clauses for Reuse
+
+**Source:** [lib/ecto/query.ex#L989](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L989)
+
+**What it does:** Removes one or more previously set clauses from a query. Enables deriving a variant of a base query — most commonly stripping `select`, `order_by`, and `preload` to build a count query.
+
+```elixir
+# From exclude/2 docs (query.ex lines 946-958)
+Ecto.Query.exclude(query, :join)
+Ecto.Query.exclude(query, :where)
+Ecto.Query.exclude(query, :order_by)
+Ecto.Query.exclude(query, :select)
+Ecto.Query.exclude(query, :preload)
+Ecto.Query.exclude(query, :limit)
+Ecto.Query.exclude(query, :offset)
+
+# Remove a list at once (line 964)
+Ecto.Query.exclude(query, [:limit, :offset])
+```
+
+The count query pattern:
+
+```elixir
+def count_query(query) do
+  query
+  |> exclude(:select)
+  |> exclude(:order_by)
+  |> exclude(:preload)
+  |> select([x], count(x.id))
+end
+```
+
+**Why:** Without `exclude/2`, you must maintain two parallel query paths — one for data, one for counts — that can drift out of sync. Deriving the count query from the data query guarantees they share all `where` and `join` clauses: adding a filter in one place automatically applies to both.
+
+**Anti-pattern:** Two independent query definitions that must be kept in sync manually:
+```elixir
+# Any filter added to data_query must also be added to count_query
+def data_query(params) do
+  from p in Post, where: p.active == true, order_by: [desc: p.inserted_at]
+end
+
+def count_query(params) do
+  from p in Post, where: p.active == true, select: count()  # easy to forget
+end
+```
+
+### When to Use
+
+**Triggers:**
+- You need both a data query and a count query from the same base (pagination)
+- A query includes `order_by` or `limit` that must be absent for counting or aggregation
+- You need to reuse a query's `where` clauses in an update or delete without its `select`
+
+**Example — before:**
+```elixir
+defmodule MyApp.Posts do
+  def list_posts(filters) do
+    base = build_base_query(filters)
+    data = Repo.all(base)
+    count = Repo.aggregate(build_count_query(filters), :count)
+    {data, count}
+  end
+
+  defp build_base_query(filters), do: ...
+  defp build_count_query(filters), do: ...  # must track build_base_query manually
+end
+```
+
+**Example — after:**
+```elixir
+defmodule MyApp.Posts do
+  def list_posts(filters) do
+    base = build_base_query(filters)
+    data = Repo.all(base)
+    count = base |> exclude(:select) |> exclude(:order_by) |> Repo.aggregate(:count)
+    {data, count}
+  end
+
+  defp build_base_query(filters), do: ...  # one source of truth
+end
+```
+
+### When NOT to Use
+
+**Don't use this when:**
+- The base query uses a `join` that is only needed for sorting, not filtering — excluding `order_by` still keeps the join, which may produce duplicates in the count
+- The clauses to exclude would leave the query in an invalid state (e.g. excluding `select` from a query with `select_merge` built on it)
+- The count query differs structurally enough that a shared base would be forced
+
+**Over-application example:**
+```elixir
+# exclude doesn't remove joins — count may be inflated by join duplicates
+def count(query) do
+  query
+  |> exclude(:order_by)
+  |> exclude(:select)
+  |> select([x], count(x.id))
+  |> Repo.one()
+  # If query has a left_join, this overcounts
+end
+```
+
+**Better alternative:**
+```elixir
+def count(query) do
+  query
+  |> exclude(:order_by)
+  |> exclude(:select)
+  |> select([x], count(x.id, :distinct))  # or exclude joins and recount
+  |> Repo.one()
+end
+```
+
+**Why:** `exclude/2` removes clause expressions but not join sources. If the base query joins tables that multiply rows (one-to-many), counting without `DISTINCT` overstates results. Know your join cardinality before deriving count queries this way.
+
+---
+
+## 7. Bindingless Queries — Data-Driven Clauses
+
+**Source:** [lib/ecto/query.ex#L264](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L264)
+
+**What it does:** When a query has only one source and clauses use simple field equality or fixed expressions, bindings can be omitted entirely. Clauses accept keyword lists and atom field names.
+
+```elixir
+# From the bindingless docs (query.ex lines 265-268)
+from Post,
+  where: [category: "fresh and new"],
+  order_by: [desc: :published_at],
+  select: [:id, :title, :body]
+```
+
+This is equivalent to the binding form:
+
+```elixir
+from p in Post,
+  where: p.category == "fresh and new",
+  order_by: [desc: p.published_at],
+  select: struct(p, [:id, :title, :body])
+```
+
+Bindingless syntax is fully dynamic (line 283-287):
+
+```elixir
+where = [category: "fresh and new"]
+order_by = [desc: :published_at]
+select = [:id, :title, :body]
+from Post, where: ^where, order_by: ^order_by, select: ^select
+```
+
+**Why:** Bindings exist to name sources so they can be referenced in expressions. When you're only filtering by equality on fields of the single source, bindings add syntax without adding capability. The bindingless form is shorter, more data-driven, and maps cleanly to keyword lists built at runtime.
+
+**Anti-pattern:** Always using binding syntax even for simple equality filters:
+```elixir
+# More verbose than necessary for simple filters
+from p in Post,
+  where: p.category == ^category and p.status == ^status,
+  select: [:id, :title]
+```
+
+### When to Use
+
+**Triggers:**
+- The query has exactly one source (no joins)
+- All `where` conditions are field equality checks against interpolated values
+- You're building query clauses dynamically from a map or keyword list (web search forms, CLIs)
+
+**Example — before:**
+```elixir
+def search(filters) do
+  from p in Post,
+    where: p.category == ^filters[:category],
+    where: p.status == ^filters[:status],
+    select: [:id, :title, :body]
+end
+```
+
+**Example — after:**
+```elixir
+def search(filters) do
+  where = Keyword.take(filters, [:category, :status])
+  from Post, where: ^where, select: [:id, :title, :body]
+end
+```
+
+### When NOT to Use
+
+**Don't use this when:**
+- The query includes a `join` — bindings are required to reference joined sources
+- A `where` condition uses operators other than equality (`>`, `<`, `like`, `fragment`)
+- You need to pass the source binding to a function like `field/2` or `type/2`
+
+**Over-application example:**
+```elixir
+# Bindingless can't express non-equality conditions
+from Post,
+  where: [inserted_at: ^date]  # Works only for exact equality — not a range
+```
+
+**Better alternative:**
+```elixir
+from p in Post, where: p.inserted_at >= ^start_date and p.inserted_at < ^end_date
+```
+
+**Why:** Bindingless keyword syntax maps to equality (`==`). Any non-equality comparison, function call, or multi-table reference requires a named binding. Use bindingless for pure equality filters; reach for bindings the moment expressions get richer.
+
+---
+
+## 8. `select_merge/3` — Augmenting Selects Dynamically
+
+**Source:** [lib/ecto/query.ex#L693](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L693)
+
+**What it does:** Merges additional fields into an existing `select` without replacing it. Especially useful with `dynamic/2` to add computed columns conditionally.
+
+```elixir
+# From the dynamic docs (query.ex lines 693-695)
+metric = dynamic([p], p.distance)
+
+from query, select: [:period, :metric], select_merge: ^%{metric: metric}
+```
+
+With aliasing and dynamic ordering (lines 700-707):
+
+```elixir
+fields = %{
+  period: dynamic([p], selected_as(p.month, :month)),
+  metric: dynamic([p], p.distance)
+}
+
+order = dynamic(selected_as(:month))
+
+from query, select: ^fields, order_by: ^order
+```
+
+**Why:** `select_merge` lets base queries define the fixed fields and separate concerns add computed or conditional fields. Without it, adding a field requires rewriting the entire `select` clause — or maintaining multiple `select` variants. Combined with `dynamic/2`, it enables data-driven projections where which columns appear depends on runtime configuration.
+
+**Anti-pattern:** Rewriting the entire `select` whenever a computed column is needed:
+```elixir
+# The base select must be duplicated in every variant
+def with_distance(query) do
+  from p in query, select: %{id: p.id, name: p.name, distance: p.distance}
+end
+
+def without_distance(query) do
+  from p in query, select: %{id: p.id, name: p.name}
+end
+```
+
+### When to Use
+
+**Triggers:**
+- A computed column should be added conditionally depending on caller context
+- You're building a reporting query where which aggregates appear is configured at runtime
+- A base query provides the structural select and feature-specific code augments it
+
+**Example — before:**
+```elixir
+def list_metrics(include_distance?) do
+  if include_distance? do
+    from p in Post, select: %{period: p.period, metric: p.views, distance: p.distance}
+  else
+    from p in Post, select: %{period: p.period, metric: p.views}
+  end
+end
+```
+
+**Example — after:**
+```elixir
+def list_metrics(opts) do
+  base = from p in Post, select: %{period: p.period, metric: p.views}
+
+  if opts[:include_distance] do
+    from p in base, select_merge: %{distance: p.distance}
+  else
+    base
+  end
+end
+```
+
+### When NOT to Use
+
+**Don't use this when:**
+- The `select` is a single value or tuple rather than a map — `select_merge` requires a map shape
+- The added field requires a binding not present in the base query
+- You're replacing, not augmenting — use a plain `select` or `exclude(:select)` first
+
+**Over-application example:**
+```elixir
+# select_merge on a non-map select causes a runtime error
+from p in Post,
+  select: p.name,
+  select_merge: %{email: p.email}  # error: base select is not a map
+```
+
+**Better alternative:**
+```elixir
+from p in Post, select: %{name: p.name, email: p.email}
+```
+
+**Why:** `select_merge` merges into the existing map-shaped select. If the base select returns a struct or a scalar, there is no map to merge into. Ensure the base `select` produces a map before using `select_merge`.
+
+---
+
+## 9. `fragment/1` and `type/2` — Escape Hatches for DB-Specific Expressions
+
+**Source:** [lib/ecto/query.ex#L291](https://github.com/elixir-ecto/ecto/blob/fd2ec52b5ae1f775747308f0fd9ffc160515514b/lib/ecto/query.ex#L291)
+
+**What it does:** `fragment/1` passes a raw SQL expression to the database engine for functions or operators the Ecto DSL cannot express. `type/2` coerces an Elixir value to a specific Ecto type for comparison when schema type information is not available.
+
+```elixir
+# From the fragments docs (query.ex lines 301-303)
+from p in Post,
+  where: is_nil(p.published_at) and
+         fragment("lower(?)", p.title) == ^title
+```
+
+`type/2` for schemaless queries where Ecto cannot infer the cast type:
+
+```elixir
+# Coerce to :integer when no schema field exists to infer from
+from u in "users",
+  where: u.age > type(^age, :integer)
+```
+
+**Why:** Databases have functions (full-text search, JSON operators, trigram similarity, window functions) that the Ecto DSL cannot enumerate. `fragment/1` is the intentional escape hatch: interpolations with `?` placeholders are still parameterized, so SQL injection is not a risk. `type/2` is necessary for schemaless queries where Ecto cannot cast a bound parameter to the correct DB type automatically.
+
+**Anti-pattern:** Using fragment for everything, bypassing Ecto's type safety and composability:
+```elixir
+# Loses all type inference; fragment output is opaque to Ecto
+from u in User,
+  where: fragment("? = ? AND ? > ?", u.name, ^name, u.age, ^age)
+```
+
+Or using string interpolation inside fragments:
+
+```elixir
+# SQL injection — never interpolate directly into fragment strings
+from u in User,
+  where: fragment("lower(email) = '#{email}'")
+```
+
+### When to Use
+
+**Triggers:**
+- The database function has no Ecto DSL equivalent (e.g. `lower()`, `similarity()`, `jsonb_array_elements()`)
+- You're writing a schemaless query (`from u in "users"`) and need to cast a bound parameter
+- A DB-specific operator or syntax is required for a performance-critical path
+
+**Example — before:**
+```elixir
+# Can't express case-insensitive equality in pure Ecto DSL
+from u in User, where: u.email == ^String.downcase(email)
+# Compares raw stored value; doesn't work if DB stores mixed case
+```
+
+**Example — after:**
+```elixir
+from u in User,
+  where: fragment("lower(?)", u.email) == ^String.downcase(email)
+```
+
+### When NOT to Use
+
+**Don't use this when:**
+- The Ecto DSL or a library like `EctoCommons` already provides the operation
+- You want type-cast values in a regular schema query — Ecto infers the type from the field
+- You're tempted to fragment an entire `WHERE` clause — named functions with `dynamic/2` compose better
+
+**Over-application example:**
+```elixir
+# Fragments for standard operations Ecto handles natively
+from p in Post,
+  where: fragment("? = ?", p.status, ^:published),
+  order_by: fragment("? DESC", p.inserted_at)
+```
+
+**Better alternative:**
+```elixir
+from p in Post,
+  where: p.status == :published,
+  order_by: [desc: p.inserted_at]
+```
+
+**Why:** Fragment output is opaque to Ecto's type system: no cast validation, no composability with `dynamic/2` type inference, and no portability across adapters. Reserve `fragment/1` for genuine gaps in the DSL; prefer native Ecto expressions for everything the DSL can express.
+
+---
+
+## Decision Tree
+
+- If you need to filter conditionally at runtime → `dynamic/2` (Pattern 4)
+- If you need to join or sort across function composition boundaries → named bindings with `as:` (Pattern 3)
+- If you need a count or aggregate from the same base as a data query → `exclude/2` (Pattern 6)
+- If you need a DB-side correlated count or aggregate per row → `subquery/1` with `parent_as` (Pattern 5)
+- If the query has one source and all filters are equality checks → bindingless keyword syntax (Pattern 7)
+- If you need to add computed columns without rewriting the select → `select_merge/3` (Pattern 8)
+- If the DB function has no Ecto DSL equivalent → `fragment/1` as last resort (Pattern 9)
+- For all queries: define small named functions that take a query and return a query (Patterns 1 and 2)
+
+<!-- PATTERN_COMPLETE -->