30 KiB
Ecto Query Patterns
Patterns extracted from Ecto's query layer source code.
Contents
- Named Query Functions — Composable Query Building
- Query Piping — Schema to Query Pipeline
- Named Bindings — Position-Independent Composition
dynamic/2— Runtime-Constructed Predicatessubquery/1— Correlated Subqueriesexclude/2— Strip Clauses for Reuse- Bindingless Queries — Data-Driven Clauses
select_merge/3— Augmenting Selects Dynamicallyfragment/1andtype/2— Escape Hatches for DB-Specific Expressions
1. Named Query Functions — Composable Query Building
Source: lib/ecto/query.ex#L1112
What it does: Define named functions that accept a query and return a refined query. The query itself is the accumulator; each function layers one concern.
# From lib/ecto/query.ex lines 1112-1134
def paginate(query, page, size) do
from query,
limit: ^size,
offset: ^((page-1) * size)
end
def published(query) do
from p in query, where: not(is_nil(p.published_at))
end
These functions compose naturally at the call site:
User |> active() |> published() |> paginate(1, 20)
Why: Each function encodes exactly one policy decision. The composed result is a single query that the database executes once. Because from query appends rather than replaces, the caller chooses which policies to apply and in what order — without any one function needing to know about the others.
Anti-pattern: One monolithic query that mixes pagination, filtering, and ordering:
# Hard to reuse parts independently
def list_published_users(page, size) do
from u in User,
where: u.active == true and not is_nil(u.published_at),
order_by: [desc: u.inserted_at],
limit: ^size,
offset: ^((page - 1) * size)
end
When to Use
Triggers:
- The same filter, ordering, or limit appears in multiple query contexts
- You need to mix and match clauses — some queries paginate, some don't
- A policy (e.g. "only active records") should be enforced consistently without copy-pasting conditions
Example — before:
def list_recent_posts(page, size) do
from p in Post,
where: not is_nil(p.published_at),
order_by: [desc: p.published_at],
limit: ^size,
offset: ^((page - 1) * size)
end
def count_published_posts do
from p in Post,
where: not is_nil(p.published_at),
select: count()
end
Example — after:
def published(query), do: from p in query, where: not is_nil(p.published_at)
def by_newest(query), do: from p in query, order_by: [desc: p.published_at]
def paginate(query, page, size) do
from query, limit: ^size, offset: ^((page - 1) * size)
end
def list_recent_posts(page, size) do
Post |> published() |> by_newest() |> paginate(page, size)
end
def count_published_posts do
Post |> published() |> select([p], count())
end
When NOT to Use
Don't use this when:
- The query is used exactly once and decomposing it adds names with no reuse value
- The clauses are tightly coupled and meaningless in isolation (e.g. a join whose
oncondition references a specific sibling join)
Over-application example:
# Not worth extracting — used once, no meaningful reuse
def with_user_and_org_and_permissions(query) do
from [u, o, p] in query,
where: u.org_id == o.id and p.user_id == u.id and p.role == "admin"
end
Better alternative:
from u in User,
join: o in Org, on: u.org_id == o.id,
join: p in Permission, on: p.user_id == u.id,
where: p.role == "admin"
Why: Extraction is worth it when the function has a name that communicates intent reusably. When a query is one-off and the extracted name just paraphrases the code, keep it inline.
2. Query Piping — Schema to Query Pipeline
Source: lib/ecto/query.ex#L310
What it does: Ecto implements the Ecto.Queryable protocol for schemas, strings, and query structs, so any of them can be the starting point of a pipeline. The pipe operator chains named query functions:
# From the macro API docs (query.ex line 319-324)
"users"
|> where([u], u.age > 18)
|> select([u], u.name)
Starting from a schema module name is idiomatic:
User
|> where([u], u.active == true)
|> order_by([u], u.name)
|> limit(10)
Why: The pipe operator makes query construction read left-to-right, mirroring how SQL clauses are mentally composed. The Ecto.Queryable protocol means from Schema and Schema |> where(...) are equivalent, so the choice of from/|> is stylistic — but pipe form scales better when each step is a named function.
Anti-pattern: Building one monolithic keyword query instead of small composable pipes:
# Cannot reuse paginate, active, or order_by separately
from u in User,
where: u.active == true,
order_by: [asc: u.name],
limit: ^limit,
offset: ^offset
When to Use
Triggers:
- You have 3+ clauses that each correspond to an independently reusable policy
- The query is assembled conditionally based on runtime inputs
- You want the query construction steps to be readable as an English sentence
Example — before:
from p in Post,
where: p.author_id == ^author_id and not is_nil(p.published_at),
order_by: [desc: p.published_at],
limit: 10
Example — after:
Post
|> by_author(author_id)
|> published()
|> by_newest()
|> limit(10)
When NOT to Use
Don't use this when:
- The query has two or fewer clauses and a single
fromis more concise - You're assembling a complex join where positional bindings require a single
fromfor clarity
Over-application example:
# Excessive piping for a trivial lookup
User
|> where([u], u.id == ^id)
|> limit(1)
|> Repo.one()
Better alternative:
Repo.get(User, id)
Why: Repo.get/2 and Repo.get_by/2 exist precisely for simple lookups. Piping adds ceremony without benefit when the standard API already expresses the intent.
3. Named Bindings — Position-Independent Composition
Source: lib/ecto/query.ex#L211
What it does: Assign stable names to from and join sources using as:. Reference those names in any function without knowing or caring about join order.
# Name the join at definition time (query.ex line 218-219)
posts_with_comments =
from p in Post,
join: c in Comment, as: :comment, on: c.post_id == p.id
# Reference by name instead of position (line 223)
from [p, comment: c] in posts_with_comments, select: {p.title, c.body}
Generic sort function that works on any named binding (line 254-256):
def sort(query, as, field) do
from [{^as, x}] in query, order_by: field(x, ^field)
end
The spread ... syntax lets you reference first and last bindings without caring about the middle (line 206):
from [p, ..., c] in posts_with_comments, select: {p.title, c.body}
Why: Positional bindings break when a new join is inserted earlier in the pipeline. Named bindings are stable: adding a join between Post and Comment does not change how :comment is referenced. This is essential for composable query libraries where join order is not known at authorship time.
Anti-pattern: Relying on position when queries are built across functions:
# Breaks if anyone inserts a join before Comment
def with_comment_body(query) do
from [p, c] in query, select: {p.title, c.body}
end
When to Use
Triggers:
- A join is referenced from a function that didn't define it
- You're writing generic helpers (sorting, filtering) that work on any named source
- Multiple joins make positional counting error-prone
Example — before:
def filter_by_org(query) do
from [u, o] in query, where: o.active == true # Breaks if join order changes
end
Example — after:
# Define the join with a name
from u in User,
join: o in Org, as: :org, on: u.org_id == o.id
# Reference by name — order-independent
def filter_by_org(query) do
from [org: o] in query, where: o.active == true
end
When NOT to Use
Don't use this when:
- The query lives entirely in one function and is never extended
- You have a single join that will never be reordered — positional is fine and shorter
- You're using
...spread and don't need to reference intermediate sources by name
Over-application example:
# Naming everything in a one-off query adds noise
from u in User, as: :user,
join: p in Post, as: :post, on: p.user_id == u.id,
where: p.published == true,
select: u
Better alternative:
from u in User,
join: p in Post, on: p.user_id == u.id,
where: p.published == true,
select: u
Why: Named bindings pay off at composition boundaries. In a self-contained query, they add verbosity without stability benefit.
4. dynamic/2 — Runtime-Constructed Predicates
Source: lib/ecto/query.ex#L770
What it does: dynamic/2 builds query expressions at runtime without executing a query. The resulting value can be composed with and/or and interpolated into where, having, on, order_by, and select_merge.
# From the dynamic/2 docs (query.ex lines 568-585)
conditions = false
conditions =
if params["is_public"] do
dynamic([p], p.is_public or ^conditions)
else
conditions
end
conditions =
if params["allow_reviewers"] do
dynamic([p, a], a.reviewer == true or ^conditions)
else
conditions
end
from query, where: ^conditions
The canonical reduce pattern for multi-field search forms:
def filter(params) do
Enum.reduce(params, dynamic(true), fn
{:name, name}, dynamic ->
dynamic([p], ^dynamic and p.name == ^name)
{:age, age}, dynamic ->
dynamic([p], ^dynamic and p.age > ^age)
_, dynamic ->
dynamic
end)
end
from p in Post, where: ^filter(params)
Why: Without dynamic/2, building conditional filters requires runtime if guards that build different query structs, or string interpolation (SQL injection risk). dynamic/2 keeps filtering logic in Ecto's type-safe DSL while composing predicates conditionally. The resulting expression is validated and cast before the query runs.
Anti-pattern: Building filter strings with interpolation, or separate query branches per condition:
# SQL injection risk
where_clause = "name = '#{params["name"]}'"
Repo.query("SELECT * FROM posts WHERE #{where_clause}")
# Brittle — duplicates the query structure N times
query =
if params["name"] do
from p in Post, where: p.name == ^params["name"]
else
from p in Post
end
When to Use
Triggers:
- You're building a search or filter form where 0..N conditions apply based on user input
- Conditions need to be composed with
and/oracross different code paths - You want conditional filtering without forking the entire query
Example — before:
def search(params) do
query = from p in Post
query =
if params[:title] do
from p in query, where: ilike(p.title, ^"%#{params[:title]}%")
else
query
end
query =
if params[:category] do
from p in query, where: p.category == ^params[:category]
else
query
end
query
end
Example — after:
def search(params) do
filters =
Enum.reduce(params, dynamic(true), fn
{:title, title}, d -> dynamic([p], ^d and ilike(p.title, ^"%#{title}%"))
{:category, cat}, d -> dynamic([p], ^d and p.category == ^cat)
_, d -> d
end)
from p in Post, where: ^filters
end
When NOT to Use
Don't use this when:
- The conditions are always applied — static
whereclauses in a named function are simpler - You only have one conditional — a simple
ifthat builds two query variants is clearer - The condition references a join binding that may not exist — use named bindings and verify first
Over-application example:
# dynamic() for a condition that's always present
filters = dynamic([p], p.active == true)
from p in Post, where: ^filters
Better alternative:
from p in Post, where: p.active == true
Why: dynamic/2 introduces a layer of indirection. When the condition is unconditional, a plain where clause in the from expression communicates intent more directly.
5. subquery/1 — Correlated Subqueries
Source: lib/ecto/query.ex#L897
What it does: subquery/1 wraps an Ecto.Query for use as a source inside another query — in joins, where conditions, or directly in select. The canonical use case is batched update_all without loading rows into memory.
# From subquery/1 docs (query.ex lines 869-878)
subset = from(p in Post,
where: p.synced == false and
(is_nil(p.sync_started_at) or p.sync_started_at < ^min_sync_started_at),
limit: ^batch_size
)
Repo.update_all(
from(p in Post, join: s in subquery(subset), on: s.id == p.id),
set: [sync_started_at: NaiveDateTime.utc_now()]
)
Correlated subquery in select using parent_as (lines 894-895):
comments_count = from(c in Comment, where: c.post_id == parent_as(:post).id, select: count())
from(p in Post, as: :post, select: %{id: p.id, comments: subquery(comments_count)})
Why: Batched updates via a subquery join let the database enforce the limit at the SQL level — no rows are fetched into Elixir. parent_as correlates a subquery to the outer query's binding, computing aggregates per row without an explicit GROUP BY in the outer query.
Anti-pattern: Loading rows into memory to get their IDs, then issuing a second query:
# Fetches all IDs into memory before updating
ids =
Post
|> where([p], p.synced == false)
|> limit(^batch_size)
|> select([p], p.id)
|> Repo.all()
Repo.update_all(from(p in Post, where: p.id in ^ids),
set: [sync_started_at: NaiveDateTime.utc_now()])
When to Use
Triggers:
- You need to batch-update records matching a subselect without loading them
- You need a per-row aggregate (count, sum) in a
selectwithout adding it as a join - The subquery filter depends on the parent row's value (
parent_as)
Example — before:
# N+1 pattern — one query per post to count comments
posts = Repo.all(Post)
Enum.map(posts, fn post ->
count = Repo.aggregate(from(c in Comment, where: c.post_id == ^post.id), :count)
Map.put(post, :comment_count, count)
end)
Example — after:
comments_count =
from c in Comment,
where: c.post_id == parent_as(:post).id,
select: count()
Repo.all(from p in Post, as: :post,
select: %{id: p.id, title: p.title, comment_count: subquery(comments_count)})
When NOT to Use
Don't use this when:
- A join +
group_byexpresses the aggregation more clearly and performs comparably - The subquery is not correlated — a preload or separate query may be more readable
- The query is simple enough that
Repo.all+ in-memory grouping is fast enough and clearer
Over-application example:
# Subquery where a simple preload is idiomatic
comments_query = from(c in Comment, where: c.post_id == parent_as(:post).id)
from(p in Post, as: :post, select: %{id: p.id, comments: subquery(comments_query)})
# Returns raw maps, not structs — preloads are often better for associations
Better alternative:
Post |> Repo.all() |> Repo.preload(:comments)
Why: subquery/1 is best suited to aggregates and batched writes. For loading associated structs, preload is idiomatic and returns properly typed structs that Ecto's association machinery can use.
6. exclude/2 — Strip Clauses for Reuse
Source: lib/ecto/query.ex#L989
What it does: Removes one or more previously set clauses from a query. Enables deriving a variant of a base query — most commonly stripping select, order_by, and preload to build a count query.
# From exclude/2 docs (query.ex lines 946-958)
Ecto.Query.exclude(query, :join)
Ecto.Query.exclude(query, :where)
Ecto.Query.exclude(query, :order_by)
Ecto.Query.exclude(query, :select)
Ecto.Query.exclude(query, :preload)
Ecto.Query.exclude(query, :limit)
Ecto.Query.exclude(query, :offset)
# Remove a list at once (line 964)
Ecto.Query.exclude(query, [:limit, :offset])
The count query pattern:
def count_query(query) do
query
|> exclude(:select)
|> exclude(:order_by)
|> exclude(:preload)
|> select([x], count(x.id))
end
Why: Without exclude/2, you must maintain two parallel query paths — one for data, one for counts — that can drift out of sync. Deriving the count query from the data query guarantees they share all where and join clauses: adding a filter in one place automatically applies to both.
Anti-pattern: Two independent query definitions that must be kept in sync manually:
# Any filter added to data_query must also be added to count_query
def data_query(params) do
from p in Post, where: p.active == true, order_by: [desc: p.inserted_at]
end
def count_query(params) do
from p in Post, where: p.active == true, select: count() # easy to forget
end
When to Use
Triggers:
- You need both a data query and a count query from the same base (pagination)
- A query includes
order_byorlimitthat must be absent for counting or aggregation - You need to reuse a query's
whereclauses in an update or delete without itsselect
Example — before:
defmodule MyApp.Posts do
def list_posts(filters) do
base = build_base_query(filters)
data = Repo.all(base)
count = Repo.aggregate(build_count_query(filters), :count)
{data, count}
end
defp build_base_query(filters), do: ...
defp build_count_query(filters), do: ... # must track build_base_query manually
end
Example — after:
defmodule MyApp.Posts do
def list_posts(filters) do
base = build_base_query(filters)
data = Repo.all(base)
count = base |> exclude(:select) |> exclude(:order_by) |> Repo.aggregate(:count)
{data, count}
end
defp build_base_query(filters), do: ... # one source of truth
end
When NOT to Use
Don't use this when:
- The base query uses a
jointhat is only needed for sorting, not filtering — excludingorder_bystill keeps the join, which may produce duplicates in the count - The clauses to exclude would leave the query in an invalid state (e.g. excluding
selectfrom a query withselect_mergebuilt on it) - The count query differs structurally enough that a shared base would be forced
Over-application example:
# exclude doesn't remove joins — count may be inflated by join duplicates
def count(query) do
query
|> exclude(:order_by)
|> exclude(:select)
|> select([x], count(x.id))
|> Repo.one()
# If query has a left_join, this overcounts
end
Better alternative:
def count(query) do
query
|> exclude(:order_by)
|> exclude(:select)
|> select([x], count(x.id, :distinct)) # or exclude joins and recount
|> Repo.one()
end
Why: exclude/2 removes clause expressions but not join sources. If the base query joins tables that multiply rows (one-to-many), counting without DISTINCT overstates results. Know your join cardinality before deriving count queries this way.
7. Bindingless Queries — Data-Driven Clauses
Source: lib/ecto/query.ex#L264
What it does: When a query has only one source and clauses use simple field equality or fixed expressions, bindings can be omitted entirely. Clauses accept keyword lists and atom field names.
# From the bindingless docs (query.ex lines 265-268)
from Post,
where: [category: "fresh and new"],
order_by: [desc: :published_at],
select: [:id, :title, :body]
This is equivalent to the binding form:
from p in Post,
where: p.category == "fresh and new",
order_by: [desc: p.published_at],
select: struct(p, [:id, :title, :body])
Bindingless syntax is fully dynamic (line 283-287):
where = [category: "fresh and new"]
order_by = [desc: :published_at]
select = [:id, :title, :body]
from Post, where: ^where, order_by: ^order_by, select: ^select
Why: Bindings exist to name sources so they can be referenced in expressions. When you're only filtering by equality on fields of the single source, bindings add syntax without adding capability. The bindingless form is shorter, more data-driven, and maps cleanly to keyword lists built at runtime.
Anti-pattern: Always using binding syntax even for simple equality filters:
# More verbose than necessary for simple filters
from p in Post,
where: p.category == ^category and p.status == ^status,
select: [:id, :title]
When to Use
Triggers:
- The query has exactly one source (no joins)
- All
whereconditions are field equality checks against interpolated values - You're building query clauses dynamically from a map or keyword list (web search forms, CLIs)
Example — before:
def search(filters) do
from p in Post,
where: p.category == ^filters[:category],
where: p.status == ^filters[:status],
select: [:id, :title, :body]
end
Example — after:
def search(filters) do
where = Keyword.take(filters, [:category, :status])
from Post, where: ^where, select: [:id, :title, :body]
end
When NOT to Use
Don't use this when:
- The query includes a
join— bindings are required to reference joined sources - A
wherecondition uses operators other than equality (>,<,like,fragment) - You need to pass the source binding to a function like
field/2ortype/2
Over-application example:
# Bindingless can't express non-equality conditions
from Post,
where: [inserted_at: ^date] # Works only for exact equality — not a range
Better alternative:
from p in Post, where: p.inserted_at >= ^start_date and p.inserted_at < ^end_date
Why: Bindingless keyword syntax maps to equality (==). Any non-equality comparison, function call, or multi-table reference requires a named binding. Use bindingless for pure equality filters; reach for bindings the moment expressions get richer.
8. select_merge/3 — Augmenting Selects Dynamically
Source: lib/ecto/query.ex#L693
What it does: Merges additional fields into an existing select without replacing it. Especially useful with dynamic/2 to add computed columns conditionally.
# From the dynamic docs (query.ex lines 693-695)
metric = dynamic([p], p.distance)
from query, select: [:period, :metric], select_merge: ^%{metric: metric}
With aliasing and dynamic ordering (lines 700-707):
fields = %{
period: dynamic([p], selected_as(p.month, :month)),
metric: dynamic([p], p.distance)
}
order = dynamic(selected_as(:month))
from query, select: ^fields, order_by: ^order
Why: select_merge lets base queries define the fixed fields and separate concerns add computed or conditional fields. Without it, adding a field requires rewriting the entire select clause — or maintaining multiple select variants. Combined with dynamic/2, it enables data-driven projections where which columns appear depends on runtime configuration.
Anti-pattern: Rewriting the entire select whenever a computed column is needed:
# The base select must be duplicated in every variant
def with_distance(query) do
from p in query, select: %{id: p.id, name: p.name, distance: p.distance}
end
def without_distance(query) do
from p in query, select: %{id: p.id, name: p.name}
end
When to Use
Triggers:
- A computed column should be added conditionally depending on caller context
- You're building a reporting query where which aggregates appear is configured at runtime
- A base query provides the structural select and feature-specific code augments it
Example — before:
def list_metrics(include_distance?) do
if include_distance? do
from p in Post, select: %{period: p.period, metric: p.views, distance: p.distance}
else
from p in Post, select: %{period: p.period, metric: p.views}
end
end
Example — after:
def list_metrics(opts) do
base = from p in Post, select: %{period: p.period, metric: p.views}
if opts[:include_distance] do
from p in base, select_merge: %{distance: p.distance}
else
base
end
end
When NOT to Use
Don't use this when:
- The
selectis a single value or tuple rather than a map —select_mergerequires a map shape - The added field requires a binding not present in the base query
- You're replacing, not augmenting — use a plain
selectorexclude(:select)first
Over-application example:
# select_merge on a non-map select causes a runtime error
from p in Post,
select: p.name,
select_merge: %{email: p.email} # error: base select is not a map
Better alternative:
from p in Post, select: %{name: p.name, email: p.email}
Why: select_merge merges into the existing map-shaped select. If the base select returns a struct or a scalar, there is no map to merge into. Ensure the base select produces a map before using select_merge.
9. fragment/1 and type/2 — Escape Hatches for DB-Specific Expressions
Source: lib/ecto/query.ex#L291
What it does: fragment/1 passes a raw SQL expression to the database engine for functions or operators the Ecto DSL cannot express. type/2 coerces an Elixir value to a specific Ecto type for comparison when schema type information is not available.
# From the fragments docs (query.ex lines 301-303)
from p in Post,
where: is_nil(p.published_at) and
fragment("lower(?)", p.title) == ^title
type/2 for schemaless queries where Ecto cannot infer the cast type:
# Coerce to :integer when no schema field exists to infer from
from u in "users",
where: u.age > type(^age, :integer)
Why: Databases have functions (full-text search, JSON operators, trigram similarity, window functions) that the Ecto DSL cannot enumerate. fragment/1 is the intentional escape hatch: interpolations with ? placeholders are still parameterized, so SQL injection is not a risk. type/2 is necessary for schemaless queries where Ecto cannot cast a bound parameter to the correct DB type automatically.
Anti-pattern: Using fragment for everything, bypassing Ecto's type safety and composability:
# Loses all type inference; fragment output is opaque to Ecto
from u in User,
where: fragment("? = ? AND ? > ?", u.name, ^name, u.age, ^age)
Or using string interpolation inside fragments:
# SQL injection — never interpolate directly into fragment strings
from u in User,
where: fragment("lower(email) = '#{email}'")
When to Use
Triggers:
- The database function has no Ecto DSL equivalent (e.g.
lower(),similarity(),jsonb_array_elements()) - You're writing a schemaless query (
from u in "users") and need to cast a bound parameter - A DB-specific operator or syntax is required for a performance-critical path
Example — before:
# Can't express case-insensitive equality in pure Ecto DSL
from u in User, where: u.email == ^String.downcase(email)
# Compares raw stored value; doesn't work if DB stores mixed case
Example — after:
from u in User,
where: fragment("lower(?)", u.email) == ^String.downcase(email)
When NOT to Use
Don't use this when:
- The Ecto DSL or a library like
EctoCommonsalready provides the operation - You want type-cast values in a regular schema query — Ecto infers the type from the field
- You're tempted to fragment an entire
WHEREclause — named functions withdynamic/2compose better
Over-application example:
# Fragments for standard operations Ecto handles natively
from p in Post,
where: fragment("? = ?", p.status, ^:published),
order_by: fragment("? DESC", p.inserted_at)
Better alternative:
from p in Post,
where: p.status == :published,
order_by: [desc: p.inserted_at]
Why: Fragment output is opaque to Ecto's type system: no cast validation, no composability with dynamic/2 type inference, and no portability across adapters. Reserve fragment/1 for genuine gaps in the DSL; prefer native Ecto expressions for everything the DSL can express.
Decision Tree
- If you need to filter conditionally at runtime →
dynamic/2(Pattern 4) - If you need to join or sort across function composition boundaries → named bindings with
as:(Pattern 3) - If you need a count or aggregate from the same base as a data query →
exclude/2(Pattern 6) - If you need a DB-side correlated count or aggregate per row →
subquery/1withparent_as(Pattern 5) - If the query has one source and all filters are equality checks → bindingless keyword syntax (Pattern 7)
- If you need to add computed columns without rewriting the select →
select_merge/3(Pattern 8) - If the DB function has no Ecto DSL equivalent →
fragment/1as last resort (Pattern 9) - For all queries: define small named functions that take a query and return a query (Patterns 1 and 2)