Files
elixir-patterns/patterns/data-transforms.md
T
aweiker 10218813d3 docs: backfill TOC + decision trees, fix review findings
- Add ## Contents and ## Decision Tree to all 10 existing pattern files
- Fix embed_as/1 semantics inversion in types.md (:self → :dump)
- Fix fabricated __meta__.changes reference in changesets.md
- Fix default primary key type (:integer → :id) in schemas.md
- Combine @impl subsections into single "Minimal Callback Annotation"
2026-05-01 22:13:35 -07:00

1042 lines
36 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Data Transform & Pipeline Patterns in Elixir Core
Patterns extracted from Elixir's standard library source code.
## Contents
1. [List-Specialized Clause Before Protocol Dispatch](#1-list-specialized-clause-before-protocol-dispatch)
2. [Build-Then-Reverse (Cons-Cell Accumulation)](#2-build-then-reverse-cons-cell-accumulation)
3. [Pipeline for Linear Transformations, Bare Calls for Control Flow](#3-pipeline-for-linear-transformations-bare-calls-for-control-flow)
4. [Pipeline Ending with `|> elem(1)` (Protocol Reduce Unwrap)](#4-pipeline-ending-with--elem1-protocol-reduce-unwrap)
5. [Private Helper Decomposition: Recursive Workers with Guards](#5-private-helper-decomposition-recursive-workers-with-guards)
6. [Enum vs Stream Decision Pattern](#6-enum-vs-stream-decision-pattern)
7. [Map.update vs Map.put Decision Pattern](#7-mapupdate-vs-mapput-decision-pattern)
8. [Pattern Matching on Map Structure for Dispatch](#8-pattern-matching-on-map-structure-for-dispatch)
9. [Delegating to Erlang BIFs with `defdelegate`](#9-delegating-to-erlang-bifs-with-defdelegate)
10. [Reduce as the Universal Primitive](#10-reduce-as-the-universal-primitive)
11. [Keyword Multi-Clause Guard Dispatch (String.split pattern)](#11-keyword-multi-clause-guard-dispatch-stringsplit-pattern)
12. [Lazy Private Helpers with `defp parts_to_index`](#12-lazy-private-helpers-with-defp-parts_to_index)
---
## 1. List-Specialized Clause Before Protocol Dispatch
**Source:** [lib/elixir/lib/enum.ex#L1723](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/enum.ex#L1723)
**What it does:** Every public Enum function defines a `when is_list(enumerable)` clause first, then a generic fallback that uses the Enumerable protocol.
```elixir
def map(enumerable, fun) when is_list(enumerable) do
:lists.map(fun, enumerable)
end
def map(first..last//step, fun) do
map_range(first, last, step, fun)
end
def map(enumerable, fun) do
reduce(enumerable, [], R.map(fun)) |> :lists.reverse()
end
```
**Why:** Lists are by far the most common enumerable. Matching them first avoids protocol dispatch overhead entirely (direct Erlang BIF call). The range clause is a further optimization for a common case. The generic clause handles all other enumerables through the protocol.
**Anti-pattern:** A single clause that always goes through protocol dispatch:
```elixir
# BAD — forces protocol overhead even for lists
def map(enumerable, fun) do
Enumerable.reduce(enumerable, {:cont, []}, fn x, acc ->
{:cont, [fun.(x) | acc]}
end) |> elem(1) |> :lists.reverse()
end
```
### When to Use
**Triggers:** You have a public function that accepts "any enumerable" but lists account for the majority of callers. Profiling shows protocol dispatch is a measurable cost. You can call an Erlang BIF or a direct recursive implementation for the list case.
**Example — before:**
```elixir
def sum(enumerable) do
Enumerable.reduce(enumerable, {:cont, 0}, fn x, acc -> {:cont, acc + x} end)
|> elem(1)
end
```
**Example — after:**
```elixir
def sum(enumerable) when is_list(enumerable) do
:lists.foldl(fn x, acc -> acc + x end, 0, enumerable)
end
def sum(enumerable) do
Enumerable.reduce(enumerable, {:cont, 0}, fn x, acc -> {:cont, acc + x} end)
|> elem(1)
end
```
### When NOT to Use
**Don't use this when:** The function is rarely called with lists, or the function body is complex enough that maintaining two implementations creates a bug risk. Also avoid when the protocol path is already fast enough (micro-optimization for non-hot paths).
**Over-application example:**
```elixir
# Pointless — this function is only ever called with streams
def expensive_transform(enumerable) when is_list(enumerable) do
# duplicate complex logic just in case a list shows up
enumerable |> do_phase_1() |> do_phase_2() |> do_phase_3()
end
def expensive_transform(enumerable) do
enumerable |> do_phase_1() |> do_phase_2() |> do_phase_3()
end
```
**Better alternative:** Keep one clause. Add the specialization only when profiling proves the protocol dispatch is a bottleneck for real workloads.
**Why:** Premature optimization. Two copies of the same logic means two places to fix bugs. The BEAM's protocol dispatch is already highly optimized — you need evidence before duplicating.
---
## 2. Build-Then-Reverse (Cons-Cell Accumulation)
**Source:** [lib/elixir/lib/enum.ex#L1124](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/enum.ex#L1124), 1733, 2697
**What it does:** Accumulates results by prepending to a list (`[x | acc]`), then reverses at the end.
```elixir
# filter (line 1124)
def filter(enumerable, fun) do
reduce(enumerable, [], R.filter(fun)) |> :lists.reverse()
end
# map (line 1733)
def map(enumerable, fun) do
reduce(enumerable, [], R.map(fun)) |> :lists.reverse()
end
# reject (line 2697)
def reject(enumerable, fun) do
reduce(enumerable, [], R.reject(fun)) |> :lists.reverse()
end
```
**Why:** Prepending to a linked list is O(1); appending is O(n). Building reversed then flipping once is O(n) total. Appending each element would be O(n²).
**Anti-pattern:** Appending to the end of a list in a loop:
```elixir
# BAD — O(n²)
def map(enumerable, fun) do
reduce(enumerable, [], fn x, acc -> acc ++ [fun.(x)] end)
end
```
### When to Use
**Triggers:** You're building a result list element-by-element through recursion or reduce, and the output order must match the input order. The collection can be any size.
**Example — before:**
```elixir
def keep_positives(list) do
Enum.reduce(list, [], fn x, acc ->
if x > 0, do: acc ++ [x], else: acc
end)
end
```
**Example — after:**
```elixir
def keep_positives(list) do
Enum.reduce(list, [], fn x, acc ->
if x > 0, do: [x | acc], else: acc
end)
|> :lists.reverse()
end
```
### When NOT to Use
**Don't use this when:** Order doesn't matter (e.g., building a set of unique items, collecting into a MapSet), or when you're only extracting a single value (sum, count, max). Also unnecessary if you're using `Enum.map/2` or `Enum.filter/2` directly — they already do this internally.
**Over-application example:**
```elixir
# Unnecessary — order doesn't matter for uniqueness
def unique_tags(items) do
Enum.reduce(items, [], fn item, acc ->
if item.tag in acc, do: acc, else: [item.tag | acc]
end)
|> :lists.reverse() # why reverse if you're just checking membership?
end
```
**Better alternative:** Use a MapSet or just don't reverse:
```elixir
def unique_tags(items) do
Enum.reduce(items, MapSet.new(), fn item, acc ->
MapSet.put(acc, item.tag)
end)
end
```
**Why:** The reverse adds O(n) work and a full list traversal. If you don't care about order, skip it. If you're collecting into a non-list structure, this pattern doesn't apply.
---
## 3. Pipeline for Linear Transformations, Bare Calls for Control Flow
**Source:** [lib/elixir/lib/enum.ex#L1684](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/enum.ex#L1684), 1551, vs 496502
**What it does:** Elixir core uses `|>` when data flows linearly through 2+ transformations. It does NOT use `|>` for single-step operations or when the first argument is computed by a `case`/`if`/`with`.
```elixir
# Pipeline: data flows through multiple transforms (line 1684-1685)
def map_join(enumerable, joiner \\ "", mapper) do
enumerable
|> map(&entry_to_string(mapper.(&1)))
|> intersperse(joiner)
|> IO.iodata_to_binary()
end
# NO pipeline: single step or control flow (line 496-502)
def at(enumerable, index, default \\ nil) when is_integer(index) do
case slice_forward(enumerable, index, 1, 1) do
[value] -> value
[] -> default
end
end
```
**Why:** Pipelines communicate "data flows through transformations." Using them for a single function call or wrapping around control flow obscures intent rather than clarifying it.
**Anti-pattern:** Pipelines for single operations or wrapping control flow:
```elixir
# BAD — single step, no pipeline needed
list |> Enum.reverse()
# BAD — control flow awkwardly forced into a pipe
result
|> case do
{:ok, x} -> x
:error -> nil
end
```
### When to Use
**Triggers:** Data flows through 2 or more transformations in sequence, each taking the previous result as its first argument. The reader should see a "conveyor belt" of operations.
**Example — before:**
```elixir
def format_names(users) do
String.upcase(Enum.join(Enum.map(users, & &1.name), ", "))
end
```
**Example — after:**
```elixir
def format_names(users) do
users
|> Enum.map(& &1.name)
|> Enum.join(", ")
|> String.upcase()
end
```
### When NOT to Use
**Don't use this when:** There's only one transformation, the result needs to go into a pattern match (`case`/`with`), or the pipe would require anonymous function wrapping (`|> then(fn x -> ... end)`) to fit.
**Over-application example:**
```elixir
# Forced — then/1 wrapper defeats readability
params
|> Map.get(:user_id)
|> then(fn id ->
case Repo.get(User, id) do
nil -> {:error, :not_found}
user -> {:ok, user}
end
end)
```
**Better alternative:**
```elixir
case Repo.get(User, Map.get(params, :user_id)) do
nil -> {:error, :not_found}
user -> {:ok, user}
end
```
**Why:** Pipes are for linear data flow. When you need branching (case/cond/with), break out of the pipeline. Forcing control flow through `then/1` adds indirection without clarity.
---
## 4. Pipeline Ending with `|> elem(1)` (Protocol Reduce Unwrap)
**Source:** [lib/elixir/lib/enum.ex#L363](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/enum.ex#L363), 403, 433, 468, 725, 1022, 2676
**What it does:** When calling `Enumerable.reduce/3` directly, the result is always `{:done | :halted | :suspended, acc}`. Core extracts the accumulator with `|> elem(1)`.
```elixir
# all?/1 (line 363)
def all?(enumerable) do
Enumerable.reduce(enumerable, {:cont, true}, fn entry, _ ->
if entry, do: {:cont, true}, else: {:halt, false}
end)
|> elem(1)
end
# reduce/3 (line 2676)
def reduce(enumerable, acc, fun) do
Enumerable.reduce(enumerable, {:cont, acc}, fun) |> elem(1)
end
```
**Why:** The protocol returns tagged tuples for the state machine (cont/halt/suspend). End users don't need the tag — only the accumulated value. `|> elem(1)` is the idiomatic unwrap.
**Anti-pattern:** Using `case` when you don't care about the tag:
```elixir
# BAD — unnecessary pattern match when you always want the value
case Enumerable.reduce(enumerable, {:cont, acc}, fun) do
{:done, result} -> result
{:halted, result} -> result
end
```
### When to Use
**Triggers:** You're calling `Enumerable.reduce/3` directly (implementing a custom Enum-like function) and you always want the accumulated value regardless of whether iteration completed or halted.
**Example — before:**
```elixir
def sum_until(enumerable, limit) do
result = Enumerable.reduce(enumerable, {:cont, 0}, fn x, acc ->
new = acc + x
if new >= limit, do: {:halt, new}, else: {:cont, new}
end)
case result do
{:done, val} -> val
{:halted, val} -> val
end
end
```
**Example — after:**
```elixir
def sum_until(enumerable, limit) do
Enumerable.reduce(enumerable, {:cont, 0}, fn x, acc ->
new = acc + x
if new >= limit, do: {:halt, new}, else: {:cont, new}
end)
|> elem(1)
end
```
### When NOT to Use
**Don't use this when:** You need to distinguish between `:done` and `:halted` to decide subsequent behavior (e.g., you want to know if iteration was interrupted). Also don't use in application code where you should be using `Enum.reduce/3` (which handles unwrapping for you).
**Over-application example:**
```elixir
# Pointless — Enum.reduce already unwraps
Enum.reduce([1, 2, 3], 0, &(&1 + &2)) |> elem(1)
# This crashes! Enum.reduce returns the value directly, not a tuple.
```
**Better alternative:** Use `Enum.reduce/3` in application code. Only use the `|> elem(1)` pattern when directly calling `Enumerable.reduce/3` in library code.
**Why:** This pattern is for protocol implementers, not application developers. Using it on already-unwrapped results causes crashes. It's an internal idiom that shouldn't leak into regular code.
---
## 5. Private Helper Decomposition: Recursive Workers with Guards
**Source:** [lib/elixir/lib/enum.ex#L4975](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/enum.ex#L4975), 50255039
**What it does:** Complex operations are split into a public entry point (with validation guards) and a private recursive worker function. The worker uses pattern matching on structure (empty list, head|tail) and guards on counters.
```elixir
# Public entry: validates, delegates (line 890-904)
def drop(enumerable, amount)
when is_list(enumerable) and is_integer(amount) and amount >= 0 do
drop_list(enumerable, amount)
end
# Private worker: pattern matches on structure (line 4975-4983)
defp split_list([head | tail], counter, acc) when counter > 0 do
split_list(tail, counter - 1, [head | acc])
end
defp split_list(list, 0, acc) do
{:lists.reverse(acc), list}
end
defp split_list([], _, acc) do
{:lists.reverse(acc), []}
end
```
**Why:** Separating validation from recursion keeps each clause focused. Guards in function heads enable the BEAM to optimize dispatch with jump tables. No runtime `if`/`cond` needed.
**Anti-pattern:** Mixing validation, edge cases, and recursion in a single function with internal conditionals:
```elixir
# BAD — one big function with nested ifs
defp split_list(list, counter, acc) do
if counter > 0 and list != [] do
[head | tail] = list
split_list(tail, counter - 1, [head | acc])
else
{:lists.reverse(acc), list}
end
end
```
### When to Use
**Triggers:** You're writing a recursive function that processes a list element-by-element with termination conditions (counter hits zero, list becomes empty, accumulator reaches a threshold). Multiple base cases exist.
**Example — before:**
```elixir
defp take_while_impl(list, fun, acc) do
case list do
[] -> :lists.reverse(acc)
[head | tail] ->
if fun.(head) do
take_while_impl(tail, fun, [head | acc])
else
:lists.reverse(acc)
end
end
end
```
**Example — after:**
```elixir
defp take_while_impl([], _fun, acc) do
:lists.reverse(acc)
end
defp take_while_impl([head | tail], fun, acc) do
if fun.(head) do
take_while_impl(tail, fun, [head | acc])
else
:lists.reverse(acc)
end
end
```
### When NOT to Use
**Don't use this when:** The logic doesn't recurse (a simple one-shot transformation), or when `Enum` functions already express the operation clearly. Don't decompose for decomposition's sake.
**Over-application example:**
```elixir
# Over-engineered — this is just Enum.take/2
defp my_take_list([], _n, acc), do: :lists.reverse(acc)
defp my_take_list(_list, 0, acc), do: :lists.reverse(acc)
defp my_take_list([h | t], n, acc), do: my_take_list(t, n - 1, [h | acc])
def my_take(list, n), do: my_take_list(list, n, [])
```
**Better alternative:**
```elixir
def my_take(list, n), do: Enum.take(list, n)
```
**Why:** Elixir's standard library already provides optimized implementations of common list operations. Writing your own recursive versions adds maintenance burden and likely performs worse (Enum's list clauses call Erlang BIFs). Reserve this pattern for genuinely novel recursion.
---
## 6. Enum vs Stream Decision Pattern
**Source:** [lib/elixir/lib/stream.ex#L1](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/stream.ex#L1) (module docs), `lib/elixir/lib/enum.ex`
**What it does:** Enum functions are eager (materialize intermediate lists). Stream functions are lazy (build computation recipes). Core uses Stream for:
- Infinite sequences (`cycle`, `iterate`, `repeatedly`)
- Resource management (`resource/3`)
- Composing transformations to execute in a single pass
```elixir
# Stream: builds a recipe, zero computation until consumed (stream.ex ~line 490)
def map(enum, fun) when is_function(fun, 1) do
lazy(enum, fn f1 -> R.map(fun, f1) end)
end
# Enum: immediately materializes the result (enum.ex line 1723)
def map(enumerable, fun) when is_list(enumerable) do
:lists.map(fun, enumerable)
end
```
**Why:** From Stream docs (lines 3741): "When chaining many operations with `Enum`, intermediate lists are created, while `Stream` creates a recipe of computations that are executed at a later moment."
Use Enum when:
- You need the full result now
- The collection is small/bounded
- You only chain 12 operations
Use Stream when:
- The collection is large or infinite
- You chain many transformations
- You need resource cleanup (file handles, network)
- You want single-pass processing
**Anti-pattern:** Using Stream for small bounded collections (overhead of the lazy machinery exceeds any benefit):
```elixir
# BAD — Stream overhead for trivial transform
[1, 2, 3] |> Stream.map(&(&1 * 2)) |> Enum.to_list()
# GOOD — just use Enum
[1, 2, 3] |> Enum.map(&(&1 * 2))
```
### When to Use
**Triggers:** You're chaining 3+ transformations on a large (or unbounded) collection. You're reading from a file/network where you want backpressure. You need `Stream.resource/3` for cleanup guarantees.
**Example — before:**
```elixir
# Materializes 3 intermediate lists for a 1M-line file
File.read!("large.csv")
|> String.split("\n")
|> Enum.map(&String.trim/1)
|> Enum.filter(&(&1 != ""))
|> Enum.map(&parse_row/1)
|> Enum.take(100)
```
**Example — after:**
```elixir
# Single pass, constant memory, stops after 100
File.stream!("large.csv")
|> Stream.map(&String.trim/1)
|> Stream.reject(&(&1 == ""))
|> Stream.map(&parse_row/1)
|> Enum.take(100)
```
### When NOT to Use
**Don't use this when:** The collection is small and bounded (under ~1000 elements), you only apply 12 transformations, or you need random access to the full result. Stream's lazy machinery has overhead that exceeds the savings for small data.
**Over-application example:**
```elixir
# Stream overhead exceeds any benefit for 5 items
config = [:a, :b, :c, :d, :e]
config
|> Stream.map(&Atom.to_string/1)
|> Stream.map(&String.upcase/1)
|> Enum.to_list()
```
**Better alternative:**
```elixir
config
|> Enum.map(&(&1 |> Atom.to_string() |> String.upcase()))
```
**Why:** Stream wraps each step in a closure and creates a lazy struct. For small collections, the allocation and indirection cost more than just building the intermediate list. The breakeven point is roughly when collections exceed hundreds of elements AND you chain 3+ operations.
---
## 7. Map.update vs Map.put Decision Pattern
**Source:** [lib/elixir/lib/map.ex#L670](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/map.ex#L670)
**What it does:** `Map.update/4` transforms an existing value based on its current state. `Map.put/3` unconditionally sets a value regardless of current state.
```elixir
# Map.update/4 (line 682-693): transform based on current value
def update(map, key, default, fun) when is_function(fun, 1) do
case map do
%{^key => value} ->
%{map | key => fun.(value)}
%{} ->
put(map, key, default)
other ->
:erlang.error({:badmap, other}, [map, key, default, fun])
end
end
# Map.put/3 (line 636): unconditional set
def put(map, key, value) do
:maps.put(key, value, map)
end
```
**Why:** `update/4` is for when the new value depends on the old value (counters, appending to nested lists). `put/3` is for when you know the exact new value regardless of what was there.
**Anti-pattern:** Using `get` + `put` when `update` expresses intent:
```elixir
# BAD — two lookups, unclear intent
count = Map.get(map, :count, 0)
Map.put(map, :count, count + 1)
# GOOD — single lookup, clear intent
Map.update(map, :count, 1, &(&1 + 1))
```
### When to Use
**Triggers:** The new value is computed FROM the old value — incrementing counters, appending to lists, toggling booleans, merging nested maps. You also need a sensible default for the "key doesn't exist yet" case.
**Example — before:**
```elixir
def add_tag(state, tag) do
existing = Map.get(state, :tags, [])
Map.put(state, :tags, [tag | existing])
end
```
**Example — after:**
```elixir
def add_tag(state, tag) do
Map.update(state, :tags, [tag], fn tags -> [tag | tags] end)
end
```
### When NOT to Use
**Don't use this when:** The new value is independent of the old value (you're replacing, not transforming). Also avoid when you need to handle the "missing key" case differently from "present key" (use `Map.get_and_update/3` or explicit `case` instead).
**Over-application example:**
```elixir
# Awkward — the "update" function ignores the old value entirely
Map.update(user, :name, new_name, fn _old -> new_name end)
```
**Better alternative:**
```elixir
Map.put(user, :name, new_name)
```
**Why:** `Map.update/4` communicates "the new value depends on the old one." When you ignore the old value in the update function, you're lying to the reader. Use `put/3` for unconditional replacement — it's simpler and signals intent correctly.
---
## 8. Pattern Matching on Map Structure for Dispatch
**Source:** [lib/elixir/lib/map.ex#L398](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/map.ex#L398), 509, 586
**What it does:** Map functions use `case map do %{^key => value} -> ...` to dispatch on whether a key exists, rather than calling `has_key?` + conditional.
```elixir
# Map.get/3 (line 586-594)
def get(map, key, default \\ nil) do
case map do
%{^key => value} ->
value
%{} ->
default
other ->
:erlang.error({:badmap, other}, [map, key, default])
end
end
# Map.put_new/3 (line 398-407)
def put_new(map, key, value) do
case map do
%{^key => _value} ->
map
%{} ->
put(map, key, value)
other ->
:erlang.error({:badmap, other})
end
end
```
**Why:** Pattern matching with `%{^key => value}` does the lookup AND extraction in one step. The `%{}` clause (empty map pattern) matches any map where the key is NOT present. The `other` clause provides a clear error for non-maps. This is both more efficient and more readable than `if Map.has_key?(map, key)`.
**Anti-pattern:**
```elixir
# BAD — double lookup, less clear
def get(map, key, default) do
if Map.has_key?(map, key) do
:maps.get(key, map)
else
default
end
end
```
### When to Use
**Triggers:** You need to branch based on whether a key exists in a map, especially when you also want the value if it does exist. You want a single lookup that both checks existence and extracts the value.
**Example — before:**
```elixir
def fetch_config(config, key) do
if Map.has_key?(config, key) do
{:ok, Map.get(config, key)}
else
{:error, :missing}
end
end
```
**Example — after:**
```elixir
def fetch_config(config, key) do
case config do
%{^key => value} -> {:ok, value}
%{} -> {:error, :missing}
end
end
```
### When NOT to Use
**Don't use this when:** You're checking for multiple keys simultaneously (use a multi-key pattern match instead), or when `Map.get/3` with a default already expresses what you need. Don't use `case` dispatch for simple "get with fallback" scenarios.
**Over-application example:**
```elixir
# Over-engineered — Map.get/3 already does this
def get_name(user) do
case user do
%{:name => name} -> name
%{} -> "Anonymous"
end
end
```
**Better alternative:**
```elixir
def get_name(user) do
Map.get(user, :name, "Anonymous")
end
```
**Why:** `Map.get/3` already implements this exact pattern internally. Rewriting it as an explicit `case` adds visual noise without any semantic or performance benefit. Use the case pattern when you're doing something `Map.get` can't — like returning different tagged tuples or triggering side effects.
---
## 9. Delegating to Erlang BIFs with `defdelegate`
**Source:** [lib/elixir/lib/map.ex#L127](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/map.ex#L127), 143, 159, 173
**What it does:** When an Erlang function already does exactly what's needed, Elixir delegates directly rather than wrapping.
```elixir
@spec keys(map) :: [key]
defdelegate keys(map), to: :maps
@spec values(map) :: [value]
defdelegate values(map), to: :maps
@spec merge(map, map) :: map
defdelegate merge(map1, map2), to: :maps
```
**Why:** Zero overhead — the compiler inlines these. No point wrapping an Erlang BIF just to have an Elixir wrapper when the semantics are identical. The `@compile {:inline, ...}` annotation on line 115 makes this explicit.
**Anti-pattern:** Wrapping without adding value:
```elixir
# BAD — pointless wrapper
def keys(map) do
:maps.keys(map)
end
```
### When to Use
**Triggers:** An Erlang module already exports a function with the exact semantics you need. The argument order matches. You want to expose it under an Elixir-idiomatic name or in your module's namespace for discoverability.
**Example — before:**
```elixir
defmodule MyQueue do
def new, do: :queue.new()
def push(q, item), do: :queue.in(item, q)
def pop(q), do: :queue.out(q)
end
```
**Example — after:**
```elixir
defmodule MyQueue do
defdelegate new(), to: :queue
# Can't delegate push — argument order differs, needs wrapper
def push(q, item), do: :queue.in(item, q)
defdelegate pop(q), to: :queue, as: :out
end
```
### When NOT to Use
**Don't use this when:** You need to validate inputs, transform arguments, change argument order, add defaults, or adapt the return value. Also avoid when the Erlang function has unclear semantics that benefit from a documenting wrapper.
**Over-application example:**
```elixir
# Broken — Erlang arg order is (key, map), Elixir convention is (map, key)
defdelegate get(map, key), to: :maps
# This compiles but has wrong argument order expectations!
```
**Better alternative:**
```elixir
def get(map, key) do
:maps.get(key, map)
end
```
**Why:** `defdelegate` is a transparent pass-through. If argument order, defaults, validation, or error handling differ between your desired API and the Erlang function, you need a real wrapper. Delegating with a semantic mismatch creates subtle bugs.
---
## 10. Reduce as the Universal Primitive
**Source:** [lib/elixir/lib/enum.ex#L19](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/enum.ex#L19), 26602676
**What it does:** Nearly every Enum operation is built on top of `reduce`. The Enumerable protocol's core function is `reduce/3`. Everything else (`count`, `member?`, `slice`) is an optimization hint.
```elixir
# From the protocol docs (line 19-21):
def map(enumerable, fun) do
reducer = fn x, acc -> {:cont, [fun.(x) | acc]} end
Enumerable.reduce(enumerable, {:cont, []}, reducer) |> elem(1) |> :lists.reverse()
end
# The actual reduce/3 (line 2676):
def reduce(enumerable, acc, fun) do
Enumerable.reduce(enumerable, {:cont, acc}, fun) |> elem(1)
end
```
**Why:** Reduce is the most general iteration primitive. By building all operations on reduce, any data structure that implements `Enumerable.reduce/3` automatically gets the full `Enum` API. This is the protocol + reduce = universal composability pattern.
**Anti-pattern:** Implementing each Enum function independently for each data structure:
```elixir
# BAD — reimplementing map for each type
def map(%MyStruct{items: items}, fun), do: ...
def filter(%MyStruct{items: items}, fun), do: ...
# Instead: implement Enumerable.reduce/3 once and get everything
```
### When to Use
**Triggers:** You're implementing a custom data structure that should be iterable. You want the full `Enum` API without implementing each function. You're designing a protocol where one function provides maximum leverage.
**Example — before:**
```elixir
defmodule RingBuffer do
def map(%RingBuffer{} = rb, fun), do: ...
def filter(%RingBuffer{} = rb, fun), do: ...
def reduce(%RingBuffer{} = rb, acc, fun), do: ...
def count(%RingBuffer{} = rb), do: ...
# 70+ functions to implement...
end
```
**Example — after:**
```elixir
defimpl Enumerable, for: RingBuffer do
def reduce(%RingBuffer{data: data, head: h, size: s}, acc, fun) do
# One function — yields elements in order
do_reduce(data, h, s, acc, fun)
end
def count(%RingBuffer{size: s}), do: {:ok, s}
def member?(_, _), do: {:error, __MODULE__}
def slice(_), do: {:error, __MODULE__}
end
# Now Enum.map/filter/take/etc. all work automatically
```
### When NOT to Use
**Don't use this when:** Your data structure has specialized algorithms that are significantly faster than the generic reduce-based approach (e.g., binary search on a sorted structure for `member?`). In that case, implement the specific protocol callbacks.
**Over-application example:**
```elixir
# Wasteful — reduce traverses all elements to count a structure with O(1) size
defimpl Enumerable, for: SizedCollection do
def count(_), do: {:error, __MODULE__}
# This forces Enum.count to use reduce: O(n)
# when the size is stored in a field: O(1)
end
```
**Better alternative:**
```elixir
defimpl Enumerable, for: SizedCollection do
def count(%{size: s}), do: {:ok, s}
# Now Enum.count is O(1)
end
```
**Why:** The optimization callbacks (`count`, `member?`, `slice`) exist precisely because reduce is O(n) for operations that some structures can do faster. Use reduce as the universal fallback, but implement the fast paths when your structure supports them.
---
## 11. Keyword Multi-Clause Guard Dispatch (String.split pattern)
**Source:** [lib/elixir/lib/string.ex#L516](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/string.ex#L516)
**What it does:** Functions with many input shapes use multiple `def` clauses with guards to dispatch, handling each case distinctly rather than using internal `cond`/`case`.
```elixir
def split(string, %Regex{} = pattern, options) when is_binary(string) and is_list(options) do
Regex.split(pattern, string, options)
end
def split(string, "", options) when is_binary(string) and is_list(options) do
# special case: split by empty string (grapheme-by-grapheme)
...
end
def split(string, [], options) when is_binary(string) and is_list(options) do
# empty pattern list: no splitting
...
end
def split(string, pattern, options) when is_binary(string) and is_list(options) do
# general binary pattern case
...
end
```
**Why:** Each clause has a single responsibility. The BEAM compiler generates efficient dispatch for these patterns. Adding a new case is additive (new clause) rather than modifying existing logic.
**Anti-pattern:** One function with nested conditionals:
```elixir
# BAD — all cases mashed into one body
def split(string, pattern, options) do
cond do
is_struct(pattern, Regex) -> ...
pattern == "" -> ...
pattern == [] -> ...
true -> ...
end
end
```
### When to Use
**Triggers:** A function accepts multiple distinct input shapes (different types, specific sentinel values, structural patterns). Each shape requires substantially different handling. The shapes are distinguishable via guards or pattern matching.
**Example — before:**
```elixir
def parse(input, format) do
cond do
format == :json -> Jason.decode!(input)
format == :yaml -> YamlElixir.read_from_string!(input)
is_binary(format) -> custom_parse(input, format)
true -> raise "unknown format"
end
end
```
**Example — after:**
```elixir
def parse(input, :json) when is_binary(input), do: Jason.decode!(input)
def parse(input, :yaml) when is_binary(input), do: YamlElixir.read_from_string!(input)
def parse(input, format) when is_binary(input) and is_binary(format), do: custom_parse(input, format)
```
### When NOT to Use
**Don't use this when:** The differences between cases are minor (a single flag toggles a small behavior), or when you'd end up with 10+ nearly-identical clauses that differ by one line. Also avoid when the distinguishing condition can't be expressed in a guard (e.g., requires a database lookup).
**Over-application example:**
```elixir
# Absurd — 5 clauses that differ only in a multiplier
def convert(value, :mm), do: value * 1.0
def convert(value, :cm), do: value * 10.0
def convert(value, :m), do: value * 1000.0
def convert(value, :km), do: value * 1_000_000.0
def convert(value, :in), do: value * 25.4
```
**Better alternative:**
```elixir
@multipliers %{mm: 1.0, cm: 10.0, m: 1000.0, km: 1_000_000.0, in: 25.4}
def convert(value, unit) when is_map_key(@multipliers, unit) do
value * @multipliers[unit]
end
```
**Why:** When clauses share identical structure and differ only by data, a lookup table is cleaner and more maintainable. Multi-clause dispatch shines when each case has genuinely different logic, not just different constants.
---
## 12. Lazy Private Helpers with `defp parts_to_index`
**Source:** [lib/elixir/lib/string.ex#L562](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/string.ex#L562)
**What it does:** Tiny private helpers that convert between API-level concepts and implementation-level values use single-line `defp` with guards.
```elixir
defp parts_to_index(:infinity), do: 0
defp parts_to_index(n) when is_integer(n) and n > 0, do: n
```
**Why:** Clear, self-documenting dispatch. Each case is one line. No branching logic in the caller. The function name explains the conversion.
**Anti-pattern:** Inline conditional in the caller:
```elixir
# BAD — logic scattered in caller
index = if parts == :infinity, do: 0, else: parts
```
### When to Use
**Triggers:** You have a small, well-defined mapping between API-level values and internal representations. The conversion appears in multiple places, or the mapping is non-obvious enough to deserve a name.
**Example — before:**
```elixir
def fetch(resource, timeout) do
ms = if timeout == :infinity, do: 0, else: timeout * 1000
do_fetch(resource, ms)
end
```
**Example — after:**
```elixir
def fetch(resource, timeout) do
do_fetch(resource, timeout_to_ms(timeout))
end
defp timeout_to_ms(:infinity), do: :infinity
defp timeout_to_ms(seconds) when is_number(seconds) and seconds >= 0, do: round(seconds * 1000)
```
### When NOT to Use
**Don't use this when:** The conversion is trivial and only used once (a single `if` is clearer than a named function for `x + 1`), or when the mapping has many entries that would be better served by a lookup map.
**Over-application example:**
```elixir
# Over-engineered — named function for trivial identity-like conversion
defp ensure_string(s) when is_binary(s), do: s
defp ensure_string(a) when is_atom(a), do: Atom.to_string(a)
# Used exactly once:
def log(msg), do: IO.puts(ensure_string(msg))
```
**Better alternative:**
```elixir
def log(msg) when is_binary(msg), do: IO.puts(msg)
def log(msg) when is_atom(msg), do: IO.puts(Atom.to_string(msg))
```
**Why:** When a conversion is used exactly once and the calling function already dispatches on clauses, folding the conversion into the caller's clauses reduces indirection. Named helpers shine when reused or when they name a non-obvious transformation.
## Decision Tree
- If you accept "any enumerable" but lists are the common case → add a `when is_list` clause before protocol dispatch (Pattern 1)
- If you are building a result list element-by-element and order matters → prepend with `[x | acc]` then reverse at the end (Pattern 2)
- If data flows through 2+ sequential transformations → use the pipe operator (Pattern 3)
- If you call `Enumerable.reduce/3` directly and always want the accumulated value → unwrap with `|> elem(1)` (Pattern 4)
- If you need a recursive function with multiple termination conditions → decompose into public entry + private multi-clause worker (Pattern 5)
- If the collection is large/infinite or you chain 3+ transforms → use Stream; otherwise use Enum (Pattern 6)
- If the new value depends on the old value (increment, append) → use `Map.update/4`; if replacing unconditionally → use `Map.put/3` (Pattern 7)
- If you need to branch on whether a key exists and extract the value → pattern-match with `%{^key => value}` in a `case` (Pattern 8)
- If an Erlang function has identical semantics and argument order → use `defdelegate` (Pattern 9)
- If you are implementing a custom iterable data structure → implement `Enumerable.reduce/3` to get the full Enum API (Pattern 10)
<!-- PATTERN_COMPLETE -->