10218813d3
- Add ## Contents and ## Decision Tree to all 10 existing pattern files - Fix embed_as/1 semantics inversion in types.md (:self → :dump) - Fix fabricated __meta__.changes reference in changesets.md - Fix default primary key type (:integer → :id) in schemas.md - Combine @impl subsections into single "Minimal Callback Annotation"
1042 lines
36 KiB
Markdown
1042 lines
36 KiB
Markdown
# Data Transform & Pipeline Patterns in Elixir Core
|
||
|
||
Patterns extracted from Elixir's standard library source code.
|
||
|
||
## Contents
|
||
|
||
1. [List-Specialized Clause Before Protocol Dispatch](#1-list-specialized-clause-before-protocol-dispatch)
|
||
2. [Build-Then-Reverse (Cons-Cell Accumulation)](#2-build-then-reverse-cons-cell-accumulation)
|
||
3. [Pipeline for Linear Transformations, Bare Calls for Control Flow](#3-pipeline-for-linear-transformations-bare-calls-for-control-flow)
|
||
4. [Pipeline Ending with `|> elem(1)` (Protocol Reduce Unwrap)](#4-pipeline-ending-with--elem1-protocol-reduce-unwrap)
|
||
5. [Private Helper Decomposition: Recursive Workers with Guards](#5-private-helper-decomposition-recursive-workers-with-guards)
|
||
6. [Enum vs Stream Decision Pattern](#6-enum-vs-stream-decision-pattern)
|
||
7. [Map.update vs Map.put Decision Pattern](#7-mapupdate-vs-mapput-decision-pattern)
|
||
8. [Pattern Matching on Map Structure for Dispatch](#8-pattern-matching-on-map-structure-for-dispatch)
|
||
9. [Delegating to Erlang BIFs with `defdelegate`](#9-delegating-to-erlang-bifs-with-defdelegate)
|
||
10. [Reduce as the Universal Primitive](#10-reduce-as-the-universal-primitive)
|
||
11. [Keyword Multi-Clause Guard Dispatch (String.split pattern)](#11-keyword-multi-clause-guard-dispatch-stringsplit-pattern)
|
||
12. [Lazy Private Helpers with `defp parts_to_index`](#12-lazy-private-helpers-with-defp-parts_to_index)
|
||
|
||
---
|
||
|
||
## 1. List-Specialized Clause Before Protocol Dispatch
|
||
|
||
**Source:** [lib/elixir/lib/enum.ex#L1723](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/enum.ex#L1723)
|
||
|
||
**What it does:** Every public Enum function defines a `when is_list(enumerable)` clause first, then a generic fallback that uses the Enumerable protocol.
|
||
|
||
```elixir
|
||
def map(enumerable, fun) when is_list(enumerable) do
|
||
:lists.map(fun, enumerable)
|
||
end
|
||
|
||
def map(first..last//step, fun) do
|
||
map_range(first, last, step, fun)
|
||
end
|
||
|
||
def map(enumerable, fun) do
|
||
reduce(enumerable, [], R.map(fun)) |> :lists.reverse()
|
||
end
|
||
```
|
||
|
||
**Why:** Lists are by far the most common enumerable. Matching them first avoids protocol dispatch overhead entirely (direct Erlang BIF call). The range clause is a further optimization for a common case. The generic clause handles all other enumerables through the protocol.
|
||
|
||
**Anti-pattern:** A single clause that always goes through protocol dispatch:
|
||
```elixir
|
||
# BAD — forces protocol overhead even for lists
|
||
def map(enumerable, fun) do
|
||
Enumerable.reduce(enumerable, {:cont, []}, fn x, acc ->
|
||
{:cont, [fun.(x) | acc]}
|
||
end) |> elem(1) |> :lists.reverse()
|
||
end
|
||
```
|
||
|
||
### When to Use
|
||
|
||
**Triggers:** You have a public function that accepts "any enumerable" but lists account for the majority of callers. Profiling shows protocol dispatch is a measurable cost. You can call an Erlang BIF or a direct recursive implementation for the list case.
|
||
|
||
**Example — before:**
|
||
```elixir
|
||
def sum(enumerable) do
|
||
Enumerable.reduce(enumerable, {:cont, 0}, fn x, acc -> {:cont, acc + x} end)
|
||
|> elem(1)
|
||
end
|
||
```
|
||
|
||
**Example — after:**
|
||
```elixir
|
||
def sum(enumerable) when is_list(enumerable) do
|
||
:lists.foldl(fn x, acc -> acc + x end, 0, enumerable)
|
||
end
|
||
|
||
def sum(enumerable) do
|
||
Enumerable.reduce(enumerable, {:cont, 0}, fn x, acc -> {:cont, acc + x} end)
|
||
|> elem(1)
|
||
end
|
||
```
|
||
|
||
### When NOT to Use
|
||
|
||
**Don't use this when:** The function is rarely called with lists, or the function body is complex enough that maintaining two implementations creates a bug risk. Also avoid when the protocol path is already fast enough (micro-optimization for non-hot paths).
|
||
|
||
**Over-application example:**
|
||
```elixir
|
||
# Pointless — this function is only ever called with streams
|
||
def expensive_transform(enumerable) when is_list(enumerable) do
|
||
# duplicate complex logic just in case a list shows up
|
||
enumerable |> do_phase_1() |> do_phase_2() |> do_phase_3()
|
||
end
|
||
|
||
def expensive_transform(enumerable) do
|
||
enumerable |> do_phase_1() |> do_phase_2() |> do_phase_3()
|
||
end
|
||
```
|
||
|
||
**Better alternative:** Keep one clause. Add the specialization only when profiling proves the protocol dispatch is a bottleneck for real workloads.
|
||
|
||
**Why:** Premature optimization. Two copies of the same logic means two places to fix bugs. The BEAM's protocol dispatch is already highly optimized — you need evidence before duplicating.
|
||
|
||
---
|
||
|
||
## 2. Build-Then-Reverse (Cons-Cell Accumulation)
|
||
|
||
**Source:** [lib/elixir/lib/enum.ex#L1124](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/enum.ex#L1124), 1733, 2697
|
||
|
||
**What it does:** Accumulates results by prepending to a list (`[x | acc]`), then reverses at the end.
|
||
|
||
```elixir
|
||
# filter (line 1124)
|
||
def filter(enumerable, fun) do
|
||
reduce(enumerable, [], R.filter(fun)) |> :lists.reverse()
|
||
end
|
||
|
||
# map (line 1733)
|
||
def map(enumerable, fun) do
|
||
reduce(enumerable, [], R.map(fun)) |> :lists.reverse()
|
||
end
|
||
|
||
# reject (line 2697)
|
||
def reject(enumerable, fun) do
|
||
reduce(enumerable, [], R.reject(fun)) |> :lists.reverse()
|
||
end
|
||
```
|
||
|
||
**Why:** Prepending to a linked list is O(1); appending is O(n). Building reversed then flipping once is O(n) total. Appending each element would be O(n²).
|
||
|
||
**Anti-pattern:** Appending to the end of a list in a loop:
|
||
```elixir
|
||
# BAD — O(n²)
|
||
def map(enumerable, fun) do
|
||
reduce(enumerable, [], fn x, acc -> acc ++ [fun.(x)] end)
|
||
end
|
||
```
|
||
|
||
### When to Use
|
||
|
||
**Triggers:** You're building a result list element-by-element through recursion or reduce, and the output order must match the input order. The collection can be any size.
|
||
|
||
**Example — before:**
|
||
```elixir
|
||
def keep_positives(list) do
|
||
Enum.reduce(list, [], fn x, acc ->
|
||
if x > 0, do: acc ++ [x], else: acc
|
||
end)
|
||
end
|
||
```
|
||
|
||
**Example — after:**
|
||
```elixir
|
||
def keep_positives(list) do
|
||
Enum.reduce(list, [], fn x, acc ->
|
||
if x > 0, do: [x | acc], else: acc
|
||
end)
|
||
|> :lists.reverse()
|
||
end
|
||
```
|
||
|
||
### When NOT to Use
|
||
|
||
**Don't use this when:** Order doesn't matter (e.g., building a set of unique items, collecting into a MapSet), or when you're only extracting a single value (sum, count, max). Also unnecessary if you're using `Enum.map/2` or `Enum.filter/2` directly — they already do this internally.
|
||
|
||
**Over-application example:**
|
||
```elixir
|
||
# Unnecessary — order doesn't matter for uniqueness
|
||
def unique_tags(items) do
|
||
Enum.reduce(items, [], fn item, acc ->
|
||
if item.tag in acc, do: acc, else: [item.tag | acc]
|
||
end)
|
||
|> :lists.reverse() # why reverse if you're just checking membership?
|
||
end
|
||
```
|
||
|
||
**Better alternative:** Use a MapSet or just don't reverse:
|
||
```elixir
|
||
def unique_tags(items) do
|
||
Enum.reduce(items, MapSet.new(), fn item, acc ->
|
||
MapSet.put(acc, item.tag)
|
||
end)
|
||
end
|
||
```
|
||
|
||
**Why:** The reverse adds O(n) work and a full list traversal. If you don't care about order, skip it. If you're collecting into a non-list structure, this pattern doesn't apply.
|
||
|
||
---
|
||
|
||
## 3. Pipeline for Linear Transformations, Bare Calls for Control Flow
|
||
|
||
**Source:** [lib/elixir/lib/enum.ex#L1684](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/enum.ex#L1684), 1551, vs 496–502
|
||
|
||
**What it does:** Elixir core uses `|>` when data flows linearly through 2+ transformations. It does NOT use `|>` for single-step operations or when the first argument is computed by a `case`/`if`/`with`.
|
||
|
||
```elixir
|
||
# Pipeline: data flows through multiple transforms (line 1684-1685)
|
||
def map_join(enumerable, joiner \\ "", mapper) do
|
||
enumerable
|
||
|> map(&entry_to_string(mapper.(&1)))
|
||
|> intersperse(joiner)
|
||
|> IO.iodata_to_binary()
|
||
end
|
||
|
||
# NO pipeline: single step or control flow (line 496-502)
|
||
def at(enumerable, index, default \\ nil) when is_integer(index) do
|
||
case slice_forward(enumerable, index, 1, 1) do
|
||
[value] -> value
|
||
[] -> default
|
||
end
|
||
end
|
||
```
|
||
|
||
**Why:** Pipelines communicate "data flows through transformations." Using them for a single function call or wrapping around control flow obscures intent rather than clarifying it.
|
||
|
||
**Anti-pattern:** Pipelines for single operations or wrapping control flow:
|
||
```elixir
|
||
# BAD — single step, no pipeline needed
|
||
list |> Enum.reverse()
|
||
|
||
# BAD — control flow awkwardly forced into a pipe
|
||
result
|
||
|> case do
|
||
{:ok, x} -> x
|
||
:error -> nil
|
||
end
|
||
```
|
||
|
||
### When to Use
|
||
|
||
**Triggers:** Data flows through 2 or more transformations in sequence, each taking the previous result as its first argument. The reader should see a "conveyor belt" of operations.
|
||
|
||
**Example — before:**
|
||
```elixir
|
||
def format_names(users) do
|
||
String.upcase(Enum.join(Enum.map(users, & &1.name), ", "))
|
||
end
|
||
```
|
||
|
||
**Example — after:**
|
||
```elixir
|
||
def format_names(users) do
|
||
users
|
||
|> Enum.map(& &1.name)
|
||
|> Enum.join(", ")
|
||
|> String.upcase()
|
||
end
|
||
```
|
||
|
||
### When NOT to Use
|
||
|
||
**Don't use this when:** There's only one transformation, the result needs to go into a pattern match (`case`/`with`), or the pipe would require anonymous function wrapping (`|> then(fn x -> ... end)`) to fit.
|
||
|
||
**Over-application example:**
|
||
```elixir
|
||
# Forced — then/1 wrapper defeats readability
|
||
params
|
||
|> Map.get(:user_id)
|
||
|> then(fn id ->
|
||
case Repo.get(User, id) do
|
||
nil -> {:error, :not_found}
|
||
user -> {:ok, user}
|
||
end
|
||
end)
|
||
```
|
||
|
||
**Better alternative:**
|
||
```elixir
|
||
case Repo.get(User, Map.get(params, :user_id)) do
|
||
nil -> {:error, :not_found}
|
||
user -> {:ok, user}
|
||
end
|
||
```
|
||
|
||
**Why:** Pipes are for linear data flow. When you need branching (case/cond/with), break out of the pipeline. Forcing control flow through `then/1` adds indirection without clarity.
|
||
|
||
---
|
||
|
||
## 4. Pipeline Ending with `|> elem(1)` (Protocol Reduce Unwrap)
|
||
|
||
**Source:** [lib/elixir/lib/enum.ex#L363](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/enum.ex#L363), 403, 433, 468, 725, 1022, 2676
|
||
|
||
**What it does:** When calling `Enumerable.reduce/3` directly, the result is always `{:done | :halted | :suspended, acc}`. Core extracts the accumulator with `|> elem(1)`.
|
||
|
||
```elixir
|
||
# all?/1 (line 363)
|
||
def all?(enumerable) do
|
||
Enumerable.reduce(enumerable, {:cont, true}, fn entry, _ ->
|
||
if entry, do: {:cont, true}, else: {:halt, false}
|
||
end)
|
||
|> elem(1)
|
||
end
|
||
|
||
# reduce/3 (line 2676)
|
||
def reduce(enumerable, acc, fun) do
|
||
Enumerable.reduce(enumerable, {:cont, acc}, fun) |> elem(1)
|
||
end
|
||
```
|
||
|
||
**Why:** The protocol returns tagged tuples for the state machine (cont/halt/suspend). End users don't need the tag — only the accumulated value. `|> elem(1)` is the idiomatic unwrap.
|
||
|
||
**Anti-pattern:** Using `case` when you don't care about the tag:
|
||
```elixir
|
||
# BAD — unnecessary pattern match when you always want the value
|
||
case Enumerable.reduce(enumerable, {:cont, acc}, fun) do
|
||
{:done, result} -> result
|
||
{:halted, result} -> result
|
||
end
|
||
```
|
||
|
||
### When to Use
|
||
|
||
**Triggers:** You're calling `Enumerable.reduce/3` directly (implementing a custom Enum-like function) and you always want the accumulated value regardless of whether iteration completed or halted.
|
||
|
||
**Example — before:**
|
||
```elixir
|
||
def sum_until(enumerable, limit) do
|
||
result = Enumerable.reduce(enumerable, {:cont, 0}, fn x, acc ->
|
||
new = acc + x
|
||
if new >= limit, do: {:halt, new}, else: {:cont, new}
|
||
end)
|
||
case result do
|
||
{:done, val} -> val
|
||
{:halted, val} -> val
|
||
end
|
||
end
|
||
```
|
||
|
||
**Example — after:**
|
||
```elixir
|
||
def sum_until(enumerable, limit) do
|
||
Enumerable.reduce(enumerable, {:cont, 0}, fn x, acc ->
|
||
new = acc + x
|
||
if new >= limit, do: {:halt, new}, else: {:cont, new}
|
||
end)
|
||
|> elem(1)
|
||
end
|
||
```
|
||
|
||
### When NOT to Use
|
||
|
||
**Don't use this when:** You need to distinguish between `:done` and `:halted` to decide subsequent behavior (e.g., you want to know if iteration was interrupted). Also don't use in application code where you should be using `Enum.reduce/3` (which handles unwrapping for you).
|
||
|
||
**Over-application example:**
|
||
```elixir
|
||
# Pointless — Enum.reduce already unwraps
|
||
Enum.reduce([1, 2, 3], 0, &(&1 + &2)) |> elem(1)
|
||
# This crashes! Enum.reduce returns the value directly, not a tuple.
|
||
```
|
||
|
||
**Better alternative:** Use `Enum.reduce/3` in application code. Only use the `|> elem(1)` pattern when directly calling `Enumerable.reduce/3` in library code.
|
||
|
||
**Why:** This pattern is for protocol implementers, not application developers. Using it on already-unwrapped results causes crashes. It's an internal idiom that shouldn't leak into regular code.
|
||
|
||
---
|
||
|
||
## 5. Private Helper Decomposition: Recursive Workers with Guards
|
||
|
||
**Source:** [lib/elixir/lib/enum.ex#L4975](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/enum.ex#L4975), 5025–5039
|
||
|
||
**What it does:** Complex operations are split into a public entry point (with validation guards) and a private recursive worker function. The worker uses pattern matching on structure (empty list, head|tail) and guards on counters.
|
||
|
||
```elixir
|
||
# Public entry: validates, delegates (line 890-904)
|
||
def drop(enumerable, amount)
|
||
when is_list(enumerable) and is_integer(amount) and amount >= 0 do
|
||
drop_list(enumerable, amount)
|
||
end
|
||
|
||
# Private worker: pattern matches on structure (line 4975-4983)
|
||
defp split_list([head | tail], counter, acc) when counter > 0 do
|
||
split_list(tail, counter - 1, [head | acc])
|
||
end
|
||
|
||
defp split_list(list, 0, acc) do
|
||
{:lists.reverse(acc), list}
|
||
end
|
||
|
||
defp split_list([], _, acc) do
|
||
{:lists.reverse(acc), []}
|
||
end
|
||
```
|
||
|
||
**Why:** Separating validation from recursion keeps each clause focused. Guards in function heads enable the BEAM to optimize dispatch with jump tables. No runtime `if`/`cond` needed.
|
||
|
||
**Anti-pattern:** Mixing validation, edge cases, and recursion in a single function with internal conditionals:
|
||
```elixir
|
||
# BAD — one big function with nested ifs
|
||
defp split_list(list, counter, acc) do
|
||
if counter > 0 and list != [] do
|
||
[head | tail] = list
|
||
split_list(tail, counter - 1, [head | acc])
|
||
else
|
||
{:lists.reverse(acc), list}
|
||
end
|
||
end
|
||
```
|
||
|
||
### When to Use
|
||
|
||
**Triggers:** You're writing a recursive function that processes a list element-by-element with termination conditions (counter hits zero, list becomes empty, accumulator reaches a threshold). Multiple base cases exist.
|
||
|
||
**Example — before:**
|
||
```elixir
|
||
defp take_while_impl(list, fun, acc) do
|
||
case list do
|
||
[] -> :lists.reverse(acc)
|
||
[head | tail] ->
|
||
if fun.(head) do
|
||
take_while_impl(tail, fun, [head | acc])
|
||
else
|
||
:lists.reverse(acc)
|
||
end
|
||
end
|
||
end
|
||
```
|
||
|
||
**Example — after:**
|
||
```elixir
|
||
defp take_while_impl([], _fun, acc) do
|
||
:lists.reverse(acc)
|
||
end
|
||
|
||
defp take_while_impl([head | tail], fun, acc) do
|
||
if fun.(head) do
|
||
take_while_impl(tail, fun, [head | acc])
|
||
else
|
||
:lists.reverse(acc)
|
||
end
|
||
end
|
||
```
|
||
|
||
### When NOT to Use
|
||
|
||
**Don't use this when:** The logic doesn't recurse (a simple one-shot transformation), or when `Enum` functions already express the operation clearly. Don't decompose for decomposition's sake.
|
||
|
||
**Over-application example:**
|
||
```elixir
|
||
# Over-engineered — this is just Enum.take/2
|
||
defp my_take_list([], _n, acc), do: :lists.reverse(acc)
|
||
defp my_take_list(_list, 0, acc), do: :lists.reverse(acc)
|
||
defp my_take_list([h | t], n, acc), do: my_take_list(t, n - 1, [h | acc])
|
||
|
||
def my_take(list, n), do: my_take_list(list, n, [])
|
||
```
|
||
|
||
**Better alternative:**
|
||
```elixir
|
||
def my_take(list, n), do: Enum.take(list, n)
|
||
```
|
||
|
||
**Why:** Elixir's standard library already provides optimized implementations of common list operations. Writing your own recursive versions adds maintenance burden and likely performs worse (Enum's list clauses call Erlang BIFs). Reserve this pattern for genuinely novel recursion.
|
||
|
||
---
|
||
|
||
## 6. Enum vs Stream Decision Pattern
|
||
|
||
**Source:** [lib/elixir/lib/stream.ex#L1](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/stream.ex#L1) (module docs), `lib/elixir/lib/enum.ex`
|
||
|
||
**What it does:** Enum functions are eager (materialize intermediate lists). Stream functions are lazy (build computation recipes). Core uses Stream for:
|
||
- Infinite sequences (`cycle`, `iterate`, `repeatedly`)
|
||
- Resource management (`resource/3`)
|
||
- Composing transformations to execute in a single pass
|
||
|
||
```elixir
|
||
# Stream: builds a recipe, zero computation until consumed (stream.ex ~line 490)
|
||
def map(enum, fun) when is_function(fun, 1) do
|
||
lazy(enum, fn f1 -> R.map(fun, f1) end)
|
||
end
|
||
|
||
# Enum: immediately materializes the result (enum.ex line 1723)
|
||
def map(enumerable, fun) when is_list(enumerable) do
|
||
:lists.map(fun, enumerable)
|
||
end
|
||
```
|
||
|
||
**Why:** From Stream docs (lines 37–41): "When chaining many operations with `Enum`, intermediate lists are created, while `Stream` creates a recipe of computations that are executed at a later moment."
|
||
|
||
Use Enum when:
|
||
- You need the full result now
|
||
- The collection is small/bounded
|
||
- You only chain 1–2 operations
|
||
|
||
Use Stream when:
|
||
- The collection is large or infinite
|
||
- You chain many transformations
|
||
- You need resource cleanup (file handles, network)
|
||
- You want single-pass processing
|
||
|
||
**Anti-pattern:** Using Stream for small bounded collections (overhead of the lazy machinery exceeds any benefit):
|
||
```elixir
|
||
# BAD — Stream overhead for trivial transform
|
||
[1, 2, 3] |> Stream.map(&(&1 * 2)) |> Enum.to_list()
|
||
|
||
# GOOD — just use Enum
|
||
[1, 2, 3] |> Enum.map(&(&1 * 2))
|
||
```
|
||
|
||
### When to Use
|
||
|
||
**Triggers:** You're chaining 3+ transformations on a large (or unbounded) collection. You're reading from a file/network where you want backpressure. You need `Stream.resource/3` for cleanup guarantees.
|
||
|
||
**Example — before:**
|
||
```elixir
|
||
# Materializes 3 intermediate lists for a 1M-line file
|
||
File.read!("large.csv")
|
||
|> String.split("\n")
|
||
|> Enum.map(&String.trim/1)
|
||
|> Enum.filter(&(&1 != ""))
|
||
|> Enum.map(&parse_row/1)
|
||
|> Enum.take(100)
|
||
```
|
||
|
||
**Example — after:**
|
||
```elixir
|
||
# Single pass, constant memory, stops after 100
|
||
File.stream!("large.csv")
|
||
|> Stream.map(&String.trim/1)
|
||
|> Stream.reject(&(&1 == ""))
|
||
|> Stream.map(&parse_row/1)
|
||
|> Enum.take(100)
|
||
```
|
||
|
||
### When NOT to Use
|
||
|
||
**Don't use this when:** The collection is small and bounded (under ~1000 elements), you only apply 1–2 transformations, or you need random access to the full result. Stream's lazy machinery has overhead that exceeds the savings for small data.
|
||
|
||
**Over-application example:**
|
||
```elixir
|
||
# Stream overhead exceeds any benefit for 5 items
|
||
config = [:a, :b, :c, :d, :e]
|
||
|
||
config
|
||
|> Stream.map(&Atom.to_string/1)
|
||
|> Stream.map(&String.upcase/1)
|
||
|> Enum.to_list()
|
||
```
|
||
|
||
**Better alternative:**
|
||
```elixir
|
||
config
|
||
|> Enum.map(&(&1 |> Atom.to_string() |> String.upcase()))
|
||
```
|
||
|
||
**Why:** Stream wraps each step in a closure and creates a lazy struct. For small collections, the allocation and indirection cost more than just building the intermediate list. The breakeven point is roughly when collections exceed hundreds of elements AND you chain 3+ operations.
|
||
|
||
---
|
||
|
||
## 7. Map.update vs Map.put Decision Pattern
|
||
|
||
**Source:** [lib/elixir/lib/map.ex#L670](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/map.ex#L670)
|
||
|
||
**What it does:** `Map.update/4` transforms an existing value based on its current state. `Map.put/3` unconditionally sets a value regardless of current state.
|
||
|
||
```elixir
|
||
# Map.update/4 (line 682-693): transform based on current value
|
||
def update(map, key, default, fun) when is_function(fun, 1) do
|
||
case map do
|
||
%{^key => value} ->
|
||
%{map | key => fun.(value)}
|
||
%{} ->
|
||
put(map, key, default)
|
||
other ->
|
||
:erlang.error({:badmap, other}, [map, key, default, fun])
|
||
end
|
||
end
|
||
|
||
# Map.put/3 (line 636): unconditional set
|
||
def put(map, key, value) do
|
||
:maps.put(key, value, map)
|
||
end
|
||
```
|
||
|
||
**Why:** `update/4` is for when the new value depends on the old value (counters, appending to nested lists). `put/3` is for when you know the exact new value regardless of what was there.
|
||
|
||
**Anti-pattern:** Using `get` + `put` when `update` expresses intent:
|
||
```elixir
|
||
# BAD — two lookups, unclear intent
|
||
count = Map.get(map, :count, 0)
|
||
Map.put(map, :count, count + 1)
|
||
|
||
# GOOD — single lookup, clear intent
|
||
Map.update(map, :count, 1, &(&1 + 1))
|
||
```
|
||
|
||
### When to Use
|
||
|
||
**Triggers:** The new value is computed FROM the old value — incrementing counters, appending to lists, toggling booleans, merging nested maps. You also need a sensible default for the "key doesn't exist yet" case.
|
||
|
||
**Example — before:**
|
||
```elixir
|
||
def add_tag(state, tag) do
|
||
existing = Map.get(state, :tags, [])
|
||
Map.put(state, :tags, [tag | existing])
|
||
end
|
||
```
|
||
|
||
**Example — after:**
|
||
```elixir
|
||
def add_tag(state, tag) do
|
||
Map.update(state, :tags, [tag], fn tags -> [tag | tags] end)
|
||
end
|
||
```
|
||
|
||
### When NOT to Use
|
||
|
||
**Don't use this when:** The new value is independent of the old value (you're replacing, not transforming). Also avoid when you need to handle the "missing key" case differently from "present key" (use `Map.get_and_update/3` or explicit `case` instead).
|
||
|
||
**Over-application example:**
|
||
```elixir
|
||
# Awkward — the "update" function ignores the old value entirely
|
||
Map.update(user, :name, new_name, fn _old -> new_name end)
|
||
```
|
||
|
||
**Better alternative:**
|
||
```elixir
|
||
Map.put(user, :name, new_name)
|
||
```
|
||
|
||
**Why:** `Map.update/4` communicates "the new value depends on the old one." When you ignore the old value in the update function, you're lying to the reader. Use `put/3` for unconditional replacement — it's simpler and signals intent correctly.
|
||
|
||
---
|
||
|
||
## 8. Pattern Matching on Map Structure for Dispatch
|
||
|
||
**Source:** [lib/elixir/lib/map.ex#L398](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/map.ex#L398), 509, 586
|
||
|
||
**What it does:** Map functions use `case map do %{^key => value} -> ...` to dispatch on whether a key exists, rather than calling `has_key?` + conditional.
|
||
|
||
```elixir
|
||
# Map.get/3 (line 586-594)
|
||
def get(map, key, default \\ nil) do
|
||
case map do
|
||
%{^key => value} ->
|
||
value
|
||
%{} ->
|
||
default
|
||
other ->
|
||
:erlang.error({:badmap, other}, [map, key, default])
|
||
end
|
||
end
|
||
|
||
# Map.put_new/3 (line 398-407)
|
||
def put_new(map, key, value) do
|
||
case map do
|
||
%{^key => _value} ->
|
||
map
|
||
%{} ->
|
||
put(map, key, value)
|
||
other ->
|
||
:erlang.error({:badmap, other})
|
||
end
|
||
end
|
||
```
|
||
|
||
**Why:** Pattern matching with `%{^key => value}` does the lookup AND extraction in one step. The `%{}` clause (empty map pattern) matches any map where the key is NOT present. The `other` clause provides a clear error for non-maps. This is both more efficient and more readable than `if Map.has_key?(map, key)`.
|
||
|
||
**Anti-pattern:**
|
||
```elixir
|
||
# BAD — double lookup, less clear
|
||
def get(map, key, default) do
|
||
if Map.has_key?(map, key) do
|
||
:maps.get(key, map)
|
||
else
|
||
default
|
||
end
|
||
end
|
||
```
|
||
|
||
### When to Use
|
||
|
||
**Triggers:** You need to branch based on whether a key exists in a map, especially when you also want the value if it does exist. You want a single lookup that both checks existence and extracts the value.
|
||
|
||
**Example — before:**
|
||
```elixir
|
||
def fetch_config(config, key) do
|
||
if Map.has_key?(config, key) do
|
||
{:ok, Map.get(config, key)}
|
||
else
|
||
{:error, :missing}
|
||
end
|
||
end
|
||
```
|
||
|
||
**Example — after:**
|
||
```elixir
|
||
def fetch_config(config, key) do
|
||
case config do
|
||
%{^key => value} -> {:ok, value}
|
||
%{} -> {:error, :missing}
|
||
end
|
||
end
|
||
```
|
||
|
||
### When NOT to Use
|
||
|
||
**Don't use this when:** You're checking for multiple keys simultaneously (use a multi-key pattern match instead), or when `Map.get/3` with a default already expresses what you need. Don't use `case` dispatch for simple "get with fallback" scenarios.
|
||
|
||
**Over-application example:**
|
||
```elixir
|
||
# Over-engineered — Map.get/3 already does this
|
||
def get_name(user) do
|
||
case user do
|
||
%{:name => name} -> name
|
||
%{} -> "Anonymous"
|
||
end
|
||
end
|
||
```
|
||
|
||
**Better alternative:**
|
||
```elixir
|
||
def get_name(user) do
|
||
Map.get(user, :name, "Anonymous")
|
||
end
|
||
```
|
||
|
||
**Why:** `Map.get/3` already implements this exact pattern internally. Rewriting it as an explicit `case` adds visual noise without any semantic or performance benefit. Use the case pattern when you're doing something `Map.get` can't — like returning different tagged tuples or triggering side effects.
|
||
|
||
---
|
||
|
||
## 9. Delegating to Erlang BIFs with `defdelegate`
|
||
|
||
**Source:** [lib/elixir/lib/map.ex#L127](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/map.ex#L127), 143, 159, 173
|
||
|
||
**What it does:** When an Erlang function already does exactly what's needed, Elixir delegates directly rather than wrapping.
|
||
|
||
```elixir
|
||
@spec keys(map) :: [key]
|
||
defdelegate keys(map), to: :maps
|
||
|
||
@spec values(map) :: [value]
|
||
defdelegate values(map), to: :maps
|
||
|
||
@spec merge(map, map) :: map
|
||
defdelegate merge(map1, map2), to: :maps
|
||
```
|
||
|
||
**Why:** Zero overhead — the compiler inlines these. No point wrapping an Erlang BIF just to have an Elixir wrapper when the semantics are identical. The `@compile {:inline, ...}` annotation on line 115 makes this explicit.
|
||
|
||
**Anti-pattern:** Wrapping without adding value:
|
||
```elixir
|
||
# BAD — pointless wrapper
|
||
def keys(map) do
|
||
:maps.keys(map)
|
||
end
|
||
```
|
||
|
||
### When to Use
|
||
|
||
**Triggers:** An Erlang module already exports a function with the exact semantics you need. The argument order matches. You want to expose it under an Elixir-idiomatic name or in your module's namespace for discoverability.
|
||
|
||
**Example — before:**
|
||
```elixir
|
||
defmodule MyQueue do
|
||
def new, do: :queue.new()
|
||
def push(q, item), do: :queue.in(item, q)
|
||
def pop(q), do: :queue.out(q)
|
||
end
|
||
```
|
||
|
||
**Example — after:**
|
||
```elixir
|
||
defmodule MyQueue do
|
||
defdelegate new(), to: :queue
|
||
# Can't delegate push — argument order differs, needs wrapper
|
||
def push(q, item), do: :queue.in(item, q)
|
||
defdelegate pop(q), to: :queue, as: :out
|
||
end
|
||
```
|
||
|
||
### When NOT to Use
|
||
|
||
**Don't use this when:** You need to validate inputs, transform arguments, change argument order, add defaults, or adapt the return value. Also avoid when the Erlang function has unclear semantics that benefit from a documenting wrapper.
|
||
|
||
**Over-application example:**
|
||
```elixir
|
||
# Broken — Erlang arg order is (key, map), Elixir convention is (map, key)
|
||
defdelegate get(map, key), to: :maps
|
||
# This compiles but has wrong argument order expectations!
|
||
```
|
||
|
||
**Better alternative:**
|
||
```elixir
|
||
def get(map, key) do
|
||
:maps.get(key, map)
|
||
end
|
||
```
|
||
|
||
**Why:** `defdelegate` is a transparent pass-through. If argument order, defaults, validation, or error handling differ between your desired API and the Erlang function, you need a real wrapper. Delegating with a semantic mismatch creates subtle bugs.
|
||
|
||
---
|
||
|
||
## 10. Reduce as the Universal Primitive
|
||
|
||
**Source:** [lib/elixir/lib/enum.ex#L19](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/enum.ex#L19), 2660–2676
|
||
|
||
**What it does:** Nearly every Enum operation is built on top of `reduce`. The Enumerable protocol's core function is `reduce/3`. Everything else (`count`, `member?`, `slice`) is an optimization hint.
|
||
|
||
```elixir
|
||
# From the protocol docs (line 19-21):
|
||
def map(enumerable, fun) do
|
||
reducer = fn x, acc -> {:cont, [fun.(x) | acc]} end
|
||
Enumerable.reduce(enumerable, {:cont, []}, reducer) |> elem(1) |> :lists.reverse()
|
||
end
|
||
|
||
# The actual reduce/3 (line 2676):
|
||
def reduce(enumerable, acc, fun) do
|
||
Enumerable.reduce(enumerable, {:cont, acc}, fun) |> elem(1)
|
||
end
|
||
```
|
||
|
||
**Why:** Reduce is the most general iteration primitive. By building all operations on reduce, any data structure that implements `Enumerable.reduce/3` automatically gets the full `Enum` API. This is the protocol + reduce = universal composability pattern.
|
||
|
||
**Anti-pattern:** Implementing each Enum function independently for each data structure:
|
||
```elixir
|
||
# BAD — reimplementing map for each type
|
||
def map(%MyStruct{items: items}, fun), do: ...
|
||
def filter(%MyStruct{items: items}, fun), do: ...
|
||
# Instead: implement Enumerable.reduce/3 once and get everything
|
||
```
|
||
|
||
### When to Use
|
||
|
||
**Triggers:** You're implementing a custom data structure that should be iterable. You want the full `Enum` API without implementing each function. You're designing a protocol where one function provides maximum leverage.
|
||
|
||
**Example — before:**
|
||
```elixir
|
||
defmodule RingBuffer do
|
||
def map(%RingBuffer{} = rb, fun), do: ...
|
||
def filter(%RingBuffer{} = rb, fun), do: ...
|
||
def reduce(%RingBuffer{} = rb, acc, fun), do: ...
|
||
def count(%RingBuffer{} = rb), do: ...
|
||
# 70+ functions to implement...
|
||
end
|
||
```
|
||
|
||
**Example — after:**
|
||
```elixir
|
||
defimpl Enumerable, for: RingBuffer do
|
||
def reduce(%RingBuffer{data: data, head: h, size: s}, acc, fun) do
|
||
# One function — yields elements in order
|
||
do_reduce(data, h, s, acc, fun)
|
||
end
|
||
|
||
def count(%RingBuffer{size: s}), do: {:ok, s}
|
||
def member?(_, _), do: {:error, __MODULE__}
|
||
def slice(_), do: {:error, __MODULE__}
|
||
end
|
||
# Now Enum.map/filter/take/etc. all work automatically
|
||
```
|
||
|
||
### When NOT to Use
|
||
|
||
**Don't use this when:** Your data structure has specialized algorithms that are significantly faster than the generic reduce-based approach (e.g., binary search on a sorted structure for `member?`). In that case, implement the specific protocol callbacks.
|
||
|
||
**Over-application example:**
|
||
```elixir
|
||
# Wasteful — reduce traverses all elements to count a structure with O(1) size
|
||
defimpl Enumerable, for: SizedCollection do
|
||
def count(_), do: {:error, __MODULE__}
|
||
# This forces Enum.count to use reduce: O(n)
|
||
# when the size is stored in a field: O(1)
|
||
end
|
||
```
|
||
|
||
**Better alternative:**
|
||
```elixir
|
||
defimpl Enumerable, for: SizedCollection do
|
||
def count(%{size: s}), do: {:ok, s}
|
||
# Now Enum.count is O(1)
|
||
end
|
||
```
|
||
|
||
**Why:** The optimization callbacks (`count`, `member?`, `slice`) exist precisely because reduce is O(n) for operations that some structures can do faster. Use reduce as the universal fallback, but implement the fast paths when your structure supports them.
|
||
|
||
---
|
||
|
||
## 11. Keyword Multi-Clause Guard Dispatch (String.split pattern)
|
||
|
||
**Source:** [lib/elixir/lib/string.ex#L516](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/string.ex#L516)
|
||
|
||
**What it does:** Functions with many input shapes use multiple `def` clauses with guards to dispatch, handling each case distinctly rather than using internal `cond`/`case`.
|
||
|
||
```elixir
|
||
def split(string, %Regex{} = pattern, options) when is_binary(string) and is_list(options) do
|
||
Regex.split(pattern, string, options)
|
||
end
|
||
|
||
def split(string, "", options) when is_binary(string) and is_list(options) do
|
||
# special case: split by empty string (grapheme-by-grapheme)
|
||
...
|
||
end
|
||
|
||
def split(string, [], options) when is_binary(string) and is_list(options) do
|
||
# empty pattern list: no splitting
|
||
...
|
||
end
|
||
|
||
def split(string, pattern, options) when is_binary(string) and is_list(options) do
|
||
# general binary pattern case
|
||
...
|
||
end
|
||
```
|
||
|
||
**Why:** Each clause has a single responsibility. The BEAM compiler generates efficient dispatch for these patterns. Adding a new case is additive (new clause) rather than modifying existing logic.
|
||
|
||
**Anti-pattern:** One function with nested conditionals:
|
||
```elixir
|
||
# BAD — all cases mashed into one body
|
||
def split(string, pattern, options) do
|
||
cond do
|
||
is_struct(pattern, Regex) -> ...
|
||
pattern == "" -> ...
|
||
pattern == [] -> ...
|
||
true -> ...
|
||
end
|
||
end
|
||
```
|
||
|
||
### When to Use
|
||
|
||
**Triggers:** A function accepts multiple distinct input shapes (different types, specific sentinel values, structural patterns). Each shape requires substantially different handling. The shapes are distinguishable via guards or pattern matching.
|
||
|
||
**Example — before:**
|
||
```elixir
|
||
def parse(input, format) do
|
||
cond do
|
||
format == :json -> Jason.decode!(input)
|
||
format == :yaml -> YamlElixir.read_from_string!(input)
|
||
is_binary(format) -> custom_parse(input, format)
|
||
true -> raise "unknown format"
|
||
end
|
||
end
|
||
```
|
||
|
||
**Example — after:**
|
||
```elixir
|
||
def parse(input, :json) when is_binary(input), do: Jason.decode!(input)
|
||
def parse(input, :yaml) when is_binary(input), do: YamlElixir.read_from_string!(input)
|
||
def parse(input, format) when is_binary(input) and is_binary(format), do: custom_parse(input, format)
|
||
```
|
||
|
||
### When NOT to Use
|
||
|
||
**Don't use this when:** The differences between cases are minor (a single flag toggles a small behavior), or when you'd end up with 10+ nearly-identical clauses that differ by one line. Also avoid when the distinguishing condition can't be expressed in a guard (e.g., requires a database lookup).
|
||
|
||
**Over-application example:**
|
||
```elixir
|
||
# Absurd — 5 clauses that differ only in a multiplier
|
||
def convert(value, :mm), do: value * 1.0
|
||
def convert(value, :cm), do: value * 10.0
|
||
def convert(value, :m), do: value * 1000.0
|
||
def convert(value, :km), do: value * 1_000_000.0
|
||
def convert(value, :in), do: value * 25.4
|
||
```
|
||
|
||
**Better alternative:**
|
||
```elixir
|
||
@multipliers %{mm: 1.0, cm: 10.0, m: 1000.0, km: 1_000_000.0, in: 25.4}
|
||
|
||
def convert(value, unit) when is_map_key(@multipliers, unit) do
|
||
value * @multipliers[unit]
|
||
end
|
||
```
|
||
|
||
**Why:** When clauses share identical structure and differ only by data, a lookup table is cleaner and more maintainable. Multi-clause dispatch shines when each case has genuinely different logic, not just different constants.
|
||
|
||
---
|
||
|
||
## 12. Lazy Private Helpers with `defp parts_to_index`
|
||
|
||
**Source:** [lib/elixir/lib/string.ex#L562](https://github.com/elixir-lang/elixir/blob/f4e1b34617ef92052b65781f18eae5b88a490098/lib/elixir/lib/string.ex#L562)
|
||
|
||
**What it does:** Tiny private helpers that convert between API-level concepts and implementation-level values use single-line `defp` with guards.
|
||
|
||
```elixir
|
||
defp parts_to_index(:infinity), do: 0
|
||
defp parts_to_index(n) when is_integer(n) and n > 0, do: n
|
||
```
|
||
|
||
**Why:** Clear, self-documenting dispatch. Each case is one line. No branching logic in the caller. The function name explains the conversion.
|
||
|
||
**Anti-pattern:** Inline conditional in the caller:
|
||
```elixir
|
||
# BAD — logic scattered in caller
|
||
index = if parts == :infinity, do: 0, else: parts
|
||
```
|
||
|
||
### When to Use
|
||
|
||
**Triggers:** You have a small, well-defined mapping between API-level values and internal representations. The conversion appears in multiple places, or the mapping is non-obvious enough to deserve a name.
|
||
|
||
**Example — before:**
|
||
```elixir
|
||
def fetch(resource, timeout) do
|
||
ms = if timeout == :infinity, do: 0, else: timeout * 1000
|
||
do_fetch(resource, ms)
|
||
end
|
||
```
|
||
|
||
**Example — after:**
|
||
```elixir
|
||
def fetch(resource, timeout) do
|
||
do_fetch(resource, timeout_to_ms(timeout))
|
||
end
|
||
|
||
defp timeout_to_ms(:infinity), do: :infinity
|
||
defp timeout_to_ms(seconds) when is_number(seconds) and seconds >= 0, do: round(seconds * 1000)
|
||
```
|
||
|
||
### When NOT to Use
|
||
|
||
**Don't use this when:** The conversion is trivial and only used once (a single `if` is clearer than a named function for `x + 1`), or when the mapping has many entries that would be better served by a lookup map.
|
||
|
||
**Over-application example:**
|
||
```elixir
|
||
# Over-engineered — named function for trivial identity-like conversion
|
||
defp ensure_string(s) when is_binary(s), do: s
|
||
defp ensure_string(a) when is_atom(a), do: Atom.to_string(a)
|
||
|
||
# Used exactly once:
|
||
def log(msg), do: IO.puts(ensure_string(msg))
|
||
```
|
||
|
||
**Better alternative:**
|
||
```elixir
|
||
def log(msg) when is_binary(msg), do: IO.puts(msg)
|
||
def log(msg) when is_atom(msg), do: IO.puts(Atom.to_string(msg))
|
||
```
|
||
|
||
**Why:** When a conversion is used exactly once and the calling function already dispatches on clauses, folding the conversion into the caller's clauses reduces indirection. Named helpers shine when reused or when they name a non-obvious transformation.
|
||
|
||
## Decision Tree
|
||
|
||
- If you accept "any enumerable" but lists are the common case → add a `when is_list` clause before protocol dispatch (Pattern 1)
|
||
- If you are building a result list element-by-element and order matters → prepend with `[x | acc]` then reverse at the end (Pattern 2)
|
||
- If data flows through 2+ sequential transformations → use the pipe operator (Pattern 3)
|
||
- If you call `Enumerable.reduce/3` directly and always want the accumulated value → unwrap with `|> elem(1)` (Pattern 4)
|
||
- If you need a recursive function with multiple termination conditions → decompose into public entry + private multi-clause worker (Pattern 5)
|
||
- If the collection is large/infinite or you chain 3+ transforms → use Stream; otherwise use Enum (Pattern 6)
|
||
- If the new value depends on the old value (increment, append) → use `Map.update/4`; if replacing unconditionally → use `Map.put/3` (Pattern 7)
|
||
- If you need to branch on whether a key exists and extract the value → pattern-match with `%{^key => value}` in a `case` (Pattern 8)
|
||
- If an Erlang function has identical semantics and argument order → use `defdelegate` (Pattern 9)
|
||
- If you are implementing a custom iterable data structure → implement `Enumerable.reduce/3` to get the full Enum API (Pattern 10)
|
||
|
||
<!-- PATTERN_COMPLETE -->
|