docs: idiomatic Elixir and Phoenix patterns with source citations

Extracted patterns, conventions, and code smells directly from the
Elixir and Phoenix source code with file path and line number citations.

Covers: GenServer, error handling, data transforms, process design,
testing, documentation, typespecs, macros, behaviours, module organization,
Phoenix-specific patterns, framework deviations, and anti-patterns.
This commit is contained in:
Aaron Weiker
2026-04-29 22:50:12 -07:00
commit 4ea9a884aa
16 changed files with 4857 additions and 0 deletions
+448
View File
@@ -0,0 +1,448 @@
# Data Transform & Pipeline Patterns in Elixir Core
Patterns extracted from Elixir's standard library source code.
---
## 1. List-Specialized Clause Before Protocol Dispatch
**Source:** `lib/elixir/lib/enum.ex` lines 17231733
**What it does:** Every public Enum function defines a `when is_list(enumerable)` clause first, then a generic fallback that uses the Enumerable protocol.
```elixir
def map(enumerable, fun) when is_list(enumerable) do
:lists.map(fun, enumerable)
end
def map(first..last//step, fun) do
map_range(first, last, step, fun)
end
def map(enumerable, fun) do
reduce(enumerable, [], R.map(fun)) |> :lists.reverse()
end
```
**Why:** Lists are by far the most common enumerable. Matching them first avoids protocol dispatch overhead entirely (direct Erlang BIF call). The range clause is a further optimization for a common case. The generic clause handles all other enumerables through the protocol.
**Anti-pattern:** A single clause that always goes through protocol dispatch:
```elixir
# BAD — forces protocol overhead even for lists
def map(enumerable, fun) do
Enumerable.reduce(enumerable, {:cont, []}, fn x, acc ->
{:cont, [fun.(x) | acc]}
end) |> elem(1) |> :lists.reverse()
end
```
---
## 2. Build-Then-Reverse (Cons-Cell Accumulation)
**Source:** `lib/elixir/lib/enum.ex` lines 1124, 1733, 2697
**What it does:** Accumulates results by prepending to a list (`[x | acc]`), then reverses at the end.
```elixir
# filter (line 1124)
def filter(enumerable, fun) do
reduce(enumerable, [], R.filter(fun)) |> :lists.reverse()
end
# map (line 1733)
def map(enumerable, fun) do
reduce(enumerable, [], R.map(fun)) |> :lists.reverse()
end
# reject (line 2697)
def reject(enumerable, fun) do
reduce(enumerable, [], R.reject(fun)) |> :lists.reverse()
end
```
**Why:** Prepending to a linked list is O(1); appending is O(n). Building reversed then flipping once is O(n) total. Appending each element would be O(n²).
**Anti-pattern:** Appending to the end of a list in a loop:
```elixir
# BAD — O(n²)
def map(enumerable, fun) do
reduce(enumerable, [], fn x, acc -> acc ++ [fun.(x)] end)
end
```
---
## 3. Pipeline for Linear Transformations, Bare Calls for Control Flow
**Source:** `lib/elixir/lib/enum.ex` lines 16841685, 1551, vs 496502
**What it does:** Elixir core uses `|>` when data flows linearly through 2+ transformations. It does NOT use `|>` for single-step operations or when the first argument is computed by a `case`/`if`/`with`.
```elixir
# Pipeline: data flows through multiple transforms (line 1684-1685)
def map_join(enumerable, joiner \\ "", mapper) do
enumerable
|> map(&entry_to_string(mapper.(&1)))
|> intersperse(joiner)
|> IO.iodata_to_binary()
end
# NO pipeline: single step or control flow (line 496-502)
def at(enumerable, index, default \\ nil) when is_integer(index) do
case slice_forward(enumerable, index, 1, 1) do
[value] -> value
[] -> default
end
end
```
**Why:** Pipelines communicate "data flows through transformations." Using them for a single function call or wrapping around control flow obscures intent rather than clarifying it.
**Anti-pattern:** Pipelines for single operations or wrapping control flow:
```elixir
# BAD — single step, no pipeline needed
list |> Enum.reverse()
# BAD — control flow awkwardly forced into a pipe
result
|> case do
{:ok, x} -> x
:error -> nil
end
```
---
## 4. Pipeline Ending with `|> elem(1)` (Protocol Reduce Unwrap)
**Source:** `lib/elixir/lib/enum.ex` lines 363, 403, 433, 468, 725, 1022, 2676
**What it does:** When calling `Enumerable.reduce/3` directly, the result is always `{:done | :halted | :suspended, acc}`. Core extracts the accumulator with `|> elem(1)`.
```elixir
# all?/1 (line 363)
def all?(enumerable) do
Enumerable.reduce(enumerable, {:cont, true}, fn entry, _ ->
if entry, do: {:cont, true}, else: {:halt, false}
end)
|> elem(1)
end
# reduce/3 (line 2676)
def reduce(enumerable, acc, fun) do
Enumerable.reduce(enumerable, {:cont, acc}, fun) |> elem(1)
end
```
**Why:** The protocol returns tagged tuples for the state machine (cont/halt/suspend). End users don't need the tag — only the accumulated value. `|> elem(1)` is the idiomatic unwrap.
**Anti-pattern:** Using `case` when you don't care about the tag:
```elixir
# BAD — unnecessary pattern match when you always want the value
case Enumerable.reduce(enumerable, {:cont, acc}, fun) do
{:done, result} -> result
{:halted, result} -> result
end
```
---
## 5. Private Helper Decomposition: Recursive Workers with Guards
**Source:** `lib/elixir/lib/enum.ex` lines 49754995, 50255039
**What it does:** Complex operations are split into a public entry point (with validation guards) and a private recursive worker function. The worker uses pattern matching on structure (empty list, head|tail) and guards on counters.
```elixir
# Public entry: validates, delegates (line 890-904)
def drop(enumerable, amount)
when is_list(enumerable) and is_integer(amount) and amount >= 0 do
drop_list(enumerable, amount)
end
# Private worker: pattern matches on structure (line 4975-4983)
defp split_list([head | tail], counter, acc) when counter > 0 do
split_list(tail, counter - 1, [head | acc])
end
defp split_list(list, 0, acc) do
{:lists.reverse(acc), list}
end
defp split_list([], _, acc) do
{:lists.reverse(acc), []}
end
```
**Why:** Separating validation from recursion keeps each clause focused. Guards in function heads enable the BEAM to optimize dispatch with jump tables. No runtime `if`/`cond` needed.
**Anti-pattern:** Mixing validation, edge cases, and recursion in a single function with internal conditionals:
```elixir
# BAD — one big function with nested ifs
defp split_list(list, counter, acc) do
if counter > 0 and list != [] do
[head | tail] = list
split_list(tail, counter - 1, [head | acc])
else
{:lists.reverse(acc), list}
end
end
```
---
## 6. Enum vs Stream Decision Pattern
**Source:** `lib/elixir/lib/stream.ex` lines 180 (module docs), `lib/elixir/lib/enum.ex`
**What it does:** Enum functions are eager (materialize intermediate lists). Stream functions are lazy (build computation recipes). Core uses Stream for:
- Infinite sequences (`cycle`, `iterate`, `repeatedly`)
- Resource management (`resource/3`)
- Composing transformations to execute in a single pass
```elixir
# Stream: builds a recipe, zero computation until consumed (stream.ex ~line 490)
def map(enum, fun) when is_function(fun, 1) do
lazy(enum, fn f1 -> R.map(fun, f1) end)
end
# Enum: immediately materializes the result (enum.ex line 1723)
def map(enumerable, fun) when is_list(enumerable) do
:lists.map(fun, enumerable)
end
```
**Why:** From Stream docs (lines 3741): "When chaining many operations with `Enum`, intermediate lists are created, while `Stream` creates a recipe of computations that are executed at a later moment."
Use Enum when:
- You need the full result now
- The collection is small/bounded
- You only chain 12 operations
Use Stream when:
- The collection is large or infinite
- You chain many transformations
- You need resource cleanup (file handles, network)
- You want single-pass processing
**Anti-pattern:** Using Stream for small bounded collections (overhead of the lazy machinery exceeds any benefit):
```elixir
# BAD — Stream overhead for trivial transform
[1, 2, 3] |> Stream.map(&(&1 * 2)) |> Enum.to_list()
# GOOD — just use Enum
[1, 2, 3] |> Enum.map(&(&1 * 2))
```
---
## 7. Map.update vs Map.put Decision Pattern
**Source:** `lib/elixir/lib/map.ex` lines 670700
**What it does:** `Map.update/4` transforms an existing value based on its current state. `Map.put/3` unconditionally sets a value regardless of current state.
```elixir
# Map.update/4 (line 682-693): transform based on current value
def update(map, key, default, fun) when is_function(fun, 1) do
case map do
%{^key => value} ->
%{map | key => fun.(value)}
%{} ->
put(map, key, default)
other ->
:erlang.error({:badmap, other}, [map, key, default, fun])
end
end
# Map.put/3 (line 636): unconditional set
def put(map, key, value) do
:maps.put(key, value, map)
end
```
**Why:** `update/4` is for when the new value depends on the old value (counters, appending to nested lists). `put/3` is for when you know the exact new value regardless of what was there.
**Anti-pattern:** Using `get` + `put` when `update` expresses intent:
```elixir
# BAD — two lookups, unclear intent
count = Map.get(map, :count, 0)
Map.put(map, :count, count + 1)
# GOOD — single lookup, clear intent
Map.update(map, :count, 1, &(&1 + 1))
```
---
## 8. Pattern Matching on Map Structure for Dispatch
**Source:** `lib/elixir/lib/map.ex` lines 398, 509, 586
**What it does:** Map functions use `case map do %{^key => value} -> ...` to dispatch on whether a key exists, rather than calling `has_key?` + conditional.
```elixir
# Map.get/3 (line 586-594)
def get(map, key, default \\ nil) do
case map do
%{^key => value} ->
value
%{} ->
default
other ->
:erlang.error({:badmap, other}, [map, key, default])
end
end
# Map.put_new/3 (line 398-407)
def put_new(map, key, value) do
case map do
%{^key => _value} ->
map
%{} ->
put(map, key, value)
other ->
:erlang.error({:badmap, other})
end
end
```
**Why:** Pattern matching with `%{^key => value}` does the lookup AND extraction in one step. The `%{}` clause (empty map pattern) matches any map where the key is NOT present. The `other` clause provides a clear error for non-maps. This is both more efficient and more readable than `if Map.has_key?(map, key)`.
**Anti-pattern:**
```elixir
# BAD — double lookup, less clear
def get(map, key, default) do
if Map.has_key?(map, key) do
:maps.get(key, map)
else
default
end
end
```
---
## 9. Delegating to Erlang BIFs with `defdelegate`
**Source:** `lib/elixir/lib/map.ex` lines 127, 143, 159, 173
**What it does:** When an Erlang function already does exactly what's needed, Elixir delegates directly rather than wrapping.
```elixir
@spec keys(map) :: [key]
defdelegate keys(map), to: :maps
@spec values(map) :: [value]
defdelegate values(map), to: :maps
@spec merge(map, map) :: map
defdelegate merge(map1, map2), to: :maps
```
**Why:** Zero overhead — the compiler inlines these. No point wrapping an Erlang BIF just to have an Elixir wrapper when the semantics are identical. The `@compile {:inline, ...}` annotation on line 115 makes this explicit.
**Anti-pattern:** Wrapping without adding value:
```elixir
# BAD — pointless wrapper
def keys(map) do
:maps.keys(map)
end
```
---
## 10. Reduce as the Universal Primitive
**Source:** `lib/elixir/lib/enum.ex` lines 1921, 26602676
**What it does:** Nearly every Enum operation is built on top of `reduce`. The Enumerable protocol's core function is `reduce/3`. Everything else (`count`, `member?`, `slice`) is an optimization hint.
```elixir
# From the protocol docs (line 19-21):
def map(enumerable, fun) do
reducer = fn x, acc -> {:cont, [fun.(x) | acc]} end
Enumerable.reduce(enumerable, {:cont, []}, reducer) |> elem(1) |> :lists.reverse()
end
# The actual reduce/3 (line 2676):
def reduce(enumerable, acc, fun) do
Enumerable.reduce(enumerable, {:cont, acc}, fun) |> elem(1)
end
```
**Why:** Reduce is the most general iteration primitive. By building all operations on reduce, any data structure that implements `Enumerable.reduce/3` automatically gets the full `Enum` API. This is the protocol + reduce = universal composability pattern.
**Anti-pattern:** Implementing each Enum function independently for each data structure:
```elixir
# BAD — reimplementing map for each type
def map(%MyStruct{items: items}, fun), do: ...
def filter(%MyStruct{items: items}, fun), do: ...
# Instead: implement Enumerable.reduce/3 once and get everything
```
---
## 11. Keyword Multi-Clause Guard Dispatch (String.split pattern)
**Source:** `lib/elixir/lib/string.ex` lines 516563
**What it does:** Functions with many input shapes use multiple `def` clauses with guards to dispatch, handling each case distinctly rather than using internal `cond`/`case`.
```elixir
def split(string, %Regex{} = pattern, options) when is_binary(string) and is_list(options) do
Regex.split(pattern, string, options)
end
def split(string, "", options) when is_binary(string) and is_list(options) do
# special case: split by empty string (grapheme-by-grapheme)
...
end
def split(string, [], options) when is_binary(string) and is_list(options) do
# empty pattern list: no splitting
...
end
def split(string, pattern, options) when is_binary(string) and is_list(options) do
# general binary pattern case
...
end
```
**Why:** Each clause has a single responsibility. The BEAM compiler generates efficient dispatch for these patterns. Adding a new case is additive (new clause) rather than modifying existing logic.
**Anti-pattern:** One function with nested conditionals:
```elixir
# BAD — all cases mashed into one body
def split(string, pattern, options) do
cond do
is_struct(pattern, Regex) -> ...
pattern == "" -> ...
pattern == [] -> ...
true -> ...
end
end
```
---
## 12. Lazy Private Helpers with `defp parts_to_index`
**Source:** `lib/elixir/lib/string.ex` lines 562563
**What it does:** Tiny private helpers that convert between API-level concepts and implementation-level values use single-line `defp` with guards.
```elixir
defp parts_to_index(:infinity), do: 0
defp parts_to_index(n) when is_integer(n) and n > 0, do: n
```
**Why:** Clear, self-documenting dispatch. Each case is one line. No branching logic in the caller. The function name explains the conversion.
**Anti-pattern:** Inline conditional in the caller:
```elixir
# BAD — logic scattered in caller
index = if parts == :infinity, do: 0, else: parts
```