docs: add when/exceptions to smells

This commit is contained in:
Aaron Weiker
2026-04-30 05:26:03 -07:00
parent 1a934eb2e3
commit 29b91bead6
2 changed files with 1462 additions and 0 deletions
+711
View File
@@ -36,6 +36,48 @@ Process.exit(pid, :kill)
assert_receive {:DOWN, ^ref, _, _, _}
```
### When to Apply This Rule
**Triggers:**
- Any `Process.sleep/1` call with a numeric argument in test code
- Flaky tests that pass locally but fail on CI
- Tests with comments like "wait for process to finish"
**Example — the smell:**
```elixir
test "pubsub delivers messages" do
PubSub.subscribe(:topic)
PubSub.publish(:topic, "hello")
Process.sleep(100)
assert_received {"hello"}
end
```
**Example — fixed:**
```elixir
test "pubsub delivers messages" do
PubSub.subscribe(:topic)
PubSub.publish(:topic, "hello")
assert_receive {"hello"}, 1000
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- Parking a process indefinitely with `Process.sleep(:infinity)` (it's not timing-based, it's a deliberate block)
- Testing actual time-dependent behavior (e.g., "this rate limiter allows 1 req/sec") where the delay IS the thing under test
**Example of acceptable use:**
```elixir
# Process parking — not synchronization, just "stay alive forever"
spawn(fn ->
Process.sleep(:infinity)
end)
```
**Why it's OK here:** The process isn't waiting for something to happen — it's deliberately kept alive as a fixture. There's no race condition because nothing depends on timing.
---
## 2. Mutable Global State in Tests
@@ -68,6 +110,48 @@ setup do
end
```
### When to Apply This Rule
**Triggers:**
- Tests that call `Application.put_env/3` or `System.put_env/2`
- Tests that modify Logger level, Mix env, or any module attribute at runtime
- Tests that fail when run in a different order or with `--seed 0`
**Example — the smell:**
```elixir
test "production mode disables debug" do
Application.put_env(:my_app, :env, :prod)
assert MyApp.debug_enabled?() == false
# Next test inherits :prod env!
end
```
**Example — fixed:**
```elixir
test "production mode disables debug" do
original = Application.get_env(:my_app, :env)
Application.put_env(:my_app, :env, :prod)
on_exit(fn -> Application.put_env(:my_app, :env, original) end)
assert MyApp.debug_enabled?() == false
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- The global state change happens in `test_helper.exs` before any tests run (one-time setup for the entire suite)
- Using `ExUnit.Case, async: false` with a dedicated setup/teardown and the state is inherently global (e.g., database migrations)
**Example of acceptable use:**
```elixir
# In test_helper.exs — one-time global setup
Application.put_env(:my_app, :env, :test)
ExUnit.start()
```
**Why it's OK here:** This runs once before any tests execute. It's not mutating state between tests — it's establishing the test environment baseline.
---
## 3. try/rescue for Control Flow
@@ -100,6 +184,57 @@ end
The standard library uses `with` statements and tagged tuples (`{:ok, result}` / `{:error, reason}`) for all fallible operations.
### When to Apply This Rule
**Triggers:**
- `try/rescue` blocks that catch known, expected error types
- Functions that call a bang (`!`) variant and immediately rescue
- Error handling that converts exceptions back to tuples
**Example — the smell:**
```elixir
def parse_config(path) do
try do
content = File.read!(path)
Jason.decode!(content)
rescue
File.Error -> {:error, :file_not_found}
Jason.DecodeError -> {:error, :invalid_json}
end
end
```
**Example — fixed:**
```elixir
def parse_config(path) do
with {:ok, content} <- File.read(path),
{:ok, decoded} <- Jason.decode(content) do
{:ok, decoded}
end
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- Interfacing with libraries that only provide bang functions (no tuple-returning variant)
- Catching truly unexpected errors at a supervision boundary (e.g., a GenServer `handle_call` that must not crash)
- Using `raise`/`rescue` for assertions in tests (`assert_raise`)
**Example of acceptable use:**
```elixir
# Third-party lib only provides a bang function
def safe_parse(input) do
try do
{:ok, ThirdPartyLib.parse!(input)}
rescue
ArgumentError -> {:error, :invalid_input}
end
end
```
**Why it's OK here:** You don't control the library API. Wrapping the bang call is the pragmatic choice until the library adds a non-bang variant.
---
## 4. God Modules (Unbounded Module Growth)
@@ -120,6 +255,59 @@ Each module has a single, clear responsibility.
**What they do instead:** Single-responsibility modules that compose. `use ExUnit.Case` imports from multiple focused modules.
### When to Apply This Rule
**Triggers:**
- A module exceeds ~300 lines
- You find yourself adding section comments like `# --- User helpers ---`
- Functions in the module don't share data or call each other
- The module name is generic (`Helpers`, `Utils`, `Common`)
**Example — the smell:**
```elixir
defmodule MyApp.Helpers do
def format_date(date), do: ...
def send_email(to, subject, body), do: ...
def validate_phone(number), do: ...
def geocode_address(addr), do: ...
def generate_pdf(data), do: ...
end
```
**Example — fixed:**
```elixir
defmodule MyApp.DateFormatter do
def format(date), do: ...
end
defmodule MyApp.Mailer do
def send(to, subject, body), do: ...
end
defmodule MyApp.PhoneValidator do
def validate(number), do: ...
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- The module is a Facade that delegates to sub-modules (thin wrapper providing a unified API)
- The module is a protocol implementation that must implement all callbacks in one place
- Kernel-style modules that define language primitives (you're not writing Kernel)
**Example of acceptable use:**
```elixir
defmodule MyApp.Orders do
# Facade — delegates to focused modules
defdelegate create(params), to: MyApp.Orders.Creator
defdelegate cancel(order), to: MyApp.Orders.Canceller
defdelegate refund(order, reason), to: MyApp.Orders.Refunder
end
```
**Why it's OK here:** The module is a thin routing layer. The actual logic lives in focused sub-modules. The facade exists for API convenience, not because responsibilities are lumped together.
---
## 5. Stringly-Typed APIs
@@ -148,6 +336,50 @@ end
%User{name: "Alice", role: :admin}
```
### When to Apply This Rule
**Triggers:**
- Maps with string keys used internally (not from external JSON)
- Functions that accept string arguments for mode/type selection (`"asc"`, `"desc"`)
- Pattern matching on string values for dispatch
**Example — the smell:**
```elixir
def sort_users(users, direction) when direction in ["asc", "desc"] do
case direction do
"asc" -> Enum.sort(users)
"desc" -> Enum.sort(users, :desc)
end
end
```
**Example — fixed:**
```elixir
def sort_users(users, :asc), do: Enum.sort(users)
def sort_users(users, :desc), do: Enum.sort(users, :desc)
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- Data comes from external sources (JSON APIs, CSV, database text columns) and hasn't been validated yet
- The set of values is truly unbounded (user names, free-text fields)
- You're building a string-keyed map specifically for JSON serialization output
**Example of acceptable use:**
```elixir
# External API response — strings are correct here
def handle_webhook(%{"event" => event_type, "data" => data}) do
case event_type do
"payment.completed" -> process_payment(data)
"subscription.cancelled" -> cancel_subscription(data)
_ -> {:error, :unknown_event}
end
end
```
**Why it's OK here:** The data comes from an external webhook as JSON. Converting to atoms would risk atom exhaustion from untrusted input. String matching at the boundary is correct — but convert to atoms/structs after validation for internal use.
---
## 6. Bare Maps Where Structs Belong
@@ -177,6 +409,69 @@ def await(%Task{ref: ref, owner: owner}) when owner == self() do
end
```
### When to Apply This Rule
**Triggers:**
- A map with the same keys appears in 3+ places
- You find yourself documenting "this map must have keys X, Y, Z"
- Functions validate map keys at runtime (`Map.has_key?` checks)
- Bugs from typos in map keys (`%{stauts: :active}` instead of `:status`)
**Example — the smell:**
```elixir
def create_order(user, items) do
%{
user_id: user.id,
items: items,
status: :pending,
total: calculate_total(items),
created_at: DateTime.utc_now()
}
end
def ship_order(order) do
%{order | stauts: :shipped} # Typo! No error, just a new key added
end
```
**Example — fixed:**
```elixir
defmodule Order do
@enforce_keys [:user_id, :items, :total]
defstruct [:user_id, :items, :total, status: :pending, created_at: nil]
end
def create_order(user, items) do
%Order{
user_id: user.id,
items: items,
total: calculate_total(items),
created_at: DateTime.utc_now()
}
end
def ship_order(%Order{} = order) do
%{order | status: :shipped} # Typo would raise KeyError!
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- The shape is genuinely dynamic (user-defined fields, plugin metadata)
- It's a short-lived intermediate value in a pipeline
- You're working with ETS/Mnesia where struct overhead adds complexity
**Example of acceptable use:**
```elixir
# Dynamic key-value config — shape unknown at compile time
def merge_config(defaults, overrides) do
Map.merge(defaults, overrides)
end
```
**Why it's OK here:** The config keys are user-defined and unbounded. A struct would require knowing all possible keys upfront, which contradicts the purpose of dynamic configuration.
---
## 7. Deep Nesting
@@ -219,6 +514,73 @@ def process(input) do
end
```
### When to Apply This Rule
**Triggers:**
- More than 2 levels of `case`/`if`/`cond` nesting
- Each nested level only handles `{:ok, _}` and passes through `{:error, _}`
- The "else" branches are all identical error pass-through
**Example — the smell:**
```elixir
def register_user(params) do
case validate_email(params.email) do
:ok ->
case validate_password(params.password) do
:ok ->
case check_uniqueness(params.email) do
:ok ->
case create_user(params) do
{:ok, user} -> send_welcome_email(user)
error -> error
end
error -> error
end
error -> error
end
error -> error
end
end
```
**Example — fixed:**
```elixir
def register_user(params) do
with :ok <- validate_email(params.email),
:ok <- validate_password(params.password),
:ok <- check_uniqueness(params.email),
{:ok, user} <- create_user(params) do
send_welcome_email(user)
end
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- Each nesting level handles different error cases with distinct logic (not just pass-through)
- The nesting represents genuinely different decision branches (a state machine)
- You need to bind variables from outer scopes in inner blocks
**Example of acceptable use:**
```elixir
def handle_response(response) do
case response.status do
200 ->
case Jason.decode(response.body) do
{:ok, %{"type" => "redirect"}} -> follow_redirect(response)
{:ok, data} -> {:ok, data}
{:error, _} -> {:error, :invalid_json}
end
401 -> refresh_token_and_retry()
429 -> schedule_retry(response)
_ -> {:error, {:http_error, response.status}}
end
end
```
**Why it's OK here:** Each branch has genuinely different handling logic. The inner `case` isn't just passing through errors — it's making a distinct decision based on the decoded content. A `with` would actually obscure the branching intent.
---
## 8. Shared Mutable State Between Tests
@@ -236,6 +598,56 @@ name = :"#{config.test}_#{partitions}_#{inspect(keys)}"
**What they do instead:** Every shared resource gets a unique name derived from `config.test` (the test name atom, guaranteed unique within a module).
### When to Apply This Rule
**Triggers:**
- Tests that use hardcoded registered names (`:my_server`, `MyApp.Cache`)
- ETS tables created with a fixed name across tests
- Tests that must run with `async: false` but you're not sure why
**Example — the smell:**
```elixir
test "cache stores values" do
:ets.new(:test_cache, [:set, :named_table, :public])
:ets.insert(:test_cache, {:key, "value"})
assert :ets.lookup(:test_cache, :key) == [{:key, "value"}]
end
# Second test crashes: :test_cache already exists!
```
**Example — fixed:**
```elixir
test "cache stores values", %{test: test_name} do
table = :ets.new(test_name, [:set, :public])
:ets.insert(table, {:key, "value"})
assert :ets.lookup(table, :key) == [{:key, "value"}]
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- The test explicitly needs to verify interaction with a shared resource (e.g., testing a distributed lock)
- Using `async: false` with proper setup/teardown for integration tests that inherently need global state
**Example of acceptable use:**
```elixir
# Integration test that MUST test global behavior
describe "cluster-wide rate limiter" do
@tag :integration
setup do
RateLimiter.reset(:global_limiter)
on_exit(fn -> RateLimiter.reset(:global_limiter) end)
end
test "limits across processes" do
# ...
end
end
```
**Why it's OK here:** The test is explicitly verifying shared-state behavior. The global name is the thing under test, not an incidental implementation detail.
---
## 9. Testing Internal State Directly
@@ -248,6 +660,56 @@ name = :"#{config.test}_#{partitions}_#{inspect(keys)}"
**What they do instead:** Test the behavior (what goes in, what comes out) through the public API.
### When to Apply This Rule
**Triggers:**
- `:sys.get_state(pid)` in test code
- Accessing `__struct__` fields that aren't part of the public API
- Testing that internal ETS tables have specific entries
- Assertions on GenServer state shape rather than behavior
**Example — the smell:**
```elixir
test "adding item updates internal state" do
{:ok, pid} = ShoppingCart.start_link()
ShoppingCart.add_item(pid, "book")
state = :sys.get_state(pid)
assert state.items == ["book"]
assert state.count == 1
end
```
**Example — fixed:**
```elixir
test "adding item makes it appear in cart" do
{:ok, pid} = ShoppingCart.start_link()
ShoppingCart.add_item(pid, "book")
assert ShoppingCart.list_items(pid) == ["book"]
assert ShoppingCart.count(pid) == 1
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- Testing the GenServer implementation itself (e.g., you're writing a GenServer library)
- Debugging a flaky test and temporarily inspecting state to understand what's happening
- The process IS a data store and `:sys.get_state` is effectively part of the contract (rare)
**Example of acceptable use:**
```elixir
# Testing a custom GenServer behaviour/library
test "init callback receives options" do
{:ok, pid} = MyCustomServer.start_link(initial_count: 5)
# We're testing the framework, not the app — state IS the public contract
assert :sys.get_state(pid) == %{count: 5}
end
```
**Why it's OK here:** You're testing the server framework itself, where the state management IS the feature. The state shape is the public contract of your library.
---
## 10. Overly Complex Test Setup
@@ -266,6 +728,79 @@ name = :"#{config.test}_#{partitions}_#{inspect(keys)}"
- Small helper functions for common patterns
- `start_supervised!` for process-heavy tests (one line)
### When to Apply This Rule
**Triggers:**
- `setup` block exceeds 15 lines
- Tests can't be understood without reading the setup first
- Setup creates objects the test doesn't use (over-fetching)
- Multiple describes share the same bloated setup
**Example — the smell:**
```elixir
setup do
user = insert(:user, role: :admin, verified: true, plan: :pro)
org = insert(:org, owner: user, plan: :enterprise)
team = insert(:team, org: org, name: "Engineering")
project = insert(:project, team: team, status: :active)
repo = insert(:repo, project: project)
branch = insert(:branch, repo: repo, name: "main")
commit = insert(:commit, branch: branch, author: user)
%{user: user, org: org, team: team, project: project, repo: repo, branch: branch, commit: commit}
end
test "user can see their name", %{user: user} do
assert user.name != nil # Only needed the user!
end
```
**Example — fixed:**
```elixir
test "user can see their name" do
user = insert(:user, name: "Alice")
assert user.name == "Alice"
end
# If multiple tests need a complex fixture, use a helper:
defp setup_full_project do
user = insert(:user, role: :admin)
org = insert(:org, owner: user)
# ... only when tests actually need all of this
%{user: user, org: org}
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- Integration tests that genuinely need a complex world state (database with relationships)
- The setup IS the test subject (testing that complex initialization works)
- All tests in the describe block actually use all setup values
**Example of acceptable use:**
```elixir
# Every test in this describe needs the full graph
describe "organization billing" do
setup do
org = insert(:org, plan: :enterprise)
team = insert(:team, org: org)
members = insert_list(5, :user, team: team)
%{org: org, team: team, members: members}
end
test "charges per seat", %{org: org, members: members} do
assert Billing.calculate(org) == length(members) * org.per_seat_price
end
test "prorates mid-month additions", %{org: org, team: team} do
new_member = insert(:user, team: team, joined_at: mid_month())
assert Billing.calculate(org) |> includes_proration?(new_member)
end
end
```
**Why it's OK here:** Every test uses the org/team/members graph. The setup represents the minimum viable state for billing tests.
---
## 11. Unsupervised Processes
@@ -289,6 +824,56 @@ spawn(fn -> do_work() end)
Task.Supervisor.start_child(MyApp.TaskSupervisor, fn -> do_work() end)
```
### When to Apply This Rule
**Triggers:**
- `spawn/1` or `spawn_link/1` in `lib/` code (not test code)
- Processes that aren't in any supervision tree
- "Ghost" processes found in Observer that nobody owns
- Application shutdown hangs (orphan processes blocking)
**Example — the smell:**
```elixir
def handle_cast({:process_job, job}, state) do
spawn(fn ->
result = expensive_computation(job)
notify_completion(result)
end)
{:noreply, state}
end
```
**Example — fixed:**
```elixir
def handle_cast({:process_job, job}, state) do
Task.Supervisor.start_child(MyApp.JobSupervisor, fn ->
result = expensive_computation(job)
notify_completion(result)
end)
{:noreply, state}
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- Test helpers that need a short-lived process fixture
- `spawn_link` in a supervised process that intentionally ties the child's fate to the parent
- One-shot scripts or Mix tasks (not long-running application code)
**Example of acceptable use:**
```elixir
# Test helper — short-lived, test process will clean up
test "monitors detect process death" do
pid = spawn(fn -> Process.sleep(:infinity) end)
ref = Process.monitor(pid)
Process.exit(pid, :kill)
assert_receive {:DOWN, ^ref, :process, ^pid, :killed}
end
```
**Why it's OK here:** It's test code. The spawned process is a fixture that will be cleaned up when the test process exits. Adding supervision would be over-engineering for a test helper.
---
## 12. Atom Creation from User Input
@@ -311,6 +896,61 @@ key = String.to_existing_atom(user_input)
config = Map.get(settings, user_input)
```
### When to Apply This Rule
**Triggers:**
- `String.to_atom/1` called on data from HTTP requests, WebSocket messages, or file input
- Atom conversion inside a loop or recursive function
- Phoenix controller that converts params to atoms
**Example — the smell:**
```elixir
def handle_event(event_name, payload, socket) do
# event_name comes from the client!
atom_event = String.to_atom(event_name)
apply(__MODULE__, atom_event, [payload, socket])
end
```
**Example — fixed:**
```elixir
@allowed_events ~w(click submit toggle)a
def handle_event(event_name, payload, socket) do
case String.to_existing_atom(event_name) do
event when event in @allowed_events ->
apply(__MODULE__, event, [payload, socket])
_ ->
{:noreply, socket}
end
rescue
ArgumentError -> {:noreply, socket}
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- Compile-time code generation (macros) where the set of atoms is fixed
- Deserializing from a trusted source (e.g., `:erlang.binary_to_term` with `:safe` option from your own nodes)
- The input is already validated against a known allowlist before conversion
**Example of acceptable use:**
```elixir
# Compile-time macro — bounded, known set
defmacro define_events(events) do
for event <- events do
quote do
def handle_event(unquote(Atom.to_string(event)), payload, socket) do
unquote(event)(payload, socket)
end
end
end
end
```
**Why it's OK here:** The atoms are created at compile time from a developer-defined list. The set is bounded and trusted. No runtime user input is involved.
---
## 13. Callback-Heavy Abstractions (When Protocols Suffice)
@@ -324,3 +964,74 @@ config = Map.get(settings, user_input)
**The heuristic:**
- **Protocol:** "Different data types need different treatment" (e.g., printing, iterating)
- **Behaviour:** "Different modules implement the same process contract" (e.g., GenServer callbacks)
### When to Apply This Rule
**Triggers:**
- A behaviour with callbacks like `format/1`, `serialize/1`, `display/1` — data transformation, not process lifecycle
- You want third-party types to implement your interface without modifying their code
- The dispatch is based on the data type, not the module providing the logic
**Example — the smell:**
```elixir
defmodule Formatter do
@callback format(term()) :: String.t()
end
defmodule JSONFormatter do
@behaviour Formatter
def format(data), do: Jason.encode!(data)
end
defmodule XMLFormatter do
@behaviour Formatter
def format(data), do: XmlBuilder.generate(data)
end
# Usage requires knowing which module to call
JSONFormatter.format(data)
```
**Example — fixed:**
```elixir
defprotocol Formattable do
@doc "Format data for output"
def format(data)
end
defimpl Formattable, for: Map do
def format(data), do: Jason.encode!(data)
end
defimpl Formattable, for: List do
def format(data), do: Enum.join(data, ", ")
end
# Dispatch is automatic based on type
Formattable.format(%{name: "Alice"})
Formattable.format(["a", "b", "c"])
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- The interface defines process lifecycle callbacks (init, handle_call, terminate)
- You need compile-time guarantees that a module implements all required functions
- The dispatch is by module (strategy pattern), not by data type
**Example of acceptable use:**
```elixir
defmodule Storage do
@callback store(key :: String.t(), value :: binary()) :: :ok | {:error, term()}
@callback fetch(key :: String.t()) :: {:ok, binary()} | {:error, :not_found}
@callback delete(key :: String.t()) :: :ok
end
# Used as a strategy — the MODULE is chosen, not the data type
defmodule MyApp.Upload do
@storage Application.compile_env(:my_app, :storage_backend)
def save(file), do: @storage.store(file.name, file.content)
end
```
**Why it's OK here:** The dispatch is by configured module (strategy pattern), not by data type. You want compile-time verification that the storage module implements all required operations. A protocol wouldn't help because the data going in is always the same type — it's the *implementation* that varies.
+751
View File
@@ -27,6 +27,47 @@ end
**Source:** Elixir's test suite has 39 `assert_receive`/`refute_receive` calls in `task_test.exs` alone, vs 0 `Process.sleep(N)` for synchronization.
### When to Apply This Rule
**Triggers:**
- `Process.sleep` followed by `assert_received` (note: `assert_received` checks mailbox NOW, doesn't wait)
- Tests with "magic number" sleeps (50, 100, 200ms)
- CI failures on tests that always pass locally
**Example — the smell:**
```elixir
test "worker processes job" do
Worker.submit(job)
Process.sleep(200)
assert_received {:job_complete, ^job}
end
```
**Example — fixed:**
```elixir
test "worker processes job" do
Worker.submit(job)
assert_receive {:job_complete, ^job}, 5000
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- You're testing that something does NOT happen within a time window (use `refute_receive` with explicit timeout instead)
- The test requires a minimum elapsed time as part of the assertion (e.g., testing a cooldown)
**Example of acceptable use:**
```elixir
test "debouncer doesn't fire before window" do
Debouncer.trigger(:action, 100)
refute_receive :action, 50 # Verify it HASN'T fired after 50ms
assert_receive :action, 200 # Then verify it DOES fire
end
```
**Why it's OK here:** The `refute_receive` with a timeout IS the assertion — we're verifying that the message hasn't arrived yet. The timing is the behavior under test.
---
## 2. Not Using `start_supervised` in Tests
@@ -52,6 +93,54 @@ end
**Source:** `lib/ex_unit/lib/ex_unit/callbacks.ex:277-340``start_supervised` is designed specifically for this: guaranteed shutdown in reverse order, no leaked processes, no race conditions.
### When to Apply This Rule
**Triggers:**
- `GenServer.start_link` or `Supervisor.start_link` in test setup blocks
- Manual `on_exit` cleanup for processes
- Test failures that leave zombie processes (ports exhausted, name conflicts)
**Example — the smell:**
```elixir
setup do
{:ok, cache} = MyApp.Cache.start_link(name: :test_cache)
{:ok, db} = MyApp.DB.start_link(pool_size: 1)
on_exit(fn ->
GenServer.stop(cache)
GenServer.stop(db)
end)
%{cache: cache, db: db}
end
```
**Example — fixed:**
```elixir
setup do
cache = start_supervised!({MyApp.Cache, name: :test_cache})
db = start_supervised!({MyApp.DB, pool_size: 1})
%{cache: cache, db: db}
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- Testing supervision tree behavior itself (restart strategies, shutdown order)
- The process must be linked to the test process to test crash propagation
- You need a process started OUTSIDE ExUnit's supervisor (e.g., testing distributed node behavior)
**Example of acceptable use:**
```elixir
test "crash propagates to caller" do
Process.flag(:trap_exit, true)
{:ok, pid} = CrashableWorker.start_link(self())
send(pid, :crash)
assert_receive {:EXIT, ^pid, :boom}
end
```
**Why it's OK here:** The test is verifying link/exit behavior. Using `start_supervised!` would put the process under ExUnit's supervisor, which would intercept the crash and defeat the purpose of the test.
---
## 3. Asserting Exact Equality on Concurrent/Shared Output
@@ -75,6 +164,52 @@ end
**Source:** `lib/ex_unit/lib/ex_unit/capture_io.ex` docs explicitly warn: "use `=~` instead of `==` for assertions on `:stderr` if your tests are async"
### When to Apply This Rule
**Triggers:**
- `== "..."` assertions on `capture_io(:stderr, ...)` in async tests
- Tests that fail intermittently with "extra" output in the captured string
- Assertions on Logger output that include timestamps or metadata
**Example — the smell:**
```elixir
use ExUnit.Case, async: true
test "warns on deprecation" do
output = capture_io(:stderr, fn -> MyModule.deprecated_func() end)
assert output == "[warning] deprecated_func/0 is deprecated\n"
end
```
**Example — fixed:**
```elixir
use ExUnit.Case, async: true
test "warns on deprecation" do
output = capture_io(:stderr, fn -> MyModule.deprecated_func() end)
assert output =~ "deprecated_func/0 is deprecated"
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- Capturing `:stdio` (group leader), which is per-process and safe for async tests
- Running with `async: false` where no concurrent output interference is possible
- Testing exact formatted output in a synchronous, isolated context
**Example of acceptable use:**
```elixir
use ExUnit.Case, async: true
test "prints exact format to stdout" do
# :stdio uses group leader — per-process, no interference
assert capture_io(fn -> IO.puts("hello") end) == "hello\n"
end
```
**Why it's OK here:** Capturing `:stdio` (the default) uses the process group leader, which is isolated per test process. No other test can pollute it, even in async mode.
---
## 4. Registering Global Names in Async Tests
@@ -103,6 +238,58 @@ end
**Source:** `lib/elixir/test/elixir/registry_test.exs:28``name = :"#{config.test}_#{partitions}_#{inspect(keys)}"` — always derives unique names from test context.
### When to Apply This Rule
**Triggers:**
- Hardcoded atom names in `GenServer.start_link` inside async test modules
- `{:error, {:already_started, _}}` errors in test output
- Tests that pass alone but fail when the full suite runs
**Example — the smell:**
```elixir
use ExUnit.Case, async: true
test "cache stores and retrieves" do
{:ok, _} = Cachex.start_link(name: :test_cache)
Cachex.put(:test_cache, :key, "value")
assert Cachex.get!(:test_cache, :key) == "value"
end
```
**Example — fixed:**
```elixir
use ExUnit.Case, async: true
test "cache stores and retrieves", %{test: test_name} do
{:ok, _} = Cachex.start_link(name: test_name)
Cachex.put(test_name, :key, "value")
assert Cachex.get!(test_name, :key) == "value"
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- The test module uses `async: false` and properly cleans up the named process
- You're testing the name registration behavior itself
- Using `start_supervised!` which handles cleanup automatically (though names still can't collide across concurrent modules)
**Example of acceptable use:**
```elixir
use ExUnit.Case, async: false # Explicitly synchronous
setup do
pid = start_supervised!({MyServer, name: :singleton_server})
%{pid: pid}
end
test "singleton server responds" do
assert MyServer.ping(:singleton_server) == :pong
end
```
**Why it's OK here:** `async: false` guarantees no other test module runs concurrently. The named process won't collide, and `start_supervised!` ensures cleanup between tests.
---
## 5. Testing Private Functions Directly
@@ -125,6 +312,65 @@ end
**Source:** The Elixir test suite tests public APIs exclusively. `gen_server_test.exs` tests `GenServer.call/cast/stop` — never the internal `handle_*` callbacks directly.
### When to Apply This Rule
**Triggers:**
- Functions prefixed with `__` or marked `@doc false` being called in tests
- Test file imports or aliases internal modules not part of the public API
- Tests that break after internal refactoring even though external behavior is unchanged
**Example — the smell:**
```elixir
defmodule MyApp.Parser do
def parse(input), do: input |> tokenize() |> build_ast()
# Made public just for testing!
@doc false
def tokenize(input), do: ...
@doc false
def build_ast(tokens), do: ...
end
# Test:
test "tokenizer splits on commas" do
assert MyApp.Parser.tokenize("a,b,c") == ["a", "b", "c"]
end
```
**Example — fixed:**
```elixir
# Test through the public API
test "parser handles comma-separated input" do
assert MyApp.Parser.parse("a,b,c") == %AST{nodes: ["a", "b", "c"]}
end
# OR extract into its own module if tokenizing is genuinely reusable:
defmodule MyApp.Tokenizer do
def tokenize(input), do: ...
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- The "private" function is actually a public utility that other modules depend on (it should be `@doc`'d)
- Property-based testing where you need to verify invariants on intermediate transformations
- Testing a complex algorithm step-by-step during development (remove these tests before merging)
**Example of acceptable use:**
```elixir
# Pure algorithmic module where each step has documented guarantees
defmodule MyApp.Compiler.Optimizer do
@doc "Remove dead code branches. Used by Compiler pipeline."
def eliminate_dead_code(ast), do: ...
@doc "Inline constant expressions. Used by Compiler pipeline."
def inline_constants(ast), do: ...
end
```
**Why it's OK here:** These aren't private functions exposed for testing — they're public steps in a compilation pipeline. Each has its own documented contract and is used by other modules.
---
## 6. Not Cleaning Up After Global State Changes
@@ -157,6 +403,53 @@ end
**Source:** `lib/logger/test/logger_test.exs:12-17` — Every Logger config change has a corresponding `on_exit` restoration. `lib/logger/test/test_helper.exs:57-62``capture_log` uses `after` to always restore level.
### When to Apply This Rule
**Triggers:**
- `Application.put_env`, `System.put_env`, `Logger.configure` in tests without corresponding restoration
- Tests that pass in isolation but fail in the full suite
- "Works on my machine" but fails in CI (different test ordering)
**Example — the smell:**
```elixir
test "respects timezone setting" do
System.put_env("TZ", "UTC")
assert MyApp.current_time().zone == "UTC"
# System env is now polluted for all remaining tests!
end
```
**Example — fixed:**
```elixir
test "respects timezone setting" do
original = System.get_env("TZ")
System.put_env("TZ", "UTC")
on_exit(fn ->
if original, do: System.put_env("TZ", original), else: System.delete_env("TZ")
end)
assert MyApp.current_time().zone == "UTC"
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- The state change is idempotent and matches the test suite's baseline (setting it to what it already is)
- You're in an integration test with a completely isolated environment (Docker container, separate VM)
**Example of acceptable use:**
```elixir
# Test helper that always restores to known baseline
setup do
# This IS the baseline — safe to set without storing original
Logger.configure(level: :warning)
on_exit(fn -> Logger.configure(level: :warning) end)
end
```
**Why it's OK here:** The level being set IS the test suite's standard baseline. Even if cleanup "fails," the state is already correct. The `on_exit` is belt-and-suspenders.
---
## 7. Nested `describe` Blocks
@@ -183,6 +476,61 @@ end
**Source:** `lib/ex_unit/lib/ex_unit/callbacks.ex:423-425``no_describe!` check prevents nesting.
### When to Apply This Rule
**Triggers:**
- Compile error: "cannot call describe inside another describe"
- Desire to organize tests hierarchically (coming from RSpec/Jest habits)
- Test file with many describe blocks that feel like they should be nested
**Example — the smell:**
```elixir
describe "API" do
describe "v1" do # Won't compile!
describe "users" do # Won't compile!
test "list" do ... end
end
end
end
```
**Example — fixed:**
```elixir
describe "API v1 users - list" do
test "returns paginated results" do ... end
test "filters by role" do ... end
end
describe "API v1 users - create" do
test "validates email" do ... end
test "hashes password" do ... end
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- You're using a testing library that extends ExUnit with nested contexts (though this is rare and usually discouraged in Elixir culture)
**Example of acceptable use:**
```elixir
# Flat describes with shared setup via module attributes or helper functions
describe "admin users" do
setup [:create_admin]
test "can delete other users", %{admin: admin} do ... end
test "can promote users", %{admin: admin} do ... end
end
describe "regular users" do
setup [:create_user]
test "cannot delete others", %{user: user} do ... end
end
```
**Why it's OK here:** Flat describe blocks with named setup functions achieve the same organization as nesting, while staying within ExUnit's design philosophy.
---
## 8. Using `assert` Where `assert_receive` Belongs
@@ -207,6 +555,55 @@ end
**Source:** ExUnit provides specialized assertion macros for messages precisely because generic `assert` is inadequate for mailbox testing.
### When to Apply This Rule
**Triggers:**
- `Process.info(self(), :messages)` in test code
- `:erlang.process_info` for mailbox inspection
- Manual mailbox polling loops in tests
**Example — the smell:**
```elixir
test "broadcast reaches subscriber" do
PubSub.subscribe(:events)
PubSub.broadcast(:events, :hello)
Process.sleep(50)
{_, messages} = Process.info(self(), :messages)
assert Enum.any?(messages, fn msg -> msg == :hello end)
end
```
**Example — fixed:**
```elixir
test "broadcast reaches subscriber" do
PubSub.subscribe(:events)
PubSub.broadcast(:events, :hello)
assert_receive :hello
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- Testing mailbox ordering (need to inspect all messages at once)
- Asserting on message count rather than content
- Building test utilities that need raw mailbox access
**Example of acceptable use:**
```elixir
test "exactly 3 events are received" do
trigger_events(3)
# Wait for all to arrive
assert_receive :event, 1000
assert_receive :event, 100
assert_receive :event, 100
# Verify no extras
refute_receive :event, 100
end
```
**Why it's OK here:** We need to assert both presence AND absence of messages, with ordering guarantees. The sequence of `assert_receive` + `refute_receive` is the idiomatic way to verify "exactly N messages."
---
## 9. Forgetting `Process.flag(:trap_exit, true)` When Testing Linked Processes
@@ -233,6 +630,65 @@ end
**Source:** `lib/elixir/test/elixir/task_test.exs:297,305,315,330` — Every test that expects a linked process to crash sets `:trap_exit` first.
### When to Apply This Rule
**Triggers:**
- Tests using `Task.async` where the task might crash
- Tests that verify process crash behavior
- Unexplained test crashes with no assertion failure message
- `** (EXIT from #PID<...>)` in test output
**Example — the smell:**
```elixir
test "supervisor restarts crashed child" do
{:ok, sup} = Supervisor.start_link([{Worker, []}], strategy: :one_for_one)
[{_, pid, _, _}] = Supervisor.which_children(sup)
Process.exit(pid, :kill)
# Test might crash here if linked!
Process.sleep(100)
[{_, new_pid, _, _}] = Supervisor.which_children(sup)
assert new_pid != pid
end
```
**Example — fixed:**
```elixir
test "supervisor restarts crashed child" do
Process.flag(:trap_exit, true)
{:ok, sup} = Supervisor.start_link([{Worker, []}], strategy: :one_for_one)
[{_, pid, _, _}] = Supervisor.which_children(sup)
ref = Process.monitor(pid)
Process.exit(pid, :kill)
assert_receive {:DOWN, ^ref, :process, ^pid, :killed}
[{_, new_pid, _, _}] = Supervisor.which_children(sup)
assert new_pid != pid
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- Using `start_supervised!` (ExUnit's test supervisor handles the linking)
- Testing that a crash DOES propagate (the test crashing IS the assertion — use `catch_exit` at the call site)
- The process is started with `start_link` under ExUnit's supervisor
**Example of acceptable use:**
```elixir
test "unlinked task doesn't crash caller" do
# Task.async_stream with ordered: false doesn't link
results = Task.async_stream([1, 2, 3], fn
2 -> raise "boom"
n -> n * 2
end, on_timeout: :kill_task)
|> Enum.to_list()
assert {:exit, _} = Enum.find(results, &match?({:exit, _}, &1))
end
```
**Why it's OK here:** `Task.async_stream` handles the linking internally and converts crashes to tagged results. The test process isn't directly linked to the failing task.
---
## 10. Writing Flaky Tests with Timing Assumptions
@@ -266,6 +722,60 @@ end
**Source:** Elixir's own `ExUnit.configure` allows setting `assert_receive_timeout` globally (default 100ms, CI uses 300ms via env var `ELIXIR_ASSERT_TIMEOUT`).
### When to Apply This Rule
**Triggers:**
- `Process.sleep(N)` where N is close to the expected delay (sleep 110ms for a 100ms timer)
- Tests tagged `@tag :flaky` or skipped in CI
- Tests with "TODO: increase timeout" comments
**Example — the smell:**
```elixir
test "rate limiter allows after window" do
RateLimiter.hit(:endpoint)
RateLimiter.hit(:endpoint) # Should be rejected
assert RateLimiter.hit(:endpoint) == {:error, :rate_limited}
Process.sleep(1010) # Wait for 1s window to expire
assert RateLimiter.hit(:endpoint) == :ok
end
```
**Example — fixed:**
```elixir
test "rate limiter allows after window" do
RateLimiter.hit(:endpoint)
RateLimiter.hit(:endpoint)
assert RateLimiter.hit(:endpoint) == {:error, :rate_limited}
# Use the rate limiter's own notification mechanism
assert_receive {:window_reset, :endpoint}, 5000
assert RateLimiter.hit(:endpoint) == :ok
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- Testing actual wall-clock behavior with very generous margins (10x the expected time)
- Performance benchmarks where timing IS the measurement
- The system under test has no notification mechanism and can't be modified
**Example of acceptable use:**
```elixir
test "connection timeout fires within expected range" do
{time_us, {:error, :timeout}} = :timer.tc(fn ->
HttpClient.get("http://10.255.255.1", timeout: 1000)
end)
# Generous bounds: should be 1000ms, accept 800-3000ms
assert time_us > 800_000
assert time_us < 3_000_000
end
```
**Why it's OK here:** The timing IS the assertion — we're verifying the timeout mechanism works correctly. The bounds are generous enough to handle system load without being so wide as to be meaningless.
---
## 11. Not Using `=~` for Regex/Partial Matching
@@ -292,6 +802,47 @@ end
**Source:** `lib/elixir/test/elixir/gen_server_test.exs:70-82` uses `~r"expected :name option to be one of the following:"` in `assert_raise` — testing the stable part, ignoring the dynamic rest.
### When to Apply This Rule
**Triggers:**
- `assert msg == "..."` where the string contains dynamic parts (PIDs, timestamps, paths)
- Tests that break after minor wording changes in error messages
- Assertions on formatted output from Logger or IO
**Example — the smell:**
```elixir
test "reports compile error" do
assert_raise CompileError, "nofile:1: undefined function foo/0 (there is no such import)", fn ->
Code.compile_string("foo()")
end
end
```
**Example — fixed:**
```elixir
test "reports compile error" do
assert_raise CompileError, ~r/undefined function foo\/0/, fn ->
Code.compile_string("foo()")
end
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- The exact output IS the contract (e.g., a formatter that must produce deterministic output)
- Testing serialization where byte-exact output matters
- The string is fully under your control with no dynamic parts
**Example of acceptable use:**
```elixir
test "JSON encoder produces exact output" do
assert Jason.encode!(%{a: 1, b: 2}) == ~s({"a":1,"b":2})
end
```
**Why it's OK here:** The JSON encoder's exact output format IS the contract. Users depend on the precise serialization. A partial match would miss regressions in formatting.
---
## 12. Relying on Process.alive? Without Synchronization
@@ -317,6 +868,49 @@ end
**Source:** `lib/elixir/test/elixir/supervisor_test.exs:278-285``assert_kill` helper always uses monitor + assert_receive, never `Process.alive?` polling.
### When to Apply This Rule
**Triggers:**
- `Process.alive?/1` in test assertions
- `refute Process.alive?(pid)` after killing/stopping a process
- Polling loops checking if a process has died
**Example — the smell:**
```elixir
test "worker shuts down gracefully" do
pid = start_supervised!(Worker)
Worker.shutdown(pid)
Process.sleep(50)
refute Process.alive?(pid)
end
```
**Example — fixed:**
```elixir
test "worker shuts down gracefully" do
pid = start_supervised!(Worker)
ref = Process.monitor(pid)
Worker.shutdown(pid)
assert_receive {:DOWN, ^ref, :process, ^pid, :normal}, 5000
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- Checking that a process IS alive (existence check, not death check) — no race condition there
- In production code (not tests) where you need a non-blocking liveness check and can handle the race
**Example of acceptable use:**
```elixir
test "start_link creates a running process" do
{:ok, pid} = MyServer.start_link([])
assert Process.alive?(pid) # Just started — guaranteed alive
end
```
**Why it's OK here:** We just created the process synchronously. It's guaranteed to be alive. There's no race condition when checking for liveness immediately after creation.
---
## 13. Using `== ""` for Empty Capture Assertions in Async Tests
@@ -341,6 +935,55 @@ end
**Source:** `lib/ex_unit/lib/ex_unit/capture_io.ex` docs: "avoid empty captures on `:stderr` with async tests"
### When to Apply This Rule
**Triggers:**
- `capture_io(:stderr, ...)` combined with `== ""` in async tests
- Flaky assertion failures showing unexpected stderr content
- Tests that pass alone but fail in the full async suite
**Example — the smell:**
```elixir
use ExUnit.Case, async: true
test "no warnings on valid input" do
assert capture_io(:stderr, fn -> MyApp.process("valid") end) == ""
end
```
**Example — fixed:**
```elixir
use ExUnit.Case, async: true
test "no warnings on valid input" do
# Option 1: Use capture_log instead (isolated per process)
assert capture_log(fn -> MyApp.process("valid") end) == ""
# Option 2: Use :stdio which is per-process
assert capture_io(fn -> MyApp.process("valid") end) == ""
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- The test uses `async: false` (no concurrent interference)
- Capturing `:stdio` (default) which uses the per-process group leader
- The function being tested explicitly writes to a named IO device you control
**Example of acceptable use:**
```elixir
use ExUnit.Case, async: false
test "no stderr output in production mode" do
Application.put_env(:my_app, :env, :prod)
on_exit(fn -> Application.put_env(:my_app, :env, :test) end)
assert capture_io(:stderr, fn -> MyApp.start() end) == ""
end
```
**Why it's OK here:** `async: false` means no other test is running concurrently. The stderr capture is isolated by test scheduling, not by the IO mechanism.
---
## 14. Overriding ExUnit Reserved Tags
@@ -359,6 +1002,56 @@ end
**Source:** `lib/ex_unit/lib/ex_unit/callbacks.ex` — ExUnit raises if you try to set reserved keys to different values in setup.
### When to Apply This Rule
**Triggers:**
- Runtime error: "ExUnit reserved field ... was set to a different value"
- Tags named `:test`, `:case`, `:file`, `:line`, `:async`, `:registered`, `:describe`
- Mysterious test failures after adding tags
**Example — the smell:**
```elixir
@tag async: false # Doesn't actually change async behavior!
@tag file: "custom.exs" # Breaks file reporting
@tag describe: "my group" # Conflicts with describe blocks
test "my test" do
# ...
end
```
**Example — fixed:**
```elixir
# Use the actual ExUnit mechanisms:
use ExUnit.Case, async: false # This is how you control async
# For custom categorization, use non-reserved names:
@tag category: :integration
@tag speed: :slow
@tag feature: "auth"
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- You're writing ExUnit extensions or custom formatters that intentionally read/write these fields (rare, advanced use)
**Example of acceptable use:**
```elixir
# Custom ExUnit formatter that adds metadata
defmodule MyFormatter do
use GenServer
def handle_cast({:test_finished, %{tags: tags}}, state) do
# Reading reserved tags in a formatter is fine
IO.puts("#{tags.file}:#{tags.line} - #{tags.test}")
{:noreply, state}
end
end
```
**Why it's OK here:** The formatter is reading reserved tags (their intended purpose), not overriding them. ExUnit populates these fields for formatters and reporters to consume.
---
## 15. Complex Conditional Logic in Tests
@@ -407,3 +1100,61 @@ end
```
**Source:** ExUnit case.ex docs: "If you use parameterized tests and then find yourself adding conditionals in your tests to deal with different parameters, then parameterized tests may be the wrong solution."
### When to Apply This Rule
**Triggers:**
- `if`/`case`/`cond` inside test bodies
- `for` loops with assertions inside (unless trivially mapping input→output)
- Tests that assert different things based on runtime values
**Example — the smell:**
```elixir
test "validates all field types" do
for {field, value, expected_error} <- [
{:email, "bad", "invalid email"},
{:age, -1, "must be positive"},
{:name, "", "can't be blank"}
] do
changeset = User.changeset(%User{}, %{field => value})
if expected_error do
assert errors_on(changeset)[field] == [expected_error]
else
assert changeset.valid?
end
end
end
```
**Example — fixed:**
```elixir
# Elixir 1.18+ parameterize
use ExUnit.Case, parameterize: [
%{field: :email, value: "bad", error: "invalid email"},
%{field: :age, value: -1, error: "must be positive"},
%{field: :name, value: "", error: "can't be blank"}
]
test "validates #{inspect(field)}", %{field: field, value: value, error: error} do
changeset = User.changeset(%User{}, %{field => value})
assert errors_on(changeset)[field] == [error]
end
```
### Exceptions (When This Rule Doesn't Apply)
**It's OK when:**
- Testing a pure function over many inputs where the mapping is trivial and the loop is just convenience
- Property-based tests (StreamData) that generate inputs — conditionals in generators are fine
- The loop tests identical behavior, not different behaviors
**Example of acceptable use:**
```elixir
test "all ASCII digits parse to integers" do
for char <- ?0..?9 do
assert Integer.parse(<<char>>) == {char - ?0, ""}
end
end
```
**Why it's OK here:** There's no conditional logic — every iteration tests the exact same behavior with a trivially predictable result. If one fails, the assertion message includes the specific value. The loop is purely for conciseness.