Files
Rodin df9c856d96 patterns(testing): add async test filtering pattern (#21)
When testing telemetry, pub/sub, or any broadcast mechanism with
async: true, events from concurrent tests can leak into mailbox.

Fix: Pin-match on unique identifier to filter.

Two-part pattern:
1. Filter at source — only forward events matching test's identifier
2. Pin in assertion — ^variable rejects mismatches that slip through

Applies to telemetry handlers, PubSub, GenStage/Broadway consumers,
any shared message bus.

Triggered by PR #710 flaky test fix in gargoyle.
2026-05-09 18:27:05 -07:00

59 KiB

Testing Patterns in Elixir

Patterns extracted from the Elixir standard library source code — how the core team writes and organizes tests.

Contents

  1. Module-Level Async Declaration
  2. Parameterized Tests
  3. Setup with start_supervised/2
  4. Named Setup Functions (Composable Pipelines)
  5. on_exit for Reversing Global Side Effects
  6. Pattern Match Assertions
  7. assert_receive / refute_receive for Process Communication
  8. Testing GenServers via Public API (No Internal State Inspection)
  9. catch_exit for Testing Process Failures
  10. @tag capture_log: true for Suppressing Expected Log Output
  11. capture_log / capture_io for Content Assertions
  12. describe Blocks for Logical Grouping
  13. ExUnit.CaseTemplate for Shared Test Infrastructure
  14. doctest Integration
  15. Process.sleep(:infinity) as a Process Parking Pattern
  16. Helper Functions for Test-Specific Behavior
  17. @tag :tmp_dir for Filesystem Tests
  18. assert_raise with Message Matching
  19. @moduletag / @describetag for Cross-Cutting Configuration
  20. Context Pattern Matching in Test Signatures

1. Module-Level Async Declaration

Source: lib/elixir/test/elixir/gen_server_test.exs#L9, lib/elixir/test/elixir/enum_test.exs#L8, nearly all test files

What it does: Every test module declares async: true or async: false at the module level, making concurrency intent explicit.

Why: Tests that don't mutate global state run concurrently, dramatically speeding up the suite. The explicit opt-in forces developers to think about whether their test touches shared resources.

Pattern:

defmodule GenServerTest do
  use ExUnit.Case, async: true
  # ...
end

# When global state is modified (e.g. registered processes):
defmodule TaskTest do
  use ExUnit.Case  # async defaults to false
  # ...
end

Key insight: The vast majority of Elixir's own tests use async: true. Only tests that register global names, modify Logger config, or interact with the filesystem use synchronous mode.

When to Use

Triggers:

  • You're creating any new test module (always explicitly declare async intent)
  • Tests are pure — no global name registration, no shared ETS tables, no filesystem writes to the same path
  • You want faster test suite execution

Example — before:

defmodule UserTest do
  use ExUnit.Case
  # Defaults to async: false — runs serially for no reason
  test "parses name" do
    assert parse_name("Alice Bob") == {"Alice", "Bob"}
  end
end

Example — after:

defmodule UserTest do
  use ExUnit.Case, async: true
  # Pure tests run concurrently — suite finishes faster
  test "parses name" do
    assert parse_name("Alice Bob") == {"Alice", "Bob"}
  end
end

When NOT to Use

Don't use this when:

  • Tests register global process names (:global, Process.register/2)
  • Tests modify shared application config (Application.put_env)
  • Tests write to the same filesystem path without @tag :tmp_dir
  • Tests depend on Logger configuration or capture stderr in exact mode

Over-application example:

defmodule ConfigTest do
  use ExUnit.Case, async: true  # DANGEROUS

  test "updates global config" do
    Application.put_env(:my_app, :key, :value)
    assert Application.get_env(:my_app, :key) == :value
  end
  # Another async test might see or clobber this config!
end

Better alternative:

defmodule ConfigTest do
  use ExUnit.Case  # async: false — global state mutation

  setup do
    original = Application.get_env(:my_app, :key)
    on_exit(fn -> Application.put_env(:my_app, :key, original) end)
  end

  test "updates global config" do
    Application.put_env(:my_app, :key, :value)
    assert Application.get_env(:my_app, :key) == :value
  end
end

Why: Async tests run in parallel. Global state mutations in parallel tests produce race conditions — intermittent failures that are nightmares to debug.


2. Parameterized Tests

Source: lib/elixir/test/elixir/registry_test.exs#L12

What it does: Runs the same test suite against multiple configurations using the :parameterize option (since v1.18).

Why: Avoids duplicating test modules for combinatorial configurations. The Registry needs testing with :unique/:duplicate keys and varying partition counts.

Pattern:

defmodule Registry.Test do
  use ExUnit.Case,
    async: true,
    parameterize:
      for(
        keys <- [:unique, :duplicate, {:duplicate, :pid}, {:duplicate, :key}],
        partitions <- [1, 8],
        do: %{keys: keys, partitions: partitions}
      )

  setup config do
    name = :"#{config.test}_#{config.partitions}_#{inspect(config.keys)}"
    opts = [keys: config.keys, name: name, partitions: config.partitions]
    {:ok, _} = start_supervised({Registry, opts})
    %{registry: name}
  end

  test "clean up registry on process crash", %{registry: registry, partitions: partitions} do
    # Test body uses parameters from context
  end
end

Warning from docs: "If you use parameterized tests and then find yourself adding conditionals in your tests to deal with different parameters, then parameterized tests may be the wrong solution."

When to Use

Triggers:

  • The same logic must work across multiple configurations (backends, modes, partition counts)
  • You'd otherwise copy-paste an entire test module with minor variations
  • The behavior under test is identical regardless of parameter — only setup differs

Example — before:

# Two nearly identical modules
defmodule CacheEtsTest do
  use ExUnit.Case, async: true
  setup do: %{cache: start_cache(:ets)}
  test "get/set", %{cache: c}, do: # ...
end

defmodule CacheRedisTest do
  use ExUnit.Case, async: true
  setup do: %{cache: start_cache(:redis)}
  test "get/set", %{cache: c}, do: # ...  # Same test!
end

Example — after:

defmodule CacheTest do
  use ExUnit.Case,
    async: true,
    parameterize: [%{backend: :ets}, %{backend: :redis}]

  setup %{backend: backend} do
    %{cache: start_cache(backend)}
  end

  test "get/set", %{cache: c} do
    # Runs once per backend — no duplication
  end
end

When NOT to Use

Don't use this when:

  • Different parameters need different assertions (sign you're testing different behavior)
  • You find yourself adding if config.backend == :redis inside tests
  • There are only 2 simple variations (just write two tests with descriptive names)

Over-application example:

defmodule ApiTest do
  use ExUnit.Case,
    parameterize: [%{method: :get}, %{method: :post}]

  test "request", %{method: method} do
    if method == :get do
      assert get("/users") == 200
    else
      assert post("/users", %{name: "x"}) == 201  # Different assertion!
    end
  end
end

Better alternative:

test "GET /users returns 200" do
  assert get("/users") == 200
end

test "POST /users creates a user" do
  assert post("/users", %{name: "x"}) == 201
end

Why: Parameterized tests assert the same behavior across configurations. If the assertions diverge per parameter, you're testing different things — write separate tests with clear names.


3. Setup with start_supervised/2

Source: lib/ex_unit/lib/ex_unit/callbacks.ex#L277, lib/elixir/test/elixir/registry_test.exs#L31

What it does: Starts processes under a test supervisor that guarantees cleanup before the next test.

Why: Eliminates manual cleanup. The test supervisor terminates children in reverse order before on_exit callbacks run. No leaked processes between tests.

Pattern:

setup config do
  {:ok, _} = start_supervised({Registry, keys: :unique, name: config.test})
  %{registry: config.test}
end

Contrast with anti-pattern:

# BAD — process may leak if test crashes before cleanup
setup do
  {:ok, pid} = Registry.start_link(keys: :unique, name: :my_reg)
  on_exit(fn -> Process.exit(pid, :kill) end)
  %{registry: :my_reg}
end

When to Use

Triggers:

  • Your test needs a running process (GenServer, Registry, Supervisor, etc.)
  • You want guaranteed cleanup even if the test crashes
  • The process should have the same lifecycle as the test

Example — before:

setup do
  {:ok, pid} = MyServer.start_link(name: :test_server)
  on_exit(fn ->
    if Process.alive?(pid), do: GenServer.stop(pid)
  end)
  %{server: pid}
end

Example — after:

setup do
  pid = start_supervised!(MyServer)
  %{server: pid}
end

When NOT to Use

Don't use this when:

  • You're testing the startup/shutdown behavior itself (need raw start_link)
  • The process must outlive the test (rare — usually a test design smell)
  • You need to test what happens when start_link fails

Over-application example:

# Testing that start_link fails with bad args
test "rejects invalid config" do
  # start_supervised! will raise, hiding the actual error you want to test
  assert {:error, _} = start_supervised({MyServer, invalid: true})
end

Better alternative:

test "rejects invalid config" do
  assert {:error, {:bad_config, _}} = MyServer.start_link(invalid: true)
end

Why: start_supervised is for processes your test needs running. When you're testing failure modes of the process itself, call start_link directly so you can assert on the error tuple.


4. Named Setup Functions (Composable Pipelines)

Source: lib/ex_unit/lib/ex_unit/callbacks.ex#L100 (docs)

What it does: Defines setup as a list of named functions rather than anonymous blocks.

Why: Each step is independently testable, reusable, and the setup pipeline reads like a declaration of preconditions.

Pattern:

setup [:clean_up_tmp_directory, :start_server, :seed_data]

defp clean_up_tmp_directory(_context), do: [tmp_dir: "/tmp/test"]
defp start_server(context), do: {:ok, server: start_supervised!({MyServer, context.tmp_dir})}
defp seed_data(context), do: :ok

When to Use

Triggers:

  • Setup has multiple independent steps that read better as named operations
  • Different describe blocks need different combinations of setup steps
  • You want to reuse individual setup steps across describe blocks

Example — before:

setup do
  # 30 lines of mixed concerns in one block
  {:ok, _} = start_supervised(Database)
  user = insert(:user)
  token = generate_token(user)
  {:ok, socket} = connect(token)
  %{user: user, socket: socket, token: token}
end

Example — after:

setup [:start_database, :create_user, :connect_socket]

defp start_database(_ctx), do: {:ok, db: start_supervised!(Database)}
defp create_user(_ctx), do: {:ok, user: insert(:user)}
defp connect_socket(%{user: user}), do: {:ok, socket: connect!(generate_token(user))}

When NOT to Use

Don't use this when:

  • Setup is 1-3 lines (named function adds indirection without clarity)
  • Steps are tightly coupled and can't be composed independently
  • Only one describe block uses the setup (inline is simpler)

Over-application example:

# Over-decomposing trivial setup
setup [:assign_name]

defp assign_name(_ctx), do: {:ok, name: "Alice"}

Better alternative:

setup do
  %{name: "Alice"}
end

Why: Named setup functions shine when they compose independently and describe preconditions. For trivial assignments, the indirection makes the code harder to follow, not easier.


5. on_exit for Reversing Global Side Effects

Source: lib/elixir/test/elixir/task_test.exs#L1128, lib/logger/test/logger_test.exs#L12

What it does: Registers cleanup callbacks that always run, even if the test fails.

Why: Guarantees global state (Logger config, ETS tables, process registrations) is restored regardless of test outcome.

Pattern:

setup do
  translator = :logger.get_primary_config().filters[:logger_translator]
  assert :ok = :logger.remove_primary_filter(:logger_translator)
  on_exit(fn -> :logger.add_primary_filter(:logger_translator, translator) end)
end

Key design: on_exit runs in a separate process from the test, so it cannot interfere with test assertions.

When to Use

Triggers:

  • Your test modifies global state (application config, Logger, ETS, registered names)
  • State must be restored even if the test crashes or fails
  • You need cleanup that runs after the test process dies

Example — before:

test "custom log level" do
  Logger.configure(level: :error)
  # test logic...
  Logger.configure(level: :debug)  # Never runs if test fails!
end

Example — after:

test "custom log level" do
  original = Logger.level()
  on_exit(fn -> Logger.configure(level: original) end)
  Logger.configure(level: :error)
  # test logic — cleanup guaranteed
end

When NOT to Use

Don't use this when:

  • start_supervised handles the lifecycle (it cleans up automatically)
  • The state is test-local (process dictionary, local variables)
  • You're using it for process cleanup that start_supervised does better

Over-application example:

setup do
  {:ok, pid} = MyWorker.start_link()
  on_exit(fn -> Process.exit(pid, :kill) end)
  %{worker: pid}
end

Better alternative:

setup do
  pid = start_supervised!(MyWorker)
  %{worker: pid}
end

Why: on_exit is for global side effects that start_supervised can't handle. For process lifecycle, start_supervised is more robust — it handles ordering, linking, and shutdown properly.


6. Pattern Match Assertions

Source: lib/ex_unit/lib/ex_unit/assertions.ex#L145

What it does: Uses assert with = for structural pattern matching in assertions.

Why: Provides rich failure messages showing both sides. More expressive than assert x == y when you only care about shape.

Pattern:

# Assert structure, ignore specifics
assert {:ok, %{id: id}} = create_user("alice")
assert is_integer(id)

# Pin variables in patterns
x = 5
assert {:count, ^x} = get_counter()

# match? for complex guards
assert match?([%{id: id} | _] when is_integer(id), records)

When to Use

Triggers:

  • You care about the shape/structure but not every field
  • You want to bind a value from the result for further assertions
  • The response has dynamic fields (IDs, timestamps) you can't predict

Example — before:

result = create_user("alice")
assert elem(result, 0) == :ok
user = elem(result, 1)
assert Map.has_key?(user, :id)
assert is_integer(user.id)

Example — after:

assert {:ok, %{id: id}} = create_user("alice")
assert is_integer(id)

When NOT to Use

Don't use this when:

  • You need to assert exact equality (use ==)
  • The pattern is so loose it would match unintended values
  • You're asserting on a boolean or simple scalar

Over-application example:

# Pattern match that matches too broadly
assert {:ok, _} = dangerous_operation()
# This passes for {:ok, nil}, {:ok, :error_actually}, anything!

Better alternative:

assert {:ok, %User{active: true}} = dangerous_operation()
# Or when you need exact value:
assert create_user("alice") == {:ok, %User{name: "alice", active: true}}

Why: Pattern match assertions are for structural validation. If your pattern is so loose it can't distinguish success from unexpected values, you're not actually testing anything meaningful.


7. assert_receive / refute_receive for Process Communication

Source: lib/ex_unit/lib/ex_unit/assertions.ex#L466, lib/elixir/test/elixir/process_test.exs#L90

What it does: Waits for messages matching a pattern within a timeout (default 100ms).

Why: Tests asynchronous process communication without Process.sleep. The test either receives the expected message or fails with a helpful mailbox dump.

Pattern:

# Basic message assertion
test "send_after/3 sends messages once expired" do
  Process.send_after(self(), :hello, 10)
  assert_receive :hello
end

# Pattern matching with pins
test "monitor/2 with monitor options" do
  ref_and_alias = Process.monitor(pid, alias: :explicit_unalias)
  send(pid, {:ping, ref_and_alias})
  assert_receive :pong
  assert_receive {:DOWN, ^ref_and_alias, _, _, _}
end

# Negative assertion
test "exit(pid, :normal) does not cause the target process to exit" do
  Process.exit(pid, :normal)
  refute_receive {:EXIT, ^pid, :normal}, 100
end

When to Use

Triggers:

  • Testing async process communication (messages, monitors, links)
  • You need to verify a message was sent without blocking indefinitely
  • Testing pub/sub, GenServer casts, or event broadcasts

Example — before:

test "notifies subscriber" do
  subscribe(self(), :topic)
  publish(:topic, "hello")
  Process.sleep(100)  # Hope it arrived...
  assert_received {:topic, "hello"}  # assert_received checks mailbox NOW
end

Example — after:

test "notifies subscriber" do
  subscribe(self(), :topic)
  publish(:topic, "hello")
  assert_receive {:topic, "hello"}, 500  # Waits UP TO 500ms
end

When NOT to Use

Don't use this when:

  • The operation is synchronous (use regular assert with the return value)
  • You're testing a GenServer.call (it already returns the result synchronously)
  • The default 100ms timeout would make tests flaky — consider a longer timeout or redesigning

Over-application example:

# Using assert_receive for a synchronous operation
test "get user" do
  send(self(), {:user, get_user(1)})  # Why send to self?
  assert_receive {:user, %User{id: 1}}
end

Better alternative:

test "get user" do
  assert %User{id: 1} = get_user(1)
end

Why: assert_receive is for genuinely async communication — messages that arrive independently of the test's control flow. For synchronous return values, just assert on the return directly.


8. Testing GenServers via Public API (No Internal State Inspection)

Source: lib/elixir/test/elixir/gen_server_test.exs#L87

What it does: Tests GenServer behavior exclusively through GenServer.call/cast/stop — never peeks at internal state.

Why: Tests the contract, not the implementation. Internal state changes don't break tests.

Pattern:

test "start_link/2, call/2 and cast/2" do
  {:ok, pid} = GenServer.start_link(Stack, [:hello])

  assert GenServer.call(pid, :pop) == :hello
  assert GenServer.cast(pid, {:push, :world}) == :ok
  assert GenServer.call(pid, :pop) == :world
  assert GenServer.stop(pid) == :ok
end

When to Use

Triggers:

  • Testing any GenServer, Agent, or stateful process
  • You want tests that survive internal refactoring
  • The process has a well-defined public API

Example — before:

test "push adds to stack" do
  {:ok, pid} = Stack.start_link([:hello])
  GenServer.cast(pid, {:push, :world})
  # Peeking at internal state — couples test to implementation
  assert :sys.get_state(pid) == [:world, :hello]
end

Example — after:

test "push adds to stack" do
  {:ok, pid} = Stack.start_link([:hello])
  GenServer.cast(pid, {:push, :world})
  assert GenServer.call(pid, :pop) == :world
  assert GenServer.call(pid, :pop) == :hello
end

When NOT to Use

Don't use this when:

  • You're debugging a specific state corruption bug (temporarily peek, then fix)
  • The process has no public query API and adding one just for tests would bloat the interface
  • You're testing internal state transitions explicitly (state machine verification)

Over-application example:

# Avoiding state inspection when there's no query API
test "handles concurrent updates" do
  cast(pid, {:increment, 5})
  cast(pid, {:increment, 3})
  # No way to observe the result without :sys.get_state or adding a query call
  Process.sleep(100)
  # ...can't assert anything useful
end

Better alternative:

# Add a minimal query function to the public API
test "handles concurrent updates" do
  cast(pid, {:increment, 5})
  cast(pid, {:increment, 3})
  assert call(pid, :get_count) == 8
end

Why: The principle is "test through the API," not "never observe state." If your process lacks observability, add a read function — that's a better public API, not just a test convenience.


9. catch_exit for Testing Process Failures

Source: lib/ex_unit/lib/ex_unit/assertions.ex#L950, lib/elixir/test/elixir/gen_server_test.exs#L118

What it does: Catches exit signals from linked processes for assertion, or uses Process.flag(:trap_exit, true) + assert_receive {:EXIT, ...}.

Why: Testing error conditions in OTP requires intercepting exit signals. The two approaches serve different needs.

Pattern:

# catch_exit for synchronous exit testing
test "call/3 exit messages" do
  assert catch_exit(GenServer.call(pid, :noreply, 1)) ==
           {:timeout, {GenServer, :call, [pid, :noreply, 1]}}
end

# trap_exit for linked process exits
test "exits on task error" do
  Process.flag(:trap_exit, true)
  task = Task.async(fn -> raise "oops" end)
  assert {{%RuntimeError{}, _}, {Task, :await, [^task, 5000]}} = catch_exit(Task.await(task))
end

When to Use

Triggers:

  • Testing that a GenServer call times out or exits with a specific reason
  • Testing that linked processes propagate exit signals correctly
  • Verifying OTP shutdown behavior and exit reasons

Example — before:

test "times out" do
  # Just assert it raises... but exit != raise
  assert_raise RuntimeError, fn ->
    GenServer.call(pid, :slow, 1)
  end
  # This fails! Exits aren't raises.
end

Example — after:

test "times out" do
  assert catch_exit(GenServer.call(pid, :slow, 1)) ==
    {:timeout, {GenServer, :call, [pid, :slow, 1]}}
end

When NOT to Use

Don't use this when:

  • The function raises an exception (use assert_raise instead)
  • You can test the error through the public API's return value {:error, reason}
  • The exit is a side effect you don't care about (just test the observable behavior)

Over-application example:

# Using catch_exit when the API already returns error tuples
test "handles missing key" do
  assert catch_exit(GenServer.call(pid, {:get, :missing}))
  # The server actually returns {:error, :not_found} — no exit!
end

Better alternative:

test "handles missing key" do
  assert GenServer.call(pid, {:get, :missing}) == {:error, :not_found}
end

Why: catch_exit is specifically for OTP exit signals (:timeout, :noproc, :normal, {:shutdown, reason}). If the function returns error tuples rather than exiting, assert on the return value directly.


10. @tag capture_log: true for Suppressing Expected Log Output

Source: lib/elixir/test/elixir/gen_server_test.exs#L114, lib/elixir/test/elixir/task_test.exs#L10

What it does: Captures log output during the test, only printing it if the test fails.

Why: Tests that intentionally trigger error conditions produce noisy log output. Capturing keeps the test output clean while preserving diagnostics on failure.

Pattern:

# Per-test tag
@tag capture_log: true
test "call/3 exit messages" do
  # This test triggers error logs — they're captured
end

# Module-level for all tests
@moduletag :capture_log

When to Use

Triggers:

  • Tests intentionally trigger error paths that log warnings/errors
  • Test output is noisy with expected error messages
  • You want clean mix test output but need logs preserved for debugging failures

Example — before:

test "handles crash" do
  # Passes, but test output shows:
  # [error] GenServer #PID<0.123.0> terminating
  # [error] ** (RuntimeError) intentional crash
  crash_worker(pid)
  assert_receive {:restarted, _}
end

Example — after:

@tag capture_log: true
test "handles crash" do
  crash_worker(pid)
  assert_receive {:restarted, _}
  # Logs captured — only shown if this test FAILS
end

When NOT to Use

Don't use this when:

  • You want to assert on log content (use capture_log/2 function instead)
  • The logs indicate a real problem you should fix, not expected behavior
  • You're hiding logs to mask flaky tests

Over-application example:

@moduletag :capture_log  # Blanket capture on entire module

test "creates user" do
  # This test logs "[warn] duplicate email check" — is that expected?
  # By capturing everything, you might miss real warnings
  assert {:ok, _} = create_user(attrs)
end

Better alternative:

# Only capture on tests that INTENTIONALLY trigger errors
test "creates user" do
  assert {:ok, _} = create_user(attrs)
  # If there's a warning, investigate it — don't hide it
end

@tag capture_log: true
test "rejects duplicate email" do
  create_user(attrs)
  assert {:error, :duplicate} = create_user(attrs)
  # Expected warning about duplicate — captured
end

Why: Blanket log capture hides signals. Apply capture_log surgically to tests where you expect error output. Unexpected logs in other tests might reveal bugs.


11. capture_log / capture_io for Content Assertions

Source: lib/ex_unit/lib/ex_unit/capture_log.ex#L1, lib/elixir/test/elixir/task_test.exs#L1138

What it does: Captures log/IO output and returns it as a string for assertion.

Why: Tests that the right messages are logged/printed without relying on side effects.

Pattern:

# capture_log for asserting log content
test "logs a terminated task" do
  assert ExUnit.CaptureLog.capture_log(fn ->
           ref = Process.monitor(pid)
           send(pid, :go)
           receive do: ({:DOWN, ^ref, _, _, _} -> :ok)
         end) =~ ~r/Task .* terminating/
end

# with_io returns both result and output (since v1.13)
{result, output} = with_io(fn ->
  IO.puts("a")
  2 + 2
end)
assert result == 4
assert output == "a\n"

Important for async tests: Use =~ instead of == for :stderr captures because output from other tests may interleave.

When to Use

Triggers:

  • You need to verify specific log messages are emitted
  • Testing CLI output or formatted display
  • Verifying error messages are user-friendly

Example — before:

test "logs warning on retry" do
  # Just call it and... hope the log is right?
  retry_request(url)
  # No assertion on the log content
end

Example — after:

test "logs warning on retry" do
  log = capture_log(fn -> retry_request(url) end)
  assert log =~ "retrying request"
  assert log =~ url
end

When NOT to Use

Don't use this when:

  • You just want to suppress noisy logs (use @tag capture_log: true)
  • The log message is an implementation detail that shouldn't be part of the contract
  • Testing the behavior is sufficient — the log is incidental

Over-application example:

test "creates user" do
  log = capture_log(fn ->
    assert {:ok, user} = create_user(attrs)
  end)
  assert log =~ "INSERT INTO users"  # Testing SQL logs?! Fragile.
end

Better alternative:

test "creates user" do
  assert {:ok, user} = create_user(attrs)
  assert user.name == "Alice"
  # SQL logs are an implementation detail — don't assert on them
end

Why: Assert on logs that are part of the contract (user-facing warnings, audit trail). Don't assert on incidental logs that change with implementation — they make tests brittle.


12. describe Blocks for Logical Grouping

Source: lib/elixir/test/elixir/task_test.exs:218,272,365, lib/elixir/test/elixir/process_test.exs#L146

What it does: Groups related tests under a named describe block. Setup inside describe only applies to that group.

Why: Organizes tests by function/feature. Makes test output readable. Allows scoped @describetag and scoped setup.

Pattern:

describe "await/2" do
  test "exits on timeout" do
    task = %Task{ref: make_ref(), owner: self(), pid: nil, mfa: {__MODULE__, :test, 1}}
    assert catch_exit(Task.await(task, 0)) == {:timeout, {Task, :await, [task, 0]}}
  end

  test "exits on normal exit" do
    task = Task.async(fn -> exit(:normal) end)
    assert catch_exit(Task.await(task)) == {:normal, {Task, :await, [task, 5000]}}
  end
end

Constraint: Describe blocks cannot be nested. setup_all cannot appear inside describe.

When to Use

Triggers:

  • A module tests multiple public functions
  • Tests share setup that's specific to one function/feature
  • You want test output organized by feature (e.g., "describe await/2 — exits on timeout")

Example — before:

test "push adds element" do ... end
test "push returns :ok" do ... end
test "pop returns top element" do ... end
test "pop from empty raises" do ... end
# Flat list — hard to see which tests cover which function

Example — after:

describe "push/2" do
  test "adds element to the top" do ... end
  test "returns :ok" do ... end
end

describe "pop/1" do
  test "returns top element" do ... end
  test "raises on empty stack" do ... end
end

When NOT to Use

Don't use this when:

  • The module tests one function (describe adds an unnecessary nesting level)
  • You'd have a describe block with a single test in it
  • You need nested grouping (describe can't nest — use separate modules)

Over-application example:

describe "parse/1" do
  test "parses input" do
    assert parse("hello") == {:ok, "hello"}
  end
end
# One test in a describe — just use a standalone test

Better alternative:

test "parse/1 parses input" do
  assert parse("hello") == {:ok, "hello"}
end

Why: describe provides structure when there are multiple tests per feature. A single test in a describe is over-organization — the describe name adds visual noise without grouping benefit.


13. ExUnit.CaseTemplate for Shared Test Infrastructure

Source: lib/mix/test/test_helper.exs#L79, lib/logger/test/test_helper.exs#L24

What it does: Defines reusable test case templates with shared setup, helpers, and imports.

Why: Eliminates duplication across test modules. Provides domain-specific test DSLs.

Pattern:

# In test_helper.exs
defmodule Logger.Case do
  use ExUnit.CaseTemplate

  using _ do
    quote do
      import Logger.Case
    end
  end

  setup do
    on_exit(fn ->
      # Shared cleanup for all tests using this template
    end)
    :ok
  end

  def capture_log(level \\ :debug, fun) do
    Logger.configure(level: level)
    capture_io(:user, fn ->
      fun.()
      Logger.flush()
    end)
  after
    Logger.configure(level: :debug)
  end
end

# In test file:
defmodule LoggerTest do
  use Logger.Case
  # Gets all imports and setup from the template
end

When to Use

Triggers:

  • Multiple test modules share the same setup/teardown logic
  • You're building a test DSL (e.g., DataCase for database tests in Phoenix)
  • Shared helpers need to be imported into every test module of a certain type

Example — before:

# Repeated in every test module that uses the database
defmodule UsersTest do
  use ExUnit.Case
  setup do
    :ok = Ecto.Adapters.SQL.Sandbox.checkout(Repo)
    Ecto.Adapters.SQL.Sandbox.mode(Repo, {:shared, self()})
  end
end

defmodule PostsTest do
  use ExUnit.Case
  setup do
    :ok = Ecto.Adapters.SQL.Sandbox.checkout(Repo)  # Same thing again
    Ecto.Adapters.SQL.Sandbox.mode(Repo, {:shared, self()})
  end
end

Example — after:

defmodule MyApp.DataCase do
  use ExUnit.CaseTemplate

  setup do
    :ok = Ecto.Adapters.SQL.Sandbox.checkout(Repo)
    Ecto.Adapters.SQL.Sandbox.mode(Repo, {:shared, self()})
  end
end

defmodule UsersTest do
  use MyApp.DataCase  # All DB setup handled
end

When NOT to Use

Don't use this when:

  • Only one or two test modules share the setup (just use named setup functions)
  • The template would have many options making it hard to understand what's actually set up
  • You're hiding important test context behind abstraction

Over-application example:

defmodule MyApp.SuperCase do
  use ExUnit.CaseTemplate
  using do
    quote do
      import MyApp.Factory
      import MyApp.Assertions
      import MyApp.Helpers
      alias MyApp.{Repo, User, Post, Comment, Tag}
      # 20 more aliases...
    end
  end
  # Every test gets everything whether it needs it or not
end

Better alternative:

# Focused templates for specific test types
defmodule MyApp.DataCase do ... end    # Database tests
defmodule MyApp.ConnCase do ... end    # HTTP tests
defmodule MyApp.ChannelCase do ... end # WebSocket tests

Why: A "god template" that imports everything creates implicit dependencies and makes it impossible to know what a test actually needs. Use focused templates that match distinct test categories.


14. doctest Integration

Source: lib/ex_unit/lib/ex_unit/doc_test.ex#L1, lib/elixir/test/elixir/agent_test.exs#L9

What it does: Generates tests from @doc and @moduledoc code examples.

Why: Documentation examples are always verified. Prevents docs from rotting.

Pattern:

defmodule AgentTest do
  use ExUnit.Case, async: true
  doctest Agent
end

# Selective doctesting:
doctest Kernel, except: [===: 2, !==: 2, and: 2, or: 2]

When to Use

Triggers:

  • Your module has iex> examples in @doc or @moduledoc
  • You want to ensure documentation examples stay correct across refactors
  • The function has simple, deterministic input/output that's easy to show in docs

Example — before:

@doc """
Doubles a number.

    iex> double(5)
    10
"""
def double(n), do: n * 2
# No doctest — this example might rot if you rename the function

Example — after:

# In test file:
defmodule MathTest do
  use ExUnit.Case, async: true
  doctest MyApp.Math
  # Now the iex> example is verified on every test run
end

When NOT to Use

Don't use this when:

  • Examples require complex setup (database, processes, external state)
  • Output is non-deterministic (timestamps, random values, PIDs)
  • The function's behavior is better tested with dedicated unit tests

Over-application example:

@doc """
Creates a user.

    iex> {:ok, user} = create_user(%{name: "Alice"})
    iex> user.id
    1
"""
# User ID is auto-incremented — this breaks on second run!

Better alternative:

@doc """
Creates a user.

    {:ok, user} = MyApp.create_user(%{name: "Alice"})
    user.name
    #=> "Alice"

"""
# Don't use iex> prefix — this is illustrative, not a doctest
# Test the actual behavior in a proper test with DB setup

Why: Doctests are for deterministic, setup-free examples. If the example needs a database, process, or produces non-deterministic output, write a proper test — doctests can't handle setup/teardown.


15. Process.sleep(:infinity) as a Process Parking Pattern

Source: lib/elixir/test/elixir/task_test.exs#L417, lib/elixir/test/elixir/registry_test.exs#L71

What it does: Spawns processes that block forever, used as test subjects that need to exist until explicitly killed.

Why: Creates stable process references for testing supervision, monitoring, and registry behavior. The process stays alive until the test supervisor shuts it down.

Pattern:

# Process exists solely to be registered/monitored
{:ok, task} =
  Task.start(fn ->
    send(parent, Registry.register(registry, key, value))
    Process.sleep(:infinity)
  end)

# Then kill it to test cleanup:
Process.exit(task, :kill)
assert_receive {:DOWN, ^ref, _, _, _}

Important distinction: This is NOT Process.sleep(100) for timing — it's an intentional "park this process" pattern where the process is always explicitly terminated by the test.

When to Use

Triggers:

  • You need a process that exists as a test subject (for monitoring, registry, supervision tests)
  • The process doesn't need to do anything — just exist and be killable
  • Testing cleanup/shutdown behavior when a process dies

Example — before:

test "monitors processes" do
  pid = spawn(fn ->
    receive do: (:stop -> :ok)  # Must send :stop to clean up
  end)
  ref = Process.monitor(pid)
  send(pid, :stop)
  assert_receive {:DOWN, ^ref, _, _, :normal}
end

Example — after:

test "monitors processes" do
  pid = spawn(fn -> Process.sleep(:infinity) end)
  ref = Process.monitor(pid)
  Process.exit(pid, :kill)
  assert_receive {:DOWN, ^ref, _, _, :killed}
end

When NOT to Use

Don't use this when:

  • You need the process to actually do something (send messages, handle calls)
  • You're using it as a timing mechanism (Process.sleep(100) for "wait a bit")
  • The process should be managed by start_supervised (which handles cleanup)

Over-application example:

test "worker processes requests" do
  pid = spawn(fn ->
    Process.sleep(:infinity)  # But the test needs it to handle messages!
  end)
  send(pid, {:process, data})
  assert_receive {:result, _}  # Never arrives — process is sleeping!
end

Better alternative:

test "worker processes requests" do
  pid = spawn(fn ->
    receive do
      {:process, data} -> send(parent, {:result, transform(data)})
    end
  end)
  send(pid, {:process, data})
  assert_receive {:result, _}
end

Why: Process.sleep(:infinity) is for inert test subjects — processes that exist to be observed, not to perform work. If the test needs the process to respond, it needs a receive loop, not a sleep.


16. Helper Functions for Test-Specific Behavior

Source: lib/elixir/test/elixir/task_test.exs#L12, lib/elixir/test/elixir/supervisor_test.exs#L278

What it does: Defines private helper functions within test modules for common test operations.

Why: Keeps tests DRY without over-abstracting. Helpers like wait_until_down, assert_kill, create_dummy_task encapsulate recurring patterns.

Pattern:

defmodule TaskTest do
  use ExUnit.Case

  # Helper to create a known-state task for testing edge cases
  defp create_dummy_task(reason) do
    {pid, ref} = spawn_monitor(Kernel, :exit, [reason])
    receive do
      {:DOWN, ^ref, _, _, _} ->
        %Task{ref: ref, pid: pid, owner: self(), mfa: {__MODULE__, :create_dummy_task, 1}}
    end
  end

  # Helper that properly waits for process termination
  def wait_until_down(task) do
    ref = Process.monitor(task.pid)
    assert_receive {:DOWN, ^ref, _, _, _}
  end

  # Helper for asserting process kill
  defp assert_kill(pid, reason) do
    ref = Process.monitor(pid)
    Process.exit(pid, reason)
    assert_receive {:DOWN, ^ref, _, _, _}
  end
end

When to Use

Triggers:

  • The same 3-5 line pattern repeats across multiple tests in the module
  • The helper name makes tests more readable by expressing intent
  • Setup logic is complex enough to obscure the test's actual assertion

Example — before:

test "cleans up on crash 1" do
  ref = Process.monitor(pid1)
  Process.exit(pid1, :kill)
  assert_receive {:DOWN, ^ref, _, _, _}
  assert Registry.lookup(reg, :key) == []
end

test "cleans up on crash 2" do
  ref = Process.monitor(pid2)
  Process.exit(pid2, :kill)
  assert_receive {:DOWN, ^ref, _, _, _}  # Same 3 lines again
  assert Registry.lookup(reg, :key) == []
end

Example — after:

defp kill_and_wait(pid) do
  ref = Process.monitor(pid)
  Process.exit(pid, :kill)
  assert_receive {:DOWN, ^ref, _, _, _}
end

test "cleans up on crash 1" do
  kill_and_wait(pid1)
  assert Registry.lookup(reg, :key) == []
end

When NOT to Use

Don't use this when:

  • The "helper" is used only once (inline it)
  • The helper hides important test details that readers need to see
  • Over-abstraction makes tests harder to understand in isolation

Over-application example:

defp setup_and_assert(input, expected) do
  {:ok, pid} = start_supervised({Worker, input})
  result = Worker.process(pid)
  assert result == expected
end

test "processes integers", do: setup_and_assert(42, {:ok, 42})
test "processes strings", do: setup_and_assert("hi", {:ok, "hi"})
# Tests are now opaque — can't see what's actually being tested

Better alternative:

test "processes integers" do
  pid = start_supervised!({Worker, 42})
  assert Worker.process(pid) == {:ok, 42}
end

test "processes strings" do
  pid = start_supervised!({Worker, "hi"})
  assert Worker.process(pid) == {:ok, "hi"}
end

Why: Test helpers should extract mechanics (kill-and-wait, setup-server), not logic (the actual behavior under test). If a helper contains assertions, readers can't understand the test without reading the helper.


17. @tag :tmp_dir for Filesystem Tests

Source: lib/ex_unit/lib/ex_unit/case.ex#L281, lib/elixir/test/elixir/path_test.exs#L12

What it does: ExUnit automatically creates a unique temporary directory and passes its path via the test context.

Why: Filesystem tests need isolation. Each test gets its own directory, removed before creation to ensure a clean slate.

Pattern:

@tag :tmp_dir
test "writes files", %{tmp_dir: tmp_dir} do
  path = Path.join(tmp_dir, "test.txt")
  File.write!(path, "hello")
  assert File.read!(path) == "hello"
end

When to Use

Triggers:

  • Tests create, read, or modify files
  • Multiple tests would conflict if they used the same paths
  • You want filesystem tests to run with async: true

Example — before:

test "writes config file" do
  path = "/tmp/test_config.json"
  File.write!(path, "{}")
  # Another async test writes to the same path — race condition!
  assert File.read!(path) == "{}"
end

Example — after:

@tag :tmp_dir
test "writes config file", %{tmp_dir: dir} do
  path = Path.join(dir, "config.json")
  File.write!(path, "{}")
  assert File.read!(path) == "{}"
  # Unique directory per test — safe for async
end

When NOT to Use

Don't use this when:

  • Tests only read files (no isolation needed for read-only access)
  • You're testing in-memory operations that happen to involve path strings
  • The filesystem interaction is mocked/stubbed

Over-application example:

@tag :tmp_dir
test "parses path components", %{tmp_dir: _dir} do
  # Doesn't actually touch the filesystem!
  assert Path.basename("/foo/bar/baz.txt") == "baz.txt"
end

Better alternative:

test "parses path components" do
  assert Path.basename("/foo/bar/baz.txt") == "baz.txt"
end

Why: @tag :tmp_dir creates actual directories on disk — it's overhead. Only use it when the test genuinely needs filesystem isolation. Pure path manipulation doesn't touch the filesystem.


18. assert_raise with Message Matching

Source: lib/ex_unit/lib/ex_unit/assertions.ex#L815

What it does: Asserts both the exception type AND the message content (string or regex).

Why: Verifying the exception type alone is insufficient — the message tells users what went wrong. Testing it ensures error UX.

Pattern:

# Exact message match
assert_raise ArgumentError, ~r"expected :name option to be one of the following:", fn ->
  GenServer.start_link(Stack, [:hello], name: "my_gen_server_name")
end

# Regex for dynamic content
assert_raise RuntimeError, ~r/^today's lucky number is 0\.\d+!$/, fn ->
  raise "today's lucky number is #{:rand.uniform()}!"
end

When to Use

Triggers:

  • Testing error messages that users will see
  • Validating that errors provide actionable information
  • The error type alone isn't specific enough (many places raise ArgumentError)

Example — before:

test "rejects bad input" do
  assert_raise ArgumentError, fn ->
    parse("not_a_number")
  end
  # Passes even if the message is wrong or unhelpful
end

Example — after:

test "rejects bad input" do
  assert_raise ArgumentError, ~r/cannot parse .* as integer/, fn ->
    parse("not_a_number")
  end
  # Verifies the error message is helpful
end

When NOT to Use

Don't use this when:

  • The exact message text is an implementation detail that changes often
  • You're testing a third-party library's error messages (they might change)
  • The exception type alone is sufficient to distinguish the error case

Over-application example:

test "file not found" do
  assert_raise File.Error, "could not read file \"/no/such/file\": no such file or directory", fn ->
    File.read!("/no/such/file")
  end
  # Exact message match breaks across OS versions / locales
end

Better alternative:

test "file not found" do
  assert_raise File.Error, fn ->
    File.read!("/no/such/file")
  end
  # Or with regex for the stable part:
  assert_raise File.Error, ~r/could not read file/, fn ->
    File.read!("/no/such/file")
  end
end

Why: Match the stable, meaningful part of error messages. OS-specific paths, locale-dependent strings, and implementation details make exact matches brittle across environments.


19. @moduletag / @describetag for Cross-Cutting Configuration

Source: lib/elixir/test/elixir/system_test.exs:104,163, lib/elixir/test/elixir/task_test.exs#L10

What it does: Sets tags that apply to all tests in a module or describe block, used for filtering and configuration.

Why: Enables running subsets of tests (mix test --include unix) and applying configuration (like :capture_log) without repeating it on every test.

Pattern:

defmodule SystemTest do
  use ExUnit.Case, async: true

  describe "Windows" do
    @describetag :windows
    # All tests here tagged :windows
  end

  describe "Unix" do
    @describetag :unix
    # All tests here tagged :unix
  end
end

When to Use

Triggers:

  • Tests only run on certain platforms (OS, architecture)
  • You want to run a subset of tests via --include / --exclude
  • A whole module or describe block shares configuration (capture_log, tmp_dir)

Example — before:

# Manually skipping in each test
test "symlinks" do
  if :os.type() != {:unix, :linux}, do: ExUnit.skip("unix only")
  # ...
end

test "permissions" do
  if :os.type() != {:unix, :linux}, do: ExUnit.skip("unix only")
  # ...
end

Example — after:

describe "Unix filesystem" do
  @describetag :unix

  test "symlinks" do ... end
  test "permissions" do ... end
end

# In test_helper.exs:
ExUnit.configure(exclude: [:unix], include: [])
# Run with: mix test --include unix

When NOT to Use

Don't use this when:

  • The tag applies to a single test (use @tag instead)
  • You're using tags as a substitute for proper test organization
  • The tag doesn't enable filtering or configuration — it's just metadata no one reads

Over-application example:

@moduletag :unit
@moduletag :fast
@moduletag :users
@moduletag :important
# Tags no one filters on — just noise in the module header

Better alternative:

# Only tag what you actually filter on
@moduletag :capture_log  # Used by ExUnit
# If you never run `--include unit`, don't tag it

Why: Tags have cost — they clutter the module header and create the illusion of organization. Only add tags that drive behavior (ExUnit configuration) or that you actually filter on in CI/development.


20. Context Pattern Matching in Test Signatures

Source: lib/ex_unit/lib/ex_unit/case.ex#L57, lib/elixir/test/elixir/gen_server_test.exs#L166

What it does: Destructures the test context directly in the test function signature.

Why: Makes dependencies explicit. You see exactly what each test needs from setup.

Pattern:

test "abcast/3", %{test: name} do
  {:ok, _} = GenServer.start_link(Stack, [], name: name)
  assert GenServer.abcast(name, {:push, :hello}) == :abcast
end

# Using test name for unique naming — prevents collision in async tests

The %{test: name} pattern is ubiquitous — the test name is unique per module, making it perfect for naming registered processes in async tests.

When to Use

Triggers:

  • Your test needs values from setup (server pids, registry names, tmp dirs)
  • You want to make test dependencies visible in the test signature
  • Using %{test: name} for unique process registration in async tests

Example — before:

setup do
  {:ok, pid} = start_supervised(MyServer)
  %{server: pid, name: "test_user"}
end

test "queries server" do
  # Where does 'server' come from? Must read setup.
  server = ???
end

Example — after:

setup do
  {:ok, pid} = start_supervised(MyServer)
  %{server: pid, name: "test_user"}
end

test "queries server", %{server: pid, name: name} do
  assert MyServer.get(pid, name) == :ok
end

When NOT to Use

Don't use this when:

  • The test doesn't use any setup context
  • You're destructuring the entire context when you only need one field
  • The test is standalone and self-contained

Over-application example:

test "basic math", %{test: _test} do
  # Destructuring context for a test that doesn't need it
  assert 2 + 2 == 4
end

Better alternative:

test "basic math" do
  assert 2 + 2 == 4
end

Why: Context destructuring signals "this test depends on external setup." If the test is self-contained, the pattern match is misleading — readers will look for setup that doesn't exist or isn't needed.

Decision Tree

21. Filtering Events in Async Tests

Source: Gargoyle PR #710 (flaky telemetry test fix)

What it does: Pin-matches on a unique identifier to filter events from concurrent tests.

Why: When testing telemetry, pub/sub, or any broadcast mechanism with async: true, events from other tests can leak into your mailbox. Without filtering, tests pass in isolation but fail randomly when run in parallel.

Pattern:

Wrong — receives events from other tests:

test "emits telemetry on create", %{user: user} do
  :telemetry.attach("test", [:user, :created], &send_to_test/4, self())
  create_post(user)
  assert_receive {:telemetry, %{user_id: uid}}
  assert uid == user.id  # Might match event from another test!
end

Right — pin filters to only your test's events:

test "emits telemetry on create", %{user: user} do
  test_pid = self()
  expected_uid = user.id

  :telemetry.attach("test-#{inspect(test_pid)}", [:user, :created], fn _, _, meta, pid ->
    if meta.user_id == expected_uid, do: send(pid, {:telemetry, meta})
  end, test_pid)

  create_post(user)
  assert_receive {:telemetry, %{user_id: ^expected_uid}}
end

Key Insight

The fix happens in two places:

  1. Filter at the source — only send messages that match your test's unique identifier
  2. Pin in the assertion — use ^variable to reject mismatches that slip through

When to Use

Triggers:

  • Telemetry handlers (filter by user_id, request_id, or test pid)
  • Phoenix.PubSub subscriptions (use unique topic per test)
  • GenStage/Broadway consumers (tag events with test pid)
  • Any shared message bus in async tests
  • Tests pass alone but fail with --seed or in CI

When NOT to Use

Don't use this when:

  • Tests run with async: false (no concurrent tests to leak events)
  • You control the event source and can make it test-aware by design
  • The event contains a natural unique key you already have (just pin on it)

Over-application example:

# Overkill when you already have a unique key
test "user update", %{user: user} do
  # user.id is already unique — just pin it directly
  assert_receive {:updated, %{id: ^user.id}}
end

Decision Tree Addition