Files
elixir-patterns/sources/elixir-lang-analysis.md
Rodin 74101b513c chore: merge elixir-conventions and oban-conventions into sources/
Absorbed content from rodin/elixir-conventions and rodin/oban-conventions
into a sources/ directory. These are reference material — descriptive,
not prescriptive. Patterns that prove broadly applicable get promoted
into patterns/.

Part of taxonomy cleanup (issue #4):
- Pattern = prescriptive, follow these
- Convention/Source = reference, study for ideas

The original repos can now be archived.
2026-05-07 18:01:42 -07:00

11 KiB

Elixir Language Source: Architectural Conventions

How does José Valim and the Elixir core team build Elixir itself? What does the language source reveal about conventions that aren't documented anywhere else?

Repo: elixir-lang/elixir


1. Repo Shape

Metric Value
Size 92M
Source files 567 .ex/.exs
Erlang bootstrap 33 .erl files
Commits 22,032
Contributors 1,578
Test files 208
Production files 248
Test ratio 1:1.2
TODOs (non-test) 127 (all version-gated)

Organizational Philosophy

lib/
├── elixir/    # The language core (compiler + stdlib)
│   ├── src/   # 33 Erlang files (bootstrap)
│   └── lib/   # Elixir stdlib + compiler
├── eex/       # Templating (independent OTP app)
├── ex_unit/   # Testing framework (independent OTP app)
├── iex/       # Interactive shell (independent OTP app)
├── logger/    # Logging (independent OTP app)
└── mix/       # Build tool (independent OTP app)

Each component is a separate OTP application. They could theoretically be released independently. This is Elixir eating its own dog food — the umbrella project convention that Phoenix apps use comes directly from how the language itself is organized.


2. What the Codebase Values

By size (what gets the most lines)

Module Lines Role
Kernel 7,102 The implicit language surface
Module.Types.Descr 6,301 Set-theoretic type descriptions
Enum 5,242 Collection operations
String 3,263 First-class string concept
Macro 3,102 Metaprogramming foundation
Exception 2,720 Error taxonomy
Code.Formatter 2,605 Code formatting as library

The surprise: The type system (types/descr.ex at 6,301 lines) is nearly as large as Kernel (7,102 lines). It's the newest and fastest-growing module — 504 commits, 96% written by José Valim. This is where the investment is going.

By authorship (who shapes the language)

Type system: 396/504 commits from José, 32 from Eric Meadows-Jönsson, 31 from Guillaume Duboc. This is auteur-driven development — one person holds the architectural vision for the most complex subsystem.


3. The Bootstrap Problem

How does Elixir compile itself?

The answer is 33 Erlang files in lib/elixir/src/:

elixir_bootstrap.erl    — minimal Kernel for self-compilation
elixir_compiler.erl     — the compiler entry point
elixir_tokenizer.erl    — lexer (in Erlang for speed)
elixir_expand.erl       — macro expansion
elixir_erl.erl          — Elixir AST → Erlang AST
elixir_erl_pass.erl     — code generation pass
elixir_env.erl          — compilation environment
elixir_clauses.erl      — pattern matching compilation

Convention: The tokenizer and core compiler remain in Erlang permanently. This isn't technical debt — it's a deliberate choice. The tokenizer benefits from Erlang's binary pattern matching performance. The compiler needs to exist before Elixir does.

Origin: The bootstrap file dates to Nov 22, 2013 (commit 260be7c8e: "Start porting elixir_macros to pure elixir"). Before this, MORE of the compiler was in Erlang. The trajectory is clear: minimize Erlang over time, but keep it where it provides genuine value.


4. TODO Culture: Version-Gated Deadlines

# TODO: Remove me on v2.0                    — 16 occurrences
# TODO: Deprecate me on Elixir v1.23         — 6 occurrences
# TODO: Remove this clause on Elixir v2.0 once single-quoted charlists are removed
# TODO: Make an error on Elixir v2.0         — 3 occurrences
# TODO: Deprecate on Elixir v1.22            — 3 occurrences

Convention: Every TODO has a version target. No "someday" TODOs exist. When a version ships, grep for that version's TODOs and resolve them all.

127 total TODOs across 567 files. Contrast with Go's 3,428 TODOs across 11K files — the Elixir team treats TODOs as time-bombs, not documentation.


5. Unique Patterns

5.1 Protocol Consolidation

Protocols dispatch dynamically at runtime by default (checking each struct's implementation). Protocol consolidation compiles all known implementations into a single dispatch module at build time.

From lib/elixir/lib/protocol.ex:

"Consolidation directly links the protocol to its implementations. Invoking a consolidated protocol is equivalent to invoking two remote functions."

Convention: Mix enables consolidation by default in production. The @callback __protocol__(:consolidated?) exists so code can check at runtime whether fast-path dispatch is active.

When NOT to use: Tests often disable consolidation (consolidate_ protocols: false) so new protocol implementations added during tests are discoverable without recompilation.

5.2 Parallel Type Checker

Module.ParallelChecker (introduced July 2019, PR #9203 by Eric Meadows-Jönsson as "Add ExCk chunk") enables concurrent type checking across modules.

The type system itself (13,034 lines across 7 files in lib/elixir/lib/module/types/) is set-theoretic — types are sets, and operations are set operations (union, intersection, difference).

Key files:

  • descr.ex (6,301 lines) — type descriptions and set operations
  • apply.ex — function application typing
  • expr.ex — expression typing
  • pattern.ex — pattern match typing
  • of.ex — type inference
  • helpers.ex — shared utilities
  • traverse.ex — AST traversal

5.3 Code.Formatter as Library Function

The code formatter (2,605 lines) is a library function, not a CLI tool. You can call Code.format_string!/2 from any Elixir code.

Introduced: Oct 7, 2017 (PR #6639 by José Valim). Zero review comments. Merged in 1 hour. José opened and merged his own formatter with no external review. This is the BDFL model — the language author ships foundational infrastructure by authority.

Convention: The formatter uses Inspect.Algebra (Wadler-Lindig pretty-printing) for layout decisions. It defines all operators and their associativity as module attributes:

@pipeline_operators [:|>, :~>>, :<<~, :~>, :<~, :<~>, :"<|>"]
@right_new_line_before_binary_operators [:|, :when]
@required_parens_logical_binary_operands [:|||, :||, :or, :&&&, :&&, :and]

5.4 Mix Tasks as Single-File Modules

55 Mix tasks, each in its own file. Convention:

  • One task = one file
  • Module name determines task name: Mix.Tasks.Deps.Cleandeps.clean
  • @shortdoc for brief help, @moduledoc for full docs
  • @recursive true for umbrella traversal

5.5 ExUnit CaseTemplate (Extension Pattern)

The ExUnit.CaseTemplate is how Elixir's test framework supports extension — you define a module that uses CaseTemplate, and test modules use YourModule to inherit setup callbacks and helpers.

This is the same pattern Phoenix uses for ConnCase and DataCase. It originates from ExUnit itself — the framework demonstrates its own extension point.

5.6 Logger: Erlang Integration Done Right

PR #9333 (Sep 2019, merged Nov 2019): "Use Erlang's logger as main logging implementation." The Elixir Logger was rewritten to sit on top of Erlang's :logger module rather than reimplementing log dispatch.

Convention: When OTP provides infrastructure, wrap it rather than replace it. The compatibility layer translates Erlang log messages to Elixir format, but dispatch/filtering/handlers are OTP's.


6. PR Discussion Patterns

JSON.Encoder (PR #14021, Dec 2024)

38 review comments, 13 days to merge. Key debate:

sabiwara asked: "What is the reason we went with a different API than Jason?" — questioning why the stdlib JSON module doesn't mirror the dominant community library.

michalmuskala (Jason author): "Once 1.18 is released with the new JSON module, I plan to make a new release of Jason with some small fixes and then effectively deprecate it."

Lesson: When stdlib absorbs community library functionality, the community library author participates in the review. Jason's author blessed the replacement and planned deprecation. This is how healthy ecosystem evolution works.

Duration (PR #13385, Mar-Apr 2024)

75 comments + 116 review comments. The most debated PR in recent Elixir history.

Pattern: Community contributor (@tfiedlerdejanze) opened a PR adding Date.shift/2. José redirected to a broader Duration type. The contributor iterated through multiple designs.

José's key intervention: "I would rather prefer to pass a Duration to Calendar.ISO.shift_date, if we ever have such a type, rather than a keyword list." — refusing a simpler PR because it would lock in a suboptimal API before the full design was clear.

Lesson: The BDFL model means one person can say "this is the wrong abstraction" and redirect months of work. The PR took 33 days and several complete rewrites. The result was better because someone held the line on "solve the whole problem, not just the immediate pain."

Formatter (PR #6639, Oct 2017)

Zero comments. Merged in 1 hour. 2,605 lines of new code.

Lesson: BDFL-driven projects can ship massive foundational changes with no review. José was both the author and the authority. This is the opposite of CockroachDB's Handle PR (2.5 months, extensive debate). Neither model is wrong — it depends on team structure and trust level.


7. Cross-Ecosystem Comparisons

Aspect Elixir Go
TODOs 127, all version-gated 3,428, all owner-attributed
Formatter origin BDFL ships in 1hr, no review gofmt shipped with language
Bootstrap Erlang (33 files, permanent) Assembly + Go (self-hosting since 1.5)
Extension 6 protocols + CaseTemplate internal/ packages (61 of them)
Type system Set-theoretic, 13K lines, growing Static, mature, compile-time only
Test ratio 1:1.2 (file per file) 1:3.3 (package-level tests)
Governance BDFL (José) Committee (Russ Cox + team)

8. What This Teaches

  1. BDFL projects can move faster on foundational infrastructure — the formatter, type system, and JSON module all shipped because one person had authority. But Duration took 33 days because community contribution required iteration with the BDFL's vision.

  2. Version-gated TODOs are a superior cleanup strategy for projects with regular release cycles. You never have to decide "is this worth fixing?" — the version bump forces the question.

  3. Keep the minimum viable bootstrap in the host language. 33 Erlang files is the floor, not a ceiling. The trajectory is always toward more Elixir, less Erlang — but the tokenizer stays in Erlang because binary matching is genuinely faster there.

  4. The type system's growth rate predicts the language's future. 504 commits, 96% from José, nearly as large as Kernel. Elixir's next 5 years will be defined by gradual typing.

  5. Community library authors should bless stdlib absorption. The Jason → JSON.Encoder transition worked because michalmuskala participated in the review and planned deprecation.

  6. Each OTP app is an independent unit — this convention flows directly into how Phoenix projects are organized. The language teaches its own architectural pattern by example.