Absorbed content from rodin/elixir-conventions and rodin/oban-conventions into a sources/ directory. These are reference material — descriptive, not prescriptive. Patterns that prove broadly applicable get promoted into patterns/. Part of taxonomy cleanup (issue #4): - Pattern = prescriptive, follow these - Convention/Source = reference, study for ideas The original repos can now be archived.
11 KiB
Elixir Language Source: Architectural Conventions
How does José Valim and the Elixir core team build Elixir itself? What does the language source reveal about conventions that aren't documented anywhere else?
Repo: elixir-lang/elixir
1. Repo Shape
| Metric | Value |
|---|---|
| Size | 92M |
| Source files | 567 .ex/.exs |
| Erlang bootstrap | 33 .erl files |
| Commits | 22,032 |
| Contributors | 1,578 |
| Test files | 208 |
| Production files | 248 |
| Test ratio | 1:1.2 |
| TODOs (non-test) | 127 (all version-gated) |
Organizational Philosophy
lib/
├── elixir/ # The language core (compiler + stdlib)
│ ├── src/ # 33 Erlang files (bootstrap)
│ └── lib/ # Elixir stdlib + compiler
├── eex/ # Templating (independent OTP app)
├── ex_unit/ # Testing framework (independent OTP app)
├── iex/ # Interactive shell (independent OTP app)
├── logger/ # Logging (independent OTP app)
└── mix/ # Build tool (independent OTP app)
Each component is a separate OTP application. They could theoretically be released independently. This is Elixir eating its own dog food — the umbrella project convention that Phoenix apps use comes directly from how the language itself is organized.
2. What the Codebase Values
By size (what gets the most lines)
| Module | Lines | Role |
|---|---|---|
Kernel |
7,102 | The implicit language surface |
Module.Types.Descr |
6,301 | Set-theoretic type descriptions |
Enum |
5,242 | Collection operations |
String |
3,263 | First-class string concept |
Macro |
3,102 | Metaprogramming foundation |
Exception |
2,720 | Error taxonomy |
Code.Formatter |
2,605 | Code formatting as library |
The surprise: The type system (types/descr.ex at 6,301 lines) is
nearly as large as Kernel (7,102 lines). It's the newest and
fastest-growing module — 504 commits, 96% written by José Valim. This
is where the investment is going.
By authorship (who shapes the language)
Type system: 396/504 commits from José, 32 from Eric Meadows-Jönsson, 31 from Guillaume Duboc. This is auteur-driven development — one person holds the architectural vision for the most complex subsystem.
3. The Bootstrap Problem
How does Elixir compile itself?
The answer is 33 Erlang files in lib/elixir/src/:
elixir_bootstrap.erl — minimal Kernel for self-compilation
elixir_compiler.erl — the compiler entry point
elixir_tokenizer.erl — lexer (in Erlang for speed)
elixir_expand.erl — macro expansion
elixir_erl.erl — Elixir AST → Erlang AST
elixir_erl_pass.erl — code generation pass
elixir_env.erl — compilation environment
elixir_clauses.erl — pattern matching compilation
Convention: The tokenizer and core compiler remain in Erlang permanently. This isn't technical debt — it's a deliberate choice. The tokenizer benefits from Erlang's binary pattern matching performance. The compiler needs to exist before Elixir does.
Origin: The bootstrap file dates to Nov 22, 2013 (commit
260be7c8e: "Start porting elixir_macros to pure elixir"). Before
this, MORE of the compiler was in Erlang. The trajectory is clear:
minimize Erlang over time, but keep it where it provides genuine value.
4. TODO Culture: Version-Gated Deadlines
# TODO: Remove me on v2.0 — 16 occurrences
# TODO: Deprecate me on Elixir v1.23 — 6 occurrences
# TODO: Remove this clause on Elixir v2.0 once single-quoted charlists are removed
# TODO: Make an error on Elixir v2.0 — 3 occurrences
# TODO: Deprecate on Elixir v1.22 — 3 occurrences
Convention: Every TODO has a version target. No "someday" TODOs exist. When a version ships, grep for that version's TODOs and resolve them all.
127 total TODOs across 567 files. Contrast with Go's 3,428 TODOs across 11K files — the Elixir team treats TODOs as time-bombs, not documentation.
5. Unique Patterns
5.1 Protocol Consolidation
Protocols dispatch dynamically at runtime by default (checking each struct's implementation). Protocol consolidation compiles all known implementations into a single dispatch module at build time.
From lib/elixir/lib/protocol.ex:
"Consolidation directly links the protocol to its implementations. Invoking a consolidated protocol is equivalent to invoking two remote functions."
Convention: Mix enables consolidation by default in production. The
@callback __protocol__(:consolidated?) exists so code can check at
runtime whether fast-path dispatch is active.
When NOT to use: Tests often disable consolidation (consolidate_ protocols: false) so new protocol implementations added during tests
are discoverable without recompilation.
5.2 Parallel Type Checker
Module.ParallelChecker (introduced July 2019, PR #9203 by Eric
Meadows-Jönsson as "Add ExCk chunk") enables concurrent type checking
across modules.
The type system itself (13,034 lines across 7 files in
lib/elixir/lib/module/types/) is set-theoretic — types are sets, and
operations are set operations (union, intersection, difference).
Key files:
descr.ex(6,301 lines) — type descriptions and set operationsapply.ex— function application typingexpr.ex— expression typingpattern.ex— pattern match typingof.ex— type inferencehelpers.ex— shared utilitiestraverse.ex— AST traversal
5.3 Code.Formatter as Library Function
The code formatter (2,605 lines) is a library function, not a CLI tool.
You can call Code.format_string!/2 from any Elixir code.
Introduced: Oct 7, 2017 (PR #6639 by José Valim). Zero review comments. Merged in 1 hour. José opened and merged his own formatter with no external review. This is the BDFL model — the language author ships foundational infrastructure by authority.
Convention: The formatter uses Inspect.Algebra (Wadler-Lindig
pretty-printing) for layout decisions. It defines all operators and
their associativity as module attributes:
@pipeline_operators [:|>, :~>>, :<<~, :~>, :<~, :<~>, :"<|>"]
@right_new_line_before_binary_operators [:|, :when]
@required_parens_logical_binary_operands [:|||, :||, :or, :&&&, :&&, :and]
5.4 Mix Tasks as Single-File Modules
55 Mix tasks, each in its own file. Convention:
- One task = one file
- Module name determines task name:
Mix.Tasks.Deps.Clean→deps.clean @shortdocfor brief help,@moduledocfor full docs@recursive truefor umbrella traversal
5.5 ExUnit CaseTemplate (Extension Pattern)
The ExUnit.CaseTemplate is how Elixir's test framework supports
extension — you define a module that uses CaseTemplate, and test
modules use YourModule to inherit setup callbacks and helpers.
This is the same pattern Phoenix uses for ConnCase and DataCase.
It originates from ExUnit itself — the framework demonstrates its own
extension point.
5.6 Logger: Erlang Integration Done Right
PR #9333 (Sep 2019, merged Nov 2019): "Use Erlang's logger as main
logging implementation." The Elixir Logger was rewritten to sit on top
of Erlang's :logger module rather than reimplementing log dispatch.
Convention: When OTP provides infrastructure, wrap it rather than replace it. The compatibility layer translates Erlang log messages to Elixir format, but dispatch/filtering/handlers are OTP's.
6. PR Discussion Patterns
JSON.Encoder (PR #14021, Dec 2024)
38 review comments, 13 days to merge. Key debate:
sabiwara asked: "What is the reason we went with a different API than Jason?" — questioning why the stdlib JSON module doesn't mirror the dominant community library.
michalmuskala (Jason author): "Once 1.18 is released with the new JSON module, I plan to make a new release of Jason with some small fixes and then effectively deprecate it."
Lesson: When stdlib absorbs community library functionality, the community library author participates in the review. Jason's author blessed the replacement and planned deprecation. This is how healthy ecosystem evolution works.
Duration (PR #13385, Mar-Apr 2024)
75 comments + 116 review comments. The most debated PR in recent Elixir history.
Pattern: Community contributor (@tfiedlerdejanze) opened a PR
adding Date.shift/2. José redirected to a broader Duration type.
The contributor iterated through multiple designs.
José's key intervention: "I would rather prefer to pass a Duration to Calendar.ISO.shift_date, if we ever have such a type, rather than a keyword list." — refusing a simpler PR because it would lock in a suboptimal API before the full design was clear.
Lesson: The BDFL model means one person can say "this is the wrong abstraction" and redirect months of work. The PR took 33 days and several complete rewrites. The result was better because someone held the line on "solve the whole problem, not just the immediate pain."
Formatter (PR #6639, Oct 2017)
Zero comments. Merged in 1 hour. 2,605 lines of new code.
Lesson: BDFL-driven projects can ship massive foundational changes with no review. José was both the author and the authority. This is the opposite of CockroachDB's Handle PR (2.5 months, extensive debate). Neither model is wrong — it depends on team structure and trust level.
7. Cross-Ecosystem Comparisons
| Aspect | Elixir | Go |
|---|---|---|
| TODOs | 127, all version-gated | 3,428, all owner-attributed |
| Formatter origin | BDFL ships in 1hr, no review | gofmt shipped with language |
| Bootstrap | Erlang (33 files, permanent) | Assembly + Go (self-hosting since 1.5) |
| Extension | 6 protocols + CaseTemplate | internal/ packages (61 of them) |
| Type system | Set-theoretic, 13K lines, growing | Static, mature, compile-time only |
| Test ratio | 1:1.2 (file per file) | 1:3.3 (package-level tests) |
| Governance | BDFL (José) | Committee (Russ Cox + team) |
8. What This Teaches
-
BDFL projects can move faster on foundational infrastructure — the formatter, type system, and JSON module all shipped because one person had authority. But Duration took 33 days because community contribution required iteration with the BDFL's vision.
-
Version-gated TODOs are a superior cleanup strategy for projects with regular release cycles. You never have to decide "is this worth fixing?" — the version bump forces the question.
-
Keep the minimum viable bootstrap in the host language. 33 Erlang files is the floor, not a ceiling. The trajectory is always toward more Elixir, less Erlang — but the tokenizer stays in Erlang because binary matching is genuinely faster there.
-
The type system's growth rate predicts the language's future. 504 commits, 96% from José, nearly as large as Kernel. Elixir's next 5 years will be defined by gradual typing.
-
Community library authors should bless stdlib absorption. The Jason → JSON.Encoder transition worked because michalmuskala participated in the review and planned deprecation.
-
Each OTP app is an independent unit — this convention flows directly into how Phoenix projects are organized. The language teaches its own architectural pattern by example.