docs: language source comparison — Go vs Elixir source conventions

How do the language teams build their own languages?

Key findings:
- Go: 3,428 owned TODOs (permanent, documenting limitations)
- Elixir: 127 version-gated TODOs (time-bombs with deadlines)
- Go: unsafe is 5th most-imported in its own source (1,304 times)
- Elixir: type system (13K lines) nearly as large as Kernel
- Go: 61 internal/ packages (shared-but-not-public pattern)
- Elixir: 1:1.2 test ratio vs Go's 1:3.3
- Both accept generated code, different delivery (checked-in vs compile-time)
This commit is contained in:
Rodin
2026-04-30 12:19:32 -07:00
parent 91b62be330
commit 324c1936f4
+281
View File
@@ -0,0 +1,281 @@
# Language Source Analysis: Go vs Elixir
How do the language teams build their own languages? What
does the source reveal about conventions that users should
follow?
---
## Shape Comparison
| Metric | Go (golang/go) | Elixir (elixir-lang/elixir) |
|--------|---------------|---------------------------|
| Size | 632M | 92M |
| Source files | 11,245 .go | 567 .ex/.exs |
| Commits | 66,142 | 22,032 |
| Contributors | 2,842 | 1,578 |
| Test files | 1,811 | 208 |
| Non-test files | 6,065 | 248 |
| Test ratio | 1:3.3 | 1:1.2 |
| TODOs (non-test) | 3,428 | 127 |
**Key observation:** Elixir has almost a 1:1 test-to-production
file ratio. Go has roughly 1 test file per 3.3 production files.
But Go has 7x more TODOs per file — the Elixir team cleans
theirs aggressively.
---
## Organizational Philosophy
### Go: Deep internal/ + flat stdlib
```
src/
├── cmd/ # toolchain (compile, go, gofmt, etc.)
├── internal/ # 61 hidden packages (NOT user-visible)
├── io/ # flat stdlib packages
├── fmt/
├── net/
├── runtime/
└── ...
```
**What this reveals:**
- The Go team uses `internal/` extensively (61 packages) to
hide implementation details from users. This is Go's answer
to "how do you share code without committing to an API."
- Stdlib packages are flat — no `pkg/` wrapper, no nesting
beyond one level (with exceptions like `net/http`).
- The compiler alone is 562,727 lines. The largest files are
generated (SSA rewrite rules: 97K lines).
### Elixir: Nested libraries as independent apps
```
lib/
├── elixir/ # the language itself
├── eex/ # templating
├── ex_unit/ # testing framework
├── iex/ # interactive shell
├── logger/ # logging
└── mix/ # build tool
```
**What this reveals:**
- Each component is a separate OTP application — they could
theoretically be released independently.
- 55 Mix tasks, each in its own file — one task = one file
is a hard convention.
- The type system (`lib/elixir/lib/module/types/`) is 13,034
lines, the newest and fastest-growing module.
---
## TODO Culture
### Go: Owned TODOs as a permanent layer
```go
// TODO(gri) — 320 occurrences
// TODO(mdempsky) — 198 occurrences
// TODO(adonovan) — 170 occurrences
// TODO(mknyszek) — 98 occurrences
// TODO(rsc) — 96 occurrences
```
**3,428 TODOs** in non-test code. Every TODO has an owner.
The top TODO authors are core team members. These aren't
aspirational — they're load-bearing markers for known
limitations that specific people are expected to address.
**Convention:** `// TODO(username): description`
### Elixir: Version-gated TODOs as deprecation roadmap
```elixir
# TODO: Remove me on v2.0 — 16 occurrences
# TODO: Deprecate me on Elixir v1.23 — 6 occurrences
# TODO: Remove this clause on Elixir v2.0 once single-quoted charlists are removed
```
**127 TODOs** total. Almost all are version-gated — they
explicitly state WHEN the TODO should be resolved. This is
systematic cleanup culture: bump the version, grep for that
version's TODOs, resolve them all.
**Convention:** `# TODO: Action on version X.Y`
**The lesson:** Go accepts permanent TODOs as documentation of
known limitations. Elixir treats TODOs as time-bombs with
deadlines. Both are disciplined — just different philosophies.
---
## What Each Language Values (Import Hierarchy)
### Go's foundation
| Package | Imports | Role |
|---------|---------|------|
| `fmt` | 2,031 | Formatting everywhere |
| `testing` | 1,658 | Tests are first-class |
| `strings` | 1,454 | String manipulation |
| `os` | 1,306 | System interaction |
| `unsafe` | 1,304 | Low-level access (surprising frequency) |
| `runtime` | 970 | Runtime introspection |
| `io` | 924 | Stream abstraction |
**Surprising:** `unsafe` is the 5th most-imported package in
Go's own source. The language that preaches safety uses unsafe
extensively in its own implementation. This is the same pattern
as Prometheus's global vars — the authors know the rules and
know where to break them safely.
### Elixir's foundation
The Elixir source doesn't use `alias`/`import` heavily — it
relies on the module system's implicit availability. The key
modules by size tell the story:
| Module | Lines | Role |
|--------|-------|------|
| `Kernel` | 7,102 | The implicit language |
| `types/descr.ex` | 6,301 | Type descriptions (set-theoretic) |
| `Enum` | 5,242 | Collection operations |
| `String` | 3,263 | String as first-class concept |
| `Macro` | 3,102 | Metaprogramming foundation |
| `Exception` | 2,720 | Error taxonomy |
**Surprising:** The type system's description module (6,301
lines) is nearly as large as Kernel itself. This is the newest
addition and already dominates the codebase — showing where
the team's investment is going.
---
## Interface/Protocol Philosophy
### Go: Composition of single-method interfaces
```go
type Reader interface { Read(p []byte) (n int, err error) }
type Writer interface { Write(p []byte) (n int, err error) }
type Closer interface { Close() error }
type ReadWriter interface { Reader; Writer }
type ReadCloser interface { Reader; Closer }
type ReadWriteCloser interface { Reader; Writer; Closer }
```
15 interfaces in `io/io.go` alone, all composed from 4
primitives. This IS Go's philosophy: small interfaces composed
into larger ones. The source practices what the documentation
preaches.
### Elixir: 6 protocols + 24 behaviours
Core protocols (the extensibility points):
- `Enumerable` — collection contract
- `Collectable` — inverse of Enumerable
- `Inspect` — debug representation
- `String.Chars` — string conversion
- `List.Chars` — charlist conversion
- `JSON.Encoder` — JSON serialization (newest)
**Only 6 protocols** in the stdlib. Elixir is conservative
about adding extension points. Compare to Go's dozens of
interfaces — Elixir prefers fewer, more powerful abstractions.
24 files define `@callback` — these are behaviours (Go's
interface equivalent for OTP patterns). Used for GenServer,
Supervisor, Application, etc.
---
## Unique Infrastructure
### Go: internal/ as API firewall
61 `internal/` packages implement things users need but
shouldn't depend on:
- `internal/singleflight` — dedup concurrent calls (too
specialized for stdlib, too useful to not have)
- `internal/godebug` — runtime feature flags via `$GODEBUG`
- `internal/goexperiment` — compile-time experiment flags
- `internal/poll` — OS-level I/O polling (used by net, os)
- `internal/cpu` — CPU feature detection
**The pattern:** Code that's shared between stdlib packages but
isn't a public API lives in `internal/`. This is Go's answer to
the "shared utility" problem that other languages solve with
package-private visibility.
### Elixir: The compiler as a library
The Elixir compiler is structured as a library you could
theoretically call:
- `Module.Types` — the type checker
- `Module.ParallelChecker` — concurrent type checking
- `Code.Formatter` — the formatter is a library function
Mix tasks as single-file modules (55 of them) enforce one
responsibility per task. The build tool's extension point is
"write a module that uses `Mix.Task`."
---
## Generated Code
### Go: Heavy generation, clearly marked
The compiler's SSA rewrite rules are generated:
- `opGen.go` — 97,135 lines
- `rewriteAMD64.go` — 79,703 lines
- `rewritegeneric.go` — 38,337 lines
Convention: Generated files contain a `// Code generated`
header comment. Go's tooling (`go generate`) is designed around
this pattern — the source is explicit about what's human-written
vs machine-written.
### Elixir: Minimal generation
No significant generated code. The Elixir source is almost
entirely human-written. The compilation model (AST macros)
means code generation happens at compile time rather than as
checked-in artifacts.
---
## Lessons for Convention Extraction
### What the language source teaches that stdlib docs don't:
1. **Go's `unsafe` usage in its own source** — the safety rules
are for users, not for the runtime team. 1,304 unsafe imports
in stdlib code. Know the rules so you know where they don't
apply.
2. **Elixir's TODO discipline is version-gated** — not "clean
up someday" but "remove on v2.0." This is why `elixir-
patterns` documents zero-TODO culture as achievable.
3. **Go accepts permanent TODOs** — 3,428 of them, owned by
specific people. This isn't sloppiness; it's documentation
of known limitations. The Go team would rather have an honest
TODO than a half-baked fix.
4. **Elixir's test ratio (1:1.2) vs Go's (1:3.3)** — Elixir's
smaller, more focused files mean each one has a direct test
counterpart. Go's larger files and package-level tests mean
more production code per test file.
5. **Both use generated code** — but Go checks it in (97K line
files) while Elixir generates at compile time. Neither is
wrong; it reflects the language's compilation model.
6. **`internal/` is Go's most distinctive structural pattern**
— 61 packages that solve "shared but not public." No other
language has this built into the module system.
<!-- PATTERN_COMPLETE -->