docs: language source comparison — Go vs Elixir source conventions
How do the language teams build their own languages? Key findings: - Go: 3,428 owned TODOs (permanent, documenting limitations) - Elixir: 127 version-gated TODOs (time-bombs with deadlines) - Go: unsafe is 5th most-imported in its own source (1,304 times) - Elixir: type system (13K lines) nearly as large as Kernel - Go: 61 internal/ packages (shared-but-not-public pattern) - Elixir: 1:1.2 test ratio vs Go's 1:3.3 - Both accept generated code, different delivery (checked-in vs compile-time)
This commit is contained in:
@@ -0,0 +1,281 @@
|
||||
# Language Source Analysis: Go vs Elixir
|
||||
|
||||
How do the language teams build their own languages? What
|
||||
does the source reveal about conventions that users should
|
||||
follow?
|
||||
|
||||
---
|
||||
|
||||
## Shape Comparison
|
||||
|
||||
| Metric | Go (golang/go) | Elixir (elixir-lang/elixir) |
|
||||
|--------|---------------|---------------------------|
|
||||
| Size | 632M | 92M |
|
||||
| Source files | 11,245 .go | 567 .ex/.exs |
|
||||
| Commits | 66,142 | 22,032 |
|
||||
| Contributors | 2,842 | 1,578 |
|
||||
| Test files | 1,811 | 208 |
|
||||
| Non-test files | 6,065 | 248 |
|
||||
| Test ratio | 1:3.3 | 1:1.2 |
|
||||
| TODOs (non-test) | 3,428 | 127 |
|
||||
|
||||
**Key observation:** Elixir has almost a 1:1 test-to-production
|
||||
file ratio. Go has roughly 1 test file per 3.3 production files.
|
||||
But Go has 7x more TODOs per file — the Elixir team cleans
|
||||
theirs aggressively.
|
||||
|
||||
---
|
||||
|
||||
## Organizational Philosophy
|
||||
|
||||
### Go: Deep internal/ + flat stdlib
|
||||
|
||||
```
|
||||
src/
|
||||
├── cmd/ # toolchain (compile, go, gofmt, etc.)
|
||||
├── internal/ # 61 hidden packages (NOT user-visible)
|
||||
├── io/ # flat stdlib packages
|
||||
├── fmt/
|
||||
├── net/
|
||||
├── runtime/
|
||||
└── ...
|
||||
```
|
||||
|
||||
**What this reveals:**
|
||||
- The Go team uses `internal/` extensively (61 packages) to
|
||||
hide implementation details from users. This is Go's answer
|
||||
to "how do you share code without committing to an API."
|
||||
- Stdlib packages are flat — no `pkg/` wrapper, no nesting
|
||||
beyond one level (with exceptions like `net/http`).
|
||||
- The compiler alone is 562,727 lines. The largest files are
|
||||
generated (SSA rewrite rules: 97K lines).
|
||||
|
||||
### Elixir: Nested libraries as independent apps
|
||||
|
||||
```
|
||||
lib/
|
||||
├── elixir/ # the language itself
|
||||
├── eex/ # templating
|
||||
├── ex_unit/ # testing framework
|
||||
├── iex/ # interactive shell
|
||||
├── logger/ # logging
|
||||
└── mix/ # build tool
|
||||
```
|
||||
|
||||
**What this reveals:**
|
||||
- Each component is a separate OTP application — they could
|
||||
theoretically be released independently.
|
||||
- 55 Mix tasks, each in its own file — one task = one file
|
||||
is a hard convention.
|
||||
- The type system (`lib/elixir/lib/module/types/`) is 13,034
|
||||
lines, the newest and fastest-growing module.
|
||||
|
||||
---
|
||||
|
||||
## TODO Culture
|
||||
|
||||
### Go: Owned TODOs as a permanent layer
|
||||
|
||||
```go
|
||||
// TODO(gri) — 320 occurrences
|
||||
// TODO(mdempsky) — 198 occurrences
|
||||
// TODO(adonovan) — 170 occurrences
|
||||
// TODO(mknyszek) — 98 occurrences
|
||||
// TODO(rsc) — 96 occurrences
|
||||
```
|
||||
|
||||
**3,428 TODOs** in non-test code. Every TODO has an owner.
|
||||
The top TODO authors are core team members. These aren't
|
||||
aspirational — they're load-bearing markers for known
|
||||
limitations that specific people are expected to address.
|
||||
|
||||
**Convention:** `// TODO(username): description`
|
||||
|
||||
### Elixir: Version-gated TODOs as deprecation roadmap
|
||||
|
||||
```elixir
|
||||
# TODO: Remove me on v2.0 — 16 occurrences
|
||||
# TODO: Deprecate me on Elixir v1.23 — 6 occurrences
|
||||
# TODO: Remove this clause on Elixir v2.0 once single-quoted charlists are removed
|
||||
```
|
||||
|
||||
**127 TODOs** total. Almost all are version-gated — they
|
||||
explicitly state WHEN the TODO should be resolved. This is
|
||||
systematic cleanup culture: bump the version, grep for that
|
||||
version's TODOs, resolve them all.
|
||||
|
||||
**Convention:** `# TODO: Action on version X.Y`
|
||||
|
||||
**The lesson:** Go accepts permanent TODOs as documentation of
|
||||
known limitations. Elixir treats TODOs as time-bombs with
|
||||
deadlines. Both are disciplined — just different philosophies.
|
||||
|
||||
---
|
||||
|
||||
## What Each Language Values (Import Hierarchy)
|
||||
|
||||
### Go's foundation
|
||||
|
||||
| Package | Imports | Role |
|
||||
|---------|---------|------|
|
||||
| `fmt` | 2,031 | Formatting everywhere |
|
||||
| `testing` | 1,658 | Tests are first-class |
|
||||
| `strings` | 1,454 | String manipulation |
|
||||
| `os` | 1,306 | System interaction |
|
||||
| `unsafe` | 1,304 | Low-level access (surprising frequency) |
|
||||
| `runtime` | 970 | Runtime introspection |
|
||||
| `io` | 924 | Stream abstraction |
|
||||
|
||||
**Surprising:** `unsafe` is the 5th most-imported package in
|
||||
Go's own source. The language that preaches safety uses unsafe
|
||||
extensively in its own implementation. This is the same pattern
|
||||
as Prometheus's global vars — the authors know the rules and
|
||||
know where to break them safely.
|
||||
|
||||
### Elixir's foundation
|
||||
|
||||
The Elixir source doesn't use `alias`/`import` heavily — it
|
||||
relies on the module system's implicit availability. The key
|
||||
modules by size tell the story:
|
||||
|
||||
| Module | Lines | Role |
|
||||
|--------|-------|------|
|
||||
| `Kernel` | 7,102 | The implicit language |
|
||||
| `types/descr.ex` | 6,301 | Type descriptions (set-theoretic) |
|
||||
| `Enum` | 5,242 | Collection operations |
|
||||
| `String` | 3,263 | String as first-class concept |
|
||||
| `Macro` | 3,102 | Metaprogramming foundation |
|
||||
| `Exception` | 2,720 | Error taxonomy |
|
||||
|
||||
**Surprising:** The type system's description module (6,301
|
||||
lines) is nearly as large as Kernel itself. This is the newest
|
||||
addition and already dominates the codebase — showing where
|
||||
the team's investment is going.
|
||||
|
||||
---
|
||||
|
||||
## Interface/Protocol Philosophy
|
||||
|
||||
### Go: Composition of single-method interfaces
|
||||
|
||||
```go
|
||||
type Reader interface { Read(p []byte) (n int, err error) }
|
||||
type Writer interface { Write(p []byte) (n int, err error) }
|
||||
type Closer interface { Close() error }
|
||||
type ReadWriter interface { Reader; Writer }
|
||||
type ReadCloser interface { Reader; Closer }
|
||||
type ReadWriteCloser interface { Reader; Writer; Closer }
|
||||
```
|
||||
|
||||
15 interfaces in `io/io.go` alone, all composed from 4
|
||||
primitives. This IS Go's philosophy: small interfaces composed
|
||||
into larger ones. The source practices what the documentation
|
||||
preaches.
|
||||
|
||||
### Elixir: 6 protocols + 24 behaviours
|
||||
|
||||
Core protocols (the extensibility points):
|
||||
- `Enumerable` — collection contract
|
||||
- `Collectable` — inverse of Enumerable
|
||||
- `Inspect` — debug representation
|
||||
- `String.Chars` — string conversion
|
||||
- `List.Chars` — charlist conversion
|
||||
- `JSON.Encoder` — JSON serialization (newest)
|
||||
|
||||
**Only 6 protocols** in the stdlib. Elixir is conservative
|
||||
about adding extension points. Compare to Go's dozens of
|
||||
interfaces — Elixir prefers fewer, more powerful abstractions.
|
||||
|
||||
24 files define `@callback` — these are behaviours (Go's
|
||||
interface equivalent for OTP patterns). Used for GenServer,
|
||||
Supervisor, Application, etc.
|
||||
|
||||
---
|
||||
|
||||
## Unique Infrastructure
|
||||
|
||||
### Go: internal/ as API firewall
|
||||
|
||||
61 `internal/` packages implement things users need but
|
||||
shouldn't depend on:
|
||||
- `internal/singleflight` — dedup concurrent calls (too
|
||||
specialized for stdlib, too useful to not have)
|
||||
- `internal/godebug` — runtime feature flags via `$GODEBUG`
|
||||
- `internal/goexperiment` — compile-time experiment flags
|
||||
- `internal/poll` — OS-level I/O polling (used by net, os)
|
||||
- `internal/cpu` — CPU feature detection
|
||||
|
||||
**The pattern:** Code that's shared between stdlib packages but
|
||||
isn't a public API lives in `internal/`. This is Go's answer to
|
||||
the "shared utility" problem that other languages solve with
|
||||
package-private visibility.
|
||||
|
||||
### Elixir: The compiler as a library
|
||||
|
||||
The Elixir compiler is structured as a library you could
|
||||
theoretically call:
|
||||
- `Module.Types` — the type checker
|
||||
- `Module.ParallelChecker` — concurrent type checking
|
||||
- `Code.Formatter` — the formatter is a library function
|
||||
|
||||
Mix tasks as single-file modules (55 of them) enforce one
|
||||
responsibility per task. The build tool's extension point is
|
||||
"write a module that uses `Mix.Task`."
|
||||
|
||||
---
|
||||
|
||||
## Generated Code
|
||||
|
||||
### Go: Heavy generation, clearly marked
|
||||
|
||||
The compiler's SSA rewrite rules are generated:
|
||||
- `opGen.go` — 97,135 lines
|
||||
- `rewriteAMD64.go` — 79,703 lines
|
||||
- `rewritegeneric.go` — 38,337 lines
|
||||
|
||||
Convention: Generated files contain a `// Code generated`
|
||||
header comment. Go's tooling (`go generate`) is designed around
|
||||
this pattern — the source is explicit about what's human-written
|
||||
vs machine-written.
|
||||
|
||||
### Elixir: Minimal generation
|
||||
|
||||
No significant generated code. The Elixir source is almost
|
||||
entirely human-written. The compilation model (AST macros)
|
||||
means code generation happens at compile time rather than as
|
||||
checked-in artifacts.
|
||||
|
||||
---
|
||||
|
||||
## Lessons for Convention Extraction
|
||||
|
||||
### What the language source teaches that stdlib docs don't:
|
||||
|
||||
1. **Go's `unsafe` usage in its own source** — the safety rules
|
||||
are for users, not for the runtime team. 1,304 unsafe imports
|
||||
in stdlib code. Know the rules so you know where they don't
|
||||
apply.
|
||||
|
||||
2. **Elixir's TODO discipline is version-gated** — not "clean
|
||||
up someday" but "remove on v2.0." This is why `elixir-
|
||||
patterns` documents zero-TODO culture as achievable.
|
||||
|
||||
3. **Go accepts permanent TODOs** — 3,428 of them, owned by
|
||||
specific people. This isn't sloppiness; it's documentation
|
||||
of known limitations. The Go team would rather have an honest
|
||||
TODO than a half-baked fix.
|
||||
|
||||
4. **Elixir's test ratio (1:1.2) vs Go's (1:3.3)** — Elixir's
|
||||
smaller, more focused files mean each one has a direct test
|
||||
counterpart. Go's larger files and package-level tests mean
|
||||
more production code per test file.
|
||||
|
||||
5. **Both use generated code** — but Go checks it in (97K line
|
||||
files) while Elixir generates at compile time. Neither is
|
||||
wrong; it reflects the language's compilation model.
|
||||
|
||||
6. **`internal/` is Go's most distinctive structural pattern**
|
||||
— 61 packages that solve "shared but not public." No other
|
||||
language has this built into the module system.
|
||||
|
||||
<!-- PATTERN_COMPLETE -->
|
||||
Reference in New Issue
Block a user