324c1936f4
How do the language teams build their own languages? Key findings: - Go: 3,428 owned TODOs (permanent, documenting limitations) - Elixir: 127 version-gated TODOs (time-bombs with deadlines) - Go: unsafe is 5th most-imported in its own source (1,304 times) - Elixir: type system (13K lines) nearly as large as Kernel - Go: 61 internal/ packages (shared-but-not-public pattern) - Elixir: 1:1.2 test ratio vs Go's 1:3.3 - Both accept generated code, different delivery (checked-in vs compile-time)
282 lines
9.0 KiB
Markdown
282 lines
9.0 KiB
Markdown
# Language Source Analysis: Go vs Elixir
|
|
|
|
How do the language teams build their own languages? What
|
|
does the source reveal about conventions that users should
|
|
follow?
|
|
|
|
---
|
|
|
|
## Shape Comparison
|
|
|
|
| Metric | Go (golang/go) | Elixir (elixir-lang/elixir) |
|
|
|--------|---------------|---------------------------|
|
|
| Size | 632M | 92M |
|
|
| Source files | 11,245 .go | 567 .ex/.exs |
|
|
| Commits | 66,142 | 22,032 |
|
|
| Contributors | 2,842 | 1,578 |
|
|
| Test files | 1,811 | 208 |
|
|
| Non-test files | 6,065 | 248 |
|
|
| Test ratio | 1:3.3 | 1:1.2 |
|
|
| TODOs (non-test) | 3,428 | 127 |
|
|
|
|
**Key observation:** Elixir has almost a 1:1 test-to-production
|
|
file ratio. Go has roughly 1 test file per 3.3 production files.
|
|
But Go has 7x more TODOs per file — the Elixir team cleans
|
|
theirs aggressively.
|
|
|
|
---
|
|
|
|
## Organizational Philosophy
|
|
|
|
### Go: Deep internal/ + flat stdlib
|
|
|
|
```
|
|
src/
|
|
├── cmd/ # toolchain (compile, go, gofmt, etc.)
|
|
├── internal/ # 61 hidden packages (NOT user-visible)
|
|
├── io/ # flat stdlib packages
|
|
├── fmt/
|
|
├── net/
|
|
├── runtime/
|
|
└── ...
|
|
```
|
|
|
|
**What this reveals:**
|
|
- The Go team uses `internal/` extensively (61 packages) to
|
|
hide implementation details from users. This is Go's answer
|
|
to "how do you share code without committing to an API."
|
|
- Stdlib packages are flat — no `pkg/` wrapper, no nesting
|
|
beyond one level (with exceptions like `net/http`).
|
|
- The compiler alone is 562,727 lines. The largest files are
|
|
generated (SSA rewrite rules: 97K lines).
|
|
|
|
### Elixir: Nested libraries as independent apps
|
|
|
|
```
|
|
lib/
|
|
├── elixir/ # the language itself
|
|
├── eex/ # templating
|
|
├── ex_unit/ # testing framework
|
|
├── iex/ # interactive shell
|
|
├── logger/ # logging
|
|
└── mix/ # build tool
|
|
```
|
|
|
|
**What this reveals:**
|
|
- Each component is a separate OTP application — they could
|
|
theoretically be released independently.
|
|
- 55 Mix tasks, each in its own file — one task = one file
|
|
is a hard convention.
|
|
- The type system (`lib/elixir/lib/module/types/`) is 13,034
|
|
lines, the newest and fastest-growing module.
|
|
|
|
---
|
|
|
|
## TODO Culture
|
|
|
|
### Go: Owned TODOs as a permanent layer
|
|
|
|
```go
|
|
// TODO(gri) — 320 occurrences
|
|
// TODO(mdempsky) — 198 occurrences
|
|
// TODO(adonovan) — 170 occurrences
|
|
// TODO(mknyszek) — 98 occurrences
|
|
// TODO(rsc) — 96 occurrences
|
|
```
|
|
|
|
**3,428 TODOs** in non-test code. Every TODO has an owner.
|
|
The top TODO authors are core team members. These aren't
|
|
aspirational — they're load-bearing markers for known
|
|
limitations that specific people are expected to address.
|
|
|
|
**Convention:** `// TODO(username): description`
|
|
|
|
### Elixir: Version-gated TODOs as deprecation roadmap
|
|
|
|
```elixir
|
|
# TODO: Remove me on v2.0 — 16 occurrences
|
|
# TODO: Deprecate me on Elixir v1.23 — 6 occurrences
|
|
# TODO: Remove this clause on Elixir v2.0 once single-quoted charlists are removed
|
|
```
|
|
|
|
**127 TODOs** total. Almost all are version-gated — they
|
|
explicitly state WHEN the TODO should be resolved. This is
|
|
systematic cleanup culture: bump the version, grep for that
|
|
version's TODOs, resolve them all.
|
|
|
|
**Convention:** `# TODO: Action on version X.Y`
|
|
|
|
**The lesson:** Go accepts permanent TODOs as documentation of
|
|
known limitations. Elixir treats TODOs as time-bombs with
|
|
deadlines. Both are disciplined — just different philosophies.
|
|
|
|
---
|
|
|
|
## What Each Language Values (Import Hierarchy)
|
|
|
|
### Go's foundation
|
|
|
|
| Package | Imports | Role |
|
|
|---------|---------|------|
|
|
| `fmt` | 2,031 | Formatting everywhere |
|
|
| `testing` | 1,658 | Tests are first-class |
|
|
| `strings` | 1,454 | String manipulation |
|
|
| `os` | 1,306 | System interaction |
|
|
| `unsafe` | 1,304 | Low-level access (surprising frequency) |
|
|
| `runtime` | 970 | Runtime introspection |
|
|
| `io` | 924 | Stream abstraction |
|
|
|
|
**Surprising:** `unsafe` is the 5th most-imported package in
|
|
Go's own source. The language that preaches safety uses unsafe
|
|
extensively in its own implementation. This is the same pattern
|
|
as Prometheus's global vars — the authors know the rules and
|
|
know where to break them safely.
|
|
|
|
### Elixir's foundation
|
|
|
|
The Elixir source doesn't use `alias`/`import` heavily — it
|
|
relies on the module system's implicit availability. The key
|
|
modules by size tell the story:
|
|
|
|
| Module | Lines | Role |
|
|
|--------|-------|------|
|
|
| `Kernel` | 7,102 | The implicit language |
|
|
| `types/descr.ex` | 6,301 | Type descriptions (set-theoretic) |
|
|
| `Enum` | 5,242 | Collection operations |
|
|
| `String` | 3,263 | String as first-class concept |
|
|
| `Macro` | 3,102 | Metaprogramming foundation |
|
|
| `Exception` | 2,720 | Error taxonomy |
|
|
|
|
**Surprising:** The type system's description module (6,301
|
|
lines) is nearly as large as Kernel itself. This is the newest
|
|
addition and already dominates the codebase — showing where
|
|
the team's investment is going.
|
|
|
|
---
|
|
|
|
## Interface/Protocol Philosophy
|
|
|
|
### Go: Composition of single-method interfaces
|
|
|
|
```go
|
|
type Reader interface { Read(p []byte) (n int, err error) }
|
|
type Writer interface { Write(p []byte) (n int, err error) }
|
|
type Closer interface { Close() error }
|
|
type ReadWriter interface { Reader; Writer }
|
|
type ReadCloser interface { Reader; Closer }
|
|
type ReadWriteCloser interface { Reader; Writer; Closer }
|
|
```
|
|
|
|
15 interfaces in `io/io.go` alone, all composed from 4
|
|
primitives. This IS Go's philosophy: small interfaces composed
|
|
into larger ones. The source practices what the documentation
|
|
preaches.
|
|
|
|
### Elixir: 6 protocols + 24 behaviours
|
|
|
|
Core protocols (the extensibility points):
|
|
- `Enumerable` — collection contract
|
|
- `Collectable` — inverse of Enumerable
|
|
- `Inspect` — debug representation
|
|
- `String.Chars` — string conversion
|
|
- `List.Chars` — charlist conversion
|
|
- `JSON.Encoder` — JSON serialization (newest)
|
|
|
|
**Only 6 protocols** in the stdlib. Elixir is conservative
|
|
about adding extension points. Compare to Go's dozens of
|
|
interfaces — Elixir prefers fewer, more powerful abstractions.
|
|
|
|
24 files define `@callback` — these are behaviours (Go's
|
|
interface equivalent for OTP patterns). Used for GenServer,
|
|
Supervisor, Application, etc.
|
|
|
|
---
|
|
|
|
## Unique Infrastructure
|
|
|
|
### Go: internal/ as API firewall
|
|
|
|
61 `internal/` packages implement things users need but
|
|
shouldn't depend on:
|
|
- `internal/singleflight` — dedup concurrent calls (too
|
|
specialized for stdlib, too useful to not have)
|
|
- `internal/godebug` — runtime feature flags via `$GODEBUG`
|
|
- `internal/goexperiment` — compile-time experiment flags
|
|
- `internal/poll` — OS-level I/O polling (used by net, os)
|
|
- `internal/cpu` — CPU feature detection
|
|
|
|
**The pattern:** Code that's shared between stdlib packages but
|
|
isn't a public API lives in `internal/`. This is Go's answer to
|
|
the "shared utility" problem that other languages solve with
|
|
package-private visibility.
|
|
|
|
### Elixir: The compiler as a library
|
|
|
|
The Elixir compiler is structured as a library you could
|
|
theoretically call:
|
|
- `Module.Types` — the type checker
|
|
- `Module.ParallelChecker` — concurrent type checking
|
|
- `Code.Formatter` — the formatter is a library function
|
|
|
|
Mix tasks as single-file modules (55 of them) enforce one
|
|
responsibility per task. The build tool's extension point is
|
|
"write a module that uses `Mix.Task`."
|
|
|
|
---
|
|
|
|
## Generated Code
|
|
|
|
### Go: Heavy generation, clearly marked
|
|
|
|
The compiler's SSA rewrite rules are generated:
|
|
- `opGen.go` — 97,135 lines
|
|
- `rewriteAMD64.go` — 79,703 lines
|
|
- `rewritegeneric.go` — 38,337 lines
|
|
|
|
Convention: Generated files contain a `// Code generated`
|
|
header comment. Go's tooling (`go generate`) is designed around
|
|
this pattern — the source is explicit about what's human-written
|
|
vs machine-written.
|
|
|
|
### Elixir: Minimal generation
|
|
|
|
No significant generated code. The Elixir source is almost
|
|
entirely human-written. The compilation model (AST macros)
|
|
means code generation happens at compile time rather than as
|
|
checked-in artifacts.
|
|
|
|
---
|
|
|
|
## Lessons for Convention Extraction
|
|
|
|
### What the language source teaches that stdlib docs don't:
|
|
|
|
1. **Go's `unsafe` usage in its own source** — the safety rules
|
|
are for users, not for the runtime team. 1,304 unsafe imports
|
|
in stdlib code. Know the rules so you know where they don't
|
|
apply.
|
|
|
|
2. **Elixir's TODO discipline is version-gated** — not "clean
|
|
up someday" but "remove on v2.0." This is why `elixir-
|
|
patterns` documents zero-TODO culture as achievable.
|
|
|
|
3. **Go accepts permanent TODOs** — 3,428 of them, owned by
|
|
specific people. This isn't sloppiness; it's documentation
|
|
of known limitations. The Go team would rather have an honest
|
|
TODO than a half-baked fix.
|
|
|
|
4. **Elixir's test ratio (1:1.2) vs Go's (1:3.3)** — Elixir's
|
|
smaller, more focused files mean each one has a direct test
|
|
counterpart. Go's larger files and package-level tests mean
|
|
more production code per test file.
|
|
|
|
5. **Both use generated code** — but Go checks it in (97K line
|
|
files) while Elixir generates at compile time. Neither is
|
|
wrong; it reflects the language's compilation model.
|
|
|
|
6. **`internal/` is Go's most distinctive structural pattern**
|
|
— 61 packages that solve "shared but not public." No other
|
|
language has this built into the module system.
|
|
|
|
<!-- PATTERN_COMPLETE -->
|