Files
patterns-vs-guidelines/analysis/purity-evolution-analysis.md
T
Rodin 9961f939c5 docs: pattern purity and codebase evolution analysis
How pure are these codebases? When do they break their own
conventions and why? Key finding: purity inversely correlates
with size. Ecto has 0 TODOs; CockroachDB has 1,048.
Neither is wrong — they're different kinds of systems.
2026-04-30 10:56:15 -07:00

9.4 KiB

Pattern Purity & Evolution: How Codebases Live With Themselves

No codebase is pure. The interesting question is: HOW do they manage impurity? When do they break their own conventions, and what does that reveal about real engineering versus ideal engineering?


CockroachDB: 1,048 TODOs and Counting

Convention Adherence (measured)

Stopper usage (goroutine lifecycle):

  • 91 calls via RunAsyncTask/RunTask (old API)
  • 295 via Handle API (new API, still tracked)
  • 3 truly untracked go func() calls

Purity: 99.6% — and one of those 3 has a literal // TODO(irfansharif): Use a stopper here instead.

Error wrapping:

  • 194 uses of errors.Wrap (cockroachdb/errors library)
  • 13 uses of fmt.Errorf("%w") (stdlib pattern)
  • 15 uses of fmt.Errorf without wrapping

The 13 stdlib-style uses are all in asimview (a CLI tool). The core KV layer uses cockroachdb/errors exclusively. Pattern breaks happen at the periphery, not the core.

How New Patterns Are Introduced

The Handle pattern replaced RunAsyncTask:

  • RunAsyncTask: 63 remaining usages
  • Handle (newer): 295 usages

The newer API coexists with the old. There's no big-bang migration. New code uses Handle; old code stays on RunAsyncTask until someone touches it. The convention evolved gradually.

Lesson: CockroachDB never does a codebase-wide rewrite. They introduce the new pattern, new code uses it, old code migrates opportunistically when touched for other reasons.

Stale TODOs (Technical Debt That Survived)

// TODO(irfansharif): Remove this in 21.2
// TODO(ajwerner): Remove in 22.2.
// TODO(nvanbenschoten): remove this entirely in v22.1

Six TODOs referencing releases from 3+ years ago. These are promises that were never kept. The code still works — the TODO became dead documentation.

Why they survive: Nobody is breaking anything. The old code path works. The TODO is aspirational, not critical. In a 2M-line codebase, someone would have to care enough to touch this file for another reason.

Lesson: Purity is not binary. CockroachDB is 99.6% pure on goroutine tracking but carries years-old dead TODOs. They invest enforcement energy where it matters (leak detection fails tests) and accept impurity where it doesn't (cosmetic cleanup).


Prometheus: The Completed Migration

Convention Adherence (measured)

Logging migration (go-kit/log → slog/promslog):

  • promslog/slog usages: 373
  • go-kit/log usages: 0

They completed a full logging framework migration. Zero remnants of the old pattern in production code. Only 3 references remain in comments explaining the hack for klog compatibility.

This is rare. Most codebases abandon migrations halfway. Prometheus finished it.

How Patterns Coexist During Migration

AppenderV2 alongside V1:

type scrapeLoopAppendAdapter interface {
    append(b []byte, contentType string, ts time.Time) (...)
}

// Old path
type scrapeLoopAppender struct { storage.Appender }

// New path  
type scrapeLoopAppenderV2 struct { storage.AppenderV2 }

An interface (scrapeLoopAppendAdapter) abstracts over both. The caller doesn't know which version it's using. The selection happens once at construction:

if sl.appendableV2 != nil {
    return &scrapeLoopAppenderV2{...}
}
return &scrapeLoopAppender{...}

Pattern: When evolving an interface, introduce an adapter layer that unifies old and new. The caller changes once. The implementations coexist until the old one can be removed.

Where They Break Conventions

Global vars in scrape:

var ScrapeTimestampTolerance = 2 * time.Millisecond
var AlignScrapeTimestamps = true
var UserAgent = version.PrometheusUserAgent()

These are configurable behavior via package-level vars. This breaks the "config struct" convention used elsewhere. They survive because:

  • They're set once at startup
  • Tests can override them directly
  • Moving them to a config struct would break the API

Functional options in leaf packages:

type WriterOption func(*writerOptions)
func WithUncachedIO(enabled bool) WriterOption { ... }

The main tsdb config is a struct. But chunk writer (added later) uses functional options. The newer code adopted a different convention than the older code.

Why: Chunk writer has 2 options. A config struct for 2 fields is ceremony. Functional options are simpler at that scale. The convention adapts to the context.


Ecto: Zero TODOs, Maximum Discipline

Convention Adherence (measured)

  • TODOs in lib/: 0
  • FIXMEs: 0
  • HACKs: 0

This is remarkable for a 12-year, 6000-commit codebase. Either they never leave TODOs (discipline) or they clean them up before merging (process).

Adapter abstraction purity:

  • References to specific adapters (Postgrex, MyXQL) in lib/: 1 (in a documentation example)
  • All DB interactions go through the adapter interface

How Ecto Introduces New Patterns

Schemaless changesets (later addition):

  • Same module (Ecto.Changeset)
  • Same API (cast/3, validate_required/3)
  • Just works without a schema by passing a types map

They didn't create Ecto.SchemalessChangeset. They made the existing API more general. The new pattern is invisible — it's the same function with a broader input.

Dynamic repos (later addition):

  • put_dynamic_repo/1 / get_dynamic_repo/0
  • Process dictionary for repo resolution
  • Existing code doesn't need to change

Again: new capability added without breaking or splitting the existing pattern. The extension point was designed in from the start (@default_dynamic_repo).

When Ecto Breaks Its Own Rules

Ecto's Queryable protocol says "anything can become a query." But internally, the planner directly pattern- matches on %Ecto.Query{} structs:

%Ecto.Query{from: %Ecto.Query.FromExpr{source: ...}}

This is pragmatic. The protocol gives consumers extensibility. But internally, Ecto knows the concrete types and uses them directly for performance.

Lesson: Abstractions are for consumers. Internals can cheat when performance or simplicity demands it.


Oban: Structural Consistency Across Engines

Convention Adherence (measured)

  • TODOs: 0
  • All Repo access: through Oban.Repo (never bypasses)
  • Engine implementations all follow the same callback structure

Engine size comparison:

  • Basic: 553 lines
  • Dolphin: 426 lines
  • Lite: 402 lines
  • Inline: 140 lines (testing, deliberately minimal)

The three production engines are within 30% of each other in size. They implement the same callbacks in roughly the same way. This is structural consistency.

Where The Pattern Stretches

The Dolphin engine (MySQL) has different query patterns than Basic (Postgres) due to MySQL limitations:

  • No FOR UPDATE SKIP LOCKED
  • No LISTEN/NOTIFY
  • Different JSON handling

Same interface, different implementation shape. The engine behaviour forces structural consistency but can't force implementational consistency.


Cross-Cutting: When and Why Patterns Break

Pattern Breaks Have Three Causes:

1. Age (gradual drift)

  • CockroachDB: RunAsyncTask vs Handle
  • Prometheus: Config structs vs functional options in newer code

Old code establishes a pattern. New code (sometimes by different authors) uses a different approach. Neither is wrong — the codebase evolved.

2. Context (different scale/needs)

  • Prometheus: global vars for 2-3 scrape settings vs full config structs for complex subsystems
  • CockroachDB: fmt.Errorf in CLI tools vs errors.Wrap in core

The pattern that's right for the core isn't always right for peripheral code. The convention adapts to context.

3. Migration (deliberate coexistence)

  • Prometheus: AppenderV1 + V2 with adapter interface
  • CockroachDB: RunAsyncTask + Handle coexisting

New pattern introduced but old one isn't removed yet. This is intentional — forcing migration on all existing code would be high risk for no immediate benefit.

The "Purity Spectrum"

Codebase TODOs Convention breaks Migration discipline
Ecto 0 Minimal (internal) Extremely high
Oban 0 Structural only High
Prometheus Low Completed migration High (once decided)
CockroachDB 1,048 Gradual drift Opportunistic

Insight: Purity correlates with codebase size inversely. Ecto (3.8M, 1 maintainer with very high standards) is pure. CockroachDB (845M, 700 contributors) has 1,048 TODOs. This isn't failure — it's the natural state of a large living system.

What This Means for Code Review

  1. Don't enforce purity mechanically. A pattern break might be intentional (different context, migration in progress, peripheral code).

  2. Ask WHY it breaks. Is this:

    • Gradual drift (worth fixing opportunistically)?
    • Context-appropriate deviation (leave it)?
    • Migration in progress (track completion)?
  3. Enforce at the boundary. CockroachDB's leak tests are 99.6% effective because they enforce at test-time, not code-review-time. Ecto's zero TODOs come from merge-time discipline, not review comments.

  4. New patterns need an adoption path. Prometheus's adapter interface and CockroachDB's gradual migration both work. Big-bang rewrites don't.

  5. Peripheral code doesn't need core conventions. CLI tools can use fmt.Errorf. Test helpers can break rules. Enforce conventions where they matter (the core that carries the system's invariants).