chore: move cross-ecosystem analysis to patterns-vs-guidelines

2026-04-30 10:50:37 -07:00
parent 7bca84f906
commit 46fe9c23c9
4 changed files with 0 additions and 1309 deletions
@@ -1,340 +0,0 @@
-# Architectural Patterns from Top Repos
-
-## CockroachDB: How to Organize 20,000 Files
-
-### The 116-Package Principle
-
-CockroachDB has 116 packages under `pkg/util/` averaging
-**4 files each**. This is deliberate:
-
-**Force:** A 2M-line codebase where developers work on
-different subsystems simultaneously. If `pkg/util` were
-5 big packages, every PR would conflict.
-
-**Pattern:** One concept = one package. `circuit/` is 3
-files (breaker, options, signal). `quotapool/` is 5 files.
-`stop/` is 2 files. The package boundary IS the API
-boundary — no internal debates about what is exported.
-
-**Naming:** Single-concept nouns. No `helpers`, no
-`common`, no `shared`. Every package name tells you what
-it does: `cancelchecker`, `ctxgroup`, `syncutil`.
-
-### Dependency Layering
-
-```
-sql → kv → storage → util
- ↓     ↓       ↓
- ↓     ↓    roachpb (protobuf types)
- ↓     ↓       ↓
- ↓     keys ← util
- ↓
- settings, config
-```
-
-**Critical insight:** `kv` imports from `sql` AND `sql`
-imports from `kv`. They solved circular deps via
-interfaces + callback registration — not by eliminating
-the cycle. The `internal/` package provides the bridge.
-
-`storage` imports `kv` (for transaction types) but `kv`
-also imports `storage`. Again, interface boundaries break
-the cycle at compile time.
-
-**Lesson:** Perfect layering is impossible in distributed
-databases. The real skill is knowing where to put the
-interface that breaks the cycle.
-
-### Error Handling at Scale
-
-They use `github.com/cockroachdb/errors` — their own
-library that extends stdlib `errors` with:
-
- **Error marks:** Tag errors with metadata without
-  changing the error chain
- **Wrapping with causes:** `errors.Wrap(err, "context")`
- **Safe printing:** `redact.Sprint` for log-safe errors
- **Network encoding:** Errors serialize across RPC
-  boundaries
-
-**Pattern:** Errors are first-class data that flows through
-the entire system, surviving serialization across nodes.
-Not just strings — structured, typed, matchable.
-
-### Circuit Breaker (not stdlib)
-
-```go
-type Breaker struct {
-    mu struct {
-        syncutil.RWMutex
-        errAndCh *errAndCh  // stable Signal() results
-        probing  bool
-    }
-}
-```
-
-**Key design:** `Signal()` returns a channel + error getter
-(like `context.Done()` + `context.Err()`). The channel is
-stable — closing it doesn't affect callers who already have
-a reference. New callers get a new channel after reset.
-
-**Force:** In a distributed DB, a broken replica should
-fail-fast all pending requests, then probe for recovery.
-Context cancellation isn't enough because you need to
-distinguish "gave up waiting" from "system is broken."
-
-### QuotaPool: Abstract Resource Allocation
-
-```go
-type Resource interface{}
-type Request interface {
-    Acquire(ctx context.Context, r Resource) (
-        fulfilled bool, tryAgainAfter time.Duration)
-    ShouldWait() bool
-}
-```
-
-**Pattern:** The pool is generic over any resource type.
-Concrete implementations include:
- `IntPool` — weighted semaphore with FIFO ordering
- Rate limiters (via `tryAgainAfter`)
- Token buckets
-
-**Force:** Different subsystems need different quota types
-but the same queueing/fairness semantics. Abstract once,
-instantiate many.
-
---
-
-## Prometheus: Interface-Driven Storage Architecture
-
-### The Contract Layer
-
-`storage/interface.go` defines **15+ interfaces** that
-form the entire query/storage contract:
-
-```
-Storage (top level)
-├── Appendable → Appender (write path)
-├── Queryable → Querier (read path)
-├── ChunkQueryable → ChunkQuerier (bulk read)
-├── ExemplarStorage (exemplars)
-└── Searcher (experimental)
-```
-
-**Force:** Prometheus must support:
- Local TSDB (the main implementation)
- Remote read/write (federation)
- Recording rules (virtual series)
- Testing (mock implementations)
-
-All through the same interface. The contract layer is
-the single point of truth for "what does storage mean."
-
-### Compile-Time Interface Verification
-
-```go
-var _ storage.GetRef = &headAppender{}
-var _ storage.Searcher = &blockBaseQuerier{}
-```
-
-Prometheus uses this pattern **8 times** in tsdb/ alone.
-Every concrete type that claims to satisfy a storage
-interface proves it at compile time.
-
-**Why this matters at scale:** Storage interfaces evolve.
-When `Searcher` was added, every type that should
-implement it needed updating. The `var _` pattern makes
-the compiler tell you what you missed.
-
-### Plugin Discovery via Channel
-
-```go
-type Discoverer interface {
-    Run(ctx context.Context, up chan<- []*targetgroup.Group)
-}
-```
-
-**Brilliance:** The entire service discovery system is one
-interface with one method. Consul, DNS, Kubernetes, AWS —
-all implement `Run`. They push target groups through a
-channel. The manager multiplexes.
-
-**Force:** Prometheus supports 20+ discovery mechanisms.
-Adding one should require zero changes to the core. The
-channel-based push model means the manager never polls.
-
-### Atomic File Operations
-
-Block lifecycle uses filesystem conventions:
- `.tmp-for-creation` — incomplete write
- `.tmp-for-deletion` — incomplete delete
-
-On startup, scan and clean up. No WAL needed for
-block-level operations because rename is atomic on POSIX.
-
-**Force:** TSDB blocks are large (hours of data). A WAL
-for block operations would be overkill. The suffix
-convention gives crash consistency with zero overhead.
-
---
-
-## Ecto: Composability Through Data
-
-### Query as Accumulating Struct
-
-```elixir
-defstruct prefix: nil, sources: nil, from: nil,
-          joins: [], wheres: [], select: nil,
-          order_bys: [], limit: nil, offset: nil,
-          group_bys: [], updates: [], havings: [],
-          preloads: [], distinct: nil, lock: nil,
-          windows: [], with_ctes: nil
-```
-
-**Every query operation appends to a list or sets a
-field.** Nothing is executed. The struct accumulates intent
-until `Repo.all/Repo.one` triggers planning + execution.
-
-**Force:** Queries must be composable (build in one
-module, filter in another, paginate in a third). If
-operations executed immediately, composition would require
-the entire DB context at every step.
-
-### Macro → Builder → Planner Pipeline
-
-```
-User writes: from(u in User, where: u.age > 18)
-                     ↓
-Macro expands: Builder.Filter.build(query, expr, env)
-                     ↓
-Builder produces: %Ecto.Query.BooleanExpr{...}
-                     ↓
-Planner resolves: types, bindings, params
-                     ↓
-Adapter generates: SQL string
-```
-
-Each builder module handles one clause type. There are
-**15 builder modules** (from, join, filter, select, etc.).
-The planner doesn't know about SQL — it resolves the
-query struct into a normalized form that any adapter can
-consume.
-
-**Force:** Support multiple databases (Postgres, MySQL,
-SQLite) with the same query language. The adapter is the
-only part that knows SQL dialect.
-
-### Protocol for Extensibility
-
-`Ecto.Queryable` protocol lets you pass:
- A module atom (`User`) → resolved to schema query
- A string (`"users"`) → raw table
- A tuple (`{"filtered_users", User}`) → view + schema
- An `Ecto.Query` struct → identity
-
-**Force:** `Repo.all(X)` should work with any "queryable
-thing." New queryable types can be added without touching
-Repo code.
-
---
-
-## Oban: Architecture for Testability
-
-### Engine Swap by Config
-
-```elixir
-def get_engine(%{engine: engine, testing: :disabled}), do: engine
-def get_engine(%{testing: :inline}), do: Oban.Engines.Inline
-def get_engine(%{testing: :manual}), do: engine
-```
-
-Three modes:
- **disabled** (production) — real engine
- **inline** (unit test) — execute in caller process
- **manual** (integration) — enqueue but don't execute
-
-**Force:** Background jobs are inherently untestable
-without process control. Rather than making tests async
-(flaky), make the engine deterministic.
-
-### Flat Supervision with Named Registry
-
-```elixir
-children = [
-  {Notifier, conf: conf, name: Registry.via(name, Notifier)},
-  {Nursery, conf: conf, name: Registry.via(name, Nursery)},
-  {Peer, conf: conf, name: Registry.via(name, Peer)},
-  {Sonar, conf: conf, name: Registry.via(name, Sonar)},
-  {Harbor, conf: conf, name: Registry.via(name, Harbor)}
-]
-```
-
-Every child gets its config via `conf:` and its identity
-via `Registry.via`. This means:
- Multiple Oban instances can run in the same VM
- Tests can start isolated Oban supervisors
- No global state — everything is namespaced
-
-**Force:** Libraries can't own global names. Enterprise
-apps run multiple Oban instances (different repos,
-different queues). The Registry pattern makes this
-possible without process naming conflicts.
-
-### Behaviour as Plugin Contract
-
-```elixir
-# Plugin must be a GenServer AND implement these:
-@callback start_link([option()]) :: GenServer.on_start()
-@callback validate([option()]) :: :ok | {:error, String.t()}
-```
-
-**Force:** Plugins need lifecycle management (start, stop,
-crash recovery) AND configuration validation. By requiring
-both a behaviour AND OTP compliance, Oban gets:
- Fault isolation (supervisor restarts crashed plugins)
- Config validation at startup (fail fast)
- No coupling (any GenServer works)
-
---
-
-## Cross-Cutting Insights
-
-### 1. Interfaces at Boundaries, Structs Internally
-
-All four codebases define interfaces at system boundaries
-(storage, engine, discovery) but use concrete types
-internally. The interface is the published contract; the
-struct is the implementation detail.
-
-### 2. Config as Validated Struct, Not Map
-
-Every system validates config at startup and stores it as
-a typed struct. Never a raw map floating around.
-
-### 3. Testing is an Architecture Decision
-
-Oban's engine swap, CockroachDB's stopper tracking,
-Prometheus's mock interfaces — testability isn't bolted on,
-it's designed in from day one.
-
-### 4. Composition via Data, Not Inheritance
-
-Ecto queries accumulate as data. Prometheus discoverers
-push through channels. CockroachDB quota requests are
-data objects. Nobody uses class hierarchies.
-
-### 5. The Cycle Problem is Solved with Interfaces
-
-CockroachDB has circular dependencies between sql↔kv↔
-storage. They break cycles with interface packages that
-both sides depend on. This is the only way at scale.
-
-### 6. Small Packages > Large Packages
-
-CockroachDB: 4 files average per package.
-Oban: focused modules (engine, worker, plugin).
-Ecto: one builder per clause type.
-The package boundary forces you to define the API.
-
-<!-- PATTERN_COMPLETE -->
@@ -1,301 +0,0 @@
-# Cross-Cutting Concerns: How Mature Codebases Handle the Hard Parts
-
-Cross-cutting concerns are the things that touch everything
-but belong nowhere. How a codebase handles logging,
-telemetry, config, retry, and lifecycle management reveals
-its architectural philosophy more than any feature code.
-
---
-
-## 1. Logging: From Strings to Semantic Channels
-
-### CockroachDB: Channel-Based Log Routing
-
-CockroachDB doesn't just log at severity levels — it
-routes logs to **semantic channels**:
-
-```go
-const DEV = logpb.Channel_DEV       // development noise
-const OPS = logpb.Channel_OPS       // operator actions
-const HEALTH = logpb.Channel_HEALTH // background health
-const STORAGE = logpb.Channel_STORAGE
-const SESSIONS = logpb.Channel_SESSIONS
-const SQL_SCHEMA = logpb.Channel_SQL_SCHEMA
-const USER_ADMIN = logpb.Channel_USER_ADMIN
-```
-
-Each channel can be routed to different sinks (file,
-network, etc.) independently. Production deploys typically
-disable DEV entirely and route HEALTH to monitoring.
-
-**Force:** In a multi-tenant distributed database, "who
-cares about this log?" is a different question than "how
-bad is it?" An INFO-level schema change matters to DBAs
-but not to SREs monitoring node health.
-
-**Ecosystem insight:** The channel IS the audience. When
-you write `log.Health.Warningf(...)`, you're declaring
-"the person watching cluster health needs to see this."
-Severity is orthogonal to audience.
-
-### Prometheus: Self-Instrumentation
-
-Prometheus instruments itself with its own metrics:
-
-```go
-type scrapeMetrics struct {
-    targetScrapeSampleLimit        prometheus.Counter
-    targetScrapeSampleOutOfOrder   prometheus.Counter
-    targetIntervalLengthHistogram  *prometheus.HistogramVec
-    // ... 20+ metrics
-}
-```
-
-Metrics are collected in a struct, constructed once via
-`newScrapeMetrics(reg)`, and passed to subsystems. No
-global registration — the registerer is injected.
-
-**Force:** Prometheus IS the metrics system. If it used
-a different metrics library to instrument itself, that
-would be a design smell. Dogfooding proves the API works.
-
-### Ecto + Oban: Telemetry as Standard
-
-Both use Erlang's `:telemetry` library with predictable
-naming:
-
-```elixir
-# Oban
-:telemetry.execute([:oban, :job, :start], measurements, meta)
-:telemetry.execute([:oban, :job, :stop], measurements, meta)
-:telemetry.execute([:oban, :job, :exception], measurements, meta)
-
-# Ecto (adapter-emitted)
-[:my_app, :repo, :query]
-```
-
-**Force:** The BEAM ecosystem standardized on `:telemetry`
-for observability. Libraries don't own their monitoring —
-they emit events; consumers attach handlers. This inverts
-the logging relationship: the library doesn't decide what
-to do with the information.
-
---
-
-## 2. Config Propagation: Three Models
-
-### CockroachDB: Cluster Settings (Distributed Config)
-
-```go
-settings.RegisterDurationSetting(
-    settings.ApplicationLevel,
-    "bulkio.ingest.flush_delay",
-    "amount of time to wait before sending a file...",
-    0,  // default
-)
-```
-
-Settings are:
- **Typed** (Duration, Bool, Int, String)
- **Leveled** (ApplicationLevel vs SystemVisible)
- **Validated** (NonNegativeInt, etc.)
- **Distributed** (propagated across all nodes)
- **Version-gated** (new settings require cluster version)
-
-Usage: `settings.Version.IsActive(ctx, clusterversion.V26_2)`
-
-**Force:** In a distributed database, config isn't a file
-— it's consensus. Every node must agree on every setting,
-and settings can only be enabled once all nodes support
-them. The version gate is the safety mechanism.
-
-### Prometheus: ApplyConfig (Hot Reload)
-
-```go
-func (m *Manager) ApplyConfig(cfg *config.Config) error {
-    m.mtxScrape.Lock()
-    defer m.mtxScrape.Unlock()
-    // rebuild scrape pools from new config
-    // close old loggers, open new ones
-}
-```
-
-Config is a struct loaded from YAML. On SIGHUP (or API
-call), the entire config is re-parsed and `ApplyConfig`
-is called on each subsystem. Each subsystem holds a mutex
-and swaps atomically.
-
-**Force:** Prometheus runs as a single binary. Config
-reload must be atomic per-subsystem but doesn't need
-distributed consensus. The mutex-per-subsystem pattern
-gives independent reload without global coordination.
-
-### Ecto + Oban: Config at Init, Validated Once
-
-```elixir
-# Oban validates exhaustively at startup
-Validation.validate_schema(opts,
-    engine: {:behaviour, Oban.Engine},
-    queues: {:custom, &validate_queues/1},
-    repo: {:module, [config: 0]},
-    ...
-)
-```
-
-Config is validated once at startup and stored as an
-immutable struct. No hot reload. If config is wrong,
-you know immediately (fail fast).
-
-**Force:** Elixir/OTP applications restart processes to
-apply new config. Hot reload is handled by supervisor
-restarts, not config mutation. The "config as immutable
-struct" pattern means no runtime config bugs — it either
-passes validation at startup or the app doesn't start.
-
---
-
-## 3. Retry and Resilience
-
-### CockroachDB: Iterator-Based Retry
-
-```go
-opts := retry.Options{
-    InitialBackoff: 100 * time.Millisecond,
-    MaxBackoff:     2 * time.Second,
-    Multiplier:     2,
-    MaxRetries:     5,
-}
-for r := retry.StartWithCtx(ctx, opts); r.Next(); {
-    // attempt operation
-    if err == nil { break }
-}
-```
-
-Retry is a **for-loop iterator**. `r.Next()` handles
-backoff timing and returns false when exhausted. This
-means retry logic reads like normal code — no callbacks,
-no framework.
-
-**Force:** CockroachDB has hundreds of retry sites. A
-callback-based retry would create deeply nested code.
-The iterator pattern keeps retry at the same indentation
-level as the operation.
-
-### Oban: Repo Dispatch with Built-In Retry
-
-```elixir
-defp dynamic_dispatch(conf, name, args, attempt) do
-    with_dynamic_repo(conf, fn repo ->
-        apply(repo, name, args)
-    end)
-rescue
-    error in UndefinedFunctionError ->
-        if attempt < @retry_opts[:retry] do
-            jittery_sleep(attempt * @retry_opts[:delay])
-            dynamic_dispatch(conf, name, args, attempt + 1)
-        else
-            reraise error, __STACKTRACE__
-        end
-end
-```
-
-Every Ecto operation dispatched through Oban's repo
-wrapper gets automatic retry for transient failures.
-The consumer never sees the retry — it's invisible
-infrastructure.
-
-**Key insight:** Oban retries `UndefinedFunctionError`
-on the repo module itself — absorbing the window during
-hot code reload when the module doesn't exist. This is
-an ecosystem-level concern (BEAM hot code loading) handled
-transparently.
-
---
-
-## 4. Resource Lifecycle: The Stopper Pattern
-
-### CockroachDB: Stopper as Universal Lifecycle
-
-```go
-type Stopper struct { ... }
-
-// RunTask runs a synchronous task
-func (s *Stopper) RunTask(ctx context.Context, taskName string, f func(context.Context)) error
-
-// RunAsyncTask runs a goroutine tracked by the stopper
-func (s *Stopper) RunAsyncTask(ctx context.Context, taskName string, f func(context.Context)) error
-
-// ShouldQuiesce returns a channel closed when shutdown begins
-func (s *Stopper) ShouldQuiesce() <-chan struct{}
-
-// Stop initiates graceful shutdown
-func (s *Stopper) Stop(ctx context.Context)
-```
-
-Every goroutine in CockroachDB is launched through a
-Stopper. This gives:
- **Tracking**: know exactly which goroutines are running
- **Graceful shutdown**: quiesce signal before hard stop
- **Leak detection**: `PrintLeakedStoppers` in tests
- **Throttling**: semaphore limits async tasks
-
-```go
-func init() {
-    leaktest.PrintLeakedStoppers = PrintLeakedStoppers
-}
-```
-
-**Force:** A database cannot afford goroutine leaks —
-they hold locks, connections, and file handles. The
-Stopper is the universal answer: every background task
-is accounted for, every shutdown is graceful, every leak
-is detected in tests.
-
-### Oban: Registry-Based Lifecycle
-
-```elixir
-children = [
-    {Notifier, conf: conf, name: Registry.via(name, Notifier)},
-    {Nursery, conf: conf, name: Registry.via(name, Nursery)},
-    ...
-]
-```
-
-OTP already provides lifecycle management via supervisors.
-Oban's addition is the Registry — namespacing processes
-so multiple instances can coexist. Lifecycle is delegated
-to the platform; naming is the library's concern.
-
---
-
-## 5. What These Patterns Teach for Code Review
-
-### Questions to Ask About Cross-Cutting Concerns:
-
-1. **Logging:** Who is the audience for this log? Is there
-   a routing mechanism, or does everything go to stdout?
-   Does the log help the *operator*, not just the developer?
-
-2. **Config:** How does config reach this code? Is it
-   validated at startup or silently wrong at runtime? Can
-   it be changed without restart? Should it be?
-
-3. **Retry:** Is retry happening at the right layer? Is it
-   invisible to the caller? Does it have backoff + jitter?
-   Does it respect context cancellation?
-
-4. **Lifecycle:** Are background tasks tracked? Will they
-   shut down gracefully? Can you detect leaks in tests?
-
-5. **Telemetry:** Are events emitted or is logging the only
-   observability? Can consumers attach their own handlers?
-
-### Red Flags:
-
- `log.Info("something happened")` with no channel/audience
- Config read from environment at point-of-use (not validated)
- Retry logic duplicated in 5 places with different backoff
- Goroutines launched with `go func()` and no tracking
- No telemetry events — only log lines for observability
-
-<!-- PATTERN_COMPLETE -->
@@ -1,371 +0,0 @@
-# Ecosystem-Level Patterns: How Codebases Present to Consumers
-
-## The Three Questions
-
-For each codebase, ask:
-1. How do consumers **extend** it? (What interfaces/behaviours
-   do they implement?)
-2. How do consumers **compose** with it? (What does day-to-day
-   usage look like?)
-3. What does it deliberately **NOT do**? (What forces shaped
-   those refusals?)
-
---
-
-## CockroachDB: Errors as First-Class Distributed Data
-
-### Extension Points
-
-CockroachDB is not a library — it is a system. Consumers
-extend it through:
- **SQL builtins** (function registration)
- **Storage engines** (via pebble interface)
- **Service discovery** (not user-extensible — closed)
-
-The interesting pattern is how errors flow from storage
-through KV through SQL to the client.
-
-### Error Architecture (ecosystem-level idiom)
-
-```
-Storage error → encoded via cockroachdb/errors →
-  KV wraps with context → serialized across gRPC →
-  SQL decodes → maps to pgcode → wire protocol to client
-```
-
-**Key design decisions:**
-
-1. **Errors have priority.** `ErrPriority()` ranks errors so
-   the system knows which to surface when multiple things
-   fail simultaneously. Transaction abort > restart >
-   unambiguous error > non-retriable.
-
-2. **Errors survive serialization.** `EncodeError` /
-   `DecodeError` serialize errors across RPC boundaries.
-   The error that originated on node 3 arrives at node 1
-   with its full cause chain intact.
-
-3. **Errors map to pg codes.** Every internal error maps to
-   a Postgres error code that clients understand. This is
-   the *ecosystem contract* — clients write
-   `if pgcode == '40001' { retry }`.
-
-**What this teaches:** In a distributed system, an error
-isn't a string — it's a data object with identity,
-priority, serializability, and a consumer-facing code.
-Design your error types for the *consumer*, not the
-*producer*.
-
-### Deliberate Absences
-
- **No dependency injection framework.** Config structs
-  passed explicitly. 1178-line `StoreConfig` struct, but
-  it's all data — no framework magic.
- **No context.Background() on hot paths.** 144 uses in
-  kvserver, but auditable — each justified in comments.
- **No functional options.** CockroachDB uses config
-  structs universally. The Option interface in stopper is
-  the exception, not the rule.
-
-### Test Architecture
-
- **TestMain in every package.** Sets up security certs,
-  random seeds, and test server factories.
- **Goroutine leak detection.** `leaktest.AfterTest(t)()`
-  at the start of every test. Detects leaked goroutines
-  by diffing goroutine stacks before/after.
- **Stopper leak detection.** Every Stopper is tracked
-  globally; `PrintLeakedStoppers(t)` in TestMain catches
-  forgot-to-stop bugs.
- **`//go:generate` for test setup.** Codegen tool
-  (`add-leaktest.sh`) auto-adds leak checks to every
-  test file.
-
-**What this teaches:** At scale, the most important test
-infrastructure isn't assertions — it's resource leak
-detection. Every goroutine, every connection, every
-Stopper is tracked and verified to be cleaned up.
-
---
-
-## Prometheus: The One-Method Interface Contract
-
-### Extension Points
-
-Prometheus is extended through:
- **Service discovery** (30 implementations, 1 interface)
- **Storage** (remote read/write adapters)
- **Exporters** (client_golang metrics)
-
-### The Discoverer Pattern (ecosystem-level idiom)
-
-```go
-type Discoverer interface {
-    Run(ctx context.Context, up chan<- []*targetgroup.Group)
-}
-```
-
-This is **one method**. Thirty implementations. The
-channel-based push model means:
- The discoverer controls timing (not polled)
- The manager multiplexes without knowing implementations
- Adding a new discovery source = implement Run, register
-
-**Registration via init():**
-```go
-func init() {
-    discovery.RegisterConfig(&SDConfig{})
-}
-```
-
-This is the classic Go plugin pattern. Import the package
-→ init registers it → the system discovers it at startup.
-
-**What this teaches:** The smallest possible interface
-creates the largest possible ecosystem. One method + one
-channel = 30 implementations without coordination.
-
-### Storage Contract (15 interfaces, 1 file)
-
-All of Prometheus's storage contract lives in
-`storage/interface.go`. This is the:
- Read path: `Queryable → Querier → SeriesSet → Series`
- Write path: `Appendable → Appender`
- Extension: `ExemplarAppender`, `MetadataUpdater`
-
-**Key:** Every implementation proves satisfaction at
-compile time with `var _ storage.Searcher = &type{}`.
-When the contract evolves, the compiler finds every
-broken implementation.
-
-### Deliberate Absences
-
- **No generics in storage interfaces.** Despite Go 1.20+
-  support. The interfaces predate generics and adding them
-  would break all existing implementations.
- **No dependency injection.** Direct struct construction
-  everywhere. Testability through interface satisfaction,
-  not framework wiring.
- **Almost no functional options.** Only in leaf packages
-  (chunk writer, parser). Core APIs use config structs.
- **No goroutine leak in production code.** `goleak` in
-  tests, `TolerantVerifyLeak` with explicit allowlist for
-  known third-party leaks.
-
-### Test Architecture
-
- **`TolerantVerifyLeak`** — goroutine leak detection with
-  allowlist for known third-party leaks (opencensus, klog)
- **Mock implementations of every interface** — defined
-  right in `storage/interface.go` next to the real ones
- **Golden file tests** in PromQL evaluation
-
---
-
-## Ecto: Composability as Architectural Principle
-
-### Extension Points
-
-Consumers extend Ecto through:
- **Custom types** (7 callbacks: cast, load, dump, equal?,
-  embed_as, autogenerate, type)
- **Adapters** (Queryable, Schema, Transaction, Storage —
-  4 behaviour modules)
- **Protocols** (`Ecto.Queryable` — anything can become a
-  query)
-
-### The NotLoaded Sentinel (ecosystem-level idiom)
-
-```elixir
-defmodule Ecto.Association.NotLoaded do
-  defstruct [:__field__, :__owner__, :__cardinality__]
-end
-```
-
-Ecto **refuses to lazy-load associations**. If you access
-`user.posts` without preloading, you get a `NotLoaded`
-struct — not nil, not an empty list, not a database query.
-
-**Why this is an ecosystem decision:**
- Forces consumers to be explicit about data needs
- Prevents N+1 queries by making them impossible
- Makes the data boundary visible in code
-
-This is a *consumer-hostile* decision that makes
-*systems built on Ecto* dramatically better. The library
-optimizes for the 1000th user, not the first-day
-experience.
-
-### Query Composition (ecosystem-level idiom)
-
-Every query clause appends to a list in the Query struct.
-Nothing executes. The Query is pure data that accumulates
-intent.
-
-**Consumer impact:** You can build queries across module
-boundaries:
-
-```elixir
-# Module A builds the base
-def active_users, do: from(u in User, where: u.active)
-
-# Module B adds pagination
-def paginate(query, page, size) do
-  query
-  |> limit(^size)
-  |> offset(^((page - 1) * size))
-end
-
-# Module C adds authorization
-def visible_to(query, role) do
-  where(query, [u], u.role in ^roles_for(role))
-end
-```
-
-Each module is independent. They compose because queries
-are data, not effects.
-
-### Adapter Architecture
-
-```
-Ecto.Repo.all(query)
-  → Planner resolves types, bindings
-  → Adapter.prepare/2 produces {cache, prepared}
-  → Adapter.execute/5 runs against DB
-  → Adapter.loaders/2 converts back to Elixir types
-```
-
-The adapter is the ONLY part that knows SQL. Ecto core
-is database-agnostic. This is why the same code works on
-Postgres, MySQL, SQLite, and custom stores.
-
-### Deliberate Absences
-
- **No lazy loading.** `NotLoaded` struct instead.
- **No global state.** Per-repo config, per-repo process.
- **No query caching at library level.** The adapter
-  caches prepared statements; Ecto doesn't.
- **No connection to schema naming.** `schema "legacy_tbl"`
-  is independent of `defmodule NewUser`.
-
---
-
-## Oban: Designing for Testability First
-
-### Extension Points
-
-Consumers extend Oban through:
- **Workers** (`perform/1` — the job logic)
- **Plugins** (GenServer + validate callback)
- **Engines** (entire backend swap)
- **Notifiers** (pub/sub mechanism)
- **Peers** (leader election)
-
-### The Worker Result Type (ecosystem-level idiom)
-
-```elixir
-@type result ::
-  :ok
-  | {:ok, ignored :: term()}
-  | {:error, reason :: term()}
-  | {:cancel, reason :: term()}
-  | {:snooze, period :: Period.t()}
-```
-
-Five possible outcomes, each with distinct semantics:
- `:ok` → success, remove from queue
- `{:error, reason}` → retry (respects max_attempts)
- `{:cancel, reason}` → permanent failure, don't retry
- `{:snooze, period}` → reschedule for later
-
-**Ecosystem impact:** Every worker author makes an
-explicit decision about failure semantics. "What should
-happen when this fails?" is answered in the type system,
-not in configuration.
-
-### Contextual Backoff (ecosystem-level idiom)
-
-```elixir
-def backoff(%Job{attempt: attempt, unsaved_error: err}) do
-  case err.reason do
-    %RateLimitError{retry_after: ms} -> ms
-    _ -> trunc(:math.pow(attempt, 4) + jitter())
-  end
-end
-```
-
-The error that caused the failure is available to the
-backoff calculation. Different errors → different retry
-strategies. This is impossible in systems where backoff
-is configured globally.
-
-### Testing Design (ecosystem-level idiom)
-
-Three testing modes via config:
- **`:inline`** — execute jobs synchronously in tests
- **`:manual`** — enqueue but don't execute
- **`:disabled`** — production behavior
-
-Plus `use Oban.Testing` which provides:
- `assert_enqueued/1` — verify job was queued
- `refute_enqueued/1` — verify job was NOT queued
- `perform_job/2` — execute a job manually in tests
- `all_enqueued/1` — list all matching jobs
-
-**Ecosystem impact:** Every Oban consumer gets
-deterministic, fast, isolated tests for free. No sleep,
-no polling, no flaky async assertions.
-
-### Deliberate Absences
-
- **No global process names.** Registry.via everywhere —
-  multiple Oban instances can coexist.
- **No direct DB coupling in workers.** Workers receive a
-  Job struct; they don't import Repo.
- **No implicit retries.** max_attempts is explicit per
-  worker. No "retry forever" default.
- **No built-in rate limiting in OSS.** That is a Pro
-  feature — deliberate business boundary.
-
---
-
-## Cross-Cutting: What "Idiomatic" Means at Ecosystem Level
-
-### 1. The Consumer Contract is the API
-
-Not the functions you export — the *experience* of
-building on your system:
- CockroachDB: "Your errors will be pg-codes, always"
- Prometheus: "Implement Run(), get discovery for free"
- Ecto: "Queries are data; loading is always explicit"
- Oban: "Return a result type; testing is built in"
-
-### 2. Deliberate Absences Define Character
-
-What a system refuses to do is as important as what it
-does:
- Ecto refuses lazy loading → forces explicit data needs
- Oban refuses global names → enables multi-instance
- Prometheus refuses DI frameworks → keeps simplicity
- CockroachDB refuses context.Background on hot paths →
-  forces timeout discipline
-
-### 3. Testability is Never Retrofitted
-
-Every system that tests well designed testing in from the
-start:
- CockroachDB: leak detection, stopper tracking
- Prometheus: goroutine leak verification, mock interfaces
- Ecto: adapter abstraction, embedded schemas for testing
- Oban: engine swap, testing modes, assertion helpers
-
-### 4. Extension Points Define the Ecosystem Size
-
- Prometheus: 1 interface, 30 discoverers
- Ecto: 7 type callbacks, hundreds of custom types
- Oban: Worker behaviour + 5 engine callbacks
-
-**Smaller interface → larger ecosystem.** The less you
-demand from implementors, the more you get.
-
-<!-- PATTERN_COMPLETE -->
@@ -1,297 +0,0 @@
-# Testing Philosophy & API Evolution
-
-How codebases prove correctness and manage change over
-time reveals their deepest architectural commitments.
-
---
-
-## Testing Philosophy: Four Models of Proof
-
-### CockroachDB: Defense in Depth
-
-**Levels of proof:**
-1. **Unit tests** — co-located in same package
-2. **Echotest/golden files** — snapshot expected output (209
-   testdata directories, auto-rewrite with -rewrite flag)
-3. **Data-driven tests** — declarative test specs in txt files
-4. **KVNemesis** — chaos/fuzzing that generates random KV
-   operations and checks linearizability
-5. **Leak detection** — goroutines, stoppers tracked globally
-
-**The echotest pattern:**
-```go
-echotest.Require(t, output, filepath.Join("testdata", name+".txt"))
-```
-
-Golden file says:
-```
-echo
----
-result is ambiguous: boom with a secret
-result is ambiguous: boom with a ‹secret›
-```
-
-The test produces output, compares against the golden file.
-Run with `-rewrite` to update. This means:
- Tests are **self-documenting** (the golden file IS the spec)
- Regressions are **visible in diffs** (the golden file changes)
- No manual expected-value maintenance
-
-**KVNemesis (chaos testing at ecosystem level):**
-Generates random sequences of KV operations (puts, gets,
-splits, merges, transfers) against a real cluster, then
-validates that results satisfy serializable isolation.
-
-This isn't unit testing. This is proving the *system* is
-correct, not individual functions.
-
-**Resource leak detection as CI gate:**
-```go
-// Every test file
-defer leaktest.AfterTest(t)()
-
-// Every TestMain
-func init() {
-    leaktest.PrintLeakedStoppers = PrintLeakedStoppers
-}
-```
-
-If a test leaks a goroutine or Stopper, it **fails**. Not
-a warning. A failure. This means resource correctness is
-as enforceable as logic correctness.
-
-### Prometheus: Golden Files + Goroutine Verification
-
-**Testing DSL for PromQL:**
-```
-load 5m
-  http_requests{job="api-server"} 0+10x10
-
-eval instant at 50m SUM BY (group) (http_requests)
-  {group="canary"} 700
-  {group="production"} 300
-```
-
-This is a custom test language. Load data, evaluate
-expressions, assert results. **205 test config files**
-in `config/testdata/` alone.
-
-**Force:** PromQL is complex enough that example-based
-testing would be insufficient. The DSL lets you write
-hundreds of test cases concisely, covering edge cases
-that would require dozens of Go test functions.
-
-**Goroutine leak detection:**
-```go
-func TolerantVerifyLeak(m *testing.M) {
-    goleak.VerifyTestMain(m,
-        goleak.IgnoreTopFunction("go.opencensus.io/..."),
-        goleak.IgnoreTopFunction("k8s.io/klog/..."),
-    )
-}
-```
-
-Explicit allowlist for known third-party leaks. Everything
-else is a test failure. Zero-tolerance with escape hatches
-for unfixable external dependencies.
-
-### Ecto: Fake Adapter + Process Mailbox Assertions
-
-```elixir
-defmodule Ecto.TestAdapter do
-  @behaviour Ecto.Adapter
-  @behaviour Ecto.Adapter.Queryable
-  @behaviour Ecto.Adapter.Schema
-  @behaviour Ecto.Adapter.Transaction
-
-  def execute(_, _, {:nocache, {:all, query}}, _, _) do
-    send(self(), {:all, query})
-    Process.get(:test_repo_all_results) || results_for_all_query(query)
-  end
-end
-```
-
-**Ecto tests the entire query pipeline without a database.**
-The fake adapter:
- Sends messages to `self()` on every operation
- Tests assert on `receive {:insert, meta}` etc.
- No network, no state, pure message-passing verification
-
-**48 test files, 43 with `async: true`.** The test suite
-runs in parallel because there's no shared state — every
-test talks to its own process mailbox.
-
-**Force:** Ecto is a *library*, not a service. It can't
-require Postgres in CI for every contributor. The fake
-adapter makes the entire query compilation + planning
-pipeline testable without external dependencies.
-
-### Oban: Testing Modes as First-Class Feature
-
-```elixir
-# In test config
-config :my_app, Oban, testing: :inline
-
-# In test
-use Oban.Testing, repo: MyApp.Repo
-
-test "job was enqueued" do
-  assert_enqueued worker: MyWorker, args: %{id: 1}
-end
-
-test "job executes correctly" do
-  assert :ok = perform_job(MyWorker, %{id: 1})
-end
-```
-
-Three modes:
- **`:inline`** — jobs execute synchronously in the test
-  process. No GenServers, no queues, no async.
- **`:manual`** — jobs are enqueued but not executed.
-  Use `assert_enqueued` to verify they were created.
- **`:disabled`** — production behavior in tests.
-
-**Force:** Background jobs are the #1 source of test
-flakiness. Oban eliminates it by making the execution
-model configurable. Tests never poll, never sleep, never
-race.
-
---
-
-## API Evolution: Three Strategies
-
-### CockroachDB: Version Gates (Distributed Migration)
-
-```go
-const (
-    V26_2_AddStatementStatisticsComputedColumns Key = iota
-    V26_2_ChangefeedsStopReadingSpanLevelCheckpoints
-    V26_2_ChangefeedsStopWritingSpanLevelCheckpoints
-)
-
-// In code:
-if settings.Version.IsActive(ctx, clusterversion.V26_2) {
-    // use new behavior
-}
-```
-
-**The pattern:** Every change to observable behavior gets
-a version constant. The feature is only enabled when ALL
-nodes in the cluster have been upgraded past that version.
-
-**Two-phase deprecation for distributed changes:**
-```
-V26_2_ChangefeedsStopReadingSpanLevelCheckpoints
-V26_2_ChangefeedsStopWritingSpanLevelCheckpoints
-V26_2_ChangefeedsNoLongerHaveSpanLevelCheckpoints
-```
-
-Three versions for one removal:
-1. Stop reading (new code doesn't depend on old format)
-2. Stop writing (old format no longer produced)
-3. Clean up (safe to remove the old code)
-
-**Force:** In a distributed database, you can't change
-behavior atomically. Some nodes will be old, some new.
-The version gate ensures new behavior only activates
-when it's safe — when all nodes understand it.
-
-**Pruning:** Once MinSupported advances past a version
-constant, it's deleted. The code path is always active
-so the `IsActive` check becomes dead code. Regular
-pruning keeps the codebase from accumulating gates.
-
-### Oban: Numbered Migrations (Schema Evolution)
-
-```elixir
-lib/oban/migrations/postgres/
-├── v01.ex  # Initial schema (job table, state enum)
-├── v02.ex  # Add columns
-├── v03.ex  # Index optimization
-...
-├── v14.ex  # Latest
-```
-
-Each migration is:
- **Idempotent** (safe to run twice)
- **Prefix-aware** (multi-tenant schemas)
- **Bidirectional** (up + down)
- **Database-specific** (postgres/, sqlite/, myxql/)
-
-**Consumer usage:**
-```elixir
-defmodule MyApp.Repo.Migrations.AddOban do
-  use Ecto.Migration
-  def up, do: Oban.Migrations.up(version: 14)
-  def down, do: Oban.Migrations.down(version: 14)
-end
-```
-
-**Force:** Oban owns a database table but lives inside
-the consumer's migration system. Numbered versions let
-consumers upgrade incrementally without knowing Oban
-internals.
-
-### Ecto: Compile-Time Deprecation + Semver
-
-```elixir
-# In changeset.ex
-IO.warn(
-  "passing a list of binaries to cast/3 is deprecated..."
-)
-```
-
-Ecto deprecates at **compile time**. When you compile
-code that uses a deprecated API, you get a warning.
-At runtime, everything still works.
-
-**CHANGELOG as contract:**
-```
-## v3.14.0-dev
-### Enhancements
-### Bug fixes
-
-## v3.13.5 (2025-11-09)
-### Enhancements
-```
-
-The changelog is the API evolution document. Breaking
-changes require a major version bump (hasn't happened
-in years because the adapter pattern provides
-extensibility without breakage).
-
---
-
-## What This Teaches for Code Review
-
-### Testing Questions:
-1. Is this testable **without standing up the system**?
-   (Ecto's fake adapter, Oban's inline engine)
-2. Are resources **tracked and leak-detected**?
-   (CockroachDB's stopper/goroutine tracking)
-3. Are test assertions **deterministic**? No sleep, no
-   poll, no "eventually consistent" in unit tests.
-4. Could this be a **golden file test**? If the output
-   is deterministic, snapshot it. Regression = visible diff.
-5. Is there **chaos/property testing** for invariants?
-   (KVNemesis for linearizability)
-
-### Evolution Questions:
-1. Can this change be deployed **gradually**? Or does it
-   require all consumers to upgrade atomically?
-2. Is there a **two-phase** path? (Stop reading → stop
-   writing → remove)
-3. Is the deprecation **visible at compile time**? Or
-   will consumers only discover it at runtime?
-4. Is the migration **idempotent**? Can it be run twice
-   safely?
-
-### Red Flags:
- Tests that require a running database for unit-level logic
- No resource leak detection in concurrent code
- `time.Sleep` / `Process.sleep` in tests instead of
-  deterministic signals
- Breaking changes without version gates or migration path
- Deprecation that only appears in docs, not in tooling
-
-<!-- PATTERN_COMPLETE -->