Extension points, deliberate absences, test architecture, and consumer contracts across CockroachDB, Prometheus, Ecto, Oban. Key insight: smaller interface → larger ecosystem.
11 KiB
Ecosystem-Level Patterns: How Codebases Present to Consumers
The Three Questions
For each codebase, ask:
- How do consumers extend it? (What interfaces/behaviours do they implement?)
- How do consumers compose with it? (What does day-to-day usage look like?)
- What does it deliberately NOT do? (What forces shaped those refusals?)
CockroachDB: Errors as First-Class Distributed Data
Extension Points
CockroachDB is not a library — it is a system. Consumers extend it through:
- SQL builtins (function registration)
- Storage engines (via pebble interface)
- Service discovery (not user-extensible — closed)
The interesting pattern is how errors flow from storage through KV through SQL to the client.
Error Architecture (ecosystem-level idiom)
Storage error → encoded via cockroachdb/errors →
KV wraps with context → serialized across gRPC →
SQL decodes → maps to pgcode → wire protocol to client
Key design decisions:
-
Errors have priority.
ErrPriority()ranks errors so the system knows which to surface when multiple things fail simultaneously. Transaction abort > restart > unambiguous error > non-retriable. -
Errors survive serialization.
EncodeError/DecodeErrorserialize errors across RPC boundaries. The error that originated on node 3 arrives at node 1 with its full cause chain intact. -
Errors map to pg codes. Every internal error maps to a Postgres error code that clients understand. This is the ecosystem contract — clients write
if pgcode == '40001' { retry }.
What this teaches: In a distributed system, an error isn't a string — it's a data object with identity, priority, serializability, and a consumer-facing code. Design your error types for the consumer, not the producer.
Deliberate Absences
- No dependency injection framework. Config structs
passed explicitly. 1178-line
StoreConfigstruct, but it's all data — no framework magic. - No context.Background() on hot paths. 144 uses in kvserver, but auditable — each justified in comments.
- No functional options. CockroachDB uses config structs universally. The Option interface in stopper is the exception, not the rule.
Test Architecture
- TestMain in every package. Sets up security certs, random seeds, and test server factories.
- Goroutine leak detection.
leaktest.AfterTest(t)()at the start of every test. Detects leaked goroutines by diffing goroutine stacks before/after. - Stopper leak detection. Every Stopper is tracked
globally;
PrintLeakedStoppers(t)in TestMain catches forgot-to-stop bugs. //go:generatefor test setup. Codegen tool (add-leaktest.sh) auto-adds leak checks to every test file.
What this teaches: At scale, the most important test infrastructure isn't assertions — it's resource leak detection. Every goroutine, every connection, every Stopper is tracked and verified to be cleaned up.
Prometheus: The One-Method Interface Contract
Extension Points
Prometheus is extended through:
- Service discovery (30 implementations, 1 interface)
- Storage (remote read/write adapters)
- Exporters (client_golang metrics)
The Discoverer Pattern (ecosystem-level idiom)
type Discoverer interface {
Run(ctx context.Context, up chan<- []*targetgroup.Group)
}
This is one method. Thirty implementations. The channel-based push model means:
- The discoverer controls timing (not polled)
- The manager multiplexes without knowing implementations
- Adding a new discovery source = implement Run, register
Registration via init():
func init() {
discovery.RegisterConfig(&SDConfig{})
}
This is the classic Go plugin pattern. Import the package → init registers it → the system discovers it at startup.
What this teaches: The smallest possible interface creates the largest possible ecosystem. One method + one channel = 30 implementations without coordination.
Storage Contract (15 interfaces, 1 file)
All of Prometheus's storage contract lives in
storage/interface.go. This is the:
- Read path:
Queryable → Querier → SeriesSet → Series - Write path:
Appendable → Appender - Extension:
ExemplarAppender,MetadataUpdater
Key: Every implementation proves satisfaction at
compile time with var _ storage.Searcher = &type{}.
When the contract evolves, the compiler finds every
broken implementation.
Deliberate Absences
- No generics in storage interfaces. Despite Go 1.20+ support. The interfaces predate generics and adding them would break all existing implementations.
- No dependency injection. Direct struct construction everywhere. Testability through interface satisfaction, not framework wiring.
- Almost no functional options. Only in leaf packages (chunk writer, parser). Core APIs use config structs.
- No goroutine leak in production code.
goleakin tests,TolerantVerifyLeakwith explicit allowlist for known third-party leaks.
Test Architecture
TolerantVerifyLeak— goroutine leak detection with allowlist for known third-party leaks (opencensus, klog)- Mock implementations of every interface — defined
right in
storage/interface.gonext to the real ones - Golden file tests in PromQL evaluation
Ecto: Composability as Architectural Principle
Extension Points
Consumers extend Ecto through:
- Custom types (7 callbacks: cast, load, dump, equal?, embed_as, autogenerate, type)
- Adapters (Queryable, Schema, Transaction, Storage — 4 behaviour modules)
- Protocols (
Ecto.Queryable— anything can become a query)
The NotLoaded Sentinel (ecosystem-level idiom)
defmodule Ecto.Association.NotLoaded do
defstruct [:__field__, :__owner__, :__cardinality__]
end
Ecto refuses to lazy-load associations. If you access
user.posts without preloading, you get a NotLoaded
struct — not nil, not an empty list, not a database query.
Why this is an ecosystem decision:
- Forces consumers to be explicit about data needs
- Prevents N+1 queries by making them impossible
- Makes the data boundary visible in code
This is a consumer-hostile decision that makes systems built on Ecto dramatically better. The library optimizes for the 1000th user, not the first-day experience.
Query Composition (ecosystem-level idiom)
Every query clause appends to a list in the Query struct. Nothing executes. The Query is pure data that accumulates intent.
Consumer impact: You can build queries across module boundaries:
# Module A builds the base
def active_users, do: from(u in User, where: u.active)
# Module B adds pagination
def paginate(query, page, size) do
query
|> limit(^size)
|> offset(^((page - 1) * size))
end
# Module C adds authorization
def visible_to(query, role) do
where(query, [u], u.role in ^roles_for(role))
end
Each module is independent. They compose because queries are data, not effects.
Adapter Architecture
Ecto.Repo.all(query)
→ Planner resolves types, bindings
→ Adapter.prepare/2 produces {cache, prepared}
→ Adapter.execute/5 runs against DB
→ Adapter.loaders/2 converts back to Elixir types
The adapter is the ONLY part that knows SQL. Ecto core is database-agnostic. This is why the same code works on Postgres, MySQL, SQLite, and custom stores.
Deliberate Absences
- No lazy loading.
NotLoadedstruct instead. - No global state. Per-repo config, per-repo process.
- No query caching at library level. The adapter caches prepared statements; Ecto doesn't.
- No connection to schema naming.
schema "legacy_tbl"is independent ofdefmodule NewUser.
Oban: Designing for Testability First
Extension Points
Consumers extend Oban through:
- Workers (
perform/1— the job logic) - Plugins (GenServer + validate callback)
- Engines (entire backend swap)
- Notifiers (pub/sub mechanism)
- Peers (leader election)
The Worker Result Type (ecosystem-level idiom)
@type result ::
:ok
| {:ok, ignored :: term()}
| {:error, reason :: term()}
| {:cancel, reason :: term()}
| {:snooze, period :: Period.t()}
Five possible outcomes, each with distinct semantics:
:ok→ success, remove from queue{:error, reason}→ retry (respects max_attempts){:cancel, reason}→ permanent failure, don't retry{:snooze, period}→ reschedule for later
Ecosystem impact: Every worker author makes an explicit decision about failure semantics. "What should happen when this fails?" is answered in the type system, not in configuration.
Contextual Backoff (ecosystem-level idiom)
def backoff(%Job{attempt: attempt, unsaved_error: err}) do
case err.reason do
%RateLimitError{retry_after: ms} -> ms
_ -> trunc(:math.pow(attempt, 4) + jitter())
end
end
The error that caused the failure is available to the backoff calculation. Different errors → different retry strategies. This is impossible in systems where backoff is configured globally.
Testing Design (ecosystem-level idiom)
Three testing modes via config:
:inline— execute jobs synchronously in tests:manual— enqueue but don't execute:disabled— production behavior
Plus use Oban.Testing which provides:
assert_enqueued/1— verify job was queuedrefute_enqueued/1— verify job was NOT queuedperform_job/2— execute a job manually in testsall_enqueued/1— list all matching jobs
Ecosystem impact: Every Oban consumer gets deterministic, fast, isolated tests for free. No sleep, no polling, no flaky async assertions.
Deliberate Absences
- No global process names. Registry.via everywhere — multiple Oban instances can coexist.
- No direct DB coupling in workers. Workers receive a Job struct; they don't import Repo.
- No implicit retries. max_attempts is explicit per worker. No "retry forever" default.
- No built-in rate limiting in OSS. That is a Pro feature — deliberate business boundary.
Cross-Cutting: What "Idiomatic" Means at Ecosystem Level
1. The Consumer Contract is the API
Not the functions you export — the experience of building on your system:
- CockroachDB: "Your errors will be pg-codes, always"
- Prometheus: "Implement Run(), get discovery for free"
- Ecto: "Queries are data; loading is always explicit"
- Oban: "Return a result type; testing is built in"
2. Deliberate Absences Define Character
What a system refuses to do is as important as what it does:
- Ecto refuses lazy loading → forces explicit data needs
- Oban refuses global names → enables multi-instance
- Prometheus refuses DI frameworks → keeps simplicity
- CockroachDB refuses context.Background on hot paths → forces timeout discipline
3. Testability is Never Retrofitted
Every system that tests well designed testing in from the start:
- CockroachDB: leak detection, stopper tracking
- Prometheus: goroutine leak verification, mock interfaces
- Ecto: adapter abstraction, embedded schemas for testing
- Oban: engine swap, testing modes, assertion helpers
4. Extension Points Define the Ecosystem Size
- Prometheus: 1 interface, 30 discoverers
- Ecto: 7 type callbacks, hundreds of custom types
- Oban: Worker behaviour + 5 engine callbacks
Smaller interface → larger ecosystem. The less you demand from implementors, the more you get.