From 758ae5dae4809ce826b436eb3b91e61258bc29d8 Mon Sep 17 00:00:00 2001 From: Rodin Date: Thu, 30 Apr 2026 09:04:11 -0700 Subject: [PATCH] docs: add patterns extracted from cockroachdb and prometheus CockroachDB: 4 patterns (Stopper lifecycle, leak detection, two-phase shutdown, CloserFn adapter) Prometheus: 5 patterns (atomic file ops, DefaultOptions, aligned timestamps, sentinel errors, compile-time interface checks) --- sources/cockroachdb-patterns.md | 179 +++++++++++++++++++++++++++++++ sources/prometheus-patterns.md | 182 ++++++++++++++++++++++++++++++++ 2 files changed, 361 insertions(+) create mode 100644 sources/cockroachdb-patterns.md create mode 100644 sources/prometheus-patterns.md diff --git a/sources/cockroachdb-patterns.md b/sources/cockroachdb-patterns.md new file mode 100644 index 0000000..6d822df --- /dev/null +++ b/sources/cockroachdb-patterns.md @@ -0,0 +1,179 @@ +# Patterns Extracted from cockroachdb/cockroach + +## Pattern: Stopper for Goroutine Lifecycle + +**Source:** `pkg/util/stop/stopper.go` +**Category:** concurrency + +**What:** A dedicated struct that manages the lifecycle of +all goroutines in a component: tracks active tasks, refuses +new work during shutdown (quiesce), waits for completion, +then runs closers. + +**Why:** In distributed systems, clean shutdown is critical. +You need to: (1) stop accepting new work, (2) finish +in-flight work, (3) release resources in order. The Stopper +centralizes this instead of scattering shutdown logic across +every goroutine. + +**Example:** + +```go +type Stopper struct { + quiescer chan struct{} // closed when quiescing + stopped chan struct{} // closed when fully stopped + mu struct { + syncutil.RWMutex + _numTasks int32 + quiescing, stopping bool + closers []Closer + } +} + +// RunAsyncTask refuses new work during quiesce +func (s *Stopper) RunAsyncTask(ctx context.Context, + taskName string, f func(context.Context)) error { + if !s.addTask() { + return ErrUnavailable + } + go func() { + defer s.decTask() + f(ctx) + }() + return nil +} +``` + +**When to use:** Any server or subsystem that spawns +goroutines and needs graceful shutdown. Especially in +long-running services where leaked goroutines cause +resource exhaustion. + +**When NOT to use:** Simple programs with a single main +goroutine. Or when `errgroup` with context cancellation +suffices for the shutdown coordination. + +--- + +## Pattern: Tracked Lifecycle with Leak Detection + +**Source:** `pkg/util/stop/stopper.go` +**Category:** testing + +**What:** Register every Stopper instance in a global +tracker. In tests, call `PrintLeakedStoppers(t)` to detect +any Stopper that was created but never stopped — indicating +a resource leak. + +**Why:** Distributed systems have complex lifecycle graphs. +A forgot-to-stop bug silently leaks goroutines and +connections. The tracker makes leaks fail-loud in tests +without requiring careful manual cleanup. + +**Example:** + +```go +var trackedStoppers struct { + syncutil.Mutex + stoppers []stopperWithStack +} + +func register(s *Stopper) { + trackedStoppers.Lock() + trackedStoppers.stoppers = append(...) + trackedStoppers.Unlock() +} + +func PrintLeakedStoppers(t testing.TB) { + for _, tracked := range trackedStoppers.stoppers { + t.Errorf("leaked stopper, created at:\n%s", + tracked.createdAt) + } +} +``` + +**When to use:** Any resource that must be explicitly +closed/stopped and where forgetting to do so causes silent +degradation. + +**When NOT to use:** Resources with finalizers or GC-safe +cleanup. Adds global state — only for testing. + +--- + +## Pattern: Quiesce Then Stop (Two-Phase Shutdown) + +**Source:** `pkg/util/stop/stopper.go` +**Category:** concurrency + +**What:** Shutdown has two explicit phases: (1) Quiesce — +refuse new work, wait for in-flight to finish; (2) Stop — +run closers, signal done. Components observe +`ShouldQuiesce` channel alongside context. + +**Why:** One-phase shutdown (just cancel context) loses +in-flight work. Two-phase gives running tasks time to +complete while preventing new work from starting. The +explicit channel (vs just context) lets components +distinguish "winding down" from "dead." + +**Example:** + +```go +func worker(s *Stopper, ctx context.Context) { + for { + select { + case <-s.ShouldQuiesce(): + return // graceful: finish current, exit + case <-ctx.Done(): + return // hard cancel + case work := <-workChan: + process(work) + } + } +} +``` + +**When to use:** Servers handling requests where you want +zero-downtime deploys (drain then stop). Load balancers, +RPC servers, queue consumers. + +**When NOT to use:** Batch jobs or CLIs where immediate +exit is fine. + +--- + +## Pattern: CloserFn Adapter + +**Source:** `pkg/util/stop/stopper.go` +**Category:** concurrency + +**What:** Define a `Closer` interface with one method +(`Close()`), plus a `CloserFn` type that adapts any +function into a Closer. + +**Why:** The adapter pattern (like `http.HandlerFunc`) +avoids forcing users to define a struct just to implement +a one-method interface. Cleanup functions can be registered +directly. + +**Example:** + +```go +type Closer interface { Close() } +type CloserFn func() +func (f CloserFn) Close() { f() } + +// Usage: +stopper.AddCloser(stop.CloserFn(func() { + conn.Close() +})) +``` + +**When to use:** Any one-method interface where callers +often have a simple function they want to register. + +**When NOT to use:** Interfaces with >1 method, or when +the implementation needs state beyond a closure. + + diff --git a/sources/prometheus-patterns.md b/sources/prometheus-patterns.md new file mode 100644 index 0000000..6a63606 --- /dev/null +++ b/sources/prometheus-patterns.md @@ -0,0 +1,182 @@ +# Patterns Extracted from prometheus/prometheus + +## Pattern: Atomic File Operations with Suffix Convention + +**Source:** `tsdb/db.go` +**Category:** storage + +**What:** Use directory suffixes (`.tmp-for-creation`, +`.tmp-for-deletion`) to make multi-step file operations +crash-safe. On startup, clean up any dirs with these +suffixes (they represent incomplete operations). + +**Why:** Database storage needs atomicity. If the process +crashes between creating a block and finalizing it, you +need to know the block is incomplete. The suffix convention +makes incomplete state visible at the filesystem level +without requiring a separate journal. + +**Example:** + +```go +const ( + tmpForDeletionBlockDirSuffix = ".tmp-for-deletion" + tmpForCreationBlockDirSuffix = ".tmp-for-creation" +) + +// On startup: remove any .tmp-* dirs (incomplete ops) +// On create: write to dir.tmp-for-creation, then rename +// On delete: rename to dir.tmp-for-deletion, then remove +``` + +**When to use:** Any system that manages files/directories +and needs crash consistency without a full WAL. Simpler +than a write-ahead log for coarse-grained operations. + +**When NOT to use:** When you already have a WAL or +transaction log. Or for fine-grained operations where +rename semantics are insufficient. + +--- + +## Pattern: DefaultOptions() Function + +**Source:** `tsdb/db.go` +**Category:** configuration + +**What:** Provide a `DefaultOptions()` function returning a +fully-populated config struct. Users copy and override only +what they need. No nil-means-default ambiguity. + +**Why:** Large config structs (20+ fields) are unwieldy. +By providing sane defaults as a function (not a +package-level var), you avoid mutation bugs and make it +clear what "normal" looks like. Users only specify +deviations. + +**Example:** + +```go +func DefaultOptions() *Options { + return &Options{ + WALSegmentSize: wlog.DefaultSegmentSize, + RetentionDuration: int64(15*24*time.Hour / ...), + MinBlockDuration: DefaultBlockDuration, + MaxBlockDuration: DefaultBlockDuration, + SamplesPerChunk: DefaultSamplesPerChunk, + // ... 20 more fields with sane defaults + } +} + +// Usage: +opts := tsdb.DefaultOptions() +opts.RetentionDuration = 30 * 24 * time.Hour +db, err := tsdb.Open(dir, nil, nil, opts, nil) +``` + +**When to use:** Config structs with many fields where most +users want defaults. Especially when zero-value semantics +would be confusing (e.g., 0 retention = infinite? or off?). + +**When NOT to use:** Small configs (3-4 fields) where +struct literal with zero-means-default is clear enough. + +--- + +## Pattern: Scrape Loop with Aligned Timestamps + +**Source:** `scrape/scrape.go` +**Category:** concurrency + +**What:** Periodic scrape loops that align timestamps to +intervals with a small tolerance, enabling better storage +compression downstream. + +**Why:** Time-series databases compress better when +timestamps are regular. A 2ms tolerance on alignment +means scraped data aligns to the expected grid while +accommodating real-world jitter. + +**Example:** + +```go +var ScrapeTimestampTolerance = 2 * time.Millisecond +var AlignScrapeTimestamps = true + +// In scrape loop: if scrape finishes within tolerance +// of expected timestamp, snap to the grid +``` + +**When to use:** Any periodic data collection where +downstream storage benefits from timestamp regularity. +Metrics, heartbeats, polling loops. + +**When NOT to use:** Event-driven data where timestamps +must reflect actual occurrence time. Audit logs, user +actions, financial transactions. + +--- + +## Pattern: Sentinel Errors with Interface Check + +**Source:** `tsdb/db.go` +**Category:** error-handling + +**What:** Define package-level sentinel errors with +`errors.New()` and use compile-time interface assertions +to verify implementations satisfy storage interfaces. + +**Why:** `ErrNotReady` as a sentinel lets callers use +`errors.Is` for retry logic. The pattern ensures error +identity is stable across versions (not string-matched). + +**Example:** + +```go +var ErrNotReady = errors.New("TSDB not ready") + +// Callers can reliably detect this: +if errors.Is(err, tsdb.ErrNotReady) { + // Retry later — DB is still initializing +} +``` + +**When to use:** Any error that callers need to handle +programmatically (retry, fallback, special UI). Make it a +named sentinel, not a string comparison. + +**When NOT to use:** Errors that are always terminal or +always logged-and-discarded. Not every error needs a name. + +--- + +## Pattern: Compile-Time Interface Satisfaction + +**Source:** `scrape/scrape.go` +**Category:** organization + +**What:** Use `var _ Interface = (*Type)(nil)` to verify at +compile time that a type satisfies an interface, even if +the type is only used dynamically. + +**Why:** Without this, you discover missing methods only +when the type is actually used — which might be in a +rarely-exercised code path or only in production. The +compile-time check catches it immediately. + +**Example:** + +```go +var _ FailureLogger = (*logging.JSONFileLogger)(nil) +// Fails at compile time if JSONFileLogger doesn't +// implement FailureLogger +``` + +**When to use:** Any type that implements an interface +consumed dynamically (registered in a map, stored as +interface value, passed to framework code). + +**When NOT to use:** Types whose interface satisfaction is +already enforced by direct usage in the same package. + +