docs: add patterns extracted from cockroachdb and prometheus

CockroachDB: 4 patterns (Stopper lifecycle, leak detection, two-phase shutdown, CloserFn adapter)
Prometheus: 5 patterns (atomic file ops, DefaultOptions, aligned timestamps, sentinel errors, compile-time interface checks)
This commit is contained in:
Rodin
2026-04-30 09:04:11 -07:00
parent 1ef2a4a189
commit 758ae5dae4
2 changed files with 361 additions and 0 deletions
+179
View File
@@ -0,0 +1,179 @@
# Patterns Extracted from cockroachdb/cockroach
## Pattern: Stopper for Goroutine Lifecycle
**Source:** `pkg/util/stop/stopper.go`
**Category:** concurrency
**What:** A dedicated struct that manages the lifecycle of
all goroutines in a component: tracks active tasks, refuses
new work during shutdown (quiesce), waits for completion,
then runs closers.
**Why:** In distributed systems, clean shutdown is critical.
You need to: (1) stop accepting new work, (2) finish
in-flight work, (3) release resources in order. The Stopper
centralizes this instead of scattering shutdown logic across
every goroutine.
**Example:**
```go
type Stopper struct {
quiescer chan struct{} // closed when quiescing
stopped chan struct{} // closed when fully stopped
mu struct {
syncutil.RWMutex
_numTasks int32
quiescing, stopping bool
closers []Closer
}
}
// RunAsyncTask refuses new work during quiesce
func (s *Stopper) RunAsyncTask(ctx context.Context,
taskName string, f func(context.Context)) error {
if !s.addTask() {
return ErrUnavailable
}
go func() {
defer s.decTask()
f(ctx)
}()
return nil
}
```
**When to use:** Any server or subsystem that spawns
goroutines and needs graceful shutdown. Especially in
long-running services where leaked goroutines cause
resource exhaustion.
**When NOT to use:** Simple programs with a single main
goroutine. Or when `errgroup` with context cancellation
suffices for the shutdown coordination.
---
## Pattern: Tracked Lifecycle with Leak Detection
**Source:** `pkg/util/stop/stopper.go`
**Category:** testing
**What:** Register every Stopper instance in a global
tracker. In tests, call `PrintLeakedStoppers(t)` to detect
any Stopper that was created but never stopped — indicating
a resource leak.
**Why:** Distributed systems have complex lifecycle graphs.
A forgot-to-stop bug silently leaks goroutines and
connections. The tracker makes leaks fail-loud in tests
without requiring careful manual cleanup.
**Example:**
```go
var trackedStoppers struct {
syncutil.Mutex
stoppers []stopperWithStack
}
func register(s *Stopper) {
trackedStoppers.Lock()
trackedStoppers.stoppers = append(...)
trackedStoppers.Unlock()
}
func PrintLeakedStoppers(t testing.TB) {
for _, tracked := range trackedStoppers.stoppers {
t.Errorf("leaked stopper, created at:\n%s",
tracked.createdAt)
}
}
```
**When to use:** Any resource that must be explicitly
closed/stopped and where forgetting to do so causes silent
degradation.
**When NOT to use:** Resources with finalizers or GC-safe
cleanup. Adds global state — only for testing.
---
## Pattern: Quiesce Then Stop (Two-Phase Shutdown)
**Source:** `pkg/util/stop/stopper.go`
**Category:** concurrency
**What:** Shutdown has two explicit phases: (1) Quiesce —
refuse new work, wait for in-flight to finish; (2) Stop —
run closers, signal done. Components observe
`ShouldQuiesce` channel alongside context.
**Why:** One-phase shutdown (just cancel context) loses
in-flight work. Two-phase gives running tasks time to
complete while preventing new work from starting. The
explicit channel (vs just context) lets components
distinguish "winding down" from "dead."
**Example:**
```go
func worker(s *Stopper, ctx context.Context) {
for {
select {
case <-s.ShouldQuiesce():
return // graceful: finish current, exit
case <-ctx.Done():
return // hard cancel
case work := <-workChan:
process(work)
}
}
}
```
**When to use:** Servers handling requests where you want
zero-downtime deploys (drain then stop). Load balancers,
RPC servers, queue consumers.
**When NOT to use:** Batch jobs or CLIs where immediate
exit is fine.
---
## Pattern: CloserFn Adapter
**Source:** `pkg/util/stop/stopper.go`
**Category:** concurrency
**What:** Define a `Closer` interface with one method
(`Close()`), plus a `CloserFn` type that adapts any
function into a Closer.
**Why:** The adapter pattern (like `http.HandlerFunc`)
avoids forcing users to define a struct just to implement
a one-method interface. Cleanup functions can be registered
directly.
**Example:**
```go
type Closer interface { Close() }
type CloserFn func()
func (f CloserFn) Close() { f() }
// Usage:
stopper.AddCloser(stop.CloserFn(func() {
conn.Close()
}))
```
**When to use:** Any one-method interface where callers
often have a simple function they want to register.
**When NOT to use:** Interfaces with >1 method, or when
the implementation needs state beyond a closure.
<!-- PATTERN_COMPLETE -->
+182
View File
@@ -0,0 +1,182 @@
# Patterns Extracted from prometheus/prometheus
## Pattern: Atomic File Operations with Suffix Convention
**Source:** `tsdb/db.go`
**Category:** storage
**What:** Use directory suffixes (`.tmp-for-creation`,
`.tmp-for-deletion`) to make multi-step file operations
crash-safe. On startup, clean up any dirs with these
suffixes (they represent incomplete operations).
**Why:** Database storage needs atomicity. If the process
crashes between creating a block and finalizing it, you
need to know the block is incomplete. The suffix convention
makes incomplete state visible at the filesystem level
without requiring a separate journal.
**Example:**
```go
const (
tmpForDeletionBlockDirSuffix = ".tmp-for-deletion"
tmpForCreationBlockDirSuffix = ".tmp-for-creation"
)
// On startup: remove any .tmp-* dirs (incomplete ops)
// On create: write to dir.tmp-for-creation, then rename
// On delete: rename to dir.tmp-for-deletion, then remove
```
**When to use:** Any system that manages files/directories
and needs crash consistency without a full WAL. Simpler
than a write-ahead log for coarse-grained operations.
**When NOT to use:** When you already have a WAL or
transaction log. Or for fine-grained operations where
rename semantics are insufficient.
---
## Pattern: DefaultOptions() Function
**Source:** `tsdb/db.go`
**Category:** configuration
**What:** Provide a `DefaultOptions()` function returning a
fully-populated config struct. Users copy and override only
what they need. No nil-means-default ambiguity.
**Why:** Large config structs (20+ fields) are unwieldy.
By providing sane defaults as a function (not a
package-level var), you avoid mutation bugs and make it
clear what "normal" looks like. Users only specify
deviations.
**Example:**
```go
func DefaultOptions() *Options {
return &Options{
WALSegmentSize: wlog.DefaultSegmentSize,
RetentionDuration: int64(15*24*time.Hour / ...),
MinBlockDuration: DefaultBlockDuration,
MaxBlockDuration: DefaultBlockDuration,
SamplesPerChunk: DefaultSamplesPerChunk,
// ... 20 more fields with sane defaults
}
}
// Usage:
opts := tsdb.DefaultOptions()
opts.RetentionDuration = 30 * 24 * time.Hour
db, err := tsdb.Open(dir, nil, nil, opts, nil)
```
**When to use:** Config structs with many fields where most
users want defaults. Especially when zero-value semantics
would be confusing (e.g., 0 retention = infinite? or off?).
**When NOT to use:** Small configs (3-4 fields) where
struct literal with zero-means-default is clear enough.
---
## Pattern: Scrape Loop with Aligned Timestamps
**Source:** `scrape/scrape.go`
**Category:** concurrency
**What:** Periodic scrape loops that align timestamps to
intervals with a small tolerance, enabling better storage
compression downstream.
**Why:** Time-series databases compress better when
timestamps are regular. A 2ms tolerance on alignment
means scraped data aligns to the expected grid while
accommodating real-world jitter.
**Example:**
```go
var ScrapeTimestampTolerance = 2 * time.Millisecond
var AlignScrapeTimestamps = true
// In scrape loop: if scrape finishes within tolerance
// of expected timestamp, snap to the grid
```
**When to use:** Any periodic data collection where
downstream storage benefits from timestamp regularity.
Metrics, heartbeats, polling loops.
**When NOT to use:** Event-driven data where timestamps
must reflect actual occurrence time. Audit logs, user
actions, financial transactions.
---
## Pattern: Sentinel Errors with Interface Check
**Source:** `tsdb/db.go`
**Category:** error-handling
**What:** Define package-level sentinel errors with
`errors.New()` and use compile-time interface assertions
to verify implementations satisfy storage interfaces.
**Why:** `ErrNotReady` as a sentinel lets callers use
`errors.Is` for retry logic. The pattern ensures error
identity is stable across versions (not string-matched).
**Example:**
```go
var ErrNotReady = errors.New("TSDB not ready")
// Callers can reliably detect this:
if errors.Is(err, tsdb.ErrNotReady) {
// Retry later — DB is still initializing
}
```
**When to use:** Any error that callers need to handle
programmatically (retry, fallback, special UI). Make it a
named sentinel, not a string comparison.
**When NOT to use:** Errors that are always terminal or
always logged-and-discarded. Not every error needs a name.
---
## Pattern: Compile-Time Interface Satisfaction
**Source:** `scrape/scrape.go`
**Category:** organization
**What:** Use `var _ Interface = (*Type)(nil)` to verify at
compile time that a type satisfies an interface, even if
the type is only used dynamically.
**Why:** Without this, you discover missing methods only
when the type is actually used — which might be in a
rarely-exercised code path or only in production. The
compile-time check catches it immediately.
**Example:**
```go
var _ FailureLogger = (*logging.JSONFileLogger)(nil)
// Fails at compile time if JSONFileLogger doesn't
// implement FailureLogger
```
**When to use:** Any type that implements an interface
consumed dynamically (registered in a map, stored as
interface value, passed to framework code).
**When NOT to use:** Types whose interface satisfaction is
already enforced by direct usage in the same package.
<!-- PATTERN_COMPLETE -->