27 KiB
Anti-Patterns: What Kubernetes Avoids (and Why)
1. Never Mutate Shared Cache Objects
What they avoid: Modifying objects returned by Listers/Informers without deep-copying first.
Why: The informer cache is shared across all controllers. Mutating a cached object corrupts state for every other consumer.
The pattern K8s enforces:
// WRONG — mutates shared cache
deployment, _ := dc.dLister.Deployments(ns).Get(name)
deployment.Spec.Replicas = ptr.To[int32](3) // CORRUPTION!
// RIGHT — deep copy before mutating
deployment, _ := dc.dLister.Deployments(ns).Get(name)
deploymentCopy := deployment.DeepCopy()
deploymentCopy.Spec.Replicas = ptr.To[int32](3)
When to Apply This Rule
Triggers:
- You're reading from any shared data structure (cache, registry, concurrent map)
- Your function modifies a struct that was returned by a Lister, cache, or lookup
- You see code like
obj.Field = newValuewhereobjcame from a shared source
Example — detecting the smell:
// This line is the red flag: modifying something obtained from a shared cache
deployment, _ := deploymentLister.Get(name)
deployment.Spec.Replicas = ptr.To[int32](5) // SMELL: mutating cached object
client.Update(ctx, deployment)
Example — fixed:
deployment, _ := deploymentLister.Get(name)
copy := deployment.DeepCopy() // isolate your mutation
copy.Spec.Replicas = ptr.To[int32](5)
client.Update(ctx, copy)
Exceptions
- Read-only access: If you only read fields and never modify the object, deep copy is unnecessary overhead
- Single-owner data: If the struct was created locally and isn't shared (e.g., you just built it with
&Deployment{}), copying your own data is wasteful - Immutable value types: Primitive fields (string, int, bool) are copied by value in Go — you only need DeepCopy for slices, maps, and pointer fields
Evidence: The runtime.Object interface mandates DeepCopyObject(). Every API type has generated deep copy methods. The entire architecture assumes immutable reads.
2. Never Process the Same Key Concurrently
What they avoid: Multiple goroutines syncing the same object simultaneously.
Why: Two goroutines reading the same Deployment, each computing a different desired state, then both writing → conflict errors and potential state corruption.
The pattern K8s enforces: The workqueue's processing set ensures a key is only handed to one worker at a time. If an item is added while being processed, it's re-queued after Done() is called:
// From queue.go — the processing set blocks concurrent access
func (q *Typed[T]) Get() (item T, shutdown bool) {
// ...
q.processing.Insert(item) // Mark as being worked on
q.dirty.Delete(item)
return item, false
}
When to Apply This Rule
Triggers:
- Multiple goroutines/workers process items from a shared queue or event stream
- Processing involves read-modify-write cycles on external state (API server, database)
- You observe optimistic concurrency conflicts (409 Conflict) or duplicate operations in logs
Example — detecting the smell:
// Multiple workers, no key-level serialization
for i := 0; i < 10; i++ {
go func() {
for key := range eventChannel { // same key can go to any worker
obj, _ := client.Get(key)
obj.Status.Count++
client.Update(obj) // CONFLICT: another worker updated first
}
}()
}
Example — fixed:
// Workqueue guarantees: one worker per key at a time
queue := workqueue.NewTypedRateLimitingQueue[string](limiter)
for i := 0; i < 10; i++ {
go func() {
for {
key, _ := queue.Get() // key is exclusively ours until Done()
reconcile(key)
queue.Done(key)
}
}()
}
Exceptions
- Read-only operations: If workers only read (metrics collection, logging), concurrent access to the same key is safe
- Truly idempotent writes: If the write is unconditional (e.g., PUT with a fixed value, not read-modify-write), concurrent processing produces the same result
- Sharded ownership: If each key is deterministically routed to exactly one worker (consistent hashing), the queue's built-in serialization is unnecessary
3. Never Use Edge-Triggered Logic
What they avoid: Controllers that react to what changed rather than what the current state is.
Why: Events can be missed (watch disconnects), delivered out of order, or duplicated. If your controller says "a pod was deleted, so decrement counter" rather than "count current pods and compare to desired", you'll drift.
The pattern K8s enforces: Level-triggered reconciliation. The syncHandler reads current state from the cache, computes desired state from the spec, and makes the world match:
When to Apply This Rule
Triggers:
- Your handler says "X happened, so do Y" instead of "the world should look like Z, make it so"
- You're incrementing/decrementing counters based on events instead of counting actual state
- A missed event (network blip, restart) would cause permanent drift
Example — detecting the smell:
func onPodDeleted(pod Pod) {
deployment.Status.Replicas-- // edge-triggered: if we miss a delete, count is wrong forever
}
Example — fixed:
func reconcile(deployment Deployment) {
actualPods := listPodsForDeployment(deployment)
deployment.Status.Replicas = int32(len(actualPods)) // level-triggered: always correct
}
Exceptions
- Audit logging / event streams: Recording "what happened" is inherently edge-triggered — you want to log each event, not reconstruct history from state
- Notifications: "User X just logged in" is an event that triggers a one-time notification — level-triggered makes no sense here
- Ordering-sensitive operations: If the sequence of events matters (transaction log, command queue), level-triggered reconciliation would lose ordering information
// The sync function always reads current state, never relies on "what happened"
func (dc *DeploymentController) syncDeployment(ctx context.Context, key string) error {
deployment, err := dc.dLister.Deployments(namespace).Get(name)
// Compute desired state from deployment.Spec
// Read actual state from replicaset lister
// Reconcile difference
}
4. Never Forget to Call queue.Forget() on Success
What they avoid: Letting the rate limiter track items that succeeded.
Why: The rate limiter is per-item. If you never call Forget(), the next time the same key needs processing (even for a new event), it will be rate-limited as if it previously failed.
Source: The rate_limiting_queue.go comments explicitly warn:
// NewTypedRateLimitingQueue constructs a new workqueue with rateLimited queuing ability
// Remember to call Forget! If you don't, you may end up tracking failures forever.
The pattern K8s enforces:
func (dc *DeploymentController) handleErr(ctx context.Context, err error, key string) {
if err == nil {
dc.queue.Forget(key) // CRITICAL: clear the failure counter
return
}
if dc.queue.NumRequeues(key) < maxRetries {
dc.queue.AddRateLimited(key)
return
}
dc.queue.Forget(key) // Also forget when giving up
}
When to Apply This Rule
Triggers:
- You're using a rate-limited workqueue and items can both succeed and fail
- You notice items processing slower over time for no apparent reason (accumulated backoff)
- Your error handling only calls
AddRateLimitedon failure but has noForgeton success
Example — detecting the smell:
func processItem(queue workqueue.RateLimitingInterface, key string) {
defer queue.Done(key)
err := reconcile(key)
if err != nil {
queue.AddRateLimited(key) // retry with backoff
return
}
// BUG: no queue.Forget(key)!
// Next time this key is processed (even for a new event),
// it gets rate-limited delay from the old failure counter
}
Example — fixed:
func processItem(queue workqueue.RateLimitingInterface, key string) {
defer queue.Done(key)
err := reconcile(key)
if err != nil {
queue.AddRateLimited(key)
return
}
queue.Forget(key) // clear backoff counter on success
}
Exceptions
- You're using a plain (non-rate-limited) queue:
Forget()only matters onRateLimitingInterface. PlainInterfacehas no rate limiter to clear. - Permanent failures where the key should never be fast-tracked: If a key represents a permanently broken resource, you might intentionally not
Forgetso that future events for it are naturally throttled (rare — usually you'd remove it from the queue entirely).
5. Never Hit the API Server in a Tight Loop
What they avoid: Direct API calls for reads. List/Get calls in hot paths.
Why: The API server is a shared resource. If 100 controllers each make 10 API calls per reconciliation at 1 sync/second, that's 1000 req/s to the API server per controller manager instance.
The pattern K8s enforces: Read from Listers (local cache), write to API server:
// READ from cache (free, local, fast)
deployment, err := dc.dLister.Deployments(namespace).Get(name)
// WRITE to API server (expensive, remote, rate-limited)
_, err = dc.client.AppsV1().Deployments(namespace).Update(ctx, deployment, ...)
When to Apply This Rule
Triggers:
- Your reconcile function makes API calls (Get/List) inside a loop
- You observe API server throttling (429 responses) or high request latency
- Multiple controllers in the same process each independently fetch the same resources
Example — detecting the smell:
func (c *Controller) reconcile(ctx context.Context, deployment *apps.Deployment) error {
for i := 0; i < int(*deployment.Spec.Replicas); i++ {
// API call inside a loop — O(replicas) calls per sync
pod, err := c.client.CoreV1().Pods(ns).Get(ctx, podName(i), metav1.GetOptions{})
if errors.IsNotFound(err) {
c.client.CoreV1().Pods(ns).Create(ctx, newPod(i), metav1.CreateOptions{})
}
}
return nil
}
Example — fixed:
func (c *Controller) reconcile(ctx context.Context, deployment *apps.Deployment) error {
// Single cache read (local, free) — gets ALL pods at once
existingPods, _ := c.podLister.Pods(ns).List(selectorForDeployment(deployment))
existingSet := make(map[string]bool)
for _, p := range existingPods { existingSet[p.Name] = true }
for i := 0; i < int(*deployment.Spec.Replicas); i++ {
if !existingSet[podName(i)] {
c.client.CoreV1().Pods(ns).Create(ctx, newPod(i), metav1.CreateOptions{})
}
}
return nil
}
Exceptions
- Writes are unavoidable: You must hit the API server for creates/updates/deletes — the rule is about reads, not writes
- Cache staleness is unacceptable: Rare cases where you need a strongly consistent read (e.g., before an irreversible action) justify a direct Get
- One-shot tools: CLI commands or migration scripts don't benefit from informer caches (they exit after one operation)
6. Never Sync Before Caches Are Warm
What they avoid: Processing items before the informer has done its initial List.
Why: With an empty cache, a controller might think "no pods exist, must create all of them" — causing a thundering herd of duplicate creates.
The pattern K8s enforces:
// pkg/controller/deployment/deployment_controller.go:189
if !cache.WaitForNamedCacheSyncWithContext(ctx,
dc.dListerSynced, dc.rsListerSynced, dc.podListerSynced) {
return // Don't start workers until all caches are populated
}
When to Apply This Rule
Triggers:
- Your controller uses informer caches (Listers) to determine what actions to take
- You observe a burst of spurious creates/deletes immediately after startup
- Your reconcile logic compares "desired count vs actual count" using cached data
Example — detecting the smell:
func (c *Controller) Run(ctx context.Context) {
// BUG: starts workers immediately — cache might be empty
for i := 0; i < workers; i++ {
go c.worker(ctx)
}
// Informers haven't finished initial List yet
// Worker reads cache → sees 0 pods → creates all replicas again
}
Example — fixed:
func (c *Controller) Run(ctx context.Context) {
// Gate: wait until all caches have completed initial List
if !cache.WaitForCacheSync(ctx.Done(), c.podsSynced, c.rsSynced) {
runtime.HandleError(fmt.Errorf("caches never synced"))
return
}
// Now safe to start workers — cache reflects reality
for i := 0; i < workers; i++ {
go c.worker(ctx)
}
}
Exceptions
- Read-only controllers: If your controller only observes and reports (metrics, logging), processing with a partial cache produces incomplete but not harmful results
- Controllers that check existence before acting: If your reconcile logic does a direct API Get (not cache read) before creating, the cache warmth gate is less critical (but still good practice)
- Fast-starting controllers with explicit "not found" handling: If your reconcile function handles "object not in cache" gracefully (e.g., requeues the key), it may tolerate starting before full sync
7. Never Ignore Tombstones in Delete Handlers
What they avoid: Assuming delete handlers always receive the concrete type.
Why: If a watch disconnects and reconnects, missed deletes arrive as DeletedFinalStateUnknown (tombstones). Ignoring them means your controller never learns about those deletions.
The pattern K8s enforces:
// Every delete handler must check for tombstones
func (dc *DeploymentController) deleteDeployment(logger klog.Logger, obj interface{}) {
d, ok := obj.(*apps.Deployment)
if !ok {
tombstone, ok := obj.(cache.DeletedFinalStateUnknown)
if !ok {
utilruntime.HandleError(...)
return
}
d, ok = tombstone.Obj.(*apps.Deployment)
// ...
}
}
When to Apply This Rule
Triggers:
- You're implementing a
DeleteFuncevent handler for an informer - Your delete handler type-asserts directly to the concrete type without a fallback
- You observe "leaked" resources that should have been cleaned up after their parent was deleted
Example — detecting the smell:
informer.AddEventHandler(cache.ResourceEventHandlerFuncs{
DeleteFunc: func(obj interface{}) {
pod := obj.(*v1.Pod) // PANIC during watch reconnection
cleanupPod(pod)
},
})
Example — fixed:
informer.AddEventHandler(cache.ResourceEventHandlerFuncs{
DeleteFunc: func(obj interface{}) {
pod, ok := obj.(*v1.Pod)
if !ok {
tombstone, ok := obj.(cache.DeletedFinalStateUnknown)
if !ok {
runtime.HandleError(fmt.Errorf("unexpected type %T", obj))
return
}
pod, ok = tombstone.Obj.(*v1.Pod)
if !ok {
runtime.HandleError(fmt.Errorf("tombstone contained %T", tombstone.Obj))
return
}
}
cleanupPod(pod)
},
})
Exceptions
- Delete handler only extracts the key: If your handler only needs
namespace/nameto enqueue work, you can usecache.MetaNamespaceKeyFunc(obj)which handles tombstones internally - Non-informer event sources: If your events don't come from client-go informers (e.g., custom message queues), tombstones don't apply
- Level-triggered reconcilers that don't use delete handlers: If your reconcile loop discovers deletions via "not found" from the lister, the delete handler is optional anyway
8. Never Use ResourceVersion for Equality
What they avoid: Comparing ResourceVersion to check if an object changed.
Why: ResourceVersion is opaque (currently etcd's mod_revision, but this is an implementation detail). The only valid operation is "!=" to detect change.
The pattern K8s uses:
// pkg/controller/deployment/deployment_controller.go:284-288
func (dc *DeploymentController) updateReplicaSet(logger klog.Logger, old, cur interface{}) {
curRS := cur.(*apps.ReplicaSet)
oldRS := old.(*apps.ReplicaSet)
if curRS.ResourceVersion == oldRS.ResourceVersion {
return // Periodic resync, nothing actually changed
}
// ... process the real update
}
When to Apply This Rule
Triggers:
- You're comparing ResourceVersions with
<,>, or numeric parsing - You're storing ResourceVersion as an integer for ordering or pagination
- You're using ResourceVersion to determine "which version is newer"
Example — detecting the smell:
// Treating ResourceVersion as a number — WRONG
rv1, _ := strconv.Atoi(obj1.ResourceVersion)
rv2, _ := strconv.Atoi(obj2.ResourceVersion)
if rv2 > rv1 {
// "obj2 is newer" — this assumption may break in future implementations
}
Example — fixed:
// Only valid comparison: equality (to detect resync vs real update)
if cur.ResourceVersion == old.ResourceVersion {
return // no change — this is a periodic resync event
}
// Don't compare ordering — just process the update
processUpdate(cur)
Exceptions
- The
==check for resync detection: Comparing ResourceVersion for equality (==) to detect "no change" is valid and explicitly how Kubernetes uses it - Watch continuity: Passing ResourceVersion to Watch/List for "resume from" is the intended API — you're not comparing, you're providing a cursor
- Internal etcd tooling: If you're operating directly on etcd (not through the Kubernetes API), mod_revision semantics are documented there
9. Never Panic in Production Goroutines (Without Recovery)
What they avoid: Unhandled panics killing the entire controller manager.
Why: A single nil pointer in one controller's sync loop would crash all 30+ controllers running in the same process.
The pattern K8s enforces:
// Every goroutine gets crash protection
func (dc *DeploymentController) Run(ctx context.Context, workers int) {
defer utilruntime.HandleCrash() // Top-level recovery
// ...
}
// And in every polling loop:
func BackoffUntilWithContext(ctx context.Context, f func(ctx context.Context), ...) {
func() {
defer runtime.HandleCrashWithContext(ctx) // Per-iteration recovery
f(ctx)
}()
}
When to Apply This Rule
Triggers:
- You're launching goroutines without
defer recover()ordefer HandleCrash() - Your process hosts multiple independent subsystems (any one panicking kills all)
- You observe random process deaths in production with nil pointer or index-out-of-range panics
Example — detecting the smell:
// Naked goroutine — one panic kills the whole process
func (c *Controller) Run(ctx context.Context) {
for i := 0; i < 5; i++ {
go c.worker(ctx) // no recovery — nil pointer in worker kills everything
}
}
Example — fixed:
func (c *Controller) Run(ctx context.Context) {
for i := 0; i < 5; i++ {
go func() {
defer utilruntime.HandleCrash()
c.worker(ctx)
}()
}
}
Exceptions
- Main goroutine of a CLI tool: Let it panic — the stack trace IS the error report. Recovery here just hides bugs.
- Panics that indicate programming errors you WANT to crash on:
panic("unreachable")or assertion failures during development should crash to catch bugs early. - Test code: Tests should panic/fail loudly.
HandleCrashin tests would swallow test failures.
10. Never Block Workers Indefinitely
What they avoid: Unbounded blocking in a sync handler (e.g., waiting for a condition that may never occur).
Why: Workers are a finite pool. If one blocks forever, that's one fewer worker processing the queue. At scale, this cascades.
The pattern K8s enforces: All API calls take context (with timeouts), all waits are bounded:
// Timeouts on API calls
ctx, cancel := context.WithTimeout(ctx, 30*time.Second)
defer cancel()
_, err := client.CoreV1().Pods(ns).Create(ctx, pod, metav1.CreateOptions{})
When to Apply This Rule
Triggers:
- Your sync handler calls external services without a timeout
- Workers are a fixed pool and you observe queue depth growing while workers appear idle
- Your code has
select {}, unbounded channel reads, or mutex waits without deadline
Example — detecting the smell:
func (c *Controller) syncItem(ctx context.Context, key string) error {
// Calls external service with no timeout — blocks indefinitely if service is down
resp, err := http.Get("http://external-service/api/" + key)
if err != nil { return err }
// Waits on a channel that might never close
result := <-c.resultChan // blocks forever if producer crashes
return c.updateStatus(ctx, key, result)
}
Example — fixed:
func (c *Controller) syncItem(ctx context.Context, key string) error {
// Bounded timeout on external call
reqCtx, cancel := context.WithTimeout(ctx, 10*time.Second)
defer cancel()
req, _ := http.NewRequestWithContext(reqCtx, "GET", "http://external-service/api/"+key, nil)
resp, err := http.DefaultClient.Do(req)
if err != nil { return err } // timeout returns error, worker is freed
// Bounded wait on channel
select {
case result := <-c.resultChan:
return c.updateStatus(ctx, key, result)
case <-ctx.Done():
return ctx.Err() // worker freed, key requeued
}
}
Exceptions
- The worker loop itself: The top-level
queue.Get()blocks intentionally when the queue is empty — this is by design (workers sleep until there's work) - Graceful shutdown drains: During shutdown, waiting for in-flight work to complete is acceptable (with a bounded overall shutdown timeout)
- Leader election acquire: Blocking until leadership is acquired is intentional — non-leaders should idle
11. Never Use sync.Mutex Where sync.Once Suffices
What they avoid: Full mutual exclusion for one-shot operations.
Why: sync.Once is semantically clearer and avoids the bug where you forget to check a "done" flag under the mutex.
The pattern K8s uses:
// pkg/controller/controller_ref_manager.go:43-49
type BaseControllerRefManager struct {
canAdoptErr error
canAdoptOnce sync.Once // One-shot lazy evaluation
CanAdoptFunc func(ctx context.Context) error
}
func (m *BaseControllerRefManager) CanAdopt(ctx context.Context) error {
m.canAdoptOnce.Do(func() {
if m.CanAdoptFunc != nil {
m.canAdoptErr = m.CanAdoptFunc(ctx)
}
})
return m.canAdoptErr
}
When to Apply This Rule
Triggers:
- You see a
sync.Mutexprotecting aboolflag that tracks "has this been initialized?" - The protected code runs exactly once and caches the result
- You see patterns like
mu.Lock(); if !done { doThing(); done = true }; mu.Unlock()
Example — detecting the smell:
type Manager struct {
mu sync.Mutex
initialized bool
config *Config
}
func (m *Manager) GetConfig() *Config {
m.mu.Lock()
defer m.mu.Unlock()
if !m.initialized {
m.config = loadConfig()
m.initialized = true
}
return m.config
}
// Bug-prone: what if someone adds code that sets initialized=false?
Example — fixed:
type Manager struct {
initOnce sync.Once
config *Config
}
func (m *Manager) GetConfig() *Config {
m.initOnce.Do(func() {
m.config = loadConfig()
})
return m.config
// Impossible to accidentally re-initialize
}
Exceptions
- Resettable state: If the "one-time" operation might need to run again (e.g., reconnection after disconnect),
sync.Oncedoesn't support reset — use a mutex with a flag - Multiple initialization phases: If initialization has multiple stages that can partially fail and retry,
sync.Onceis too coarse (it only fires once, even on error) - Pre-Go 1.21 error handling: Before Go 1.21's
sync.OnceValue/sync.OnceFunc, handling errors fromDorequired workarounds. Nowsync.OnceValuecleanly supports fallible initialization.
12. Never Expose Mutable State Through Interfaces
What they avoid: Returning pointers to internal state through public interfaces.
Why: Callers can accidentally mutate internal state, creating subtle bugs that only manifest under concurrency.
The pattern K8s enforces: Listers return objects from the read-only cache. The DeepCopy() pattern ensures mutation safety is the caller's responsibility, not the cache's.
When to Apply This Rule
Triggers:
- Your getter/accessor returns a pointer to a field that's part of your struct's internal state
- Multiple goroutines call the accessor, and at least one caller might modify the result
- You observe race conditions in
go test -racepointing to internal fields accessed from outside
Example — detecting the smell:
type Registry struct {
mu sync.RWMutex
items map[string]*Item
}
func (r *Registry) Get(key string) *Item {
r.mu.RLock()
defer r.mu.RUnlock()
return r.items[key] // returns internal pointer — caller can mutate!
}
// Caller does: item := registry.Get("foo"); item.Name = "bar"
// Now every other caller sees the mutated name — data race
Example — fixed:
func (r *Registry) Get(key string) *Item {
r.mu.RLock()
defer r.mu.RUnlock()
if item, ok := r.items[key]; ok {
return item.DeepCopy() // caller gets their own copy
}
return nil
}
// Caller mutates their copy — registry is unaffected
Exceptions
- Single-threaded access: If your struct is only accessed from one goroutine (no concurrency), returning pointers to internals is safe and avoids copy overhead
- Intentionally shared mutable state: Some designs explicitly want callers to see mutations (e.g., shared counters, observable state). Document this clearly.
- Performance-critical read paths: If deep-copying on every access is too expensive and callers are trusted to not mutate, return a pointer with clear documentation:
// WARNING: do not mutate returned value
Summary: The Philosophy
Kubernetes avoids these anti-patterns because of one fundamental truth: in a distributed system, every assumption you make about state being consistent is wrong.
The patterns exist because:
- Events are unreliable → level-triggered reconciliation
- Reads are stale → always compare desired vs actual
- Concurrent access is inevitable → deep copy, queue serialization
- Failures are normal → retry with backoff, graceful degradation
- Resources are shared → cache reads, rate-limit writes
- Systems outlive their authors → code generation, type registries, feature gates