docs: Kubernetes production patterns with source citations

2026-04-30 12:10:12 +00:00
commit eee4200c04
6 changed files with 1613 additions and 0 deletions
@@ -0,0 +1,522 @@
+# Kubernetes-Specific Patterns
+
+## 1. Controller / Reconciler Pattern
+
+**Source:** `pkg/controller/deployment/deployment_controller.go` (lines 65–530)
+
+### What it does
+The controller pattern is the central design pattern of Kubernetes. Every controller watches a set of resources, maintains a work queue, and reconciles desired state with actual state through a sync loop.
+
+### Why
+Distributed systems can't guarantee that a single API call will bring the world to desired state. The controller pattern provides eventual consistency by continuously reconciling — it handles missed events, partial failures, and concurrent modifications.
+
+### Structure
+
+```go
+// pkg/controller/deployment/deployment_controller.go:65-95
+type DeploymentController struct {
+    rsControl controller.RSControlInterface
+    client    clientset.Interface
+
+    // Testability: sync handler is injectable
+    syncHandler func(ctx context.Context, dKey string) error
+    enqueueDeployment func(deployment *apps.Deployment)
+
+    // Listers: read from local cache, not API server
+    dLister  appslisters.DeploymentLister
+    rsLister appslisters.ReplicaSetLister
+    podLister corelisters.PodLister
+
+    // Synced funcs: gate processing until caches are warm
+    dListerSynced  cache.InformerSynced
+    rsListerSynced cache.InformerSynced
+
+    // Work queue: rate-limited, deduplicating
+    queue workqueue.TypedRateLimitingInterface[string]
+}
+```
+
+### The Canonical Worker Loop
+
+```go
+// pkg/controller/deployment/deployment_controller.go:481-515
+func (dc *DeploymentController) worker(ctx context.Context) {
+    for dc.processNextWorkItem(ctx) {
+    }
+}
+
+func (dc *DeploymentController) processNextWorkItem(ctx context.Context) bool {
+    key, quit := dc.queue.Get()
+    if quit {
+        return false
+    }
+    defer dc.queue.Done(key)
+
+    err := dc.syncHandler(ctx, key)
+    dc.handleErr(ctx, err, key)
+    return true
+}
+
+func (dc *DeploymentController) handleErr(ctx context.Context, err error, key string) {
+    if err == nil || errors.HasStatusCause(err, v1.NamespaceTerminatingCause) {
+        dc.queue.Forget(key)  // Success: clear rate limiter
+        return
+    }
+    if dc.queue.NumRequeues(key) < maxRetries {
+        dc.queue.AddRateLimited(key)  // Retry with backoff
+        return
+    }
+    utilruntime.HandleError(err)
+    dc.queue.Forget(key)  // Give up after maxRetries
+}
+```
+
+### When to Use
+
+**Triggers:**
+- You're building a system that must maintain desired state over time (not just react to events once)
+- External state can change outside your control (user edits, crashes, network partitions)
+- You need automatic recovery from partial failures without human intervention
+
+**Example — before:**
+```go
+// Event-driven: reacts once and hopes nothing changes
+func handlePodCreated(pod Pod) {
+    assignToNode(pod)
+    // What if the node dies 5 seconds later? Nobody re-assigns.
+}
+```
+
+**Example — after:**
+```go
+// Controller pattern: continuously reconciles desired vs actual
+func (c *Scheduler) Reconcile(ctx context.Context, key string) error {
+    pod, err := c.podLister.Get(key)
+    if err != nil { return err }
+    
+    if pod.Spec.NodeName == "" {
+        node := c.selectBestNode(pod)
+        return c.assignPodToNode(ctx, pod, node)
+    }
+    // Already assigned — verify node is still healthy
+    if !c.nodeIsReady(pod.Spec.NodeName) {
+        return c.reassignPod(ctx, pod)
+    }
+    return nil // desired state matches actual state
+}
+```
+
+### Key Properties
+1. **Level-triggered, not edge-triggered** — the sync loop reads current state, not diffs
+2. **Idempotent** — running sync twice produces the same result
+3. **Key-based deduplication** — the workqueue coalesces multiple events for the same object
+4. **Bounded retries** — exponential backoff with a max retry count (15 retries = ~82s max delay)
+
+---
+
+## 2. Informer + Cache + Workqueue Combo
+
+**Source:** `staging/src/k8s.io/client-go/tools/cache/shared_informer.go` (lines 144–283), `staging/src/k8s.io/client-go/informers/factory.go` (lines 57–250)
+
+### What it does
+The Informer provides a local read cache of API server state, backed by a List+Watch connection. The SharedInformerFactory ensures only one informer per resource type exists per process, preventing duplicate watches.
+
+### Why
+- **Reduces API server load**: controllers read from local cache (Listers) instead of hitting the API
+- **Reduces latency**: events are delivered via callbacks, no polling
+- **Memory efficiency**: shared informers prevent N controllers from opening N watches
+
+### Architecture
+
+```
+API Server
+    │
+    ├── List (initial sync)
+    │
+    └── Watch (streaming updates)
+            │
+            ▼
+    SharedIndexInformer
+        ├── local Store (thread-safe cache)
+        ├── Indexer (secondary indexes)
+        └── Event Handlers → [Controller1, Controller2, ...]
+                                    │
+                                    ▼
+                              WorkQueue
+                                    │
+                                    ▼
+                              worker goroutines
+```
+
+### SharedInformerFactory Pattern
+
+```go
+// staging/src/k8s.io/client-go/informers/factory.go:57-77
+type sharedInformerFactory struct {
+    client           kubernetes.Interface
+    lock             sync.Mutex
+    informers        map[reflect.Type]cache.SharedIndexInformer
+    startedInformers map[reflect.Type]bool
+    wg               sync.WaitGroup
+    shuttingDown     bool
+}
+```
+
+### Registration via Event Handlers
+
+```go
+// pkg/controller/deployment/deployment_controller.go:117-146
+dInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{
+    AddFunc:    func(obj interface{}) { dc.addDeployment(logger, obj) },
+    UpdateFunc: func(oldObj, newObj interface{}) { dc.updateDeployment(logger, oldObj, newObj) },
+    DeleteFunc: func(obj interface{}) { dc.deleteDeployment(logger, obj) },
+})
+```
+
+### Cache Sync Gate
+
+```go
+// pkg/controller/deployment/deployment_controller.go:189
+if !cache.WaitForNamedCacheSyncWithContext(ctx, dc.dListerSynced, dc.rsListerSynced, dc.podListerSynced) {
+    return
+}
+```
+
+---
+
+## 3. Workqueue: Typed Rate-Limiting Queue
+
+**Source:** `staging/src/k8s.io/client-go/util/workqueue/queue.go` (lines 33–370), `rate_limiting_queue.go`
+
+### What it does
+A concurrent-safe work queue with three critical properties:
+1. **Deduplication** (dirty set) — same item added twice results in one processing
+2. **Re-entrancy** (processing set) — if an item is added while being processed, it's re-queued after Done()
+3. **Rate limiting** — exponential backoff on failures
+
+### Why
+In a controller, multiple events may fire for the same object in rapid succession. Without deduplication, you'd process stale intermediate states. The dirty/processing set design ensures you always process the latest state while never losing notifications.
+
+### When to Use
+
+**Triggers:**
+- Multiple event sources (informer callbacks) trigger work on the same object rapidly
+- You need to deduplicate: 5 events for the same pod should result in 1 sync, not 5
+- Failed processing should retry with exponential backoff, not flood the system
+
+**Example — before:**
+```go
+// Raw channel: no deduplication, no backoff
+events := make(chan string, 100)
+
+// Producer fires rapid updates:
+events <- "pod-abc" // event 1
+events <- "pod-abc" // event 2 (duplicate!)
+events <- "pod-abc" // event 3 (duplicate!)
+
+// Consumer processes all 3 — wasteful
+for key := range events {
+    reconcile(key) // called 3 times for the same stale state
+}
+```
+
+**Example — after:**
+```go
+queue := workqueue.NewTypedRateLimitingQueue[string](
+    workqueue.DefaultTypedControllerRateLimiter[string](),
+)
+
+// Producer fires rapid updates — queue deduplicates:
+queue.Add("pod-abc") // queued
+queue.Add("pod-abc") // already dirty — no-op
+queue.Add("pod-abc") // already dirty — no-op
+
+// Consumer processes once with latest state:
+key, _ := queue.Get()
+err := reconcile(key) // called once
+if err != nil {
+    queue.AddRateLimited(key) // retry with backoff
+} else {
+    queue.Forget(key) // clear backoff counter
+}
+queue.Done(key)
+```
+
+### The Dirty/Processing Dance
+
+```go
+// staging/src/k8s.io/client-go/util/workqueue/queue.go:227-252
+func (q *Typed[T]) Add(item T) {
+    q.cond.L.Lock()
+    defer q.cond.L.Unlock()
+    if q.shuttingDown { return }
+    if q.dirty.Has(item) {
+        if !q.processing.Has(item) {
+            q.queue.Touch(item)  // Allow priority changes
+        }
+        return  // Already marked for processing
+    }
+    q.dirty.Insert(item)
+    if q.processing.Has(item) {
+        return  // Being processed, will re-queue on Done()
+    }
+    q.queue.Push(item)
+    q.cond.Signal()
+}
+
+func (q *Typed[T]) Done(item T) {
+    q.cond.L.Lock()
+    defer q.cond.L.Unlock()
+    q.processing.Delete(item)
+    if q.dirty.Has(item) {
+        q.queue.Push(item)  // Was modified during processing
+        q.cond.Signal()
+    }
+}
+```
+
+### Rate-Limited Requeue
+
+```go
+// staging/src/k8s.io/client-go/util/workqueue/rate_limiting_queue.go:120-122
+func (q *rateLimitingType[T]) AddRateLimited(item T) {
+    q.TypedDelayingInterface.AddAfter(item, q.rateLimiter.When(item))
+}
+```
+
+---
+
+## 4. Tombstone Pattern (DeletedFinalStateUnknown)
+
+**Source:** `staging/src/k8s.io/client-go/tools/cache/delta_fifo.go` (lines 797–801)
+
+### What it does
+When a watch disconnects and reconnects, some delete events may be missed. The DeltaFIFO synthesizes a `DeletedFinalStateUnknown` ("tombstone") containing the last known state of the object.
+
+### Why
+Without this, controllers would never learn about deletions that happened during disconnects.
+
+```go
+// staging/src/k8s.io/client-go/tools/cache/delta_fifo.go:797-801
+type DeletedFinalStateUnknown struct {
+    Key string
+    Obj interface{}
+}
+```
+
+### How controllers handle it
+
+```go
+// pkg/controller/deployment/deployment_controller.go:210-224
+func (dc *DeploymentController) deleteDeployment(logger klog.Logger, obj interface{}) {
+    d, ok := obj.(*apps.Deployment)
+    if !ok {
+        tombstone, ok := obj.(cache.DeletedFinalStateUnknown)
+        if !ok {
+            utilruntime.HandleError(fmt.Errorf("couldn't get object from tombstone %#v", obj))
+            return
+        }
+        d, ok = tombstone.Obj.(*apps.Deployment)
+        if !ok {
+            utilruntime.HandleError(fmt.Errorf("tombstone contained object that is not a Deployment %#v", obj))
+            return
+        }
+    }
+    dc.enqueueDeployment(d)
+}
+```
+
+---
+
+## 5. Controller Expectations Pattern
+
+**Source:** `pkg/controller/controller_utils.go` (lines 130–315)
+
+### What it does
+Expectations track pending creates/deletes to prevent controllers from taking action on stale cache state. A controller won't sync until its expectations are satisfied or expired.
+
+### Why
+Between a controller issuing a create and the informer cache reflecting that new object, there's a window where the controller might create duplicates. Expectations close this gap.
+
+```go
+// pkg/controller/controller_utils.go:153-173
+type ControllerExpectationsInterface interface {
+    GetExpectations(controllerKey string) (*ControlleeExpectations, bool, error)
+    SatisfiedExpectations(logger klog.Logger, controllerKey string) bool
+    DeleteExpectations(logger klog.Logger, controllerKey string)
+    SetExpectations(logger klog.Logger, controllerKey string, add, del int) error
+    ExpectCreations(logger klog.Logger, controllerKey string, adds int) error
+    ExpectDeletions(logger klog.Logger, controllerKey string, dels int) error
+    CreationObserved(logger klog.Logger, controllerKey string)
+    DeletionObserved(logger klog.Logger, controllerKey string)
+}
+```
+
+---
+
+## 6. OwnerReference / Controller Ref Manager Pattern
+
+**Source:** `pkg/controller/controller_ref_manager.go` (lines 37–80)
+
+### What it does
+Implements garbage collection ownership through OwnerReferences. The ControllerRefManager handles adopting orphaned resources and releasing resources that no longer match.
+
+### Why
+Multiple controllers may create children. The ownership model ensures exactly one controller owns each child, enabling garbage collection and preventing conflicts.
+
+```go
+// pkg/controller/controller_ref_manager.go:37-50
+type BaseControllerRefManager struct {
+    Controller   metav1.Object
+    Selector     labels.Selector
+    canAdoptErr  error
+    canAdoptOnce sync.Once      // Lazy, one-shot adoption check
+    CanAdoptFunc func(ctx context.Context) error
+}
+
+// The claim logic: adopt if matching and orphaned, release if owned but not matching
+func (m *BaseControllerRefManager) ClaimObject(ctx context.Context, obj metav1.Object,
+    match func(metav1.Object) bool,
+    adopt, release func(context.Context, metav1.Object) error) (bool, error) {
+    controllerRef := metav1.GetControllerOfNoCopy(obj)
+    if controllerRef != nil {
+        if controllerRef.UID != m.Controller.GetUID() {
+            return false, nil  // Owned by someone else
+        }
+        if match(obj) {
+            return true, nil   // Already ours and matches
+        }
+        // Ours but no longer matches → release
+    }
+    // Orphan → adopt if matches
+}
+```
+
+---
+
+## 7. Leader Election Pattern
+
+**Source:** `staging/src/k8s.io/client-go/tools/leaderelection/leaderelection.go` (lines 116–230)
+
+### What it does
+Provides distributed mutex semantics using a Kubernetes resource (Lease) as the lock. Only one instance of a controller runs actively; others are hot standbys.
+
+### When to Use
+
+**Triggers:**
+- You're running multiple replicas of a controller for high availability
+- Only ONE instance should actively reconcile at a time (to avoid conflicts)
+- You need automatic failover: if the leader dies, another replica takes over within seconds
+
+**Example — before:**
+```go
+// All replicas reconcile simultaneously → write conflicts, duplicate work
+func main() {
+    ctrl := NewController()
+    ctrl.Run(ctx) // every replica does this — chaos
+}
+```
+
+**Example — after:**
+```go
+func main() {
+    ctrl := NewController()
+    leaderelection.RunOrDie(ctx, leaderelection.LeaderElectionConfig{
+        Lock:          resourceLock,
+        LeaseDuration: 15 * time.Second,
+        RenewDeadline: 10 * time.Second,
+        RetryPeriod:   2 * time.Second,
+        Callbacks: leaderelection.LeaderCallbacks{
+            OnStartedLeading: func(ctx context.Context) {
+                ctrl.Run(ctx) // only the leader reconciles
+            },
+            OnStoppedLeading: func() {
+                log.Fatal("lost leadership") // restart to re-enter election
+            },
+        },
+    })
+}
+```
+
+### Why
+Controller-manager runs multiple replicas for HA. Only one should reconcile to avoid conflicts.
+
+```go
+// staging/src/k8s.io/client-go/tools/leaderelection/leaderelection.go:116-163
+type LeaderElectionConfig struct {
+    Lock           rl.Interface
+    LeaseDuration  time.Duration  // Default 15s — how long a lease is valid
+    RenewDeadline  time.Duration  // Default 10s — how long leader retries renewal
+    RetryPeriod    time.Duration  // Default 2s — how often candidates check
+    Callbacks      LeaderCallbacks
+    ReleaseOnCancel bool
+}
+
+type LeaderCallbacks struct {
+    OnStartedLeading func(context.Context)
+    OnStoppedLeading func()
+    OnNewLeader      func(identity string)
+}
+
+// staging/src/k8s.io/client-go/tools/leaderelection/leaderelection.go:211-226
+func (le *LeaderElector) Run(ctx context.Context) {
+    defer runtime.HandleCrashWithContext(ctx)
+    defer le.config.Callbacks.OnStoppedLeading()
+    if !le.acquire(ctx) {
+        return
+    }
+    ctx, cancel := context.WithCancel(ctx)
+    defer cancel()
+    go le.config.Callbacks.OnStartedLeading(ctx)
+    le.renew(ctx)
+}
+```
+
+---
+
+## 8. Feature Gates Pattern
+
+**Source:** `pkg/features/kube_features.go` (lines 34–2811), `staging/src/k8s.io/client-go/features/features.go` (lines 34–80)
+
+### What it does
+A global registry of boolean flags that control feature rollout. Features progress through Alpha → Beta → GA → Deprecated lifecycle stages.
+
+### Why
+Kubernetes serves thousands of clusters. Features must be safe to enable/disable at runtime across versions. Feature gates provide:
+- Progressive rollout (alpha off by default, beta on, GA locked)
+- Per-version semantics (a feature may become beta in v1.28)
+- Testing isolation
+
+```go
+// staging/src/k8s.io/client-go/features/features.go:50-70
+type Feature string
+
+type FeatureSpec struct {
+    Default    bool
+    LockToDefault bool
+    PreRelease prerelease
+    Version    *version.Version
+}
+
+type Gates interface {
+    Enabled(key Feature) bool
+}
+
+// pkg/features/kube_features.go:50-58 (example feature definition)
+const (
+    AllowDNSOnlyNodeCSR featuregate.Feature = "AllowDNSOnlyNodeCSR"
+    // ... 2700+ lines of feature definitions
+)
+```
+
+### Registration at init()
+
+```go
+// pkg/features/kube_features.go:2798-2811
+func init() {
+    ca := &clientAdapter{utilfeature.DefaultMutableFeatureGate}
+    runtime.Must(clientfeatures.AddVersionedFeaturesToExistingFeatureGates(ca))
+    clientfeatures.ReplaceFeatureGates(ca)
+    runtime.Must(utilfeature.DefaultMutableFeatureGate.AddVersioned(defaultVersionedKubernetesFeatureGates))
+}
+```
@@ -0,0 +1,441 @@
+# Production Go Patterns (from Kubernetes)
+
+Patterns for building large-scale Go codebases that go beyond what stdlib teaches you.
+
+## 1. Code Generation Pattern
+
+**Source:** `staging/src/k8s.io/apimachinery/pkg/runtime/zz_generated.deepcopy.go`, `staging/src/k8s.io/client-go/informers/apps/v1/deployment.go`
+
+### What it does
+Kubernetes generates massive amounts of boilerplate code from annotations on types:
+- `deepcopy-gen` → DeepCopy/DeepCopyInto methods
+- `informer-gen` → typed informers (List/Watch/Lister per resource)
+- `client-gen` → typed client sets
+- `lister-gen` → typed lister interfaces
+- `conversion-gen` → version conversion functions
+- `defaulter-gen` → defaulting functions
+
+### Why
+At Kubernetes scale (~50 resource types × multiple versions), hand-writing deep copy, client wrappers, and conversion code is:
+1. Error-prone (forgetting to copy a new field breaks everything)
+2. Unmaintainable (thousands of nearly-identical files)
+3. Not verifiable by human review
+
+### How it works
+
+Annotations drive generation:
+```go
+// +k8s:deepcopy-gen=true
+// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
+type RawExtension struct { ... }
+```
+
+Generated output uses `zz_generated.` prefix (convention for "don't edit"):
+```go
+// staging/src/k8s.io/apimachinery/pkg/runtime/zz_generated.deepcopy.go:22
+// Code generated by deepcopy-gen. DO NOT EDIT.
+package runtime
+
+func (in *RawExtension) DeepCopyInto(out *RawExtension) {
+    *out = *in
+    if in.Raw != nil {
+        in, out := &in.Raw, &out.Raw
+        *out = make([]byte, len(*in))
+        copy(*out, *in)
+    }
+}
+```
+
+Generated informers (note the header comment):
+```go
+// staging/src/k8s.io/client-go/informers/apps/v1/deployment.go:20
+// Code generated by informer-gen. DO NOT EDIT.
+```
+
+### When to Use
+
+**Triggers:**
+- You have 10+ types that need identical boilerplate methods (DeepCopy, Validate, Marshal)
+- Hand-writing the code is error-prone (forgetting to copy a new field causes silent bugs)
+- The generated output is mechanical and reviewable, not creative
+
+**Example — before:**
+```go
+// Hand-written deep copy for every type — 50 types × 30 lines each = 1500 lines of bugs
+func (in *Deployment) DeepCopy() *Deployment {
+    out := new(Deployment)
+    out.Name = in.Name
+    out.Labels = make(map[string]string)
+    for k, v := range in.Labels { out.Labels[k] = v }
+    // Did you remember Annotations? Finalizers? Every nested struct?
+}
+```
+
+**Example — after:**
+```go
+// +k8s:deepcopy-gen=true
+type Deployment struct {
+    Name        string
+    Labels      map[string]string
+    Annotations map[string]string
+}
+// Generated: zz_generated.deepcopy.go handles ALL fields correctly, always.
+// Adding a new field? Re-run generator. Zero chance of forgetting.
+```
+
+### Key Insight
+**Stdlib has no code generation culture.** stdlib keeps things small enough that hand-writing works. Kubernetes proves that once you cross ~20 types with shared behavior, code gen is the only sane path.
+
+---
+
+## 2. The Scheme / Type Registry Pattern
+
+**Source:** `staging/src/k8s.io/apimachinery/pkg/runtime/scheme.go` (lines 38–100), `scheme_builder.go`
+
+### What it does
+The Scheme is a runtime type registry that maps:
+- `GroupVersionKind` → Go type (`reflect.Type`)
+- Go type → `[]GroupVersionKind`
+- Provides serialization, defaulting, conversion, and validation dispatch
+
+### Why
+Kubernetes has 50+ resource types across 15+ API groups, each with multiple versions. The Scheme provides:
+- **Dynamic dispatch**: serialize any Object without knowing its concrete type
+- **Version conversion**: convert between v1 and v1beta1 transparently
+- **Pluggability**: third-party resources register into the same system
+
+### Structure
+
+```go
+// staging/src/k8s.io/apimachinery/pkg/runtime/scheme.go:38-98
+type Scheme struct {
+    gvkToType       map[schema.GroupVersionKind]reflect.Type
+    typeToGVK       map[reflect.Type][]schema.GroupVersionKind
+    unversionedTypes map[reflect.Type]schema.GroupVersionKind
+    defaulterFuncs  map[reflect.Type]func(interface{})
+    validationFuncs map[reflect.Type]func(ctx, op, obj, oldObj) field.ErrorList
+    converter       *conversion.Converter
+    versionPriority map[string][]string
+}
+```
+
+### SchemeBuilder Pattern
+
+```go
+// staging/src/k8s.io/apimachinery/pkg/runtime/scheme_builder.go:23-48
+type SchemeBuilder []func(*Scheme) error
+
+func (sb *SchemeBuilder) AddToScheme(s *Scheme) error {
+    for _, f := range *sb {
+        if err := f(s); err != nil {
+            return err
+        }
+    }
+    return nil
+}
+
+func (sb *SchemeBuilder) Register(funcs ...func(*Scheme) error) {
+    *sb = append(*sb, f)
+}
+```
+
+### How Registration Works
+
+```go
+// staging/src/k8s.io/apimachinery/pkg/runtime/scheme.go:151-160
+func (s *Scheme) AddKnownTypes(gv schema.GroupVersion, types ...Object) {
+    for _, obj := range types {
+        t := reflect.TypeOf(obj)
+        if t.Kind() != reflect.Pointer {
+            panic("All types must be pointers to structs.")
+        }
+        t = t.Elem()
+        s.AddKnownTypeWithName(gv.WithKind(t.Name()), obj)
+    }
+}
+```
+
+### Key Insight
+This is Java's ServiceLoader / dependency injection adapted for Go's type system. Stdlib uses interfaces; Kubernetes needs a **runtime type system on top of Go's static type system** because API objects must be dynamically dispatched across version boundaries.
+
+---
+
+## 3. The runtime.Object Interface
+
+**Source:** `staging/src/k8s.io/apimachinery/pkg/runtime/interfaces.go` (lines 333–342)
+
+### What it does
+Every Kubernetes API object must implement this two-method interface:
+
+```go
+// staging/src/k8s.io/apimachinery/pkg/runtime/interfaces.go:337-341
+type Object interface {
+    GetObjectKind() schema.ObjectKind
+    DeepCopyObject() Object
+}
+```
+
+### Why
+- `GetObjectKind()` — allows the serialization layer to determine what type an object is without reflection
+- `DeepCopyObject()` — enables safe concurrent access (informer cache is shared; mutations must happen on copies)
+
+### Key Insight
+**This is the foundation of Kubernetes' extensibility.** Any Go struct that satisfies these two methods can participate in the entire API machinery — serialization, storage, admission, informers, etc. CRDs generate code that implements this interface.
+
+---
+
+## 4. Deep Copy Everywhere
+
+**Source:** Generated code in `zz_generated.deepcopy.go` files throughout the tree
+
+### What it does
+Every API type has generated `DeepCopy()` and `DeepCopyInto()` methods that create true deep copies including nested slices, maps, and pointer fields.
+
+### Why
+The informer cache is shared across all controllers in a process. If controller A gets an object from the cache and mutates it, controller B would see corrupted data. Deep copy provides the isolation guarantee.
+
+```go
+// Usage pattern in controllers:
+deployment := deploymentFromCache.DeepCopy()
+deployment.Spec.Replicas = ptr.To[int32](3)
+_, err := client.AppsV1().Deployments(ns).Update(ctx, deployment, metav1.UpdateOptions{})
+```
+
+### Key Insight
+Stdlib rarely needs deep copy because stdlib objects are typically owned by one goroutine. Kubernetes has a **shared read cache** (the informer store) that necessitates copy-on-write semantics at the application level.
+
+---
+
+## 5. Graceful Shutdown with Priority Classes
+
+**Source:** `pkg/kubelet/nodeshutdown/nodeshutdown_manager_linux.go` (lines 23–100)
+
+### What it does
+When a node is shutting down, pods are terminated in priority order. Critical pods (system-node-critical) get more grace time than regular pods.
+
+### Why
+A hard kill of all pods simultaneously would lose important work. Priority-based graceful shutdown preserves the most important workloads longest.
+
+```go
+// pkg/kubelet/nodeshutdown/nodeshutdown_manager_linux.go:66-90
+type managerImpl struct {
+    logger             klog.Logger
+    recorder           record.EventRecorder
+    getPods            eviction.ActivePodsFunc
+    syncNodeStatus     func(context.Context)
+    dbusCon            dbusInhibiter
+    inhibitLock        systemd.InhibitLock
+    nodeShuttingDownMutex sync.Mutex
+    nodeShuttingDownNow   bool
+    podManager         *podManager
+}
+```
+
+---
+
+## 6. Context as Logger Carrier
+
+**Source:** `pkg/controller/deployment/deployment_controller.go` (lines 106, 179, 500)
+
+### What it does
+Kubernetes passes structured loggers through context:
+
+```go
+// pkg/controller/deployment/deployment_controller.go:179
+logger := klog.FromContext(ctx)
+logger.Info("Starting controller", "controller", "deployment")
+```
+
+### Why
+At scale, you need structured logging with:
+- Consistent key-value pairs (controller name, object reference)
+- Verbosity levels (`logger.V(4).Info(...)`)
+- No global state (context carries the logger configured by the caller)
+
+### Key Insight
+Stdlib's `log` package is global. Kubernetes uses context-based structured logging (`klog.FromContext`) to allow each call chain to carry its own logger configuration. This enables filtering by controller, verbosity tuning per-component, and correlation.
+
+---
+
+## 7. Functional Options for Configuration
+
+**Source:** `staging/src/k8s.io/client-go/informers/factory.go` (lines 83–127)
+
+### What it does
+The SharedInformerFactory uses functional options for configuration:
+
+```go
+// staging/src/k8s.io/client-go/informers/factory.go:57
+type SharedInformerOption func(*sharedInformerFactory) *sharedInformerFactory
+
+func WithNamespace(namespace string) SharedInformerOption {
+    return func(factory *sharedInformerFactory) *sharedInformerFactory {
+        factory.namespace = namespace
+        return factory
+    }
+}
+
+func WithTransform(transform cache.TransformFunc) SharedInformerOption {
+    return func(factory *sharedInformerFactory) *sharedInformerFactory {
+        factory.transform = transform
+        return factory
+    }
+}
+
+func NewSharedInformerFactoryWithOptions(client kubernetes.Interface, defaultResync time.Duration, options ...SharedInformerOption) SharedInformerFactory {
+    factory := &sharedInformerFactory{...}
+    for _, opt := range options {
+        factory = opt(factory)
+    }
+    return factory
+}
+```
+
+### Why
+APIs evolve. Adding a new configuration option shouldn't break callers. Functional options provide:
+- Backward compatibility (new options don't change existing signatures)
+- Self-documenting (each option is a named function)
+- Composability (options can be collected and applied conditionally)
+
+---
+
+## 8. Type-Safe Generics in Critical Paths
+
+**Source:** `staging/src/k8s.io/client-go/util/workqueue/queue.go` (lines 33–200), `staging/src/k8s.io/client-go/gentype/type.go` (lines 33–120)
+
+### What it does
+Both workqueue and gentype use Go generics (1.18+) to provide type-safe interfaces while maintaining backward compatibility via type aliases:
+
+```go
+// Workqueue: type-safe queue
+type TypedInterface[T comparable] interface {
+    Add(item T)
+    Get() (item T, shutdown bool)
+    Done(item T)
+}
+
+// Type alias for backward compat
+type Type = Typed[any]
+
+// Gentype: type-safe client
+type Client[T objectWithMeta] struct {
+    resource       string
+    client         rest.Interface
+    namespace      string
+    newObject      func() T
+}
+```
+
+### Why
+Before generics, Kubernetes used `interface{}` everywhere, requiring type assertions at every boundary. Generics eliminate entire classes of runtime panics and make the code self-documenting.
+
+### Key Insight
+This is a migration pattern: introduce the generic version alongside the deprecated `interface{}` version using type aliases. Callers migrate at their own pace.
+
+---
+
+## 9. HandleCrash — Structured Panic Recovery
+
+**Source:** `staging/src/k8s.io/apimachinery/pkg/util/runtime/runtime.go` (lines 30–120)
+
+### What it does
+A standardized `defer HandleCrash()` pattern that:
+1. Catches panics
+2. Logs them with proper stack attribution
+3. Invokes registered panic handlers
+4. Optionally re-panics (controlled by `ReallyCrash` flag)
+
+```go
+// staging/src/k8s.io/apimachinery/pkg/util/runtime/runtime.go:78-82
+func HandleCrashWithContext(ctx context.Context, additionalHandlers ...func(context.Context, interface{})) {
+    if r := recover(); r != nil {
+        handleCrash(ctx, r, additionalHandlers...)
+    }
+}
+```
+
+### Why
+In a production system with hundreds of goroutines, an unrecovered panic in one kills the entire process. HandleCrash provides a standardized recovery point that:
+- Logs the panic with caller attribution
+- Allows cleanup handlers (shutdown gracefully)
+- In tests, can be configured to not actually crash
+
+### When to Use
+
+**Triggers:**
+- You're running multiple independent subsystems in one process (multiple controllers, background workers)
+- A panic in one subsystem shouldn't kill the entire process
+- You need structured logging of panic stack traces before potential recovery
+
+**Example — before:**
+```go
+// One bad nil pointer in workerB kills workerA, workerC, and the whole server
+func main() {
+    go workerA(ctx)
+    go workerB(ctx) // panics → entire process dies
+    go workerC(ctx)
+    select {}
+}
+```
+
+**Example — after:**
+```go
+func safeGo(ctx context.Context, name string, f func(ctx context.Context)) {
+    go func() {
+        defer func() {
+            if r := recover(); r != nil {
+                log.Printf("panic in %s: %v
+%s", name, r, debug.Stack())
+                // Log, alert, increment metric — but don't kill siblings
+            }
+        }()
+        f(ctx)
+    }()
+}
+
+func main() {
+    safeGo(ctx, "worker-a", workerA)
+    safeGo(ctx, "worker-b", workerB) // panics → logged, other workers continue
+    safeGo(ctx, "worker-c", workerC)
+    select {}
+}
+```
+
+### Key Insight
+Stdlib's approach is "let it crash." Kubernetes' approach is "catch it, log it, let the controller retry on the next sync." This is only safe because the controller pattern is idempotent.
+
+---
+
+## 10. ContextForChannel — Bridge Pattern
+
+**Source:** `staging/src/k8s.io/apimachinery/pkg/util/wait/wait.go` (lines 120–145)
+
+### What it does
+Bridges the older `<-chan struct{}` stop pattern to the modern `context.Context` pattern:
+
+```go
+// staging/src/k8s.io/apimachinery/pkg/util/wait/wait.go:120-142
+func ContextForChannel(parentCh <-chan struct{}) context.Context {
+    return channelContext{stopCh: parentCh}
+}
+
+type channelContext struct {
+    stopCh <-chan struct{}
+}
+
+func (c channelContext) Done() <-chan struct{} { return c.stopCh }
+func (c channelContext) Err() error {
+    select {
+    case <-c.stopCh:
+        return context.Canceled
+    default:
+        return nil
+    }
+}
+```
+
+### Why
+Kubernetes predates `context.Context` (which arrived in Go 1.7). Millions of lines of code use `stopCh <-chan struct{}`. Rather than a big-bang rewrite, this adapter allows gradual migration.
+
+### Key Insight
+**Large codebases can't do breaking API changes atomically.** This bridge pattern is how you evolve from one idiom to another over years without breaking everything at once.