docs: Kubernetes production patterns with source citations
This commit is contained in:
@@ -0,0 +1,441 @@
|
||||
# Production Go Patterns (from Kubernetes)
|
||||
|
||||
Patterns for building large-scale Go codebases that go beyond what stdlib teaches you.
|
||||
|
||||
## 1. Code Generation Pattern
|
||||
|
||||
**Source:** `staging/src/k8s.io/apimachinery/pkg/runtime/zz_generated.deepcopy.go`, `staging/src/k8s.io/client-go/informers/apps/v1/deployment.go`
|
||||
|
||||
### What it does
|
||||
Kubernetes generates massive amounts of boilerplate code from annotations on types:
|
||||
- `deepcopy-gen` → DeepCopy/DeepCopyInto methods
|
||||
- `informer-gen` → typed informers (List/Watch/Lister per resource)
|
||||
- `client-gen` → typed client sets
|
||||
- `lister-gen` → typed lister interfaces
|
||||
- `conversion-gen` → version conversion functions
|
||||
- `defaulter-gen` → defaulting functions
|
||||
|
||||
### Why
|
||||
At Kubernetes scale (~50 resource types × multiple versions), hand-writing deep copy, client wrappers, and conversion code is:
|
||||
1. Error-prone (forgetting to copy a new field breaks everything)
|
||||
2. Unmaintainable (thousands of nearly-identical files)
|
||||
3. Not verifiable by human review
|
||||
|
||||
### How it works
|
||||
|
||||
Annotations drive generation:
|
||||
```go
|
||||
// +k8s:deepcopy-gen=true
|
||||
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object
|
||||
type RawExtension struct { ... }
|
||||
```
|
||||
|
||||
Generated output uses `zz_generated.` prefix (convention for "don't edit"):
|
||||
```go
|
||||
// staging/src/k8s.io/apimachinery/pkg/runtime/zz_generated.deepcopy.go:22
|
||||
// Code generated by deepcopy-gen. DO NOT EDIT.
|
||||
package runtime
|
||||
|
||||
func (in *RawExtension) DeepCopyInto(out *RawExtension) {
|
||||
*out = *in
|
||||
if in.Raw != nil {
|
||||
in, out := &in.Raw, &out.Raw
|
||||
*out = make([]byte, len(*in))
|
||||
copy(*out, *in)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Generated informers (note the header comment):
|
||||
```go
|
||||
// staging/src/k8s.io/client-go/informers/apps/v1/deployment.go:20
|
||||
// Code generated by informer-gen. DO NOT EDIT.
|
||||
```
|
||||
|
||||
### When to Use
|
||||
|
||||
**Triggers:**
|
||||
- You have 10+ types that need identical boilerplate methods (DeepCopy, Validate, Marshal)
|
||||
- Hand-writing the code is error-prone (forgetting to copy a new field causes silent bugs)
|
||||
- The generated output is mechanical and reviewable, not creative
|
||||
|
||||
**Example — before:**
|
||||
```go
|
||||
// Hand-written deep copy for every type — 50 types × 30 lines each = 1500 lines of bugs
|
||||
func (in *Deployment) DeepCopy() *Deployment {
|
||||
out := new(Deployment)
|
||||
out.Name = in.Name
|
||||
out.Labels = make(map[string]string)
|
||||
for k, v := range in.Labels { out.Labels[k] = v }
|
||||
// Did you remember Annotations? Finalizers? Every nested struct?
|
||||
}
|
||||
```
|
||||
|
||||
**Example — after:**
|
||||
```go
|
||||
// +k8s:deepcopy-gen=true
|
||||
type Deployment struct {
|
||||
Name string
|
||||
Labels map[string]string
|
||||
Annotations map[string]string
|
||||
}
|
||||
// Generated: zz_generated.deepcopy.go handles ALL fields correctly, always.
|
||||
// Adding a new field? Re-run generator. Zero chance of forgetting.
|
||||
```
|
||||
|
||||
### Key Insight
|
||||
**Stdlib has no code generation culture.** stdlib keeps things small enough that hand-writing works. Kubernetes proves that once you cross ~20 types with shared behavior, code gen is the only sane path.
|
||||
|
||||
---
|
||||
|
||||
## 2. The Scheme / Type Registry Pattern
|
||||
|
||||
**Source:** `staging/src/k8s.io/apimachinery/pkg/runtime/scheme.go` (lines 38–100), `scheme_builder.go`
|
||||
|
||||
### What it does
|
||||
The Scheme is a runtime type registry that maps:
|
||||
- `GroupVersionKind` → Go type (`reflect.Type`)
|
||||
- Go type → `[]GroupVersionKind`
|
||||
- Provides serialization, defaulting, conversion, and validation dispatch
|
||||
|
||||
### Why
|
||||
Kubernetes has 50+ resource types across 15+ API groups, each with multiple versions. The Scheme provides:
|
||||
- **Dynamic dispatch**: serialize any Object without knowing its concrete type
|
||||
- **Version conversion**: convert between v1 and v1beta1 transparently
|
||||
- **Pluggability**: third-party resources register into the same system
|
||||
|
||||
### Structure
|
||||
|
||||
```go
|
||||
// staging/src/k8s.io/apimachinery/pkg/runtime/scheme.go:38-98
|
||||
type Scheme struct {
|
||||
gvkToType map[schema.GroupVersionKind]reflect.Type
|
||||
typeToGVK map[reflect.Type][]schema.GroupVersionKind
|
||||
unversionedTypes map[reflect.Type]schema.GroupVersionKind
|
||||
defaulterFuncs map[reflect.Type]func(interface{})
|
||||
validationFuncs map[reflect.Type]func(ctx, op, obj, oldObj) field.ErrorList
|
||||
converter *conversion.Converter
|
||||
versionPriority map[string][]string
|
||||
}
|
||||
```
|
||||
|
||||
### SchemeBuilder Pattern
|
||||
|
||||
```go
|
||||
// staging/src/k8s.io/apimachinery/pkg/runtime/scheme_builder.go:23-48
|
||||
type SchemeBuilder []func(*Scheme) error
|
||||
|
||||
func (sb *SchemeBuilder) AddToScheme(s *Scheme) error {
|
||||
for _, f := range *sb {
|
||||
if err := f(s); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (sb *SchemeBuilder) Register(funcs ...func(*Scheme) error) {
|
||||
*sb = append(*sb, f)
|
||||
}
|
||||
```
|
||||
|
||||
### How Registration Works
|
||||
|
||||
```go
|
||||
// staging/src/k8s.io/apimachinery/pkg/runtime/scheme.go:151-160
|
||||
func (s *Scheme) AddKnownTypes(gv schema.GroupVersion, types ...Object) {
|
||||
for _, obj := range types {
|
||||
t := reflect.TypeOf(obj)
|
||||
if t.Kind() != reflect.Pointer {
|
||||
panic("All types must be pointers to structs.")
|
||||
}
|
||||
t = t.Elem()
|
||||
s.AddKnownTypeWithName(gv.WithKind(t.Name()), obj)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Key Insight
|
||||
This is Java's ServiceLoader / dependency injection adapted for Go's type system. Stdlib uses interfaces; Kubernetes needs a **runtime type system on top of Go's static type system** because API objects must be dynamically dispatched across version boundaries.
|
||||
|
||||
---
|
||||
|
||||
## 3. The runtime.Object Interface
|
||||
|
||||
**Source:** `staging/src/k8s.io/apimachinery/pkg/runtime/interfaces.go` (lines 333–342)
|
||||
|
||||
### What it does
|
||||
Every Kubernetes API object must implement this two-method interface:
|
||||
|
||||
```go
|
||||
// staging/src/k8s.io/apimachinery/pkg/runtime/interfaces.go:337-341
|
||||
type Object interface {
|
||||
GetObjectKind() schema.ObjectKind
|
||||
DeepCopyObject() Object
|
||||
}
|
||||
```
|
||||
|
||||
### Why
|
||||
- `GetObjectKind()` — allows the serialization layer to determine what type an object is without reflection
|
||||
- `DeepCopyObject()` — enables safe concurrent access (informer cache is shared; mutations must happen on copies)
|
||||
|
||||
### Key Insight
|
||||
**This is the foundation of Kubernetes' extensibility.** Any Go struct that satisfies these two methods can participate in the entire API machinery — serialization, storage, admission, informers, etc. CRDs generate code that implements this interface.
|
||||
|
||||
---
|
||||
|
||||
## 4. Deep Copy Everywhere
|
||||
|
||||
**Source:** Generated code in `zz_generated.deepcopy.go` files throughout the tree
|
||||
|
||||
### What it does
|
||||
Every API type has generated `DeepCopy()` and `DeepCopyInto()` methods that create true deep copies including nested slices, maps, and pointer fields.
|
||||
|
||||
### Why
|
||||
The informer cache is shared across all controllers in a process. If controller A gets an object from the cache and mutates it, controller B would see corrupted data. Deep copy provides the isolation guarantee.
|
||||
|
||||
```go
|
||||
// Usage pattern in controllers:
|
||||
deployment := deploymentFromCache.DeepCopy()
|
||||
deployment.Spec.Replicas = ptr.To[int32](3)
|
||||
_, err := client.AppsV1().Deployments(ns).Update(ctx, deployment, metav1.UpdateOptions{})
|
||||
```
|
||||
|
||||
### Key Insight
|
||||
Stdlib rarely needs deep copy because stdlib objects are typically owned by one goroutine. Kubernetes has a **shared read cache** (the informer store) that necessitates copy-on-write semantics at the application level.
|
||||
|
||||
---
|
||||
|
||||
## 5. Graceful Shutdown with Priority Classes
|
||||
|
||||
**Source:** `pkg/kubelet/nodeshutdown/nodeshutdown_manager_linux.go` (lines 23–100)
|
||||
|
||||
### What it does
|
||||
When a node is shutting down, pods are terminated in priority order. Critical pods (system-node-critical) get more grace time than regular pods.
|
||||
|
||||
### Why
|
||||
A hard kill of all pods simultaneously would lose important work. Priority-based graceful shutdown preserves the most important workloads longest.
|
||||
|
||||
```go
|
||||
// pkg/kubelet/nodeshutdown/nodeshutdown_manager_linux.go:66-90
|
||||
type managerImpl struct {
|
||||
logger klog.Logger
|
||||
recorder record.EventRecorder
|
||||
getPods eviction.ActivePodsFunc
|
||||
syncNodeStatus func(context.Context)
|
||||
dbusCon dbusInhibiter
|
||||
inhibitLock systemd.InhibitLock
|
||||
nodeShuttingDownMutex sync.Mutex
|
||||
nodeShuttingDownNow bool
|
||||
podManager *podManager
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Context as Logger Carrier
|
||||
|
||||
**Source:** `pkg/controller/deployment/deployment_controller.go` (lines 106, 179, 500)
|
||||
|
||||
### What it does
|
||||
Kubernetes passes structured loggers through context:
|
||||
|
||||
```go
|
||||
// pkg/controller/deployment/deployment_controller.go:179
|
||||
logger := klog.FromContext(ctx)
|
||||
logger.Info("Starting controller", "controller", "deployment")
|
||||
```
|
||||
|
||||
### Why
|
||||
At scale, you need structured logging with:
|
||||
- Consistent key-value pairs (controller name, object reference)
|
||||
- Verbosity levels (`logger.V(4).Info(...)`)
|
||||
- No global state (context carries the logger configured by the caller)
|
||||
|
||||
### Key Insight
|
||||
Stdlib's `log` package is global. Kubernetes uses context-based structured logging (`klog.FromContext`) to allow each call chain to carry its own logger configuration. This enables filtering by controller, verbosity tuning per-component, and correlation.
|
||||
|
||||
---
|
||||
|
||||
## 7. Functional Options for Configuration
|
||||
|
||||
**Source:** `staging/src/k8s.io/client-go/informers/factory.go` (lines 83–127)
|
||||
|
||||
### What it does
|
||||
The SharedInformerFactory uses functional options for configuration:
|
||||
|
||||
```go
|
||||
// staging/src/k8s.io/client-go/informers/factory.go:57
|
||||
type SharedInformerOption func(*sharedInformerFactory) *sharedInformerFactory
|
||||
|
||||
func WithNamespace(namespace string) SharedInformerOption {
|
||||
return func(factory *sharedInformerFactory) *sharedInformerFactory {
|
||||
factory.namespace = namespace
|
||||
return factory
|
||||
}
|
||||
}
|
||||
|
||||
func WithTransform(transform cache.TransformFunc) SharedInformerOption {
|
||||
return func(factory *sharedInformerFactory) *sharedInformerFactory {
|
||||
factory.transform = transform
|
||||
return factory
|
||||
}
|
||||
}
|
||||
|
||||
func NewSharedInformerFactoryWithOptions(client kubernetes.Interface, defaultResync time.Duration, options ...SharedInformerOption) SharedInformerFactory {
|
||||
factory := &sharedInformerFactory{...}
|
||||
for _, opt := range options {
|
||||
factory = opt(factory)
|
||||
}
|
||||
return factory
|
||||
}
|
||||
```
|
||||
|
||||
### Why
|
||||
APIs evolve. Adding a new configuration option shouldn't break callers. Functional options provide:
|
||||
- Backward compatibility (new options don't change existing signatures)
|
||||
- Self-documenting (each option is a named function)
|
||||
- Composability (options can be collected and applied conditionally)
|
||||
|
||||
---
|
||||
|
||||
## 8. Type-Safe Generics in Critical Paths
|
||||
|
||||
**Source:** `staging/src/k8s.io/client-go/util/workqueue/queue.go` (lines 33–200), `staging/src/k8s.io/client-go/gentype/type.go` (lines 33–120)
|
||||
|
||||
### What it does
|
||||
Both workqueue and gentype use Go generics (1.18+) to provide type-safe interfaces while maintaining backward compatibility via type aliases:
|
||||
|
||||
```go
|
||||
// Workqueue: type-safe queue
|
||||
type TypedInterface[T comparable] interface {
|
||||
Add(item T)
|
||||
Get() (item T, shutdown bool)
|
||||
Done(item T)
|
||||
}
|
||||
|
||||
// Type alias for backward compat
|
||||
type Type = Typed[any]
|
||||
|
||||
// Gentype: type-safe client
|
||||
type Client[T objectWithMeta] struct {
|
||||
resource string
|
||||
client rest.Interface
|
||||
namespace string
|
||||
newObject func() T
|
||||
}
|
||||
```
|
||||
|
||||
### Why
|
||||
Before generics, Kubernetes used `interface{}` everywhere, requiring type assertions at every boundary. Generics eliminate entire classes of runtime panics and make the code self-documenting.
|
||||
|
||||
### Key Insight
|
||||
This is a migration pattern: introduce the generic version alongside the deprecated `interface{}` version using type aliases. Callers migrate at their own pace.
|
||||
|
||||
---
|
||||
|
||||
## 9. HandleCrash — Structured Panic Recovery
|
||||
|
||||
**Source:** `staging/src/k8s.io/apimachinery/pkg/util/runtime/runtime.go` (lines 30–120)
|
||||
|
||||
### What it does
|
||||
A standardized `defer HandleCrash()` pattern that:
|
||||
1. Catches panics
|
||||
2. Logs them with proper stack attribution
|
||||
3. Invokes registered panic handlers
|
||||
4. Optionally re-panics (controlled by `ReallyCrash` flag)
|
||||
|
||||
```go
|
||||
// staging/src/k8s.io/apimachinery/pkg/util/runtime/runtime.go:78-82
|
||||
func HandleCrashWithContext(ctx context.Context, additionalHandlers ...func(context.Context, interface{})) {
|
||||
if r := recover(); r != nil {
|
||||
handleCrash(ctx, r, additionalHandlers...)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Why
|
||||
In a production system with hundreds of goroutines, an unrecovered panic in one kills the entire process. HandleCrash provides a standardized recovery point that:
|
||||
- Logs the panic with caller attribution
|
||||
- Allows cleanup handlers (shutdown gracefully)
|
||||
- In tests, can be configured to not actually crash
|
||||
|
||||
### When to Use
|
||||
|
||||
**Triggers:**
|
||||
- You're running multiple independent subsystems in one process (multiple controllers, background workers)
|
||||
- A panic in one subsystem shouldn't kill the entire process
|
||||
- You need structured logging of panic stack traces before potential recovery
|
||||
|
||||
**Example — before:**
|
||||
```go
|
||||
// One bad nil pointer in workerB kills workerA, workerC, and the whole server
|
||||
func main() {
|
||||
go workerA(ctx)
|
||||
go workerB(ctx) // panics → entire process dies
|
||||
go workerC(ctx)
|
||||
select {}
|
||||
}
|
||||
```
|
||||
|
||||
**Example — after:**
|
||||
```go
|
||||
func safeGo(ctx context.Context, name string, f func(ctx context.Context)) {
|
||||
go func() {
|
||||
defer func() {
|
||||
if r := recover(); r != nil {
|
||||
log.Printf("panic in %s: %v
|
||||
%s", name, r, debug.Stack())
|
||||
// Log, alert, increment metric — but don't kill siblings
|
||||
}
|
||||
}()
|
||||
f(ctx)
|
||||
}()
|
||||
}
|
||||
|
||||
func main() {
|
||||
safeGo(ctx, "worker-a", workerA)
|
||||
safeGo(ctx, "worker-b", workerB) // panics → logged, other workers continue
|
||||
safeGo(ctx, "worker-c", workerC)
|
||||
select {}
|
||||
}
|
||||
```
|
||||
|
||||
### Key Insight
|
||||
Stdlib's approach is "let it crash." Kubernetes' approach is "catch it, log it, let the controller retry on the next sync." This is only safe because the controller pattern is idempotent.
|
||||
|
||||
---
|
||||
|
||||
## 10. ContextForChannel — Bridge Pattern
|
||||
|
||||
**Source:** `staging/src/k8s.io/apimachinery/pkg/util/wait/wait.go` (lines 120–145)
|
||||
|
||||
### What it does
|
||||
Bridges the older `<-chan struct{}` stop pattern to the modern `context.Context` pattern:
|
||||
|
||||
```go
|
||||
// staging/src/k8s.io/apimachinery/pkg/util/wait/wait.go:120-142
|
||||
func ContextForChannel(parentCh <-chan struct{}) context.Context {
|
||||
return channelContext{stopCh: parentCh}
|
||||
}
|
||||
|
||||
type channelContext struct {
|
||||
stopCh <-chan struct{}
|
||||
}
|
||||
|
||||
func (c channelContext) Done() <-chan struct{} { return c.stopCh }
|
||||
func (c channelContext) Err() error {
|
||||
select {
|
||||
case <-c.stopCh:
|
||||
return context.Canceled
|
||||
default:
|
||||
return nil
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Why
|
||||
Kubernetes predates `context.Context` (which arrived in Go 1.7). Millions of lines of code use `stopCh <-chan struct{}`. Rather than a big-bang rewrite, this adapter allows gradual migration.
|
||||
|
||||
### Key Insight
|
||||
**Large codebases can't do breaking API changes atomically.** This bridge pattern is how you evolve from one idiom to another over years without breaking everything at once.
|
||||
Reference in New Issue
Block a user