diff --git a/patterns/ownership.md b/patterns/ownership.md new file mode 100644 index 0000000..cf5eb76 --- /dev/null +++ b/patterns/ownership.md @@ -0,0 +1,677 @@ +# Rust Ownership & Lifetime Patterns + +Patterns for ownership, borrowing, and lifetimes in Rust, extracted +from the standard library source. + +**Source:** [rust-lang/rust](https://github.com/rust-lang/rust) at commit +[`f53b654`](https://github.com/rust-lang/rust/tree/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6) + +**Stats:** 7,909 &mut references, 798 Rc/Arc, 697 Box<>, 410 Cell/RefCell, +197 Cow<>, 254 Drop impls, 401 PhantomData, 112 mem::take/replace. + +--- + +## 1. Borrowing Over Owning in Parameters + +### Source: + +[library/alloc/src/string.rs](https://github.com/rust-lang/rust/blob/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6/library/alloc/src/string.rs) (String/str relationship) + +```rust +// The stdlib accepts &str not String for read-only access: +impl str { + pub fn contains(&self, pat: P) -> bool { ... } + pub fn starts_with(&self, pat: P) -> bool { ... } +} +``` + +### Why + +Taking `&T` instead of `T` means the caller keeps ownership. This +is the most fundamental Rust pattern: borrow when you only need to +read, own when you need to keep or modify. + +### When to Use + +**Triggers:** +- You only need to read the value (don't need to store it) +- You want to accept both owned and borrowed values (String and &str) +- You don't want to force callers to clone + +**Example — before:** +```rust +// Takes ownership unnecessarily — forces caller to clone +fn greet(name: String) { + println!("Hello, {name}!"); +} + +let name = String::from("Alice"); +greet(name); // name is moved — can't use it anymore +``` + +**Example — after:** +```rust +// Borrows — caller keeps ownership +fn greet(name: &str) { + println!("Hello, {name}!"); +} + +let name = String::from("Alice"); +greet(&name); // works with &String (deref coercion) +greet("Bob"); // works with &str directly +// name is still usable here +``` + +### When NOT to Use + +**Don't use this when:** +- You need to store the value in a struct (take ownership) +- You need to send it to another thread (ownership needed for Send) +- The function will outlive the caller (return a Future that needs 'static) + +--- + +## 2. Clone vs Copy Semantics + +### Source: + +[library/core/src/clone.rs](https://github.com/rust-lang/rust/blob/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6/library/core/src/clone.rs), [library/core/src/marker.rs](https://github.com/rust-lang/rust/blob/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6/library/core/src/marker.rs) (Copy) + +Top derives: Clone (880), Copy (537). + +```rust +// Copy = implicit bitwise copy (stack only, no heap) +#[derive(Clone, Copy)] +pub struct Duration { secs: u64, nanos: Nanoseconds } + +// Clone only = explicit .clone() needed (may allocate) +#[derive(Clone)] +pub struct String { vec: Vec } // Can't be Copy — heap allocated +``` + +### Why + +Copy types are implicitly duplicated on assignment (like integers). +Clone types require explicit `.clone()`. The distinction prevents +accidental expensive copies. If you can be Copy, you should be. + +### When to Use + +**Triggers:** +- **Copy:** Type is small, stack-only, bitwise-copyable (no heap, + no Drop impl, no &mut interior) +- **Clone only:** Type allocates, has Drop, or copying is expensive + +**Example — before:** +```rust +// Missing Copy on a small stack type +struct Point { x: f64, y: f64 } + +let p = Point { x: 1.0, y: 2.0 }; +let q = p; // MOVED — p is now invalid +// println!("{}", p.x); // ERROR: use after move +``` + +**Example — after:** +```rust +#[derive(Clone, Copy)] +struct Point { x: f64, y: f64 } + +let p = Point { x: 1.0, y: 2.0 }; +let q = p; // COPIED — p is still valid +println!("{}", p.x); // Works fine +``` + +### When NOT to Use + +**Don't use this when:** +- Type has heap allocation (Vec, String, Box) — can't be Copy +- Type has a Drop impl — can't be Copy +- Implicit copying would be expensive or surprising +- Adding Copy now would be a breaking change to remove later + +### Anti-pattern + +```rust +// DON'T: Derive Copy on something that might grow +#[derive(Clone, Copy)] +struct Config { + port: u16, + // Later you add: name: String — BREAKS because String isn't Copy +} + +// DO: Only derive Copy on types that are permanently small and stack-only +``` + +--- + +## 3. Cow<'a, B> (Clone on Write) + +### Source: + +[library/alloc/src/borrow.rs#L169](https://github.com/rust-lang/rust/blob/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6/library/alloc/src/borrow.rs#L169) + +```rust +pub enum Cow<'a, B: ?Sized + 'a> where B: ToOwned { + Borrowed(&'a B), + Owned(::Owned), +} +``` + +197 usages in the stdlib. + +### Why + +Cow delays cloning until mutation is needed. If you only read, you +pay zero allocation cost (borrowed). If you need to modify, it +clones on first write. This is the "avoid allocation in the common +case" pattern. + +### When to Use + +**Triggers:** +- A function usually returns borrowed data but sometimes needs to + allocate (e.g., string escaping — most strings need no change) +- You want to accept either owned or borrowed without forcing a clone +- Performance matters and most inputs don't need modification + +**Example — before:** +```rust +// Always allocates even when input needs no change +fn escape(s: &str) -> String { + if s.contains('<') { + s.replace('<', "<") // allocates + } else { + s.to_string() // ALSO allocates — unnecessary! + } +} +``` + +**Example — after:** +```rust +use std::borrow::Cow; + +fn escape(s: &str) -> Cow<'_, str> { + if s.contains('<') { + Cow::Owned(s.replace('<', "<")) // allocates only when needed + } else { + Cow::Borrowed(s) // zero-cost — just returns a reference + } +} +``` + +### When NOT to Use + +**Don't use this when:** +- You always modify the value (just return the owned type) +- The lifetime makes the API confusing for callers +- The performance difference is negligible + +--- + +## 4. mem::take / mem::replace (Move Out of &mut) + +### Source: + +[library/core/src/mem/mod.rs](https://github.com/rust-lang/rust/blob/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6/library/core/src/mem/mod.rs) + +112 usages of `mem::take`/`mem::replace` in the stdlib. + +```rust +pub fn take(dest: &mut T) -> T { + replace(dest, T::default()) +} + +pub fn replace(dest: &mut T, src: T) -> T { + // Swaps src into dest, returns old dest + unsafe { ... } +} +``` + +### Why + +You can't move out of a `&mut T` directly (would leave the reference +dangling). `mem::take` solves this by replacing with Default and +giving you the old value. This is the idiomatic way to "take +ownership through a mutable reference." + +### When to Use + +**Triggers:** +- You have `&mut self` and need to move a field out +- You're implementing state machines (swap states) +- You need to "reset" a field while extracting its value + +**Example — before:** +```rust +struct Parser { + buffer: Vec, +} + +impl Parser { + fn flush(&mut self) -> Vec { + let result = self.buffer.clone(); // EXPENSIVE clone + self.buffer.clear(); + result + } +} +``` + +**Example — after:** +```rust +use std::mem; + +impl Parser { + fn flush(&mut self) -> Vec { + mem::take(&mut self.buffer) // moves buffer out, replaces with empty Vec + // Zero allocations, zero copies + } +} +``` + +### When NOT to Use + +**Don't use this when:** +- The type doesn't implement Default +- You need to keep the original value (use clone) +- You're in a consuming method (just take self) + +--- + +## 5. Box for Heap Allocation + +### Source: + +[library/alloc/src/boxed.rs](https://github.com/rust-lang/rust/blob/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6/library/alloc/src/boxed.rs) + +697 Box<> usages in library/. + +### Why + +Box is the simplest smart pointer: single-owner heap allocation. +Use it when you need a value on the heap with known, fixed ownership. +Recursive types REQUIRE boxing (otherwise infinite size). + +### When to Use + +**Triggers:** +- Recursive data structures (tree nodes, linked lists) +- Trait objects (`Box`, `Box`) +- Large values you want on the heap to avoid stack overflow +- Transferring ownership without copying large structs + +**Example — before:** +```rust +// COMPILE ERROR: recursive type has infinite size +enum Tree { + Leaf(i32), + Node(Tree, Tree), // How big is Tree? Depends on Tree... infinite +} +``` + +**Example — after:** +```rust +enum Tree { + Leaf(i32), + Node(Box, Box), // Box is pointer-sized — finite! +} +``` + +### When NOT to Use + +**Don't use this when:** +- The value is small and stack allocation is fine +- You need shared ownership (use Rc/Arc) +- You need interior mutability (use RefCell or Mutex) +- You're boxing just to avoid lifetime annotations (fix the lifetimes) + +--- + +## 6. Arc for Shared Ownership Across Threads + +### Source: + +[library/alloc/src/sync.rs](https://github.com/rust-lang/rust/blob/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6/library/alloc/src/sync.rs) + +798 Rc/Arc usages in library/. + +### Why + +Arc (Atomic Reference Counted) enables multiple owners across threads. +The data lives on the heap; cloning an Arc increments the reference +count (not the data). Data is freed when the last Arc is dropped. + +### When to Use + +**Triggers:** +- Multiple threads need read access to the same data +- Ownership is shared (no single owner) +- Combined with Mutex for shared mutable state: `Arc>` + +**Example — before:** +```rust +// Can't send &data to another thread — lifetime issue +fn spawn_with_data(data: &Config) { + std::thread::spawn(|| { + // ERROR: borrowed data may not outlive the scope + println!("{}", data.name); + }); +} +``` + +**Example — after:** +```rust +use std::sync::Arc; + +fn spawn_with_data(data: Arc) { + let data = Arc::clone(&data); // cheap: just increments counter + std::thread::spawn(move || { + println!("{}", data.name); // owns a reference, lives as long as needed + }); +} +``` + +### When NOT to Use + +**Don't use this when:** +- Single-threaded (use Rc — no atomic overhead) +- Single owner is sufficient (use Box) +- You can restructure to avoid shared ownership (prefer this) +- You're using it because you can't figure out lifetimes (usually + a design smell) + +### Anti-pattern + +```rust +// DON'T: Arc everything because lifetimes are hard +struct App { + db: Arc, + cache: Arc, + config: Arc, + logger: Arc, +} +// If App owns all of these, just use owned fields! +// Arc is for SHARING, not for avoiding lifetime annotations. +``` + +--- + +## 7. Drop for Cleanup (RAII) + +### Source: + +[library/core/src/ops/drop.rs](https://github.com/rust-lang/rust/blob/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6/library/core/src/ops/drop.rs) + +254 Drop implementations in the stdlib. + +```rust +pub trait Drop { + fn drop(&mut self); +} +``` + +### Why + +Drop runs automatically when a value goes out of scope. This is RAII: +resources are tied to ownership. File handles close, locks release, +memory frees — all automatically, without try/finally or defer. + +### When to Use + +**Triggers:** +- Your type holds an external resource (file, socket, lock) +- You need cleanup code to run even on panic +- You're implementing a smart pointer or wrapper + +**Example — before (in another language):** +```python +# Must remember to close — easy to forget on error paths +f = open("data.txt") +try: + process(f) +finally: + f.close() # What if we forget this? +``` + +**Example — in Rust:** +```rust +// Drop handles cleanup automatically +{ + let f = File::open("data.txt")?; + process(&f)?; +} // f.drop() called here — file closed automatically + // Even if process() panics, Drop still runs during unwinding +``` + +### When NOT to Use + +**Don't use this when:** +- Type doesn't hold external resources (just let memory free) +- You need Copy (can't impl both Drop and Copy) +- Cleanup order matters precisely (Drop order is reverse of creation, + but this isn't always what you want — use explicit close methods) + +--- + +## 8. Lifetime Elision (Let the Compiler Infer) + +### Source: + +The Rust compiler applies 3 elision rules automatically. The stdlib +exploits this — most function signatures DON'T write lifetimes: + +```rust +// Written in source (elided): +fn first_word(s: &str) -> &str { ... } + +// What the compiler actually sees: +fn first_word<'a>(s: &'a str) -> &'a str { ... } +``` + +1,206 explicit lifetime annotations exist in library/ — only +where elision rules don't apply. + +### Why + +Lifetimes are inferred in the common case. Only annotate when: +- Multiple references in input (compiler can't guess which one + the output borrows from) +- Struct fields that borrow +- Static or complex relationships + +### When to Use + +**Triggers:** +- Function has one reference parameter and returns a reference + (rule 1: output gets input's lifetime) +- Method has `&self` and returns a reference + (rule 3: output gets self's lifetime) + +**Example — before:** +```rust +// Unnecessary explicit lifetimes +fn longest<'a>(x: &'a str, y: &'a str) -> &'a str { + if x.len() > y.len() { x } else { y } +} +// This NEEDS annotations because there are TWO input references +``` + +**Example — after:** +```rust +// Let elision work when it can: +fn trim(s: &str) -> &str { s.trim() } +// No annotations needed — one input ref, one output ref → same lifetime + +// Only annotate when the compiler ASKS you to +fn longest<'a>(x: &'a str, y: &'a str) -> &'a str { ... } +``` + +### When NOT to Use + +**Don't use this when:** +- Multiple input references exist (compiler can't infer which) +- Struct borrows from multiple sources +- The relationship is non-obvious (annotate for clarity even + if the compiler could infer it) + +--- + +## 9. AsRef/Into for Flexible Function Parameters + +### Source: + +[library/core/src/convert/mod.rs](https://github.com/rust-lang/rust/blob/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6/library/core/src/convert/mod.rs) + +79 AsRef implementations in library/. + +```rust +// std::fs::read accepts anything that can become a Path: +pub fn read>(path: P) -> io::Result> { ... } +``` + +### Why + +`AsRef` accepts `&str`, `String`, `PathBuf`, `&Path` — any +type that cheaply references a Path. One function, many input types, +zero allocation. + +### When to Use + +**Triggers:** +- You want to accept multiple types that can cheaply become &T +- Ergonomics: let callers pass String or &str interchangeably +- The conversion is cheap (reference, not allocation) + +**Example — before:** +```rust +// Accepts only &Path — callers must convert manually +fn file_exists(path: &Path) -> bool { + path.exists() +} + +file_exists(Path::new("/tmp/foo")); // works +// file_exists("/tmp/foo"); // ERROR — &str isn't &Path +``` + +**Example — after:** +```rust +fn file_exists(path: impl AsRef) -> bool { + path.as_ref().exists() +} + +file_exists("/tmp/foo"); // &str works +file_exists(String::from("/tmp")); // String works +file_exists(PathBuf::from("/tmp")); // PathBuf works +``` + +### When NOT to Use + +**Don't use this when:** +- The function is private/internal (just accept the concrete type) +- The generic bound adds confusion for no real benefit +- You need ownership, not a reference (use `Into` instead) + +--- + +## 10. PhantomData for Unused Type Parameters + +### Source: + +[library/core/src/marker.rs](https://github.com/rust-lang/rust/blob/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6/library/core/src/marker.rs) (PhantomData) + +401 PhantomData usages in the stdlib. + +```rust +pub struct PhantomData; +``` + +### Why + +Sometimes you need a type parameter for lifetime or type safety but +don't actually store the value. PhantomData tells the compiler "I'm +logically related to T" without physically containing T. Used for +variance, drop checking, and marker relationships. + +### When to Use + +**Triggers:** +- Your struct has a type parameter it doesn't store (for type safety) +- You need correct variance (covariant, contravariant, invariant) +- You need the compiler to know your type "owns" a T for drop checking +- Implementing raw pointer wrappers that should act like &T + +**Example — before:** +```rust +// COMPILE ERROR: parameter T is never used +struct TypedId { + id: u64, +} +``` + +**Example — after:** +```rust +use std::marker::PhantomData; + +struct TypedId { + id: u64, + _phantom: PhantomData, // tells compiler this is related to T +} + +// Now TypedId and TypedId are different types! +let user_id: TypedId = TypedId { id: 42, _phantom: PhantomData }; +let product_id: TypedId = TypedId { id: 42, _phantom: PhantomData }; +// user_id == product_id // COMPILE ERROR: different types +``` + +### When NOT to Use + +**Don't use this when:** +- You can just store the T directly +- The type parameter doesn't add safety (just remove it) +- You're cargo-culting PhantomData without understanding variance + +--- + +## Summary: Ownership Decision Tree + +``` +Who needs this value? +├── One owner, stack → plain value (no wrapper) +├── One owner, heap → Box +├── Multiple owners, single thread → Rc +├── Multiple owners, multi thread → Arc +└── Need mutation with shared refs? + ├── Single thread → RefCell (runtime borrow check) + └── Multi thread → Mutex or RwLock + +How to pass to a function? +├── Read only → &T (borrow) +├── Need to modify → &mut T (mutable borrow) +├── Need to store/send → T (ownership transfer) +├── Accept multiple types → impl AsRef or impl Into +└── Might need to clone → Cow<'a, T> + +Need to move out of &mut? +└── mem::take (replaces with Default, returns old value) +``` + +| Pattern | Use when | +|---|---| +| `&T` parameter | Read-only access, no ownership needed | +| `T` parameter | Must store, send, or consume the value | +| `Clone`/`Copy` | Small stack types (Copy), expensive heap types (Clone) | +| `Cow<'a, B>` | Usually borrowed, sometimes needs to own | +| `mem::take` | Move a field out of &mut self | +| `Box` | Heap allocation with single owner | +| `Arc` | Shared ownership across threads | +| `Drop` | RAII cleanup (files, locks, connections) | +| Lifetime elision | Let compiler infer when rules apply | +| `AsRef` | Accept multiple types cheaply | +| `PhantomData` | Unused type parameter for type safety | + +See also: +- [traits.md](traits.md) — Clone, Copy, Drop trait design +- [concurrency.md](concurrency.md) — Arc/Mutex patterns +- [error-handling.md](error-handling.md) — Ownership in error propagation + +