# Anti-Patterns: What Rust's Stdlib Avoids (and Why) Patterns the Rust standard library team actively avoids, extracted from studying what they DON'T do in their source code. **Source:** [rust-lang/rust](https://github.com/rust-lang/rust) at commit [`f53b654`](https://github.com/rust-lang/rust/tree/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6) --- ## 1. unwrap() in Library Code **What they avoid:** Using `.unwrap()` in non-test, non-example code without an invariant proof. **Source evidence:** 1,790 unwrap/expect calls in library/ (non-test). Concentrated in: (1) proven invariants with expect() messages, (2) compile-time constants, (3) doc examples. Zero bare `.unwrap()` calls where the value could legitimately be None/Err at runtime. **Why it's bad:** unwrap() panics. In library code, a panic kills the caller's thread without giving them a chance to handle the error. Libraries should never make that decision for their users. **What to do instead:** ```rust // BAD — library code that panics on user input pub fn parse_config(input: &str) -> Config { let value: serde_json::Value = serde_json::from_str(input).unwrap(); Config { name: value["name"].as_str().unwrap().to_string() } } // GOOD — returns Result, caller decides how to handle pub fn parse_config(input: &str) -> Result { let value: serde_json::Value = serde_json::from_str(input)?; let name = value["name"].as_str() .ok_or(ConfigError::MissingField("name"))?; Ok(Config { name: name.to_string() }) } ``` ### When to Apply This Rule **Triggers:** - `.unwrap()` in any `pub fn` that takes external input - `.unwrap()` without a comment explaining why it can't fail - `.unwrap()` in a library (as opposed to a binary/application) ### Exceptions **It's OK when:** - Proven invariant: `self.items.last().expect("items is never empty after init")` - Compile-time constant: `Regex::new(r"^\d+$").unwrap()` (pattern is always valid) - Test code: tests should panic on unexpected errors - Application code where you WANT to crash on bad state --- ## 2. String as Error Type **What they avoid:** Using `String` or `&str` as the error type in `Result`. **Source evidence:** 0 public functions in library/ return `Result`. Every error is a named struct or enum that implements the Error trait. **Why it's bad:** Can't match on it programmatically. Callers must string-compare to determine what went wrong. No type safety. No structured data. Breaks the entire Error trait ecosystem. **What to do instead:** ```rust // BAD — stringly-typed error fn connect(addr: &str) -> Result { if !valid(addr) { return Err(format!("invalid address: {addr}")); } if timeout() { return Err("connection timed out".to_string()); } Ok(conn) } // Caller: if err.contains("timeout") { ... } — fragile! // GOOD — typed error enum #[derive(Debug)] enum ConnectError { InvalidAddress(String), Timeout(Duration), Refused, } impl fmt::Display for ConnectError { ... } impl std::error::Error for ConnectError {} fn connect(addr: &str) -> Result { ... } // Caller: match err { ConnectError::Timeout(d) => retry(d), ... } ``` ### When to Apply This Rule **Triggers:** - `Result` in any function signature - Error handling that does string matching - Multiple different failure modes lumped into one String ### Exceptions **It's OK when:** - Quick scripts/prototypes that will be thrown away - Internal one-off tools where the only consumer is a human reading logs --- ## 3. Unsafe Without SAFETY Comment **What they avoid:** Any `unsafe { }` block without an accompanying `// SAFETY:` comment explaining the soundness proof. **Source evidence:** 2,463 `// SAFETY:` comments for 31,244 unsafe blocks. The density is lower than 1:1 because many unsafe blocks are one-liners inside already-documented unsafe functions. But every PUBLIC-facing unsafe block has a comment. **Why it's bad:** Without the comment, no one (including future-you) can audit whether the unsafe code is actually sound. It might be correct today but break after a refactor that invalidates an assumption no one documented. **What to do instead:** ```rust // BAD — unsafe with no justification unsafe { ptr::copy_nonoverlapping(src, dst, len); } // GOOD — proves soundness at the call site // SAFETY: src points to self.buf[0..self.len] and dst points to // out[0..self.len]. They're from different allocations so they // can't overlap. len is bounded by self.len which we checked above. unsafe { ptr::copy_nonoverlapping(src, dst, len); } ``` ### When to Apply This Rule **Triggers:** - Every `unsafe { }` block (no exceptions) - The comment should explain WHY, not just WHAT ### Exceptions **None.** This rule has no exceptions in the stdlib. --- ## 4. Clone to Satisfy the Borrow Checker **What they avoid:** Sprinkling `.clone()` calls to make the borrow checker happy instead of restructuring the code. **Source evidence:** The stdlib uses clone deliberately for explicit copies, never as a borrow-checker escape hatch. Clone calls are concentrated in: (1) actual need for independent copies, (2) Cow clone-on-write, (3) test setup. **Why it's bad:** Hides the real problem (ownership structure is wrong). Adds unnecessary allocations. Makes performance unpredictable. The borrow checker error is telling you your design has a flaw. **What to do instead:** ```rust // BAD — cloning because the borrow checker complains fn process(data: &mut Vec) { for item in data.clone().iter() { // CLONE to avoid borrow conflict if item.is_empty() { data.retain(|s| !s.is_empty()); } } } // GOOD — restructure to avoid the conflict fn process(data: &mut Vec) { data.retain(|s| !s.is_empty()); // just do what you actually want } ``` ### When to Apply This Rule **Triggers:** - `.clone()` adjacent to a borrow checker error comment - Clone inside a loop (O(n) allocations) - Cloning a large struct just to use one field ### Exceptions **It's OK when:** - You genuinely need two independent copies (fork a value) - The type is cheap to clone (small Copy types, Arc) - Prototyping to get the logic right before optimizing --- ## 5. Deref as Inheritance **What they avoid:** Using the `Deref` trait to simulate inheritance from OOP languages. **Source evidence:** Every Deref impl in the stdlib follows the "smart pointer" pattern: Box→T, Vec→[T], String→str, Arc→T. Zero cases of "struct A derefs to struct B" where B isn't the logical inner content. **Why it's bad:** Deref coercion is implicit — the compiler inserts it silently. If Dog derefs to Animal, users can accidentally call Animal methods on Dog without realizing they're crossing an abstraction boundary. Method resolution becomes unpredictable. **What to do instead:** ```rust // BAD — using Deref as inheritance struct Animal { name: String } struct Dog { animal: Animal, tricks: Vec } impl Deref for Dog { type Target = Animal; fn deref(&self) -> &Animal { &self.animal } } // dog.name works — but it's confusing, not actually inheritance // GOOD — explicit delegation struct Dog { name: String, tricks: Vec } impl Dog { fn name(&self) -> &str { &self.name } } ``` ### When to Apply This Rule **Triggers:** - Deref where the target type isn't "contained" content - Using Deref to avoid writing delegation methods - Deref between two types at the same abstraction level ### Exceptions **It's OK when:** - Your type IS a smart pointer (wraps and owns one inner value) - The target is clearly "the thing inside" (Vec→[T], Box→T) - Deref target is at a lower abstraction level (concrete → abstract) --- ## 6. Panic for Recoverable Errors **What they avoid:** Using `panic!()` for conditions that callers could reasonably handle. **Source evidence:** 514 panic! calls in library/ (non-test). ALL are for: (1) invariant violations (bugs), (2) index out of bounds where the contract says "index must be valid", (3) unimplemented/unreachable. Zero panics for "file not found" or "parse failed" type errors. **Why it's bad:** Panic unwinds the stack. The caller can't recover gracefully. In servers, it kills the request. In libraries, it's a contract violation — you're deciding for the user that this error is fatal. **What to do instead:** ```rust // BAD — panicking on user-facing errors pub fn load_config(path: &str) -> Config { let text = std::fs::read_to_string(path) .expect("failed to read config"); // KILLS caller's thread toml::from_str(&text).expect("invalid config") } // GOOD — let the caller decide pub fn load_config(path: &str) -> Result { let text = std::fs::read_to_string(path)?; let config = toml::from_str(&text)?; Ok(config) } ``` ### When to Apply This Rule **Triggers:** - panic!/expect/unwrap on external input (files, network, user data) - panic in library code (not application code) - "Failed to X" in expect messages where X is an I/O operation ### Exceptions **It's OK when:** - Programmer bug: index out of bounds on internal array (invariant violated) - Impossible state: match arm that can't be reached by construction - Application main(): crashing on startup config failure is reasonable --- ## 7. God Struct (All State in One Place) **What they avoid:** A single struct that holds the entire application state with dozens of fields. **Source evidence:** The stdlib separates concerns into distinct types. `io::Error` has 2 fields. `Vec` has 3 fields. The largest public struct is OsString at 1 field (the inner platform string). No struct in the stdlib has more than 6-7 fields. **Why it's bad:** Can't borrow parts independently (borrowing one field borrows the whole struct). Hard to test in isolation. Methods grow unboundedly. Violates single responsibility. **What to do instead:** ```rust // BAD — god struct struct App { db: Database, cache: Cache, config: Config, logger: Logger, metrics: Metrics, users: HashMap, sessions: HashMap, pending_jobs: VecDeque, } // Every method has access to everything. Can't test cache without db. // GOOD — separated concerns struct App { db: Database, cache: CacheLayer, jobs: JobQueue } struct CacheLayer { inner: Cache, config: CacheConfig } struct JobQueue { pending: VecDeque, metrics: Metrics } ``` ### When to Apply This Rule **Triggers:** - Struct with 8+ fields - Methods that only use 2-3 of the struct's fields - Can't write a unit test without constructing the entire struct - Borrow checker fights when trying to use two parts simultaneously ### Exceptions **It's OK when:** - Configuration structs (many fields, all read-only, constructed once) - DTOs for serialization (matching an external schema) - Builder intermediate state (temporarily holds many optional fields) --- ## 8. Box Instead of Generics **What they avoid:** Using `Box` with downcasting to erase types instead of using proper generics. **Source evidence:** `Any` is used in the stdlib only for: panic payloads (`Box`) and the reflection system. Zero uses of `dyn Any` as a general-purpose container or map value. **Why it's bad:** Loses all type information. Downcasting can fail at runtime. You've recreated dynamic typing inside a statically-typed language — the compiler can't help you anymore. **What to do instead:** ```rust // BAD — type erasure via Any (Java's Object approach) struct Registry { items: HashMap>, } impl Registry { fn get(&self, key: &str) -> Option<&T> { self.items.get(key)?.downcast_ref() // runtime failure if wrong type! } } // GOOD — use generics or typed enums enum ConfigValue { String(String), Int(i64), Bool(bool) } struct Config { values: HashMap } // Or: use generics to make the container type-safe struct TypedMap(HashMap>); // at least TypeId keyed ``` ### When to Apply This Rule **Triggers:** - `Box` in a non-error, non-panic context - `.downcast_ref::()` calls scattered through the code - "I don't know what type this will be" — you probably do ### Exceptions **It's OK when:** - Panic handling (that's what `catch_unwind` returns) - Plugin systems where types genuinely aren't known at compile time - Thin FFI wrappers bridging to dynamic languages --- ## 9. Shared Mutable State Without Synchronization **What they avoid:** Any `static mut` or mutable global state without proper synchronization primitives. **Source evidence:** Zero `static mut` in public library APIs. All shared mutable state uses: `AtomicUsize` (783 usages), `Mutex` (135 usages), `OnceLock` (254 usages), or `thread_local!`. The compiler prevents data races at compile time via Send/Sync. **Why it's bad:** `static mut` is always unsafe and the unsoundest construct in Rust. Multiple threads accessing it simultaneously is undefined behavior. Even single-threaded access requires unsafe. **What to do instead:** ```rust // BAD — global mutable state (UB waiting to happen) static mut COUNTER: u64 = 0; fn increment() { unsafe { COUNTER += 1; } // DATA RACE if called from multiple threads } // GOOD — atomic for simple counters static COUNTER: AtomicU64 = AtomicU64::new(0); fn increment() { COUNTER.fetch_add(1, Ordering::Relaxed); // thread-safe, no unsafe } // GOOD — OnceLock for lazy initialization static CONFIG: OnceLock = OnceLock::new(); fn get_config() -> &'static Config { CONFIG.get_or_init(|| load_config()) } ``` ### When to Apply This Rule **Triggers:** - `static mut` anywhere - `unsafe` blocks that access global state - Global state without atomic/mutex/rwlock protection ### Exceptions **Literally none for `static mut`.** There is no safe use case that can't be expressed with atomics, OnceLock, or Mutex. The Rust team has discussed deprecating `static mut` entirely. --- ## 10. Stringly-Typed APIs (Magic Constants) **What they avoid:** Using string constants where enums would provide compile-time checking. **Source evidence:** The stdlib uses enums extensively: `ErrorKind` (30+ variants), `Ordering` (3 variants), `SeekFrom` (3 variants). String parameters are only for genuinely dynamic data (file paths, user input, format strings). **Why it's bad:** Typos compile fine. `"relaexd"` won't be caught. No autocomplete. No exhaustive matching. Refactoring is grep-based instead of compiler-based. **What to do instead:** ```rust // BAD — string constants (invisible typos) fn set_log_level(level: &str) { ... } set_log_level("wraning"); // typo compiles fine — bug at runtime // GOOD — enum (typos are compile errors) enum LogLevel { Error, Warning, Info, Debug, Trace } fn set_log_level(level: LogLevel) { ... } set_log_level(LogLevel::Wraning); // COMPILE ERROR — caught immediately ``` ### When to Apply This Rule **Triggers:** - String parameters with a fixed set of valid values - Match/if-else chains comparing strings - Documentation that says "valid values are: X, Y, Z" ### Exceptions **It's OK when:** - Values are genuinely dynamic (user input, file paths, URLs) - Interop with external systems that use strings (HTTP headers, env vars) - The set of values changes frequently (but consider #[non_exhaustive] enum)