# Common Mistakes: Language Transition Smells in Rust Patterns people bring from other languages (C++, Java, Python, Go) that don't work well in Rust. These aren't necessarily wrong everywhere — they're smells that suggest the author hasn't yet internalized Rust's ownership model. --- ## 1. Arc> Everywhere (Borrow Checker Avoidance) **What it looks like:** Every shared reference wrapped in `Arc>` even when single-threaded or single-owner. **Where it comes from:** Java/C# where everything is garbage-collected and shared references are free. Developers use Arc as "Rust's version of a reference." **Why it's a smell:** Arc has atomic overhead on every clone/drop. Mutex has lock overhead on every access. If you're doing this single-threaded, you're paying for concurrency you don't use. It usually means the ownership structure is wrong. **What to do instead:** ```rust // SMELL — Arc as a crutch struct App { config: Arc>, db: Arc>, cache: Arc>, } // Every access: self.config.lock().unwrap().timeout // FIX — own what you own, borrow what you borrow struct App { config: Config, // App owns config (no sharing needed) db: Database, // App owns database cache: Cache, // App owns cache } // Access: self.config.timeout ``` ### When to Apply This Rule **Triggers:** - `Arc>` in single-threaded code - Every field in a struct wrapped in Arc - More than 2 Arc fields (usually wrong) - Using `.lock().unwrap()` dozens of times per function ### Exceptions **It's OK when:** - The value IS shared across threads (that's what Arc is for) - You've restructured and genuinely need shared ownership - Async code with multiple tasks accessing shared state --- ## 2. Collecting Iterators Unnecessarily **What it looks like:** `.collect::>()` followed by another iteration over the collected vec. **Where it comes from:** Python lists and Java streams where intermediate collections are normal. **Why it's a smell:** Allocates a Vec you immediately discard. Iterators are lazy and compose without allocation. Two passes through the data when one would suffice. **What to do instead:** ```rust // SMELL — collect then iterate again let names: Vec = users.iter() .map(|u| u.name.clone()) .collect(); let long_names: Vec<&String> = names.iter() .filter(|n| n.len() > 10) .collect(); // FIX — chain without intermediate collection let long_names: Vec = users.iter() .map(|u| u.name.clone()) .filter(|n| n.len() > 10) .collect(); // one allocation, one pass ``` ### When to Apply This Rule **Triggers:** - `.collect()` followed by `.iter()` on the result - Intermediate `Vec` that's only iterated once then dropped - Multiple passes when a single chain would work ### Exceptions **It's OK when:** - You need the collection for multiple uses (sort, binary search) - You need to know the length before processing - Ownership requires it (can't chain because of lifetime issues) --- ## 3. Index Loops Instead of Iterators **What it looks like:** `for i in 0..vec.len() { vec[i] ... }` instead of `for item in &vec { ... }`. **Where it comes from:** C, C++, Java for-loops with index variables. **Why it's a smell:** Bounds checking on every access (runtime cost). Can't be auto-vectorized as well. More error-prone (off-by-one). Doesn't compose with iterator adapters. Not idiomatic. **What to do instead:** ```rust // SMELL — C-style index loop for i in 0..items.len() { process(&items[i]); // bounds check on every access } // FIX — iterator (no bounds checks, composable) for item in &items { process(item); } // Need the index too? for (i, item) in items.iter().enumerate() { println!("{i}: {item}"); } ``` ### When to Apply This Rule **Triggers:** - `for i in 0..x.len()` followed by `x[i]` - Index variable only used for indexing (not for math) - Parallel iteration of two collections by index ### Exceptions **It's OK when:** - You need to modify the collection during iteration (but consider retain/drain) - You need random access patterns (skip ahead, look behind) - The algorithm genuinely needs indices (matrix operations) --- ## 4. .clone() as First Resort **What it looks like:** Sprinkling `.clone()` to silence every borrow checker error without thinking about why it's complaining. **Where it comes from:** GC languages where copying references is free. The borrow checker error feels like a "bug to fix" rather than a "design signal." **Why it's a smell:** Hidden allocation costs. The borrow checker is telling you the ownership structure is wrong — clone hides the message without fixing the problem. **What to do instead:** ```rust // SMELL — clone to silence borrow checker fn process(data: &HashMap>) -> Vec { let keys: Vec = data.keys().cloned().collect(); let mut result = Vec::new(); for key in &keys { for value in data.get(key).unwrap() { result.push(format!("{key}: {value}")); } } result } // FIX — restructure to avoid the need fn process(data: &HashMap>) -> Vec { data.iter() .flat_map(|(key, values)| { values.iter().map(move |v| format!("{key}: {v}")) }) .collect() } ``` ### When to Apply This Rule **Triggers:** - `.clone()` added right after a borrow checker error - Clone inside a loop (O(n) allocations) - Clone of a large struct to use one field - More than 3 clone calls in one function ### Exceptions **It's OK when:** - You genuinely need independent ownership (forking a value) - Cloning is cheap (Arc::clone is just a counter increment) - Prototyping — clone now, optimize later (but DO optimize later) --- ## 5. try/catch Mentality (panic + catch_unwind) **What it looks like:** Using `panic!()` as "throw" and `catch_unwind()` as "catch" to simulate exception handling. **Where it comes from:** Java/Python/C# exception handling where throwing and catching is the normal error flow. **Why it's a smell:** Panics are for BUGS, not expected errors. `catch_unwind` doesn't catch all panics (abort mode). It's not equivalent to try/catch — it's a last-resort safety net. Using it for control flow means you've reimplemented exceptions badly. **What to do instead:** ```rust // SMELL — panic as exception, catch_unwind as try/catch fn validate(input: &str) { if input.is_empty() { panic!("empty input"); } if input.len() > 100 { panic!("too long"); } } fn handle(input: &str) -> String { match std::panic::catch_unwind(|| validate(input)) { Ok(()) => "valid".into(), Err(e) => format!("error: {:?}", e), } } // FIX — use Result for expected errors fn validate(input: &str) -> Result<(), ValidationError> { if input.is_empty() { return Err(ValidationError::Empty); } if input.len() > 100 { return Err(ValidationError::TooLong(input.len())); } Ok(()) } fn handle(input: &str) -> String { match validate(input) { Ok(()) => "valid".into(), Err(e) => format!("error: {e}"), } } ``` ### When to Apply This Rule **Triggers:** - `catch_unwind` in application code (not test harnesses) - `panic!()` for validatable conditions - Pattern of panic → catch → extract message ### Exceptions **It's OK when:** - Thread pool implementations (catch panics from user-provided closures) - FFI boundaries (panics must not cross extern "C" boundaries) - Test frameworks (catch panics to report failures) --- ## 6. Null Sentinel Values Instead of Option **What it looks like:** Using magic values (-1, 0, "", `u64::MAX`) to represent "no value" instead of `Option`. **Where it comes from:** C (NULL/0/-1 as sentinel), Go (zero values), database design (nullable columns with default values). **Why it's a smell:** Compiler can't enforce checking. -1 might be a valid value in some context. Empty string might be legitimate input. Option forces the caller to handle the None case. **What to do instead:** ```rust // SMELL — sentinel values struct User { age: i32, // -1 means "not provided" nickname: String, // "" means "none" } fn get_age(user: &User) -> i32 { if user.age == -1 { panic!("no age!"); } // easy to forget this check user.age } // FIX — Option makes absence explicit struct User { age: Option, // None means "not provided" nickname: Option, // None means "none" } fn get_age(user: &User) -> Option { user.age // caller MUST handle None — compiler enforces it } ``` ### When to Apply This Rule **Triggers:** - Numeric fields with documented "special" values - String fields where empty means "absent" - Comments like "// -1 means not set" - Checks like `if x == -1` or `if s.is_empty()` before use ### Exceptions **It's OK when:** - FFI with C APIs that use sentinel values (wrap in Option at the boundary) - Performance-critical packed structs where Option's niche doesn't fit - Matching an external protocol/format that uses sentinels --- ## 7. OOP Inheritance Thinking (Deep Trait Object Hierarchies) **What it looks like:** Complex trait object hierarchies trying to replicate class inheritance from Java/C++. **Where it comes from:** OOP languages where "is-a" relationships and abstract base classes are the primary abstraction tool. **Why it's a smell:** Rust doesn't have inheritance. Trait objects have dynamic dispatch overhead. Complex hierarchies lead to Box type soup. Composition is almost always better. **What to do instead:** ```rust // SMELL — trying to build class hierarchy trait Shape { fn area(&self) -> f64; } trait Drawable: Shape { fn draw(&self); } trait Resizable: Drawable { fn resize(&mut self, factor: f64); } // Three levels of "inheritance" — complex, fragile // FIX — flat traits + composition trait HasArea { fn area(&self) -> f64; } trait Drawable { fn draw(&self); } trait Resizable { fn resize(&mut self, factor: f64); } struct Circle { radius: f64 } impl HasArea for Circle { ... } impl Drawable for Circle { ... } impl Resizable for Circle { ... } // Each trait is independent — implement what you need ``` ### When to Apply This Rule **Triggers:** - Trait with supertraits 3+ levels deep - `Box` where A has 5+ methods - Trying to "inherit" behavior from a base struct ### Exceptions **It's OK when:** - Supertrait bounds are genuinely required (Error: Debug + Display) - The hierarchy matches a real protocol (Iterator → DoubleEndedIterator) - Framework design where trait objects are the extension point --- ## 8. Global Mutable State (static mut, lazy_static Mutex) **What it looks like:** Global variables with interior mutability as the primary state management strategy. **Where it comes from:** C global variables, Python module-level state, Singleton pattern from Java/C#. **Why it's a smell:** Makes testing hard (tests share state). Makes reasoning hard (who modified it? when?). Makes concurrency hard (locks everywhere). Creates hidden dependencies between functions. **What to do instead:** ```rust // SMELL — global mutable state static STATE: LazyLock> = LazyLock::new(|| { Mutex::new(AppState::new()) }); fn process_request(req: Request) -> Response { let mut state = STATE.lock().unwrap(); state.handle(req) // hidden dependency on global } // FIX — pass state explicitly struct App { state: AppState } impl App { fn process_request(&mut self, req: Request) -> Response { self.state.handle(req) // dependency is visible and testable } } ``` ### When to Apply This Rule **Triggers:** - `static` + `Mutex` or `RwLock` for application state - Functions that "magically" access state without parameters - Hard-to-test code because of hidden global dependencies - `lazy_static!` or `LazyLock` for mutable state ### Exceptions **It's OK when:** - True global configuration (read-only after init — use OnceLock) - Logging (global logger is an accepted pattern) - Metrics/counters (AtomicU64 for statistics) - Process-level singletons (allocator, thread pool) --- ## 9. Returning impl Trait When Box Is Needed **What it looks like:** Confusion between `impl Trait` (static dispatch, one concrete type) and `Box` (dynamic dispatch, multiple types). **Where it comes from:** Not understanding that `impl Trait` in return position means "one specific type the compiler knows" — not "any type implementing this trait." **Why it's a smell:** Using `impl Trait` when you need to return DIFFERENT types based on runtime conditions causes confusing errors. Using `Box` when you always return the same type adds unnecessary heap allocation. **What to do instead:** ```rust // SMELL — impl Trait can't return different types fn get_writer(use_file: bool) -> impl Write { if use_file { File::create("out.txt").unwrap() } else { io::stdout() // ERROR: different types! } } // FIX — Box for runtime polymorphism fn get_writer(use_file: bool) -> Box { if use_file { Box::new(File::create("out.txt").unwrap()) } else { Box::new(io::stdout()) } } // FIX — enum for known set of types (no heap allocation) enum Writer { File(File), Stdout(io::Stdout) } impl Write for Writer { ... } ``` ### When to Apply This Rule **Triggers:** - Compiler error "expected X, found Y" in if/match arms with impl Trait - Box when only one type is ever returned (unnecessary) - Confusion about when to use which ### Exceptions **It's OK when:** - `impl Iterator` in return position (common, always one type) - `impl Future` for async functions (always one type) --- ## 10. Ignoring #[must_use] Warnings **What it looks like:** Suppressing `Result` unused warnings with `let _ = ...` without considering whether the error matters. **Where it comes from:** Languages where exceptions propagate automatically. In Python/Java, if you don't catch it, it propagates. In Rust, if you don't check it, it's silently lost. **Why it's a smell:** `let _ = file.write(data);` silently discards a potential I/O error. The data might not have been written. The user never finds out. **What to do instead:** ```rust // SMELL — silencing the warning without thinking let _ = file.write_all(data); // error swallowed — data might be lost let _ = channel.send(msg); // receiver might be gone — msg dropped // FIX — handle or propagate file.write_all(data)?; // propagate to caller // Or if you genuinely don't care: if let Err(e) = channel.send(msg) { tracing::warn!("channel closed: {e}"); // at least log it } ``` ### When to Apply This Rule **Triggers:** - `let _ =` on a Result without any comment - Multiple `let _ =` in sequence (each one hiding a potential failure) - The function being called is I/O, network, or state-changing ### Exceptions **It's OK when:** - The error genuinely doesn't matter (e.g., best-effort logging) - The operation is idempotent and will be retried anyway - You've documented WHY: `let _ = tx.send(msg); // receiver might be gone, that's fine`