Files

T

Rodin f270fc68c1 docs: add smell files (anti-patterns + common mistakes)

anti-patterns.md: 10 entries, 490 lines — what the stdlib avoids
common-mistakes.md: 10 entries, 508 lines — language transition smells

Covers: unwrap in library code, String errors, unsafe without SAFETY,
clone as borrow-checker escape, Deref as inheritance, panic for
recoverable errors, god structs, Box<dyn Any>, static mut, stringly-typed APIs,
Arc<Mutex> everywhere, collect then iterate, index loops, try/catch
mentality, null sentinels, OOP hierarchies, global state, impl vs dyn.

2026-04-30 16:01:52 -07:00

15 KiB

Raw Permalink Blame History

Common Mistakes: Language Transition Smells in Rust

Patterns people bring from other languages (C++, Java, Python, Go) that don't work well in Rust. These aren't necessarily wrong everywhere — they're smells that suggest the author hasn't yet internalized Rust's ownership model.

1. Arc<Mutex> Everywhere (Borrow Checker Avoidance)

What it looks like: Every shared reference wrapped in Arc<Mutex<T>> even when single-threaded or single-owner.

Where it comes from: Java/C# where everything is garbage-collected and shared references are free. Developers use Arc as "Rust's version of a reference."

Why it's a smell: Arc has atomic overhead on every clone/drop. Mutex has lock overhead on every access. If you're doing this single-threaded, you're paying for concurrency you don't use. It usually means the ownership structure is wrong.

What to do instead:

// SMELL — Arc<Mutex> as a crutch
struct App {
    config: Arc<Mutex<Config>>,
    db: Arc<Mutex<Database>>,
    cache: Arc<Mutex<Cache>>,
}
// Every access: self.config.lock().unwrap().timeout

// FIX — own what you own, borrow what you borrow
struct App {
    config: Config,        // App owns config (no sharing needed)
    db: Database,          // App owns database
    cache: Cache,          // App owns cache
}
// Access: self.config.timeout

When to Apply This Rule

Triggers:

Arc<Mutex<T>> in single-threaded code
Every field in a struct wrapped in Arc
More than 2 Arc fields (usually wrong)
Using .lock().unwrap() dozens of times per function

Exceptions

It's OK when:

The value IS shared across threads (that's what Arc is for)
You've restructured and genuinely need shared ownership
Async code with multiple tasks accessing shared state

2. Collecting Iterators Unnecessarily

What it looks like: .collect::<Vec<_>>() followed by another iteration over the collected vec.

Where it comes from: Python lists and Java streams where intermediate collections are normal.

Why it's a smell: Allocates a Vec you immediately discard. Iterators are lazy and compose without allocation. Two passes through the data when one would suffice.

What to do instead:

// SMELL — collect then iterate again
let names: Vec<String> = users.iter()
    .map(|u| u.name.clone())
    .collect();
let long_names: Vec<&String> = names.iter()
    .filter(|n| n.len() > 10)
    .collect();

// FIX — chain without intermediate collection
let long_names: Vec<String> = users.iter()
    .map(|u| u.name.clone())
    .filter(|n| n.len() > 10)
    .collect();  // one allocation, one pass

When to Apply This Rule

Triggers:

.collect() followed by .iter() on the result
Intermediate Vec that's only iterated once then dropped
Multiple passes when a single chain would work

Exceptions

It's OK when:

You need the collection for multiple uses (sort, binary search)
You need to know the length before processing
Ownership requires it (can't chain because of lifetime issues)

3. Index Loops Instead of Iterators

What it looks like: for i in 0..vec.len() { vec[i] ... } instead of for item in &vec { ... }.

Where it comes from: C, C++, Java for-loops with index variables.

Why it's a smell: Bounds checking on every access (runtime cost). Can't be auto-vectorized as well. More error-prone (off-by-one). Doesn't compose with iterator adapters. Not idiomatic.

What to do instead:

// SMELL — C-style index loop
for i in 0..items.len() {
    process(&items[i]);  // bounds check on every access
}

// FIX — iterator (no bounds checks, composable)
for item in &items {
    process(item);
}

// Need the index too?
for (i, item) in items.iter().enumerate() {
    println!("{i}: {item}");
}

When to Apply This Rule

Triggers:

for i in 0..x.len() followed by x[i]
Index variable only used for indexing (not for math)
Parallel iteration of two collections by index

Exceptions

It's OK when:

You need to modify the collection during iteration (but consider retain/drain)
You need random access patterns (skip ahead, look behind)
The algorithm genuinely needs indices (matrix operations)

4. .clone() as First Resort

What it looks like: Sprinkling .clone() to silence every borrow checker error without thinking about why it's complaining.

Where it comes from: GC languages where copying references is free. The borrow checker error feels like a "bug to fix" rather than a "design signal."

Why it's a smell: Hidden allocation costs. The borrow checker is telling you the ownership structure is wrong — clone hides the message without fixing the problem.

What to do instead:

// SMELL — clone to silence borrow checker
fn process(data: &HashMap<String, Vec<String>>) -> Vec<String> {
    let keys: Vec<String> = data.keys().cloned().collect();
    let mut result = Vec::new();
    for key in &keys {
        for value in data.get(key).unwrap() {
            result.push(format!("{key}: {value}"));
        }
    }
    result
}

// FIX — restructure to avoid the need
fn process(data: &HashMap<String, Vec<String>>) -> Vec<String> {
    data.iter()
        .flat_map(|(key, values)| {
            values.iter().map(move |v| format!("{key}: {v}"))
        })
        .collect()
}

When to Apply This Rule

Triggers:

.clone() added right after a borrow checker error
Clone inside a loop (O(n) allocations)
Clone of a large struct to use one field
More than 3 clone calls in one function

Exceptions

It's OK when:

You genuinely need independent ownership (forking a value)
Cloning is cheap (Arc::clone is just a counter increment)
Prototyping — clone now, optimize later (but DO optimize later)

5. try/catch Mentality (panic + catch_unwind)

What it looks like: Using panic!() as "throw" and catch_unwind() as "catch" to simulate exception handling.

Where it comes from: Java/Python/C# exception handling where throwing and catching is the normal error flow.

Why it's a smell: Panics are for BUGS, not expected errors. catch_unwind doesn't catch all panics (abort mode). It's not equivalent to try/catch — it's a last-resort safety net. Using it for control flow means you've reimplemented exceptions badly.

What to do instead:

// SMELL — panic as exception, catch_unwind as try/catch
fn validate(input: &str) {
    if input.is_empty() { panic!("empty input"); }
    if input.len() > 100 { panic!("too long"); }
}
fn handle(input: &str) -> String {
    match std::panic::catch_unwind(|| validate(input)) {
        Ok(()) => "valid".into(),
        Err(e) => format!("error: {:?}", e),
    }
}

// FIX — use Result for expected errors
fn validate(input: &str) -> Result<(), ValidationError> {
    if input.is_empty() { return Err(ValidationError::Empty); }
    if input.len() > 100 { return Err(ValidationError::TooLong(input.len())); }
    Ok(())
}
fn handle(input: &str) -> String {
    match validate(input) {
        Ok(()) => "valid".into(),
        Err(e) => format!("error: {e}"),
    }
}

When to Apply This Rule

Triggers:

catch_unwind in application code (not test harnesses)
panic!() for validatable conditions
Pattern of panic → catch → extract message

Exceptions

It's OK when:

Thread pool implementations (catch panics from user-provided closures)
FFI boundaries (panics must not cross extern "C" boundaries)
Test frameworks (catch panics to report failures)

6. Null Sentinel Values Instead of Option

What it looks like: Using magic values (-1, 0, "", u64::MAX) to represent "no value" instead of Option<T>.

Where it comes from: C (NULL/0/-1 as sentinel), Go (zero values), database design (nullable columns with default values).

Why it's a smell: Compiler can't enforce checking. -1 might be a valid value in some context. Empty string might be legitimate input. Option forces the caller to handle the None case.

What to do instead:

// SMELL — sentinel values
struct User {
    age: i32,       // -1 means "not provided"
    nickname: String, // "" means "none"
}
fn get_age(user: &User) -> i32 {
    if user.age == -1 { panic!("no age!"); }  // easy to forget this check
    user.age
}

// FIX — Option makes absence explicit
struct User {
    age: Option<u32>,         // None means "not provided"
    nickname: Option<String>, // None means "none"
}
fn get_age(user: &User) -> Option<u32> {
    user.age  // caller MUST handle None — compiler enforces it
}

When to Apply This Rule

Triggers:

Numeric fields with documented "special" values
String fields where empty means "absent"
Comments like "// -1 means not set"
Checks like if x == -1 or if s.is_empty() before use

Exceptions

It's OK when:

FFI with C APIs that use sentinel values (wrap in Option at the boundary)
Performance-critical packed structs where Option's niche doesn't fit
Matching an external protocol/format that uses sentinels

7. OOP Inheritance Thinking (Deep Trait Object Hierarchies)

What it looks like: Complex trait object hierarchies trying to replicate class inheritance from Java/C++.

Where it comes from: OOP languages where "is-a" relationships and abstract base classes are the primary abstraction tool.

Why it's a smell: Rust doesn't have inheritance. Trait objects have dynamic dispatch overhead. Complex hierarchies lead to Box<dyn A + B + C> type soup. Composition is almost always better.

What to do instead:

// SMELL — trying to build class hierarchy
trait Shape { fn area(&self) -> f64; }
trait Drawable: Shape { fn draw(&self); }
trait Resizable: Drawable { fn resize(&mut self, factor: f64); }
// Three levels of "inheritance" — complex, fragile

// FIX — flat traits + composition
trait HasArea { fn area(&self) -> f64; }
trait Drawable { fn draw(&self); }
trait Resizable { fn resize(&mut self, factor: f64); }

struct Circle { radius: f64 }
impl HasArea for Circle { ... }
impl Drawable for Circle { ... }
impl Resizable for Circle { ... }
// Each trait is independent — implement what you need

When to Apply This Rule

Triggers:

Trait with supertraits 3+ levels deep
Box<dyn A> where A has 5+ methods
Trying to "inherit" behavior from a base struct

Exceptions

It's OK when:

Supertrait bounds are genuinely required (Error: Debug + Display)
The hierarchy matches a real protocol (Iterator → DoubleEndedIterator)
Framework design where trait objects are the extension point

8. Global Mutable State (static mut, lazy_static Mutex)

What it looks like: Global variables with interior mutability as the primary state management strategy.

Where it comes from: C global variables, Python module-level state, Singleton pattern from Java/C#.

Why it's a smell: Makes testing hard (tests share state). Makes reasoning hard (who modified it? when?). Makes concurrency hard (locks everywhere). Creates hidden dependencies between functions.

What to do instead:

// SMELL — global mutable state
static STATE: LazyLock<Mutex<AppState>> = LazyLock::new(|| {
    Mutex::new(AppState::new())
});

fn process_request(req: Request) -> Response {
    let mut state = STATE.lock().unwrap();
    state.handle(req)  // hidden dependency on global
}

// FIX — pass state explicitly
struct App { state: AppState }

impl App {
    fn process_request(&mut self, req: Request) -> Response {
        self.state.handle(req)  // dependency is visible and testable
    }
}

When to Apply This Rule

Triggers:

static + Mutex or RwLock for application state
Functions that "magically" access state without parameters
Hard-to-test code because of hidden global dependencies
lazy_static! or LazyLock for mutable state

Exceptions

It's OK when:

True global configuration (read-only after init — use OnceLock)
Logging (global logger is an accepted pattern)
Metrics/counters (AtomicU64 for statistics)
Process-level singletons (allocator, thread pool)

9. Returning impl Trait When Box Is Needed

What it looks like: Confusion between impl Trait (static dispatch, one concrete type) and Box<dyn Trait> (dynamic dispatch, multiple types).

Where it comes from: Not understanding that impl Trait in return position means "one specific type the compiler knows" — not "any type implementing this trait."

Why it's a smell: Using impl Trait when you need to return DIFFERENT types based on runtime conditions causes confusing errors. Using Box<dyn Trait> when you always return the same type adds unnecessary heap allocation.

What to do instead:

// SMELL — impl Trait can't return different types
fn get_writer(use_file: bool) -> impl Write {
    if use_file {
        File::create("out.txt").unwrap()
    } else {
        io::stdout()  // ERROR: different types!
    }
}

// FIX — Box<dyn Trait> for runtime polymorphism
fn get_writer(use_file: bool) -> Box<dyn Write> {
    if use_file {
        Box::new(File::create("out.txt").unwrap())
    } else {
        Box::new(io::stdout())
    }
}

// FIX — enum for known set of types (no heap allocation)
enum Writer { File(File), Stdout(io::Stdout) }
impl Write for Writer { ... }

When to Apply This Rule

Triggers:

Compiler error "expected X, found Y" in if/match arms with impl Trait
Box when only one type is ever returned (unnecessary)
Confusion about when to use which

Exceptions

It's OK when:

impl Iterator in return position (common, always one type)
impl Future for async functions (always one type)

10. Ignoring #[must_use] Warnings

What it looks like: Suppressing Result unused warnings with let _ = ... without considering whether the error matters.

Where it comes from: Languages where exceptions propagate automatically. In Python/Java, if you don't catch it, it propagates. In Rust, if you don't check it, it's silently lost.

Why it's a smell: let _ = file.write(data); silently discards a potential I/O error. The data might not have been written. The user never finds out.

What to do instead:

// SMELL — silencing the warning without thinking
let _ = file.write_all(data);  // error swallowed — data might be lost
let _ = channel.send(msg);     // receiver might be gone — msg dropped

// FIX — handle or propagate
file.write_all(data)?;  // propagate to caller
// Or if you genuinely don't care:
if let Err(e) = channel.send(msg) {
    tracing::warn!("channel closed: {e}");  // at least log it
}

When to Apply This Rule

Triggers:

let _ = on a Result without any comment
Multiple let _ = in sequence (each one hiding a potential failure)
The function being called is I/O, network, or state-changing

Exceptions

It's OK when:

The error genuinely doesn't matter (e.g., best-effort logging)
The operation is idempotent and will be retried anyway
You've documented WHY: let _ = tx.send(msg); // receiver might be gone, that's fine

15 KiB Raw Permalink Blame History

Common Mistakes: Language Transition Smells in Rust

1. Arc<Mutex> Everywhere (Borrow Checker Avoidance)

When to Apply This Rule

Exceptions

2. Collecting Iterators Unnecessarily

When to Apply This Rule

Exceptions

3. Index Loops Instead of Iterators

When to Apply This Rule

Exceptions

4. .clone() as First Resort

When to Apply This Rule

Exceptions

5. try/catch Mentality (panic + catch_unwind)

When to Apply This Rule

Exceptions

6. Null Sentinel Values Instead of Option

When to Apply This Rule

Exceptions

7. OOP Inheritance Thinking (Deep Trait Object Hierarchies)

When to Apply This Rule

Exceptions

8. Global Mutable State (static mut, lazy_static Mutex)

When to Apply This Rule

Exceptions

9. Returning impl Trait When Box Is Needed

When to Apply This Rule

Exceptions

10. Ignoring #[must_use] Warnings

When to Apply This Rule

Exceptions

15 KiB

Raw Permalink Blame History