docs: add smell files (anti-patterns + common mistakes)
anti-patterns.md: 10 entries, 490 lines — what the stdlib avoids common-mistakes.md: 10 entries, 508 lines — language transition smells Covers: unwrap in library code, String errors, unsafe without SAFETY, clone as borrow-checker escape, Deref as inheritance, panic for recoverable errors, god structs, Box<dyn Any>, static mut, stringly-typed APIs, Arc<Mutex> everywhere, collect then iterate, index loops, try/catch mentality, null sentinels, OOP hierarchies, global state, impl vs dyn.
This commit is contained in:
@@ -0,0 +1,508 @@
|
||||
# Common Mistakes: Language Transition Smells in Rust
|
||||
|
||||
Patterns people bring from other languages (C++, Java, Python, Go)
|
||||
that don't work well in Rust. These aren't necessarily wrong
|
||||
everywhere — they're smells that suggest the author hasn't yet
|
||||
internalized Rust's ownership model.
|
||||
|
||||
---
|
||||
|
||||
## 1. Arc<Mutex<T>> Everywhere (Borrow Checker Avoidance)
|
||||
|
||||
**What it looks like:** Every shared reference wrapped in
|
||||
`Arc<Mutex<T>>` even when single-threaded or single-owner.
|
||||
|
||||
**Where it comes from:** Java/C# where everything is garbage-collected
|
||||
and shared references are free. Developers use Arc<Mutex> as "Rust's
|
||||
version of a reference."
|
||||
|
||||
**Why it's a smell:** Arc has atomic overhead on every clone/drop.
|
||||
Mutex has lock overhead on every access. If you're doing this
|
||||
single-threaded, you're paying for concurrency you don't use.
|
||||
It usually means the ownership structure is wrong.
|
||||
|
||||
**What to do instead:**
|
||||
```rust
|
||||
// SMELL — Arc<Mutex> as a crutch
|
||||
struct App {
|
||||
config: Arc<Mutex<Config>>,
|
||||
db: Arc<Mutex<Database>>,
|
||||
cache: Arc<Mutex<Cache>>,
|
||||
}
|
||||
// Every access: self.config.lock().unwrap().timeout
|
||||
|
||||
// FIX — own what you own, borrow what you borrow
|
||||
struct App {
|
||||
config: Config, // App owns config (no sharing needed)
|
||||
db: Database, // App owns database
|
||||
cache: Cache, // App owns cache
|
||||
}
|
||||
// Access: self.config.timeout
|
||||
```
|
||||
|
||||
### When to Apply This Rule
|
||||
|
||||
**Triggers:**
|
||||
- `Arc<Mutex<T>>` in single-threaded code
|
||||
- Every field in a struct wrapped in Arc
|
||||
- More than 2 Arc<Mutex> fields (usually wrong)
|
||||
- Using `.lock().unwrap()` dozens of times per function
|
||||
|
||||
### Exceptions
|
||||
|
||||
**It's OK when:**
|
||||
- The value IS shared across threads (that's what Arc is for)
|
||||
- You've restructured and genuinely need shared ownership
|
||||
- Async code with multiple tasks accessing shared state
|
||||
|
||||
---
|
||||
|
||||
## 2. Collecting Iterators Unnecessarily
|
||||
|
||||
**What it looks like:** `.collect::<Vec<_>>()` followed by another
|
||||
iteration over the collected vec.
|
||||
|
||||
**Where it comes from:** Python lists and Java streams where
|
||||
intermediate collections are normal.
|
||||
|
||||
**Why it's a smell:** Allocates a Vec you immediately discard.
|
||||
Iterators are lazy and compose without allocation. Two passes
|
||||
through the data when one would suffice.
|
||||
|
||||
**What to do instead:**
|
||||
```rust
|
||||
// SMELL — collect then iterate again
|
||||
let names: Vec<String> = users.iter()
|
||||
.map(|u| u.name.clone())
|
||||
.collect();
|
||||
let long_names: Vec<&String> = names.iter()
|
||||
.filter(|n| n.len() > 10)
|
||||
.collect();
|
||||
|
||||
// FIX — chain without intermediate collection
|
||||
let long_names: Vec<String> = users.iter()
|
||||
.map(|u| u.name.clone())
|
||||
.filter(|n| n.len() > 10)
|
||||
.collect(); // one allocation, one pass
|
||||
```
|
||||
|
||||
### When to Apply This Rule
|
||||
|
||||
**Triggers:**
|
||||
- `.collect()` followed by `.iter()` on the result
|
||||
- Intermediate `Vec` that's only iterated once then dropped
|
||||
- Multiple passes when a single chain would work
|
||||
|
||||
### Exceptions
|
||||
|
||||
**It's OK when:**
|
||||
- You need the collection for multiple uses (sort, binary search)
|
||||
- You need to know the length before processing
|
||||
- Ownership requires it (can't chain because of lifetime issues)
|
||||
|
||||
---
|
||||
|
||||
## 3. Index Loops Instead of Iterators
|
||||
|
||||
**What it looks like:** `for i in 0..vec.len() { vec[i] ... }`
|
||||
instead of `for item in &vec { ... }`.
|
||||
|
||||
**Where it comes from:** C, C++, Java for-loops with index variables.
|
||||
|
||||
**Why it's a smell:** Bounds checking on every access (runtime cost).
|
||||
Can't be auto-vectorized as well. More error-prone (off-by-one).
|
||||
Doesn't compose with iterator adapters. Not idiomatic.
|
||||
|
||||
**What to do instead:**
|
||||
```rust
|
||||
// SMELL — C-style index loop
|
||||
for i in 0..items.len() {
|
||||
process(&items[i]); // bounds check on every access
|
||||
}
|
||||
|
||||
// FIX — iterator (no bounds checks, composable)
|
||||
for item in &items {
|
||||
process(item);
|
||||
}
|
||||
|
||||
// Need the index too?
|
||||
for (i, item) in items.iter().enumerate() {
|
||||
println!("{i}: {item}");
|
||||
}
|
||||
```
|
||||
|
||||
### When to Apply This Rule
|
||||
|
||||
**Triggers:**
|
||||
- `for i in 0..x.len()` followed by `x[i]`
|
||||
- Index variable only used for indexing (not for math)
|
||||
- Parallel iteration of two collections by index
|
||||
|
||||
### Exceptions
|
||||
|
||||
**It's OK when:**
|
||||
- You need to modify the collection during iteration (but consider retain/drain)
|
||||
- You need random access patterns (skip ahead, look behind)
|
||||
- The algorithm genuinely needs indices (matrix operations)
|
||||
|
||||
---
|
||||
|
||||
## 4. .clone() as First Resort
|
||||
|
||||
**What it looks like:** Sprinkling `.clone()` to silence every borrow
|
||||
checker error without thinking about why it's complaining.
|
||||
|
||||
**Where it comes from:** GC languages where copying references is free.
|
||||
The borrow checker error feels like a "bug to fix" rather than a
|
||||
"design signal."
|
||||
|
||||
**Why it's a smell:** Hidden allocation costs. The borrow checker is
|
||||
telling you the ownership structure is wrong — clone hides the
|
||||
message without fixing the problem.
|
||||
|
||||
**What to do instead:**
|
||||
```rust
|
||||
// SMELL — clone to silence borrow checker
|
||||
fn process(data: &HashMap<String, Vec<String>>) -> Vec<String> {
|
||||
let keys: Vec<String> = data.keys().cloned().collect();
|
||||
let mut result = Vec::new();
|
||||
for key in &keys {
|
||||
for value in data.get(key).unwrap() {
|
||||
result.push(format!("{key}: {value}"));
|
||||
}
|
||||
}
|
||||
result
|
||||
}
|
||||
|
||||
// FIX — restructure to avoid the need
|
||||
fn process(data: &HashMap<String, Vec<String>>) -> Vec<String> {
|
||||
data.iter()
|
||||
.flat_map(|(key, values)| {
|
||||
values.iter().map(move |v| format!("{key}: {v}"))
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
```
|
||||
|
||||
### When to Apply This Rule
|
||||
|
||||
**Triggers:**
|
||||
- `.clone()` added right after a borrow checker error
|
||||
- Clone inside a loop (O(n) allocations)
|
||||
- Clone of a large struct to use one field
|
||||
- More than 3 clone calls in one function
|
||||
|
||||
### Exceptions
|
||||
|
||||
**It's OK when:**
|
||||
- You genuinely need independent ownership (forking a value)
|
||||
- Cloning is cheap (Arc::clone is just a counter increment)
|
||||
- Prototyping — clone now, optimize later (but DO optimize later)
|
||||
|
||||
---
|
||||
|
||||
## 5. try/catch Mentality (panic + catch_unwind)
|
||||
|
||||
**What it looks like:** Using `panic!()` as "throw" and
|
||||
`catch_unwind()` as "catch" to simulate exception handling.
|
||||
|
||||
**Where it comes from:** Java/Python/C# exception handling where
|
||||
throwing and catching is the normal error flow.
|
||||
|
||||
**Why it's a smell:** Panics are for BUGS, not expected errors.
|
||||
`catch_unwind` doesn't catch all panics (abort mode). It's not
|
||||
equivalent to try/catch — it's a last-resort safety net. Using it
|
||||
for control flow means you've reimplemented exceptions badly.
|
||||
|
||||
**What to do instead:**
|
||||
```rust
|
||||
// SMELL — panic as exception, catch_unwind as try/catch
|
||||
fn validate(input: &str) {
|
||||
if input.is_empty() { panic!("empty input"); }
|
||||
if input.len() > 100 { panic!("too long"); }
|
||||
}
|
||||
fn handle(input: &str) -> String {
|
||||
match std::panic::catch_unwind(|| validate(input)) {
|
||||
Ok(()) => "valid".into(),
|
||||
Err(e) => format!("error: {:?}", e),
|
||||
}
|
||||
}
|
||||
|
||||
// FIX — use Result for expected errors
|
||||
fn validate(input: &str) -> Result<(), ValidationError> {
|
||||
if input.is_empty() { return Err(ValidationError::Empty); }
|
||||
if input.len() > 100 { return Err(ValidationError::TooLong(input.len())); }
|
||||
Ok(())
|
||||
}
|
||||
fn handle(input: &str) -> String {
|
||||
match validate(input) {
|
||||
Ok(()) => "valid".into(),
|
||||
Err(e) => format!("error: {e}"),
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### When to Apply This Rule
|
||||
|
||||
**Triggers:**
|
||||
- `catch_unwind` in application code (not test harnesses)
|
||||
- `panic!()` for validatable conditions
|
||||
- Pattern of panic → catch → extract message
|
||||
|
||||
### Exceptions
|
||||
|
||||
**It's OK when:**
|
||||
- Thread pool implementations (catch panics from user-provided closures)
|
||||
- FFI boundaries (panics must not cross extern "C" boundaries)
|
||||
- Test frameworks (catch panics to report failures)
|
||||
|
||||
---
|
||||
|
||||
## 6. Null Sentinel Values Instead of Option
|
||||
|
||||
**What it looks like:** Using magic values (-1, 0, "", `u64::MAX`)
|
||||
to represent "no value" instead of `Option<T>`.
|
||||
|
||||
**Where it comes from:** C (NULL/0/-1 as sentinel), Go (zero values),
|
||||
database design (nullable columns with default values).
|
||||
|
||||
**Why it's a smell:** Compiler can't enforce checking. -1 might be a
|
||||
valid value in some context. Empty string might be legitimate input.
|
||||
Option forces the caller to handle the None case.
|
||||
|
||||
**What to do instead:**
|
||||
```rust
|
||||
// SMELL — sentinel values
|
||||
struct User {
|
||||
age: i32, // -1 means "not provided"
|
||||
nickname: String, // "" means "none"
|
||||
}
|
||||
fn get_age(user: &User) -> i32 {
|
||||
if user.age == -1 { panic!("no age!"); } // easy to forget this check
|
||||
user.age
|
||||
}
|
||||
|
||||
// FIX — Option makes absence explicit
|
||||
struct User {
|
||||
age: Option<u32>, // None means "not provided"
|
||||
nickname: Option<String>, // None means "none"
|
||||
}
|
||||
fn get_age(user: &User) -> Option<u32> {
|
||||
user.age // caller MUST handle None — compiler enforces it
|
||||
}
|
||||
```
|
||||
|
||||
### When to Apply This Rule
|
||||
|
||||
**Triggers:**
|
||||
- Numeric fields with documented "special" values
|
||||
- String fields where empty means "absent"
|
||||
- Comments like "// -1 means not set"
|
||||
- Checks like `if x == -1` or `if s.is_empty()` before use
|
||||
|
||||
### Exceptions
|
||||
|
||||
**It's OK when:**
|
||||
- FFI with C APIs that use sentinel values (wrap in Option at the boundary)
|
||||
- Performance-critical packed structs where Option's niche doesn't fit
|
||||
- Matching an external protocol/format that uses sentinels
|
||||
|
||||
---
|
||||
|
||||
## 7. OOP Inheritance Thinking (Deep Trait Object Hierarchies)
|
||||
|
||||
**What it looks like:** Complex trait object hierarchies trying to
|
||||
replicate class inheritance from Java/C++.
|
||||
|
||||
**Where it comes from:** OOP languages where "is-a" relationships
|
||||
and abstract base classes are the primary abstraction tool.
|
||||
|
||||
**Why it's a smell:** Rust doesn't have inheritance. Trait objects
|
||||
have dynamic dispatch overhead. Complex hierarchies lead to
|
||||
Box<dyn A + B + C> type soup. Composition is almost always better.
|
||||
|
||||
**What to do instead:**
|
||||
```rust
|
||||
// SMELL — trying to build class hierarchy
|
||||
trait Shape { fn area(&self) -> f64; }
|
||||
trait Drawable: Shape { fn draw(&self); }
|
||||
trait Resizable: Drawable { fn resize(&mut self, factor: f64); }
|
||||
// Three levels of "inheritance" — complex, fragile
|
||||
|
||||
// FIX — flat traits + composition
|
||||
trait HasArea { fn area(&self) -> f64; }
|
||||
trait Drawable { fn draw(&self); }
|
||||
trait Resizable { fn resize(&mut self, factor: f64); }
|
||||
|
||||
struct Circle { radius: f64 }
|
||||
impl HasArea for Circle { ... }
|
||||
impl Drawable for Circle { ... }
|
||||
impl Resizable for Circle { ... }
|
||||
// Each trait is independent — implement what you need
|
||||
```
|
||||
|
||||
### When to Apply This Rule
|
||||
|
||||
**Triggers:**
|
||||
- Trait with supertraits 3+ levels deep
|
||||
- `Box<dyn A>` where A has 5+ methods
|
||||
- Trying to "inherit" behavior from a base struct
|
||||
|
||||
### Exceptions
|
||||
|
||||
**It's OK when:**
|
||||
- Supertrait bounds are genuinely required (Error: Debug + Display)
|
||||
- The hierarchy matches a real protocol (Iterator → DoubleEndedIterator)
|
||||
- Framework design where trait objects are the extension point
|
||||
|
||||
---
|
||||
|
||||
## 8. Global Mutable State (static mut, lazy_static Mutex)
|
||||
|
||||
**What it looks like:** Global variables with interior mutability as
|
||||
the primary state management strategy.
|
||||
|
||||
**Where it comes from:** C global variables, Python module-level state,
|
||||
Singleton pattern from Java/C#.
|
||||
|
||||
**Why it's a smell:** Makes testing hard (tests share state). Makes
|
||||
reasoning hard (who modified it? when?). Makes concurrency hard
|
||||
(locks everywhere). Creates hidden dependencies between functions.
|
||||
|
||||
**What to do instead:**
|
||||
```rust
|
||||
// SMELL — global mutable state
|
||||
static STATE: LazyLock<Mutex<AppState>> = LazyLock::new(|| {
|
||||
Mutex::new(AppState::new())
|
||||
});
|
||||
|
||||
fn process_request(req: Request) -> Response {
|
||||
let mut state = STATE.lock().unwrap();
|
||||
state.handle(req) // hidden dependency on global
|
||||
}
|
||||
|
||||
// FIX — pass state explicitly
|
||||
struct App { state: AppState }
|
||||
|
||||
impl App {
|
||||
fn process_request(&mut self, req: Request) -> Response {
|
||||
self.state.handle(req) // dependency is visible and testable
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### When to Apply This Rule
|
||||
|
||||
**Triggers:**
|
||||
- `static` + `Mutex` or `RwLock` for application state
|
||||
- Functions that "magically" access state without parameters
|
||||
- Hard-to-test code because of hidden global dependencies
|
||||
- `lazy_static!` or `LazyLock` for mutable state
|
||||
|
||||
### Exceptions
|
||||
|
||||
**It's OK when:**
|
||||
- True global configuration (read-only after init — use OnceLock)
|
||||
- Logging (global logger is an accepted pattern)
|
||||
- Metrics/counters (AtomicU64 for statistics)
|
||||
- Process-level singletons (allocator, thread pool)
|
||||
|
||||
---
|
||||
|
||||
## 9. Returning impl Trait When Box<dyn Trait> Is Needed
|
||||
|
||||
**What it looks like:** Confusion between `impl Trait` (static dispatch,
|
||||
one concrete type) and `Box<dyn Trait>` (dynamic dispatch, multiple
|
||||
types).
|
||||
|
||||
**Where it comes from:** Not understanding that `impl Trait` in return
|
||||
position means "one specific type the compiler knows" — not "any type
|
||||
implementing this trait."
|
||||
|
||||
**Why it's a smell:** Using `impl Trait` when you need to return
|
||||
DIFFERENT types based on runtime conditions causes confusing errors.
|
||||
Using `Box<dyn Trait>` when you always return the same type adds
|
||||
unnecessary heap allocation.
|
||||
|
||||
**What to do instead:**
|
||||
```rust
|
||||
// SMELL — impl Trait can't return different types
|
||||
fn get_writer(use_file: bool) -> impl Write {
|
||||
if use_file {
|
||||
File::create("out.txt").unwrap()
|
||||
} else {
|
||||
io::stdout() // ERROR: different types!
|
||||
}
|
||||
}
|
||||
|
||||
// FIX — Box<dyn Trait> for runtime polymorphism
|
||||
fn get_writer(use_file: bool) -> Box<dyn Write> {
|
||||
if use_file {
|
||||
Box::new(File::create("out.txt").unwrap())
|
||||
} else {
|
||||
Box::new(io::stdout())
|
||||
}
|
||||
}
|
||||
|
||||
// FIX — enum for known set of types (no heap allocation)
|
||||
enum Writer { File(File), Stdout(io::Stdout) }
|
||||
impl Write for Writer { ... }
|
||||
```
|
||||
|
||||
### When to Apply This Rule
|
||||
|
||||
**Triggers:**
|
||||
- Compiler error "expected X, found Y" in if/match arms with impl Trait
|
||||
- Box<dyn Trait> when only one type is ever returned (unnecessary)
|
||||
- Confusion about when to use which
|
||||
|
||||
### Exceptions
|
||||
|
||||
**It's OK when:**
|
||||
- `impl Iterator` in return position (common, always one type)
|
||||
- `impl Future` for async functions (always one type)
|
||||
|
||||
---
|
||||
|
||||
## 10. Ignoring #[must_use] Warnings
|
||||
|
||||
**What it looks like:** Suppressing `Result` unused warnings with
|
||||
`let _ = ...` without considering whether the error matters.
|
||||
|
||||
**Where it comes from:** Languages where exceptions propagate
|
||||
automatically. In Python/Java, if you don't catch it, it propagates.
|
||||
In Rust, if you don't check it, it's silently lost.
|
||||
|
||||
**Why it's a smell:** `let _ = file.write(data);` silently discards
|
||||
a potential I/O error. The data might not have been written. The
|
||||
user never finds out.
|
||||
|
||||
**What to do instead:**
|
||||
```rust
|
||||
// SMELL — silencing the warning without thinking
|
||||
let _ = file.write_all(data); // error swallowed — data might be lost
|
||||
let _ = channel.send(msg); // receiver might be gone — msg dropped
|
||||
|
||||
// FIX — handle or propagate
|
||||
file.write_all(data)?; // propagate to caller
|
||||
// Or if you genuinely don't care:
|
||||
if let Err(e) = channel.send(msg) {
|
||||
tracing::warn!("channel closed: {e}"); // at least log it
|
||||
}
|
||||
```
|
||||
|
||||
### When to Apply This Rule
|
||||
|
||||
**Triggers:**
|
||||
- `let _ =` on a Result without any comment
|
||||
- Multiple `let _ =` in sequence (each one hiding a potential failure)
|
||||
- The function being called is I/O, network, or state-changing
|
||||
|
||||
### Exceptions
|
||||
|
||||
**It's OK when:**
|
||||
- The error genuinely doesn't matter (e.g., best-effort logging)
|
||||
- The operation is idempotent and will be retried anyway
|
||||
- You've documented WHY: `let _ = tx.send(msg); // receiver might be gone, that's fine`
|
||||
|
||||
<!-- PATTERN_COMPLETE -->
|
||||
Reference in New Issue
Block a user