docs: unsafe patterns from rust-lang/rust

10 patterns, 624 lines. Full spec compliance.
Patterns: // SAFETY: comments, unsafe fn contracts, safe wrappers,
MaybeUninit, transmute, raw pointers, unsafe impl Send/Sync,
NonNull/PhantomData, extern "C" FFI, type-encoded invariants.
This commit is contained in:
Rodin
2026-04-30 15:08:25 -07:00
parent c32dd9b843
commit 9cd0a33ff9
+624
View File
@@ -0,0 +1,624 @@
# Rust Unsafe Patterns
Patterns for using unsafe code correctly in Rust, extracted from
the standard library source.
**Source:** [rust-lang/rust](https://github.com/rust-lang/rust) at commit
[`f53b654`](https://github.com/rust-lang/rust/tree/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6)
**Stats:** 31,244 unsafe blocks, 7,091 unsafe fn declarations,
2,463 `// SAFETY:` comments, 9,061 transmute usages, 928 MaybeUninit
usages, 710 ptr::read/write/copy calls, 489 extern "C" blocks.
---
## 1. // SAFETY: Comment on Every Unsafe Block
### Source:
[library/core/src/slice/mod.rs](https://github.com/rust-lang/rust/blob/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6/library/core/src/slice/mod.rs)
2,463 `// SAFETY:` comments in library/.
```rust
// library/core/src/slice/mod.rs
pub fn split_at(&self, mid: usize) -> (&[T], &[T]) {
assert!(mid <= self.len());
// SAFETY: `[ptr; mid]` and `[mid; len]` are inside `self`, which
// fulfills the requirements of `split_at_unchecked`.
unsafe { self.split_at_unchecked(mid) }
}
```
### Why
Every unsafe block must prove soundness at the point of use. The
comment is a proof obligation: "I assert that the following
invariants hold HERE because..." This is how unsafe code gets
audited — reviewers check the comment against the requirements.
### When to Use
**Triggers:**
- Every `unsafe { }` block (no exceptions)
- Explain WHY it's safe, not WHAT the code does
- Reference specific invariants from the unsafe fn's `# Safety` docs
**Example — before:**
```rust
unsafe {
ptr::copy_nonoverlapping(src, dst, len);
}
// No comment — reviewer has no idea if this is actually safe
```
**Example — after:**
```rust
// SAFETY: `src` and `dst` are both derived from `self.buf` which
// is a contiguous allocation. `src` points to index `self.head` and
// `dst` points to index 0. They don't overlap because head > 0
// (checked by the if-guard above). `len` is bounded by capacity
// minus head, ensuring we don't read past the allocation.
unsafe {
ptr::copy_nonoverlapping(src, dst, len);
}
```
### When NOT to Use
**This pattern is ALWAYS required.** There is no "when not to use."
If you have an unsafe block without a SAFETY comment, it's
incomplete.
---
## 2. unsafe fn for Precondition Contracts
### Source:
[library/core/src/slice/mod.rs](https://github.com/rust-lang/rust/blob/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6/library/core/src/slice/mod.rs) (get_unchecked)
7,091 unsafe fn declarations in library/.
```rust
/// Returns a reference to an element, without doing bounds checking.
///
/// # Safety
///
/// Calling this method with an out-of-bounds index is
/// *[undefined behavior]* even if the resulting reference is not used.
pub unsafe fn get_unchecked<I>(&self, index: I) -> &I::Output
where
I: SliceIndex<Self>,
{ ... }
```
### Why
`unsafe fn` shifts the proof obligation to the CALLER. The function
says "I'm correct IF you uphold these preconditions." The `# Safety`
doc section is the contract. Without `unsafe`, Rust guarantees
safety; with it, YOU guarantee safety.
### When to Use
**Triggers:**
- The function has preconditions that can't be checked at runtime
(or checking would be too expensive)
- Performance-critical inner loops where bounds checking matters
- The function wraps raw pointer operations
**Example — before:**
```rust
// Safe version — always checks (correct but slower in hot paths)
pub fn get(&self, index: usize) -> Option<&T> {
if index < self.len() {
Some(unsafe { &*self.ptr.add(index) })
} else {
None
}
}
```
**Example — after:**
```rust
// Unsafe version — skips the check (caller's responsibility)
/// # Safety
///
/// `index` must be less than `self.len()`.
pub unsafe fn get_unchecked(&self, index: usize) -> &T {
// SAFETY: caller guarantees index < len
unsafe { &*self.ptr.add(index) }
}
```
### When NOT to Use
**Don't use this when:**
- You can validate inputs cheaply (just check and panic/return Err)
- The function is public API that regular users will call
- Performance isn't critical (safe version is always preferred)
---
## 3. Safe Wrapper Around Unsafe Core
### Source:
This is THE fundamental pattern of Rust's stdlib. Almost every
safe public API is a thin wrapper that validates inputs then calls
unsafe internals.
```rust
// The pattern: safe API → validate → unsafe impl
pub fn split_at(&self, mid: usize) -> (&[T], &[T]) {
assert!(mid <= self.len()); // ← validation
// SAFETY: assertion above guarantees mid is in bounds
unsafe { self.split_at_unchecked(mid) } // ← unsafe core
}
```
### Why
This is how Rust achieves both safety AND performance. The safe
wrapper provides the guarantee. The unsafe core provides the speed.
Users get safety by default; experts opt into `_unchecked` when they
can prove the preconditions themselves.
### When to Use
**Triggers:**
- You have an operation that's unsafe in general but can be made
safe with runtime checks
- You want to offer both safe and unsafe versions
- The safe version is the default; unsafe is the opt-in optimization
**Example — before:**
```rust
// Only unsafe — forces ALL callers to use unsafe
pub unsafe fn index(&self, i: usize) -> &T {
&*self.ptr.add(i)
}
```
**Example — after:**
```rust
// Safe default (what most users call):
pub fn index(&self, i: usize) -> &T {
assert!(i < self.len(), "index {i} out of bounds (len {})", self.len());
// SAFETY: we just verified i < len
unsafe { self.index_unchecked(i) }
}
// Unsafe escape hatch (for performance-critical code):
/// # Safety
/// `i` must be less than `self.len()`.
pub unsafe fn index_unchecked(&self, i: usize) -> &T {
unsafe { &*self.ptr.add(i) }
}
```
### When NOT to Use
**Don't use this when:**
- The safe version has no overhead worth avoiding (just be safe)
- The precondition can't be expressed as a simple check
- Only internal code will ever call the unsafe version
---
## 4. MaybeUninit for Uninitialized Memory
### Source:
[library/core/src/mem/maybe_uninit.rs](https://github.com/rust-lang/rust/blob/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6/library/core/src/mem/maybe_uninit.rs)
928 MaybeUninit usages in the stdlib.
```rust
use std::mem::MaybeUninit;
let mut buf: [MaybeUninit<u8>; 1024] = MaybeUninit::uninit_array();
let len = read_into(&mut buf)?;
// SAFETY: read_into guarantees buf[..len] is initialized
let initialized = unsafe { MaybeUninit::array_assume_init(buf[..len]) };
```
### Why
Rust requires all values to be initialized. `MaybeUninit<T>` opts out
of this requirement for performance (avoiding zeroing large buffers).
It tells the compiler "this might not be initialized yet — don't
assume anything."
### When to Use
**Triggers:**
- Buffer allocation without initialization overhead
- FFI where C code fills in the data
- Building arrays element-by-element without Default requirement
- Performance-critical allocation hot paths
**Example — before:**
```rust
// Zeroing 1MB for no reason — the OS will fill it immediately
let mut buf = vec![0u8; 1_000_000];
file.read(&mut buf)?; // overwrites all zeros anyway
```
**Example — after:**
```rust
let mut buf = Vec::with_capacity(1_000_000);
// SAFETY: read will initialize exactly `n` bytes
unsafe {
let n = file.read(buf.spare_capacity_mut())?;
buf.set_len(n);
}
```
### When NOT to Use
**Don't use this when:**
- Default/zeroed memory is fine (clarity > micro-optimization)
- You're not sure how many bytes will be initialized
- The type has drop glue (forgetting to call `assume_init` leaks)
---
## 5. transmute for Type Reinterpretation
### Source:
[library/core/src/mem/mod.rs](https://github.com/rust-lang/rust/blob/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6/library/core/src/mem/mod.rs) (transmute)
9,061 transmute usages (many in generated code/architecture intrinsics).
```rust
// SAFETY: u8 and i8 have the same size and any bit pattern is valid
let signed: i8 = unsafe { std::mem::transmute::<u8, i8>(byte) };
```
### Why
`transmute` reinterprets the bits of one type as another type. It's
the most dangerous unsafe operation — it bypasses ALL type checking.
The stdlib uses it for zero-cost conversions between types with
identical bit representations.
### When to Use
**Triggers:**
- Converting between types with identical memory layout
- Enum discriminant inspection
- FFI type conversions
### When NOT to Use
**Don't use this when:**
- `From`/`Into` can do the conversion safely
- `as` casting works (numeric conversions)
- The types might have different sizes (instant UB)
- There are invalid bit patterns for the target type
### Anti-pattern
```rust
// DON'T: transmute between types with different validity
let x: u8 = 255;
let b: bool = unsafe { std::mem::transmute(x) };
// UB! bool can only be 0 or 1
// DO: use safe conversion
let b: bool = x != 0;
```
---
## 6. Raw Pointers (ptr::read, ptr::write, ptr::copy)
### Source:
710 ptr operations (read/write/copy/drop_in_place) in library/.
```rust
use std::ptr;
// SAFETY: src is valid, aligned, and initialized for T.
// dst is valid and aligned for T.
// src and dst don't overlap.
unsafe {
ptr::copy_nonoverlapping(src, dst, count);
}
```
### Why
Raw pointers bypass the borrow checker. They're needed for:
implementing data structures, FFI, and performance-critical code.
The `ptr` module provides safe building blocks for common operations.
### When to Use
**Triggers:**
- Implementing custom collections (Vec, LinkedList)
- Moving values without running Drop
- FFI (C gives you raw pointers)
- Pointer arithmetic for buffer management
### When NOT to Use
**Don't use this when:**
- References (&T, &mut T) work (almost always)
- You can use safe abstractions (Vec, Box, slice methods)
- You're using raw pointers to "work around" the borrow checker
(fix the design instead)
---
## 7. unsafe impl Send/Sync
### Source:
[library/core/src/marker.rs](https://github.com/rust-lang/rust/blob/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6/library/core/src/marker.rs)
274 unsafe impl Send/Sync in the stdlib.
```rust
// library/alloc/src/sync.rs
unsafe impl<T: ?Sized + Sync + Send> Send for Arc<T> {}
unsafe impl<T: ?Sized + Sync + Send> Sync for Arc<T> {}
```
### Why
Types with raw pointers are !Send and !Sync by default (safe).
If you've built a type that IS safe to share across threads (e.g.,
using atomic operations internally), you must explicitly opt in
with `unsafe impl`.
### When to Use
**Triggers:**
- Your type contains raw pointers but IS thread-safe
- You use atomic operations for all shared access
- The type wraps a C library that's documented as thread-safe
**Example — before:**
```rust
struct SharedData {
ptr: *mut u8, // raw pointer → auto !Send, !Sync
}
// Can't use in thread::spawn — even if it's actually safe
```
**Example — after:**
```rust
struct SharedData {
ptr: *mut u8,
// internally uses atomic operations for all access
}
// SAFETY: SharedData uses atomic operations for all mutations
// and the underlying data is never accessed without synchronization.
unsafe impl Send for SharedData {}
unsafe impl Sync for SharedData {}
```
### When NOT to Use
**Don't use this when:**
- You're not 100% certain the type is thread-safe
- The type uses non-atomic interior mutability (Cell, RefCell)
- You haven't proven that no data races are possible
---
## 8. NonNull and PhantomData for Safe Abstractions
### Source:
[library/core/src/ptr/non_null.rs](https://github.com/rust-lang/rust/blob/f53b654a8882fd5fc036c4ca7a4ff41ce32497a6/library/core/src/ptr/non_null.rs)
```rust
// NonNull is used instead of *mut T to encode the "never null" invariant:
pub struct Vec<T> {
ptr: NonNull<T>, // never null — can use niche optimization
len: usize,
cap: usize,
_marker: PhantomData<T>, // tells compiler Vec "owns" T values
}
```
### Why
`NonNull<T>` wraps a raw pointer with a "not null" invariant.
This enables the compiler to use the null bit pattern for
`Option<NonNull<T>>` optimization (same size as a raw pointer).
`PhantomData<T>` tells the compiler about ownership/variance
without storing T.
### When to Use
**Triggers:**
- You have a raw pointer that's never null by construction
- You want `Option<YourType>` to be pointer-sized
- You need correct drop checking behavior (PhantomData)
### When NOT to Use
**Don't use this when:**
- The pointer CAN be null (use `Option<NonNull<T>>` or `*mut T`)
- You don't need the niche optimization
- A reference (&T, &mut T) would work
---
## 9. extern "C" for FFI
### Source:
489 `extern "C"` blocks in the stdlib.
```rust
extern "C" {
fn strlen(s: *const c_char) -> usize;
fn memcpy(dst: *mut u8, src: *const u8, n: usize) -> *mut u8;
}
```
### Why
`extern "C"` declares functions using the C calling convention.
This is how Rust calls into C libraries. All extern functions are
implicitly `unsafe` because Rust can't verify C's behavior.
### When to Use
**Triggers:**
- Calling C/C++ libraries from Rust
- Providing Rust functions callable from C
- OS system calls
**Example — before:**
```rust
// Re-implementing in Rust what already exists in C:
fn my_strlen(s: &[u8]) -> usize {
s.iter().position(|&b| b == 0).unwrap_or(s.len())
}
```
**Example — after:**
```rust
use std::ffi::{CStr, c_char};
extern "C" {
fn strlen(s: *const c_char) -> usize;
}
fn safe_strlen(s: &CStr) -> usize {
// SAFETY: CStr is null-terminated, which strlen requires
unsafe { strlen(s.as_ptr()) }
}
```
### When NOT to Use
**Don't use this when:**
- A safe Rust equivalent exists (prefer pure Rust)
- The C library isn't well-documented (you can't prove safety)
- You only need it on one platform (consider cfg + fallback)
---
## 10. Invariant Encoding in Types (Making Invalid States Unrepresentable)
### Source:
The stdlib encodes invariants in the type system, reducing the
surface area where unsafe is needed:
```rust
// NonZero<T> — can never be zero (compiler enforces this)
pub struct NonZero<T>(T); // where T is a primitive integer
// str — ALWAYS valid UTF-8 (unsafe to construct from arbitrary bytes)
// &str methods never need to re-validate
// Pin<P> — the pointed-to value will never move
pub struct Pin<Ptr> { pointer: Ptr }
```
### Why
The safest unsafe code is code that doesn't exist. By encoding
invariants in types, you push the unsafe boundary to construction
and then never need unsafe again. `str` is ALWAYS valid UTF-8 —
every `&str` method can assume this without checking.
### When to Use
**Triggers:**
- You have an invariant that many functions depend on
- Validating the invariant is expensive (do it once at construction)
- The invariant can be expressed as a type distinction
**Example — before:**
```rust
// Every function must check the invariant
fn process(data: &[u8]) -> Result<Output, Error> {
if !is_valid_utf8(data) {
return Err(Error::InvalidUtf8);
}
// ... 10 more functions all repeat this check
}
```
**Example — after:**
```rust
// Validate once at the boundary, then it's always true
struct ValidatedInput(String); // String is always valid UTF-8
impl ValidatedInput {
pub fn new(data: &[u8]) -> Result<Self, Error> {
let s = std::str::from_utf8(data)?;
Ok(Self(s.to_owned()))
}
}
fn process(input: &ValidatedInput) -> Output {
// No validation needed — the type guarantees it
}
```
### When NOT to Use
**Don't use this when:**
- The invariant is trivial to check (just check it)
- The type would make the API confusing
- Creating the validated type requires unsafe (might not be worth it)
---
## Summary: Unsafe Decision Tree
```
Do you need unsafe?
├── Can you use a safe API? → NO UNSAFE (always prefer this)
├── Performance-critical inner loop → Safe wrapper + unsafe core
├── FFI (calling C) → extern "C" + safe wrapper
├── Custom data structure → raw pointers + NonNull + PhantomData
└── Thread safety assertion → unsafe impl Send/Sync
Writing an unsafe block?
├── Add // SAFETY: comment (MANDATORY)
├── What invariants does the unsafe op require?
├── How are those invariants guaranteed HERE?
└── Would a reviewer agree with your proof?
Designing an unsafe fn?
├── Document # Safety section (contract with caller)
├── What must the caller guarantee?
├── Can you offer a safe alternative? (almost always yes)
└── Name it with _unchecked suffix
```
| Pattern | Use when |
|---|---|
| `// SAFETY:` comment | Every `unsafe {}` block |
| `unsafe fn` | Preconditions callers must guarantee |
| Safe wrapper + unsafe core | Public API with bounds/validity checks |
| `MaybeUninit` | Avoiding unnecessary initialization |
| `transmute` | Zero-cost type reinterpretation |
| `ptr::read`/`write`/`copy` | Custom data structure internals |
| `unsafe impl Send/Sync` | Asserting thread safety for raw-pointer types |
| `NonNull` + `PhantomData` | Encoding invariants in pointer wrappers |
| `extern "C"` | FFI (calling C libraries) |
| Type-encoded invariants | Make invalid states unrepresentable |
See also:
- [documentation.md](documentation.md) — # Safety doc sections
- [concurrency.md](concurrency.md) — Send/Sync auto traits
- [ownership.md](ownership.md) — Raw pointers vs references
- [traits.md](traits.md) — Marker traits and sealed traits
<!-- PATTERN_COMPLETE -->