What's the difference between references and pointers in Rust?

Monday, February 6, 2023

I've been working on writing a Rust training course, and one of the things I struggled with explaining in there was the difference between references and pointers.

Ultimately, the underlying representation is the same: both hold an address for some memory. The difference between them is ultimately in semantics.

References have some rules enforced by the compiler. Specifically, they cannot outlive what they refer to (the "referent"), and mutable references cannot be aliased. Other than that, references behave a lot like the variables they point to. They have a type, and you can interact with that type to read it or (with mutable references) modify it.

On the other hand, pointers are semantically more about the address. This means that when we interact with them, we'll be modifying the address (things like add will do pointer offsets instead of adding to the underlying value). When we print them, we don't print the underlying value—in fact, we cannot get to the underlying value at all without the unsafe keyword. Instead, we print out the address.

We can see this with a simple program.

fn main() {
    let x: u32 = 10;
    let ref_x: &u32 = &x;
    let pointer_x: *const u32 = &x;

    println!("x: {x}");
    println!("ref_x: {}", ref_x);
    println!("pointer_x: {:?}", pointer_x);

First, we create an unsigned 32-bit integer and give it a value. Then we create a reference to the same value, and we'll also create a pointer to it. And then we try to print this out.

When we execute this, we get this output:

x: 10
ref_x: 10
pointer_x: 0x7ffd046a6444

When we interact with the variable directly or the reference, we get the underlying value. But with the pointer, we get the address!

You can still access the underlying values with pointers, but you have to use unsafe to do so. To see why, we can just try to dereference a raw pointer without unsafe and get an error message:

error[E0133]: dereference of raw pointer is unsafe and requires unsafe function or block
  --> src/main.rs:10:32
10 |     println!("*pointer_x: {}", *pointer_x);
   |                                ^^^^^^^^^^ dereference of raw pointer
   = note: raw pointers may be null, dangling or unaligned; they can violate aliasing rules and cause data races: all of these are undefined behavior
   = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)

The important bit is in the note:

note: raw pointers may be null, dangling or unaligned; they can violate aliasing rules and cause data races: all of these are undefined behavior

And indeed, if we wrap it in unsafe, it will work:

println!("*pointer_x: {}", unsafe { *pointer_x } );

Using references is safe. The compiler will check that you don't alias the same mutable variable multiple times, ensuring you don't have data races. It will ensure that any references do not outlive the memory they refer to. You have to verify all those things yourself with raw pointers, so it's unsafe.

So that's the difference between references and pointers in Rust: they have the same underlying data, but different constraints and semantics with the compiler.

If this post was enjoyable or useful for you, please share it! If you have comments, questions, or feedback, please email my public inbox or my personal email. To get new posts, please use my RSS feed.