Contents

Memory Management: C/C++ to Rust

Reason why Rust is gaining popularity lately

Understanding Memory Management: From C/C++ Chaos to Rust’s Revolution

I wanted to understand why Rust has gained such passionate adoption in systems programming.

The answer lies in one fundamental concept: memory management.

Let’s first get a primer on the foundational concepts.

[Foundation] Compile Time vs Runtime

A program’s lifecycle:
souce code -> machine code -> execution

Compile Time: The Planning Phase

Compile time is when the source code gets transformed into executable machine code. During this phase, the compiler:

  • Analyzes code structure - checking syntax, types, function signatures
  • Optimizes performance - eliminating dead code, inlining functions, reordering operations
  • Resolves static references - linking function calls, determining memory layouts for structs
  • Catches certain errors - type mismatches, undefined variables, syntax errors

Think of compile time as an architect reviewing building plans where they can catch structural issues, optimize the design, before construction begins.

// Compile time: compiler knows this struct layout
struct Point {
    int x;  // 4 bytes
    int y;  // 4 bytes
};         // Total: 8 bytes (plus potential padding)

// Compile time: compiler can optimize this
int calculate_area(int width, int height) {
    return width * height;  // Simple multiplication
}

Runtime: The Execution Phase

Runtime is when your compiled program actually executes. During this phase:

  • Memory gets allocated and freed - programs request and release RAM as needed
  • User input is processed - file reads, network requests, keyboard input
  • Dynamic decisions are made - following if/else branches, handling errors
  • Resource management happens - opening files, network connections, database queries

Runtime is like the actual construction of the building where we deal with real materials, unexpected conditions, that the architect couldn’t fully predict.

// Runtime: actual memory allocation happens here
char* buffer = malloc(user_input_size);  // Size determined at runtime

// Runtime: branching based on actual data
if (user_age >= 18) {
    grant_access();
} else {
    deny_access();
}

Why This Distinction Matters

The fundamental challenge in systems programming is this tension:

  • Compile-time checks are safe but limited - the compiler can’t predict everything that will happen
  • Runtime decisions are flexible but dangerous - programs have full control but can make catastrophic mistakes

This is where memory management becomes critical, because memory errors typically happen at runtime but often can’t be caught by traditional compilers.


[Foundation] Memory Layout: Stack and Heap

When your program runs, the operating system allocates a chunk of RAM and organizes it into distinct regions. The two most critical regions for understanding memory management are the stack and the heap.

The Stack: Structured and Predictable

The stack is a highly organized memory region that follows strict rules:

Structure: Last In, First Out (LIFO)

│     │  ← Top of stack
├─────┤
│ bar │  ← Most recent function call
├─────┤
│ foo │  ← Previous function call  
├─────┤
│main │  ← Bottom of stack
└─────┘

What Lives on the Stack

The stack stores data that is:

  • Fixed-size and known at compile time - int x = 5, char buffer[100]
  • Function-scoped - local variables, parameters, return addresses
  • Short-lived - automatically cleaned up when functions return
  • Small - typically a few KB per function call
void example_function(int param) {  // param goes on stack
    int local_var = 42;             // local_var goes on stack
    char small_buffer[64];          // small_buffer goes on stack
    
    // When function returns, ALL of this disappears automatically
}

Stack Frames

Each function call creates a “stack frame” containing all its local data:

void main() {
    int a = 1;
    foo(a);
}

void foo(int param) {
    int b = 2;
    bar(param + b);
}

void bar(int param) {
    int c = 3;
    // Do something
}

Memory layout during execution:

│ bar: c=3, param=3 │  ← Current function
├───────────────────┤
│ foo: b=2, param=1 │  ← Previous function
├───────────────────┤
│ main: a=1         │  ← Original function
└───────────────────┘

When bar() returns, its entire frame disappears instantly. When foo() returns, its frame disappears. This automatic cleanup is why stack memory is both fast and safe.

Stack Advantages

  • Lightning fast - allocation/deallocation is just moving a pointer
  • Automatic memory management - no manual cleanup required
  • Memory safety - impossible to access deallocated stack memory
  • Cache-friendly - sequential memory access patterns

Stack Limitations

  • Size restrictions - typically 1-8MB total per thread
  • Scope limitations - data disappears when function returns
  • Fixed-size only - can’t allocate based on runtime values
void problematic() {
    int size = get_user_input();    // Runtime value
    int array[size];                // Not allowed in C
    
    // Also problematic - too large for stack
    int huge_array[1000000];        // Likely stack overflow
}

The Heap: Flexible but Dangerous

The heap is a large, unstructured memory region designed for dynamic allocation:

What Lives on the Heap

The heap stores data that is:

  • Dynamic-size - size determined at runtime
  • Long-lived - can outlast the function that created it
  • Large - megabytes, gigabytes, limited only by available RAM
  • Shared - can be accessed from multiple functions
// Runtime-determined sizes
int size = get_user_input();
char* buffer = malloc(size);        // Dynamic allocation

// Large data structures
struct Database* db = malloc(sizeof(struct Database));
db->records = malloc(sizeof(Record) * 1000000);

// Long-lived data
char* global_config = load_config_file();  // Outlasts the function

How Heap Allocation Works

Unlike stack allocation (which is just pointer arithmetic), heap allocation involves:

  1. Searching for free space - the allocator maintains a data structure of available blocks
  2. Marking memory as used - updating internal bookkeeping
  3. Returning a pointer - giving you the address of your new memory block
char* ptr = malloc(100);  // Allocator finds 100 bytes and returns address

// Memory state:
// Stack: ptr = 0x7f8b4c000100 (just an address)
// Heap:  [100 bytes starting at 0x7f8b4c000100]

[Critical] What “Freeing” Actually Means

This is where many developers’ intuition breaks down. When you call:

free(ptr);

What freeing does NOT do:

  • Erase or zero out the memory
  • Make the pointer invalid
  • Prevent further access to that memory

What freeing ACTUALLY does:

  • Marks the memory block as “available for reuse”
  • Updates the allocator’s internal data structures
  • Allows future malloc() calls to reuse that space

The memory contents remain completely unchanged:

int* ptr = malloc(sizeof(int));
*ptr = 42;

printf("%d\n", *ptr);  // Prints: 42

free(ptr);

printf("%d\n", *ptr);  // Still prints: 42 (but this is undefined behavior!)

This fundamental misunderstanding about what free() does is the root cause of most heap-related bugs.

Heap Advantages

  • Dynamic sizing - allocate based on runtime needs
  • Large capacity - limited only by available RAM
  • Persistent across function calls - data survives scope changes
  • Shareable - multiple pointers can reference the same data

Heap Disadvantages

  • Slower allocation - requires searching and bookkeeping
  • Manual memory management - you must remember to free everything
  • Fragmentation - repeated allocation/deallocation can waste space
  • No automatic safety - easy to create bugs

The C/C++ Approach: Power Without Safety

C and C++ give direct, unrestricted access to both stack and heap memory.

This approach has powered decades of system software, but it comes with a fundamental flaw: the compiler cannot prevent memory safety bugs.

Why C/C++ Cannot Guarantee Memory Safety

The core issue is that C/C++ compilers make minimal assumptions about the code:

// The compiler has NO idea:
// - Who owns this memory?
// - When should it be freed?
// - Who else might be using it?
char* create_buffer() {
    char* ptr = malloc(100);
    return ptr;  // Ownership transferred, but to whom?
}

void some_function() {
    char* buffer = create_buffer();
    // Should I free this? When? What if someone else is using it?
    free(buffer);  // Maybe? Maybe not?
}

The compiler sees this code as perfectly valid, even though it’s impossible to determine from the code alone whether the memory management is correct.

Types of Memory Issues

Let’s look at the major categories of memory bugs that plague C/C++ programs:

1. Memory Leaks

Allocating memory but never freeing it.

void process_file(const char* filename) {
    char* buffer = malloc(LARGE_SIZE);
    FILE* file = fopen(filename, "r");
    
    if (!file) {
        return;  // Leaked buffer!
    }
    
    // Process file...
    
    fclose(file);
    free(buffer);  // Only freed on success path
}

Why it’s insidious:

  • Programs continue working normally… for a while
  • Memory usage slowly grows over time
  • Eventually the system runs out of RAM
  • Critical for long-running services (web servers, databases)

Real-world impact:

  • Web servers that crash after handling millions of requests
  • Mobile apps that get slower and slower
  • System services that eventually bring down entire machines

2. Use-After-Free

Accessing memory after it has been freed.

char* create_and_process() {
    char* data = malloc(100);
    strcpy(data, "important data");
    
    free(data);
    
    // Some time later...
    return data;  // Returning freed memory!
}

void caller() {
    char* result = create_and_process();
    printf("%s\n", result);  // Reading freed memory!
}

The timeline of disaster:

  1. Initial state: Memory contains “important data”
  2. After free(): Memory still contains “important data” (not erased!)
  3. Program continues: Everything seems fine
  4. Later malloc(): Same memory gets reused for something else
  5. Original access: Now reads/writes completely unrelated data

Why it’s terrifying:

  • May work correctly for months in testing
  • Fails nondeterministically in production
  • Can corrupt unrelated data silently
  • Often exploitable by attackers

3. Double Free

Calling free() on the same pointer twice.

void cleanup(char* ptr1, char* ptr2) {
    free(ptr1);
    free(ptr2);
    
    // Oops, ptr1 and ptr2 were pointing to the same memory!
    // We just called free() twice on the same address
}

What goes wrong:

  • The allocator maintains internal data structures
  • First free() updates these structures correctly
  • Second free() corrupts the allocator’s bookkeeping
  • Future malloc()/free() calls may crash or behave unpredictably

Example of allocator corruption:

char* ptr = malloc(100);
free(ptr);
free(ptr);  // Double free

// Later, completely unrelated code:
char* other = malloc(50);  // May crash due to corrupted allocator

4. Dangling Pointers

References to Nowhere

Stack dangling pointers:

int* get_local_address() {
    int x = 42;
    return &x;  // Returning address of local variable
}

void caller() {
    int* ptr = get_local_address();
    printf("%d\n", *ptr);  // x no longer exists!
}

Heap dangling pointers:

int* ptr1 = malloc(sizeof(int));
int* ptr2 = ptr1;  // Both point to same memory

free(ptr1);
// ptr2 is now dangling

*ptr2 = 10;  // Writing to freed memory

Why dangling pointers are particularly nasty:

  • The pointer value looks completely valid
  • No automatic detection mechanism
  • May accidentally access other valid data
  • Can cause data corruption in unrelated parts of the program

The Fundamental Problem: Missing Information

The root issue is that C/C++ code lacks critical information:

// Looking at this function signature, can you tell:
// 1. Who owns the returned pointer?
// 2. When should it be freed?
// 3. Can it be NULL?
// 4. Is it pointing to one item or an array?
// 5. How large is the allocated memory?
char* process_data(char* input);

The compiler has the same problem that it can’t reason about memory safety without this information.


Rust: Same Hardware, New Rules

Here’s the key insight that many miss:

Rust uses exactly the same memory model as C/C++.

The stack works identically. The heap works identically. Memory allocation and deallocation happen the same way at the hardware level.

What Rust changes is not how memory works, but who is allowed to access it when.

The Concept of “Ownership”

Rust introduces a compile-time system called “ownership” that tracks:

  • Who owns each piece of memory
  • When that memory can be accessed
  • When that memory will be cleaned up

Let’s see how this works:

Basic Ownership

fn main() {
    let s = String::from("hello");  // s owns the heap memory
    
    // Memory layout:
    // Stack: s { ptr: 0x..., len: 5, capacity: 5 }
    // Heap:  "hello"
    
}  // s goes out of scope, memory is automatically freed

What just happened:

  1. String::from("hello") allocates memory on the heap
  2. Variable s becomes the owner of that memory
  3. When s goes out of scope, Rust automatically calls drop(s)
  4. The heap memory is freed automatically

No malloc(). No free(). No memory leaks.

Ownership Transfer (Moving)

fn main() {
    let s1 = String::from("hello");
    let s2 = s1;  // Ownership moves from s1 to s2
    
    // println!("{}", s1);  // Compile error! s1 no longer owns the data
    println!("{}", s2);     // s2 is the owner
}

This prevents double-free bugs at compile time. Only one owner can exist, so only one automatic cleanup will happen.

Borrowing: Access Without Ownership

fn main() {
    let s = String::from("hello");
    
    let len = calculate_length(&s);  // Borrow s, don't take ownership
    
    println!("Length of '{}' is {}", s, len);  // s still valid
}

fn calculate_length(s: &String) -> usize {  // s is a reference, not owner
    s.len()
}  // s goes out of scope, but it doesn't own the data, so nothing happens

The Borrow Checker: Compile-Time Safety

Rust’s compiler includes a “borrow checker” that enforces these rules:

Rule 1: References cannot outlive their data

let r;
{
    let x = 5;
    r = &x;  // Compile error: x doesn't live long enough
}
// println!("{}", r);  // This would be a dangling pointer

Rule 2: No mutable aliasing

let mut s = String::from("hello");

let r1 = &s;     // Immutable borrow
let r2 = &s;     // Multiple immutable borrows are OK
let r3 = &mut s; // Compile error: cannot borrow as mutable while immutable borrows exist

This prevents data races and ensures memory safety.

Comparing the Same Bug Across Languages

Let’s see how the same logical error is handled differently:

C version (compiles, crashes at runtime):

char* get_data() {
    char local[] = "hello";
    return local;  // Returns pointer to stack memory
}

int main() {
    char* data = get_data();
    printf("%s\n", data);  // Undefined behavior - probably garbage
}

Rust version (rejected at compile time):

fn get_data() -> &str {
    let local = "hello";
    local  // Compile error: returns a value referencing data owned by the current function
}

Rust catches this entire class of bugs before your code even runs.

The Third Option: Garbage Collection

Before we continue with Rust, it’s worth understanding another approach to memory management that many languages use: garbage collection.

How Garbage Collectors Work

Languages like Java, Python, C#, and JavaScript solve memory safety differently:

  • Automatic allocation - objects are created on the heap as needed
  • No manual deallocation - you never call free() or delete
  • Runtime cleanup - a garbage collector periodically finds and frees unused memory
// Java example
String data = new String("hello");  // Allocated on heap
// No need to free - GC will handle it eventually

The Garbage Collection Process

A garbage collector works by:

  1. Pausing the program (stop-the-world)
  2. Tracing all reachable objects starting from “roots” (global variables, stack variables)
  3. Marking unreachable objects as garbage
  4. Freeing the garbage memory and updating references
  5. Resuming the program
Before GC:          After GC:
┌─────┐             ┌─────┐
│ A   │────────────→│ A   │
└─────┘             └─────┘
   │                   │
   ▼                   ▼
┌─────┐             ┌─────┐
│ B   │             │ B   │
└─────┘             └─────┘

┌─────┐             (freed)
│ C   │ orphaned    
└─────┘             

Garbage Collection Trade-offs

Advantages:

  • Memory safety - eliminates use-after-free, double-free, memory leaks
  • Programmer convenience - no manual memory management required
  • Reduced bugs - entire classes of memory errors become impossible

Disadvantages:

  • Unpredictable pauses - GC can stop your program at any time
  • Memory overhead - GC needs extra memory to track object relationships
  • Performance cost - scanning and cleanup takes CPU time
  • No real-time guarantees - unsuitable for hard real-time systems

Why Not Garbage Collection for Systems Programming?

Garbage collectors work well for application development, but have limitations in systems programming:

// In an operating system kernel:
void handle_interrupt() {
    // This must complete in microseconds
    // A GC pause here could miss critical hardware events
}

// In a game engine:
void render_frame() {
    // Must complete in 16ms for 60 FPS
    // A 50ms GC pause would cause visible stuttering
}

For systems where predictable performance and minimal overhead are critical, garbage collection introduces unacceptable uncertainty.

The Performance Question

A common concern:

Doesn’t all this safety checking slow down my program?

The answer is no, because:

  1. All ownership checking happens at compile time - zero runtime cost
  2. No garbage collector - no stop-the-world pauses
  3. Zero-cost abstractions - Rust’s safety features compile down to the same assembly as hand-optimized C
  4. Predictable performance - no hidden allocations or cleanup

Conclusion

Understanding memory management is fundamental to systems programming, and the evolution from manual memory management to ownership systems represents a significant leap forward in software engineering.

We’ve seen how:

  • Stack and heap serve different purposes with different trade-offs
  • C/C++’s manual approach provides control but allows dangerous bugs
  • Garbage collection offers safety but sacrifices performance predictability
  • Rust’s ownership system delivers both safety and performance through compile-time guarantees

The key insight is that Rust doesn’t change how memory works at the hardware level instead it changes who is allowed to access memory when, and enforces these rules at compile time rather than hoping for correct behavior at runtime.

While Rust started in systems programming, its safety guarantees have made it increasingly popular for web development, CLI tools, and even machine learning, proving that safe, fast code is valuable across all domains.