Scoped Stack Allocation in GBA EWRAM: A Rust Smart Pointer Pattern

The Game Boy Advance has a very unusual memory layout compared to modern systems. How can you utilize Rust to use each memory area with high performance and safe code?

The Gameboy Advance have 2 main general purpose memory areas:

IWRAM (Internal Work RAM): ~32 KB, fast, typically used for the stack and hot code.
EWRAM (External Work RAM): ~256 KB, slower, but much larger. Used for general-purpose memory and application data.

When writing Rust for the GBA, this creates tension with Rust’s default assumptions:

Rust wants to put most temporaries on the stack.
The stack usually lives in IWRAM.
Large stack frames are bad (or outright impossible) due to the limited size of IWRAM.

Heap allocation can solve this, but comes at additional runtime costs, and is therefore often avoided on bare-metal targets where each cpu cycle matters. So, can we use EWRAM memory to allocate like the normal stack?

This post describes a pattern I implemented to solve this problem: a scoped, stack based allocator-like trait backed by EWRAM, exposed through a Rust trait that behaves like a smart pointer—but is enforced entirely by lifetimes.

The result is an API that looks like this:

let cx = &mut Ewram;
let value: &mut i32 = cx.init(|| 0);
*value += 3;

No heap. No global state leaks. No unsafe at the call site.

Design Goals

Avoid heap allocation
Avoid growing the IWRAM stack
Use EWRAM as a stack (LIFO, scoped)
Statically prevent use-after-free
Allow destructors (Drop) to run correctly
Work in no_std and on real hardware

Rust already has all the pieces we need—but not in a way that maps cleanly onto the GBA memory model.

The Core Idea: `Rs` (“RAM Stack”)

At the heart of this design is a single trait:

pub trait Rs<'a, T> {
    fn init(self: &'a mut Self, f: impl FnOnce() -> T) -> &'a mut T;
}

This trait represents a slot of memory that can be initialized exactly once and then borrowed for the entire lifetime of the slot.

Key properties:

self is mutably borrowed for 'a
The returned &'a mut T cannot outlive the slot
The slot cannot be reused after initialization

Lifetime as Ownership

Consider this example:

let slot = &mut Ewram;
let a = slot.init(|| 0);

As soon as init is called:

slot is mutably borrowed for its entire lifetime
You cannot call init again
You cannot move or drop slot

All of the following are compile-time errors:

Double initialization
Moving the slot after initialization
Dropping the slot while the value is still borrowed

This is how Rust enforces stack discipline without runtime checks.

`Ewram`: A Scoped EWRAM Stack

Ewram is an enum with two states:

enum ExternalRam<'a, T> {
    Ewram,
    Initialized((EwramNonNull<T>, PhantomData<&'a ()>)),
}

Ewram is the empty slot
Initialized owns a pointer into EWRAM

The lifetime 'a is not used at runtime—it exists purely to teach the borrow checker that:

“Once initialized, this slot is borrowed forever.”

This prevents all invalid usage statically.

Why `FnOnce() -> T`?

init takes a closure instead of a value:

cx.init(|| big_struct())

This avoids a subtle but critical problem:

If T were passed by value, Rust would first place it on the normal stack
That defeats the entire purpose, and adds additional memcopy overhead.

By deferring construction until after allocation, we ensure the value is written directly into EWRAM.

Drop Safety and Order Guarantees

Allocation and deallocation are (EWRAM) stack-based and drops must happen in strict reverse order. This is ensured by the compiler and the borrow checker.

Choosing the Stack at the Call Site

The real power of Rs is that it’s not tied to EWRAM.

The same trait is implemented for multiple storage backends, including:

Ewram: EWRAM-backed stack allocation
None: normal Rust stack allocation

This means the caller decides where memory lives, without changing the API.

Using `None` as a Stack Slot

On the normal stack, Rs is implemented for:

Option<(T, PhantomData<&'a ()>)>

The PhantomData<&'a ()> does not exist at runtime. Its sole purpose is to tie the stored value to the borrow lifetime of the slot, ensuring that the value cannot outlive the slot.

From the caller’s perspective, this is completely invisible:

let cx = &mut None;        // normal stack (IWRAM on GBA)
let value: &mut i32 = cx.init(|| 0);
*value += 3;

The same generic function works unchanged:

fn init_default<'a, T: Default>(cx: &'a mut impl Rs<'a, T>) -> &'a mut T {
    cx.init(|| Default::default())
}

And the call site controls placement:

let a: &mut i32 = init_default(&mut None);   // normal stack
let b: &mut i32 = init_default(&mut Ewram);  // EWRAM stack

No code duplication. No cfg flags. No runtime cost. No heap allocation. Call site decides where memory lives.

Design Goals

The Core Idea: Rs (“RAM Stack”)