The Game Boy Advance has a very unusual memory layout compared to modern systems. How can you utilize Rust to use each memory area with high performance and safe code?
The Gameboy Advance have 2 main general purpose memory areas:
- IWRAM (Internal Work RAM): ~32 KB, fast, typically used for the stack and hot code.
- EWRAM (External Work RAM): ~256 KB, slower, but much larger. Used for general-purpose memory and application data.
When writing Rust for the GBA, this creates tension with Rust’s default assumptions:
- Rust wants to put most temporaries on the stack.
- The stack usually lives in IWRAM.
- Large stack frames are bad (or outright impossible) due to the limited size of IWRAM.
Heap allocation can solve this, but comes at additional runtime costs, and is therefore often avoided on bare-metal targets where each cpu cycle matters. So, can we use EWRAM memory to allocate like the normal stack?
This post describes a pattern I implemented to solve this problem: a scoped, stack based allocator-like trait backed by EWRAM, exposed through a Rust trait that behaves like a smart pointer—but is enforced entirely by lifetimes.
The result is an API that looks like this:
let cx = &mut Ewram;
let value: &mut i32 = cx.init(|| 0);
*value += 3;
No heap. No global state leaks. No unsafe at the call site.
Design Goals
- Avoid heap allocation
- Avoid growing the IWRAM stack
- Use EWRAM as a stack (LIFO, scoped)
- Statically prevent use-after-free
- Allow destructors (
Drop) to run correctly - Work in
no_stdand on real hardware
Rust already has all the pieces we need—but not in a way that maps cleanly onto the GBA memory model.
The Core Idea: Rs (“RAM Stack”)
At the heart of this design is a single trait:
pub trait Rs<'a, T> {
fn init(self: &'a mut Self, f: impl FnOnce() -> T) -> &'a mut T;
}
This trait represents a slot of memory that can be initialized exactly once and then borrowed for the entire lifetime of the slot.
Key properties:
selfis mutably borrowed for'a- The returned
&'a mut Tcannot outlive the slot - The slot cannot be reused after initialization
Lifetime as Ownership
Consider this example:
let slot = &mut Ewram;
let a = slot.init(|| 0);
As soon as init is called:
slotis mutably borrowed for its entire lifetime- You cannot call
initagain - You cannot move or drop
slot
All of the following are compile-time errors:
- Double initialization
- Moving the slot after initialization
- Dropping the slot while the value is still borrowed
This is how Rust enforces stack discipline without runtime checks.
Ewram: A Scoped EWRAM Stack
Ewram is an enum with two states:
enum ExternalRam<'a, T> {
Ewram,
Initialized((EwramNonNull<T>, PhantomData<&'a ()>)),
}
Ewramis the empty slotInitializedowns a pointer into EWRAM
The lifetime 'a is not used at runtime—it exists purely to teach the borrow checker that:
“Once initialized, this slot is borrowed forever.”
This prevents all invalid usage statically.
Why FnOnce() -> T?
init takes a closure instead of a value:
cx.init(|| big_struct())
This avoids a subtle but critical problem:
- If
Twere passed by value, Rust would first place it on the normal stack - That defeats the entire purpose, and adds additional memcopy overhead.
By deferring construction until after allocation, we ensure the value is written directly into EWRAM.
Drop Safety and Order Guarantees
Allocation and deallocation are (EWRAM) stack-based and drops must happen in strict reverse order. This is ensured by the compiler and the borrow checker.
Choosing the Stack at the Call Site
The real power of Rs is that it’s not tied to EWRAM.
The same trait is implemented for multiple storage backends, including:
Ewram: EWRAM-backed stack allocationNone: normal Rust stack allocation
This means the caller decides where memory lives, without changing the API.
Using None as a Stack Slot
On the normal stack, Rs is implemented for:
Option<(T, PhantomData<&'a ()>)>
The PhantomData<&'a ()> does not exist at runtime. Its sole purpose is to tie the stored value to the borrow lifetime of the slot, ensuring that the value cannot outlive the slot.
From the caller’s perspective, this is completely invisible:
let cx = &mut None; // normal stack (IWRAM on GBA)
let value: &mut i32 = cx.init(|| 0);
*value += 3;
The same generic function works unchanged:
fn init_default<'a, T: Default>(cx: &'a mut impl Rs<'a, T>) -> &'a mut T {
cx.init(|| Default::default())
}
And the call site controls placement:
let a: &mut i32 = init_default(&mut None); // normal stack
let b: &mut i32 = init_default(&mut Ewram); // EWRAM stack
No code duplication. No cfg flags. No runtime cost. No heap allocation. Call site decides where memory lives.