Intro
At the very beginning of my career I found it hard to grasp the notion of memory layout and stack. Even when I read about it several times, it didn’t feel solid. That’s why I drew drew and drew. Because my mind prefers pictures, analogies, metaphors and examples! So, I’ve made that. May be someone finds it useful.
Picture
Metaphor and Analogy
I was thinking about a good metaphor, but that was the extent of my imagination… Let’s say some weirdo (let’s call him Gargaralian) is piling up all the boring books he has π under the table because the appartment is too small and the luxury of having a bookcase is not an option available for him. Besides, the table is a little loose, so that pile of books could be like the fifth leg. Something like that:
He is piling them up to reach the table top. He has limited space from the floor to the table top, around 4294967295
of hydroget atoms (he’s a weirdo, thus he counts table heights in atoms of hydrogen π€ ). For normal people, it’s about 43 centimetres (around 17 inches). With each book π the distance to the table top is decreased β¬οΈ . So, when he puts the first book, the distance becomes smaller - 39 cm. I think, you get the main idea. The same is with memory for each program: each one of them is given a limited space and it should decide wisely how to use it. Some programs might fail that task and have an overflow (when there are more books than the table height lets us pile), some underflow (when there are on the contrary not enough books and the table gets loose), but that’s whole another story.
We will go from the top to the bottom, from the highest address (0xffffffff
) to the lowest (0x00000000
). As an example, I’ll use ping
program. So, we have 6 books…
Memory Segmentation
Cmd args and env vars
This is our first π, right on the floor. We are executing a program with some command line argument which now has the highest address in memory (around 0xffffffff
):
ping ya.ru
This goes on the very bottom of the pile. There is pretty much of space left!
Stack
Consider the following code:
void func π plough(){
# Plough the soil to plant the seed
}
void func π± seed(int seed){
# Plant the seed
plough()
}
void func π₯ grow_rose(int seed){
# grow and get a rose
seed(seed)
}
grow_rose(4)
To grow a rose π₯, one must first plant and nurture a seedling π±. But before planting the seed, one must first plough π the soil. So, here are the steps: π₯ get_rose
<- π± seed
<- π plough
.
In other words:
- π
plough
- π±
seed
- π₯
grow_rose
Depending on the system, the stack grows up (towards the highest address) or down (towards the lowest address). Let’s assume we are dealing with a system where stack grows down. There are three important registers that hold the value that’s different for each function: ESP
pointing to the top of the stack (aka stack pointer), EBP
pointing to the start of a frame of a function (aka base pointer) and EIP
pointing to the code currently being executed (aka function pointer).
Each function’s belongings on the stack are called a stack frame. Each function gets one. I don’t know about you, but I’ve watched my fair share of detective movies and series, and when they show someone getting arrested, they come to jail, register and leave all their belongings. They get them back when they are released. You might relate this to a stack: please save this and that until I return.
The EBP
value is needed in order to “delete” the bunch of stuff stored on the stack for this particular function. They are not actually deleted, they will eventually get overwritten, but since EBP
is no logner pointing there, and ESP
is restored to the value of the EBP, they
are not available for the code.
Here is a little picture of a stack, that shows how EBP
can be moved up and down. Here is how the stack is built:
So, here is how our stack will look like when when the π function is executed:
On the other hand, ESP
is moving up and down WITHIN the given frame, with each addition or removal from the stack. EBP
stays the same WITHIN the frame and changes only when the frame is changed. Since all local variables are added at the beginning of the function in assembly (regardless of where those were added in the source code), and CPU knows the number of those local variables, it can use EBP as a base and calcilate the address of each variable by adding multiples of 32-bits (8 bytes) to the EBP. It’s pretty much the same thing as with relative virtual addresses and image base described here.
Below is a picture of one frame to visualise the difference between EBP
and ESP
:
In the pictures above, fingers pointing to the right ππΎ are EBP
pointers, palm-front βπΎ is ESP
and ππΎ is a saved EIP
(return address). Each function has a finger of different colour, making tracking the changes in the pictures easier. The π₯ grow_rose
’s base pointer is of the darker colour ππΎ, and its EIP
(function pointer) is of the same colour but pointing to the left ππΎ. The π±seed
function’s EBP
π and EIP π are yellow. And last but not least, the π plough
function’s EBP
ππΌ and EIP
ππΌ are both beige.
Now, when the π plough
function returns, the CPU retrieves the π±seed
function’s EBP
π from the π plough
function’s stack frame so that now EBP
points to the base of this stack frame, which is now π±seed
function’s frame. It also returns the EIP
value from the stack, knowing which line of code to execute next.
Just in case you are wondering, return values is stored in the
EAX
register, not on the stack.
Something similar happens to the EIP
register.
void func π plough(){
ππΌ # Plough the soil to plant the seed
}
void func π± seed(int seed){
# Plant the seed
π plough()
}
void func π₯ grow_rose(int seed){
# grow and get a rose
ππΎ seed(seed)
}
grow_rose(4)
When the callee function returns, thanks to the saved EIP
value on the stack, the CPU knows where to resume the code. Think of the EBP
as a bookmark for stack and EIP
- as a bookmark for code.
When a function is called, it prepares a nest πͺΊ for itself, a cosy one (like a pregnant woman during the nesting period). It’s called a function prologue. After all the fun is over, the function “cleans the mess”, which we call the function epilogue.
push ebp <- function prologue for the main function
mov ebp,esp
sub esp,0x8
and esp, 0xfffffff0
mov eax,0x0
sub esp,eax
push 20
push 10
call Addme
...
push ebp # <- function prologue for the AddMe function
mov ebp,esp
sub esp,4
...
mov esp,ebp # <- function epilogue for the AddMe function, stack is cleaned, i.e. stackpointer is moved up
pop ebp # EBP restored
ret # EIP restored
A little note before we part, I remember being very confused about what were the following lines:
sub esp,0x8
and esp, 0xfffffff0
Let’s start with the first line, sub esp,0x8
. Remember that the stack is growing towards the LOWEST address? So, the ESP
pointer gets smaller each time the stack grows. When we ADD something to the stack, we need to SUBTRACT from the stack pointer. When we want to DELETE something, we ADD. A little confusing, I know. Remember, though, on some systems stack grows downwards!
If you see the source code for the assembly above, or if you prefer the hardcore way, you may notice that there are two local variables, integers 20
and 10
. Integers on a 32-bit system are 4 bytes in size. So, two integers is 8 bytes. By substracting 8 from the ESP we are moving the stack pointer down towards the lowest address and thus preparing the nestπͺΊ for the two local variables.
The second line, and esp, 0xfffffff0
is for the alignment purposes as all addresses need to be aligned to 16-byte (in case of 32-bit systems at least) for some operations. The 0xfffffff0
mask used with AND operand clears the last bytes of the ESP. Here is what this mask looks like in binary:
As you can see, the last 4 bits are equal to 0
. When 1
is AND
ed with 0
, you get 0
. When something get’s AND
ed with 1
, it remains as it was. So, for the first 28 bits all 1
s will remain 1
s and all 0
s will remain 0
s. For the last 4 bits, regardless of the original value, all the 4 bits will be set to 0
. Pretty much as if we were multiplying by 1
or 0
, same effect here.
The following line mov eax,0x0
, I assume, is for making sure that the EAX
register is cleaned before used (it’s always used for the return value on Windows and some other systems). However, I don’t yet understand what that’s for: sub esp,eax
.
π BTFM: To see the stack frame in
gdb
:x/16xw $esp
A little about local variables
int main() {
int stack_var = 4;
}
void function() {
int stack_var = 5;
}
Both main
and function
have a stack_var
. They are not the same. They only share the name, but in fact these only exist while the function’s context exists. These are local variables.
Yet another example
Say, our custom ping program has a function dns_request
. Let’s have a closer look:
main(int argc, char *argv[]) {
static int dns_request;
//make request
dns_request("ya.ru");
}
function bool dns_request (char domain_name[]) {
bool result;
struct dns_request;
dns_request.domain = domain_name;
...
for (int i = 1; i < 10; i++) {
somefunc(dns_request);
}
...
return result;
}
There is an interesting article about primary function.
What goes right on the stack? In general, everything that dies after function’s finished executing and before it returns. Specifically (from the snippet above):
- function arguments (
char domain_name[]
) - Local variables (
bool result
,int i
)
Depending on archetecture and calling conventions, the code may differ a little, but I would like to show a snippet of C and the corresponding assembly below:
function void _func1(){
//assembly equivalent
//push 3
//push 2
//call _func2
_func2(2,3);
return 0;
}
function int _func2(int arg1, int arg2){
//function prologue (behind the curtain --> no equivalent in C)
//push ebp
//mov ebp, esp
//mov eax, [ebp + 8]
//mov edx, [ebp + 12]
//add eax, edx
int sum = arg1 + arg2;
//pop ebp
//ret
return sum;
}
What goes on the stack (in assembly)? Everything that was push
ed (_func1
: 3
and 2
; _func2
: ebp
). So, in assembly, it’s obvious what goes on the stack.
Here is an interesting article about the different layers of memory of a C-program. The author, among other things, states that we cannot use local variables to return values because the stack is freed after program execution. But the thing (is far as I know) is that the return values are put into the eax register before returning, so they are not technically on the stack any more. Of course, if you use not the value from the returned function but an address of the local variable (which would be utterly weird), then yes, there is no guarantee it wasn’t overwritten.
What I’d also like to mention in the context of stack is that even though they say it’s freed, it’s not actually released per se. There is an esp register, which is very interesting.
Is underflow specific to stack?
Heap
Stack vs Heap article.
Doesn’t have a fixed size. The growth of the heap moves downward toward higher memory addresses. So, heap grows toward stack and stack - toward heap. One day when they meet (if they meet), you’ll deal with stack or heap overflow.
Variable need to be allocated first with a malloc()
function. Pointers are usually used to reference heap variables.
int main() {
heap_var = (int *) malloc(4);
}
If it can’t allocate memory, it will return a NULL
pointer (0x0
). There once was a vulnerability in BootROM on iOS devices regarding this (Alloc8
exploit).
# Good code
void *pointer = malloc(size);
if (pointer == NULL) {
// handle error
} else {
// pointer is valid, continue
}
# Bad code in alloc8 vulnerable BootROM
void *pointer = malloc(size);
if (pointer == 0x8) {
// handle error
} else {
// pointer is valid, continue
}
For some reason, the return value was compared to 0x8
, not NULL
or even 0x0
. Typo?
After the memory is no longer needed, you must call free()
on this memory location. However, it will still be reused once the new malloc is called, even after freeing the memory space. It depends on the system, how the allocation functions are implemented, and the block size to allocate.
.BSS
int global_var;
int main() {
static int uninit_var;
// some code
}
This memory segment holds uninitialised global and static variables. Have a fixed size. Writable.
.Data
You can read about static variables, functions, classes and class members here.
This memory segment holds initialised global and static variables. Have a fixed size. Writable.
int global_var = 4;
int main() {
static int uninit_var = 5;
// some code
}
.Text
It contains the actual code of the program. It’s usually (and better be) read-only to prevent changes to the code. Also, different instances of the same process can share this memory portion.