Logo
RSS Feed

Overview


Created: 27.09.2020

The idea of a buffer overflow attack is quite simple, though the implementation might initially be difficult to digest. I am exploring this notion and visualising it in this article.

Mechanics

Imagine you have a job you don’t really like (I think most of us have had such an experience at least once in our life 😒). And also, imagine that you are being highly underpaid. Your job is simple and ridiculous: place rabbits πŸ‡ into one set of boxes and foxes 🦊 - into another. You happen to have 6 boxes πŸ“¦πŸ“¦πŸ“¦πŸ“¦πŸ“¦πŸ“¦ for foxes 🦊 and 4 πŸ“¦πŸ“¦πŸ“¦πŸ“¦ for rabbits πŸ‡. However, you were given 4 rabbits πŸ‡πŸ‡πŸ‡πŸ‡ and 7 foxes 🦊🦊🦊🦊🦊🦊🦊. So, one fox 🦊 doesn’t have its fox box πŸ“¦. However, there are also rabbit πŸ‡ boxes πŸ“¦ right nearby. So, even though you were explicitly told not to place foxes into rabbit boxes and vice versa since you don’t give a shit, you put a fox into a rabbit fox. What happens? One can only guess… πŸ€”. Long story short: the fox eats the rabbit, and now you have 7 foxes and only 3 rabbits πŸ‡. Alas! If only you’d followed the manual πŸ“–…. Most likely, one would get fired after such a mistake, but we can’t fire the compiler, so that would happen if the developer was not using a memory-safe language or wasn’t careful enough. Now, to the technicalities.

Below is a short example of such a vulnerability in code (C language). Let’s read it line by line and understand the mechanics.

#include <stdio.h>
#include <string.h>

int main(int argc, char *argv[]) {
  //init two arrays (buffers), 4 bytes for rabbits and 6 bytes for foxes
  char rabbit_boxes[4], foxes_boxes[6];
  
  //put 4 rabbits into rabbit boxes and 6 foxes into foxes boxes
  strcpy(rabbit_boxes, "rrrr");
  strcpy(foxes_boxes, "ffffff");
  
  //print the contents of these two arrays in memory before buffer overflow occurs
  printf("[BEFORE] Foxes boxes are at %p and contain %s", foxes_boxes, foxes_boxes);
  printf("[BEFORE] Rabbit boxes are at %p and contain %s", rabbit_boxes, rabbit_boxes);
  
  //if the length of the argv[1] exceeds the size of the foxes boxes array, you'll have your overflow. Some amount of rabbits in rabbit boxes will be "overflown", i.e. eaten
  printf("[STRCPY] Putting %d foxes (bytes) into foxes boxes", strlen(argv[1]));
  
  //print the contents of these two arrays in memory after and if buffer overflow occurs
  printf("[AFTER] Foxes boxes are at %p and contain %s", foxes_boxes, foxes_boxes);
  printf("[AFTER] Rabbit boxes are at %p and contain %s", rabbit_boxes, rabbit_boxes);
}

The above example is just a technical representation of the story at the beginning of this article.

A stack is a data structure and an area in memory that stores local variables, arguments, return addresses, return values, and frame stack pointers. It’s also used to save data temporarily. When a function calls another function, the caller’s data (registry data, EIP and EBP) is temporarily stored in the stack until the callee returns.

Let’s assume we have a cruel kill_rabbit πŸ‡ function:

#include <stdio.h>
#include <string.h>

int kill_rabbit(char* name) {
  char rabbit_names[10]; 
  strcpy(rabbit_names, name); 
  return 0;
}

int main(int argc, char *argv[]) {
  int result = kill_rabbit("VeryLongRabbitNameThatWillOverflowBuffer");
  if (result == 0){
    printf("The rabbit is alive and will eat all your carrots πŸ₯•\n");
  }
  else {
    printf("You've killed the rabbit! The πŸ₯•s are saved. Was it worth it?\n");
  }
  return 0;
}

At the moment of the function call, the stack is arranged in the following way:

img But what if the fox 🦊 still wants to kill Roger πŸ‡? How to help him? If the fox types a long name of a rabbit, like, IHateThisRabbitIHateThisRabbitIHateThisRabbitIHateThisRabbitIHateThisRabbitIHateThisRabbitIHateThisRabbitDieRabbitDieDieRabbitDieDieRabbitDieDieRabbitDieDieRabbitDieDieRabbitDieDieRabbitDie, the strcpy function will still copy this name to the buffer. And here are two problems from the πŸ‡ perspective:

  1. strcpy function does what it says and trusts the user entirely. It does not check the length of the provided input before copying it. It will continue copying character by character until it sees an end of the string symbol, usually a null byte /0x00.

img 4. The second problem is that the buffer is on the stack alongside other data, including our return address. At some point, the strcpy will overwrite something valuable. The trick is to find out the exact length of the string needed to

Exploitation Steps

  • Identify a buffer to overflow.
  • Find a value you want to overflow.
  • Β Calculate the distance in bytes between the buffer and the value:
    • msf-pattern create -l 1400 (to generate an excessively long string of random characters). For example, we have developed a string, and some portion of it contained [...]lksdjfk[...].
    • Look at the value of the registry or other buffer that is the target of BO. Translate the hex to ASCII and search this sequence in the above-generated string. Let’s say we see the EIP registry set to 0x73646a66, which in ASCII is sdjf.
    • Find the exact offset of the substring above within the string: msf-pattern_offset -q 73646a66.

πŸ“• RTFM

  x/s buffer # represents the value at buffer as a string
  x/x value # represents the value as a hex integer 4 bytes long
  
  print &value - &buffer (or &buffer - &value) # for arithmetic operation
  
  x/16xw buffer # print 16 words starting from the beginning of the buffer

Mitigation

How does one stop vulnerabilities of this type? How do we save the rabbit πŸ‡?

Check the boundaries

The first problem was that strcpy did not check if the data fit into the buffer. The solution is simple - check! In other words, use the function that does or implements that check on the interpreter’s or compiler’s level.

Stack canaries πŸ₯

Stack canaries are an integrity check mechanism.

There are several types of canaries:

  1. XOR random
  2. random
  3. null
  4. terminator 🦾
  5. 64-bit
  6. custom

Most canaries start with a null byte. That prevents some buffer overflow attacks, since it’s a tricky business to write several null bytes to a location, the string ends with the first one. Other null bytes will be discarded. However, some functions can still write several null bytes to memory (read, for example).

It’s still a practised technique, but in the past, miners would use a canary to determine if a mine was safe to enter. They would release a canary into the mine and observe if it would return. It was indeed a cruel method, and I acknowledge that. However, this is the historical origin of the term.

It means that we put some random value on the stack and keep a copy of it in some safe place. Before the function returns and passes the torch πŸ—½ to the calling function. The CPU then can check that the data is not corrupted by comparing the value on the stack and someplace safe. Look at the following picture:

img First, look at the picture on the left. This is what the stack looks like when all is in order, nothing is overflown, and the rabbits πŸ‡ are safe. The canary value goes between the local variables that are potentially dangerous and the data we are trying to protect. Now, look at the picture on the right. The buffer was overflown, corrupting the EBP and EIP saved on the stack, and also the canary πŸ₯. When the CPU checks this canary value, it doesn’t match the one it remembers, and an exception can be raised.

img

Since the value of the canary πŸ₯ is random, you won’t be able to overwrite the buffer so that the canary remains intact. How to circumvent it, then?

Bypass Techniques

  1. Recompile the executable without the canary flag set.
  2. See if there is a way to guess/calculate the canary’s value.
  3. Use another vulnerability (memory leak) to see the canary value, for example, format strings.
  4. Brute force. Less feasible for 64-bit systems but for 32-bit systems. One byte (8 bits) is reserved for the null byte. So, we have only 24 bits to guess. In the worst-case scenario, it would take 2^24 guesses to get the correct value (some compilers use the same canary during execution.
  5. Overwrite the values, overwrite the SEH AND trigger an exception before the canary is checked. # Binary Exploit Development - SEH Based Overflow SEH Exploit

DEP

DEP stands for Data Execution Prevention and is a flag that marks some regions as non-executable. For example, stack and heap are not supposed to contain code, only data. Something is only executed from the stack or heap when something malicious occurs. It only makes sense that the data is locked for execution.

Bypass Techniques

  1. ROP - return-oriented programming.
  2. WriteProcessMemory
  3. VirtualAlloc https://www.youtube.com/watch?v=phVz8CqEng8

ASLR & PIE

ASLR stands for Address Space Layout Randomisation (ASLR). PIE stands for Position Independent Executables.

If you haven’t already, I suggest reading about memory addressing here. In short, when a program is loaded into memory, it has specific preferred base addresses for different sections of the program (code, stack, heap). Without ASLR, these base addresses are usually the same and therefore predictable. This is why carrying out buffer overflow attacks can be easier when you know where things are. ASLR removes this advantage from the attacker. However, it doesn’t mean that it completely eliminates buffer overflow attacks; it just makes them harder to execute.

PIE is the feature of an executable that makes it work when ASLR is enabled. Without PIE a program would not be able to run and would likely crush or would not even be able to launch.

Bypass Techniques

  1. Brute-force. Obviously, We have limited number of addresses and we still need to maintain the general memory layout. So, the chances are at some point you’ll guess. Would you try?
  2. Information leaks. Format string vulnerabilties can be leveraged to leak the addresses and help calculate the offsets and base.
  3. ROP. More about it later.
  4. NOP sled or JOP (jump-oriented programming).

References

Expand…

Hacking: the Art of Exploitation, J. Ericson https://www.sans.org/blog/stack-canaries-gingerly-sidestepping-the-cage/ https://book.hacktricks.xyz/reversing-and-exploiting/linux-exploiting-basic-esp/bypassing-canary-and-pie

https://www.youtube.com/watch?v=8kYTDK9oKV8 - bypass DEP with WriteProcessMemory https://www.youtube.com/watch?v=phVz8CqEng8 - bypass DEP with VirtualAlloc