8

I'm learning C in a tutorial and have reached the point where the term "buffer" s being mentioned regularly.

It has also mentioned how certain bad programming practises involving memory can be "vulnerable to buffer overflow". It defines the buffer as:

A small amount of memory reserved for the input source

and though I've heard about buffer overflow attacks in relation to malware I have never understood what is or how it actually works, particularly from a programming perspective.

Ideally, could this be explained in "lay mans" terms as I have very little computer theory knowledge.

Confuseduser
  • 83
  • 1
  • 3
  • 3
    You should read "Smashing the Stack for Fun and Profit": http://insecure.org/stf/smashstack.html –  Jun 25 '13 at 21:01

5 Answers5

9

A buffer is a pre-allocated area of memory where you store your data while you're processing it. Basically it's just saying that from a certain address in memory until memory address + x Bytes is reserved to allocate data. In C this is often called an array.

A buffer overflow happens when you assign more data than can fit into the buffer and overwriting the code beyond memory address + x. You might have done this before and you will notice that your program crashes. Now the problem is that somewhere beyond your buffer is the return address (this is pointing to the next instruction that will be executed after assigning the buffer and loading the data into it) and if you overwrite it with random data your program will crash. However if you manage to load byte code (this is a compiled program which the CPU can directly execute) and you can actually make it point to your program, then you can execute code on that machine.

Now you might think that this is not really an issue if you are running it locally, but imagine programs like SSH or FTP servers which run on the internet or imagine a restricted environment where certain programs run with elevated privileges. If you were able to execute code within the context and privileges of the other program, you could be able to break out of your restrictions or take over a remote server.

If you want to know more about assembly, bufferoverflows and shellcode, I suggest buying the Shellcoder's Handbook. It's THE book to learn this stuff.

Lucas Kauffman
  • 54,169
  • 17
  • 112
  • 196
0

To better understand buffer overflow, I can suggest this site. It contains extremely clear explanation, but it doesn't explain how to take advantage of the strange situation that occurs after the overflow. Despite of this, the following site explains better.

Before you read the guides, please pay attention to these few indications: Memory is divided in segments and every segment is divided in items located at a certain offset inside the same segment: Every item in the segment is accessed at starting position (offset) plus memory allocation for every kind of item (how many bytes are needed to allocate object in memory segment), so you should be aware of what does it means "Segmentation fault" and NEVER get out of bounds in programming techniques.

The second guide, uses two kinds of Operating System: Linux and Windows; The duality is extremely important, because you'll see that the offset of the Instruction Pointer (0x7c9d30d7) is written in reverse order in exploit coding (buff = 'x90'*230+'xd7x30x9dx7c'+'x43'*366). The target is a FTP server which is vulnerable to stack overflow, but the concept behind the scenes is the same in case of writing and executing a little C script with keyboard input, as you proposed. In the end of the guide, the attacker listens on TCP port 443 on Linux host and receive a shell prompt (the exploit in reverse shell) from a Windows host (the vulnerable FTP server).

Because of these guides, you should understand the phrase I wrote: "Make the software do something else it was originally programmed with input data" As an advice, I can suggest you start on the first guide, then pass to the second and turn back to the first, just to understand possibly obscure passages.


By a hacker perspective, XSS, RFI or even SQL are all poor of interest; The SQL language, for example, is useful to retrieve (or modify data) from a database using the boolean logic.

When you need to get information regarding tables, columns and records, you pass conditions to the SQL engine as parameters; These parameters are compared together with the logic of AND, OR, NOT.

Imagine adding (injecting) a new parameter (syntax must be correct) to a predefined SQL statement, where the resulting condition is always true respect to the original restricted one; In such way, you can bypass the restriction. That's it, extremely simple, no hacking techniques or skill are needed.

By the way, an exploit as NSA EternalBlue (and derivates) is very appreciated, because overwrites part of the authentication process (challenge-response), connect to resources without specified password and injects new code using a hidden interprocess communication channel on SMBv1 unpatched Windows machines.

vakus
  • 3,743
  • 3
  • 20
  • 32
0

As you wrote a buffer s a small amount of memory (e.g. 16 bytes). When I now write more bytes into the buffer (e.g. 20 bytes) than its capacity this is called buffer overflow.

If the data in the buffer comes from the outside this is a security flaw as the new bytes are written in a memory area which is used for other purposes.

When the other purpose of the overwriten bytes was to contain program code you an imagine what happens then.

Uwe Plonus
  • 2,267
  • 12
  • 14
0

In C, buffer overflows most commonly happen when data copied into an array(buffer) exceeds its defined size. This explains it perfectly. Just read the C code there if you're not familiar with assembly

Nitaai
  • 123
  • 4
-1

Imagine you have to allocate memory from byte at position A to byte at position B.

Also imagine you have some other information in position C which is near position B or position A, but it doesn't overwrite memory space allocated from A to B.

Now, every operation of input/output executed from a memory location to another memory location (dynamic or static memory), needs to be fetched,verified,executed and included in a list of other sequentially operations which are called "a program" (The Instruction Pointer (IP) gives the position of the new operation of that list to execute in memory)

Imagine to write out of bound the pre-allocated memory, using one or more instructions in assembly language (or by using more data in input as expected in runtime), and overwrite the portion of memory where IP resides with a new allocation memory to a new mini-program (an other mini list of other sequentially operations) in the main program: In a few words, you have taken control of the main program.

This is the concept behind hacking or cracking: Make the software do something else it was originally programmed with input data.

guntbert
  • 1,825
  • 2
  • 18
  • 21
  • To make your statements better understandable you might want to add a graphic (about A,B,C and the IP). – guntbert Jan 07 '19 at 17:00