Buffer overflow vulnerability implemented by memcpy function


The environment and tools needed in this experiment can be commented if necessary. I won’t introduce them one by one. It’s estimated that few people see them. It’s mainly too much. I don’t want to type.

To learn this loophole, we should learn the theoretical knowledge of the loophole.

During the execution of the assembler, if a function needs to be called, it is generally the call function address, and the call instruction will press the function return address (the address of the next instruction to be executed after the function is executed) on the stack, and then the extension direction in the stack is extended from the high address to the low address. It may not be well understood here. I draw a diagram hhhh. For example, when I execute the call instruction, the serial number of the return address of pressing the stack is 10000 in the stack, and then the variables I call the function also need to be squeezed. I apply for an array with a space of 10, so I will apply for 10 grids of 9990 to 10000 to put parameters. Then, in the process of saving parameters, the stack pointer is actually moved from low address to high address, and the first parameter is stored in 9990, The second one is 9991,,, but if our application space is 10 but there are actually 20 parameters, the parameters will continue to be stored in the stack, which will overwrite the data of the return address of the original storage function.


After the memcpy function is executed, the stack is balanced, and the number of calling functions in the stack will be out of the stack until the location of the stored return address is out of the stack to EBP. The address of EBP is the address of the current instruction and the address of EIP is the address of the next instruction.

Let’s think about it. If we pass the malicious code to the stack as a parameter, and then try to make the function return and jump back to the stack to execute the malicious code in the stack, won’t we be able to realize stack overflow?

So the problem is, 1 If our stored data causes stack overflow, where does the data overflow start (overwriting the data of the return address)? 2. How to make the returned main program EIP return to the stack to execute the malicious code we saved?

First show the experimental program containing overflow vulnerability Code:


It can be seen that we call the overflow function in the main function and pass in the parameter shellcode. In the overflow function, we call the memcpy function, which copies the string into the array. The applied array length is 10, but the parameter shellcode we pass is larger than the array space.

Experimental idea: first find out which part of shellcode covers the return address of memcpy function in the stack when the program is running, then modify the part of shellcode covering the return address to JMP ESP (back to the stack) bytecode (compiled bytecode), and change JMP esp in shellcode to the bytecode of malicious code we want to execute, so as to realize overflow. Maybe it’s very abstract here, but it will be understood through debugging tracking. Let’s record the debugging process.

Debugging with VC + + 6.0 debugging, because you can see the stack change process, but also see the disassembly code, and bytecode, used to feel very difficult, now feel very powerful HHH.

Make a breakpoint, stop the mouse on the line of code at the break point, click the little hand, and then a red dot appears, which is the breakpoint:


Then click the box to debug and stop at the breakpoint. If there is no compilation, the compiler will remind you to compile and just do it.


After the operation stops at the breakpoint, click the three icons I circle, from left to right, to display the register, display the memory condition (you can search and see the contents of the stack later), and display the disassembly code



Then click the next step. The circle shows the step-by-step and step-in. I generally like step-by-step. I don’t understand that I can baidu if I go to other places.


You can see call memcpy and get the return address out of the stack to EBP, then RET and return to execution.


When the previous sentence of pop EBP is executed, the content of the stack is. At this time, the content of the address of the ESP is out of the stack into the EBP. At this time, the content is 46474849, and the machine reading is 49484746 (this involves large-end and small-end storage, without explanation, Baidu).


Then, when it goes down, it will report an error, because if 46474849 is put out of the stack into EBP, the machine will find this address to execute the next code, but if there is no code at this address, it will report an error.


Therefore, we have solved the first problem of the previous idea. The contents of shellcode will overwrite the return address from 46. Now we solve the second problem to return the address of the next instruction to the stack. At this time, we can use JMP esp in the dynamic link library. We can use a tool to find it at this time. Double click findjmpesp Exe, all displayed. Let’s choose the first one.


Then write this address to the position of 46474849 in the shellcode, and then put it into the stack instead of it. Careful students will find out how it is 49484746 when it just comes out of the stack, so we should write the address backwards when we write it in the shellcode, that is, da24e477

Recompile and debug again. Will the instructions executed after the stack become JMP ESP.


The next step is to see what code to execute after returning to the stack. Although the later machine is interpreted as push and pop, its bytecode is actually stored in the stack. The numbers after 49 are 50, 51




So the third step is that we write in the bytecode of the malicious code we want to execute, and we will continue to perform the operation we want. How to make bytecode is the work of the compiler. Let’s write the C language program of malicious code first. Here I use the one given by the teacher.

Show me my program. This program is assembly code. Its content is to call a pop-up function and then call a function to exit the program. This program is not written by me. We find and fill in the absolute memory addresses of the two functions:


How to find the absolute memory address of a function:

MessageBox A is in user32 The relative address in DLL is 3d8de, user32 The base address of DLL is 77e1000. When the program runs, the address of the function is the base + relative address. So we add up these two numbers 77e4d8de.



Another function, calculate by yourself:


Then we compile and run createshellcode See what the program looks like.


Then we write the bytecode of the ASM part of the function (pop-up and exit the program) to overflow After JMP esp of shellcode of C function. Disassembly can view bytecode. To copy all byte codes of ASM part, I know it’s a heavy workload. Maybe there are scripts hhhh on the Internet, but I copy them by hand


Then go back to overflow c. Then we compile and execute it.


The result of execution is like this. Maybe careful students will find that the words of my two pop-up windows are not the same, because the words of the previous pop-up window have been changed by me. I use the original bytecode for the characters of this pop-up window. Stack overflow is successfully implemented~~~



That’s all for the introduction of this experimental study!