C + + disassembly – about function calling conventions

Time:2022-5-6

Function must exist in any high-level language. Using functional programming can make the program more readable and give full play to the essence of modular design. Today, I will take you to explore the implementation mechanism of function, explore how the compiler implements the keyword function, and use assembly language to simulate the parameter transfer and call specification in function programming.

When it comes to functions, we must mention the term call convention, which is inseparable from the support of the stack. The stack is a special storage space in memory. Follow the principle of first in and last out, and use push and pop instructions to perform data push and pop operations on the stack space. The stack structure occupies a continuous storage space in the memory. The start address and end address of the current stack are saved through the stack pointer registers ESP and EBP, and one data is saved every four bytes.

The calling conventions implemented by the general compiler are nothing more than the following:

  • Cdecl: the default calling convention of C / C + +, which calls the square stack. Functions with indefinite parameters can be used, and parameters are passed through the stack
  • Stdcall: the stack is adjusted. Functions with indefinite parameters cannot be used. All parameters are passed through the stack by default
  • Fastcall32: the stack is adjusted, and the function with indefinite parameters cannot be used. The first two parameters are put into (ECX, EDX), and the remaining parameters are saved on the stack
  • Fastcall64: the stack is adjusted. Functions with indefinite parameters cannot be used. The first four parameters are put into (RCX, RDX, R8, R9), and the remaining parameters are saved on the stack
  • System V: the default convention of Linux like system. The first eight parameters are put into (RDI, RSI, RDX, RCX, R8, R9), and the remaining parameters are saved on the stack

When the stack top pointer ESP is smaller than the stack bottom pointer EBP, the stack frame is formed. The addressable data in the stack frame include local variables, function return address, function parameters and so on. The stack frames formed by different two function calls are also different. When one function enters another function, it will open up the required stack space for the called function to form the unique stack frame of this function. When the call ends, it will clear the stack space used by it and close the stack frame. This process is commonly called stack balance. If the stack is not recovered or over recovered after use, it will cause overflow or underflow of the stack and bring fatal errors to the program.

Cdecl caller flat stack:Cdecl is the default calling convention of C / C + +. This calling method does not carry out any balancing parameter operation in the function, but performs the plus 4 operation on ESP after exiting the function, so as to realize stack balancing.

This Convention will adopt replication propagation optimization to merge each parameter balancing operation. After the function ends, the stack top pointer ESP will be balanced at one time, and this convention can be used for indefinite parameter functions.

Stdcall callee stack leveling:Stdcall is different from cdecl only in parameter balance, and the rest are the same, but the Convention is not applicable to indefinite parameter functions.

Cdecl calling functions are called multiple times in the same scope, which will be more efficient than stdcall, because it can use replication propagation optimization, while stdcall balances the stack in the function and cannot use replication propagation optimization.

Fastcall callee stack leveling:Fastcall has the highest efficiency. It can use registers to transfer parameters. Generally, the first two or the first four parameters are transferred by registers, and the rest parameters are transferred by stack. This Convention does not apply to indefinite parameter functions.

For 32 bits, ECX and EDX are used to pass the first two parameters, and the following parameters are passed on the stack.
For 64 bit, RCX, RDX, R8 and R9 will be used to pass the first four parameters, and the following parameters will be passed on the stack.

Using ESP addressing:In the O2 compiler option, in order to improve the efficiency of program execution, as long as the top of the stack is stable, you can no longer use the EBP pointer, but use the ESP pointer to directly access local variables, which can save a register resource.

In the following assembly code, we find the current ESP base address.

It can be seen that ESP + 18 is the first parameter passed in, so the program has actually been calculated when compiling.

After using ESP addressing, it is not necessary to adjust the EBP at the bottom of the stack every time after entering the function, which reduces the use of EBP, so it can effectively improve the efficiency of program execution.

However, each access needs to be calculated. If the ESP changes during the execution of the function, the offset needs to be recalculated when accessing the variable again.

Reference: uncover the secrets of C + + disassembly and reverse analysis technology