Address space randomization for app vulnerability scanning

Time:2020-9-30

preface

In the previous “app vulnerability scanner local denial of service detection details”, we learned that aliju security vulnerability scanner has a static analysis and dynamic fuzzy test method to detect the function, and introduced in detail its detection method for local denial of service.

At the same time,Aligather vulnerability scanner has a detection item called unused address space randomization technology, which will analyze elf files contained in app to determine whether they use this technology。 If this vulnerability exists in app, it will reduce the threshold of buffer overflow attack.

This paper mainly introduces the principle of this technology and the detection method of the scanner. Due to the complexity of the implementation details of pie, this paper only introduces the general principle. Students who want to understand the details can refer to teacher pan Aimin’s book “programmer’s self-cultivation”.

What is pie

Pie (position independent executable) is a technique for generating address independent executable programs。 If the compiler uses pie in the process of generating executable program, the loading address of executable program is unpredictable when it is loaded into memory.

Pie also has a twin brother pic (position independent code). Its function and pie are the same, are to make the compiled program can be randomly loaded to a memory address. The difference lies in the use of dynamic link in the generation of PISO library.

The role of pie

Security

Pie can raise the threshold of buffer overflow attack. It is part of ASLR (address space layout randomization). ASLR requires that any part of an executable program be random when it is loaded into memory.include
Stack, Heap ,Libs and mmap, Executable, Linker, VDSO。Through pie, we can realize the randomization of executable memory

Save memory space

In addition to security, address independent code also plays an important role in improving memory efficiency.

A shared library can be loaded by multiple processes at the same time. If it is not address independent code (there are absolute address references in the code segment), each process must call the DLL with its own memory address. As a result, the shared library has to be copied to the process as a whole. If 100 processes in the system call this library, there will be 100 copies of the library in memory, which will be a great waste of space.

On the contrary, if the shared library to be loaded is address independent code and 100 processes call the library, the library needs to be loaded only once in memory. This is because pie separates the content of code segment in shared library from data segment. So that the code segment can be loaded into memory address independent. When multiple processes call the shared library, they only need to load the data segments of the shared library in their own processes, while the code segments can be shared.

Address space randomization for app vulnerability scanning

Brief introduction of pie working principle

We start with a practical example to observe the difference between pie and no-pie in the form of executable program. Explore the implementation principle of address independent code.

Example 1

C code is defined as follows:

#include <stdio.h>

int global;

void main()
{
    printf("global address = %x\n", &global);
}

The program defines a global variable global and prints its address. Let’s first compile the program in the normal way.

gcc -o sample1 sample1.c

The running program can observe that the address that global loads into memory is the same every time.

$./sample1
global address = 6008a8
$./sample1
global address = 6008a8
$./sample1
global address = 6008a8

Then compile sample1. C in pie mode

gcc -o sample1_pie sample1.c -fpie -pie

Run the program to observe the output of global

./sample1_pie
global address = 1ce72b38
./sample1_pie
global address = 4c0b38
./sample1_pie
global address = 766dcb38

Each running address will change, which means that the address of executing program loaded into memory is random by pie.

Example 2

Declare an external variable global in your code. However, the definition of this variable is not included in the compiled file.

#include <stdio.h>

extern int global;

void main()
{
    printf("extern global address = %x\n", &global);
}

First, compile extern in the normal way_ var.c。 Source files with global definition are intentionally not included in the compilation options.

gcc -o extern_var extern_var.c

If it fails to compile, GCC prompts:

/tmp/ccJYN5Ql.o: In function `main':
extern_var.c:(.text+0xa): undefined reference to `global'
collect2: ld returned 1 exit status

The compiler has an important step in the link phase called symbol parsing and relocation. The linker will merge the data, code and symbols of all intermediate files together, and calculate the virtual base address after the link. For example, the “. Text” section starts with 0x1000, and the “. Data” section starts with 0x2000. The linker then calculates the relative virtual address of each symbol (global) based on the base address.

When the compiler finds that the global address cannot be found in the symbol table, it will reportundefined reference to `global`. In the process of static linking, the compiler must complete the link of all symbols in the compilation link phase.

If you use the pie method to change extern_ What happens when var.c is compiled into a share library?

gcc -o extern_var.so extern_var.c -shared -fPIC

The program can compile smoothly by generating extern_ var.so 。 However, an error is reported at runtime because the global symbol destination address cannot be found at load time. This shows that the – FPIC option generates address independent code. Postpone the linking of global symbols not found during static linking to the load phase.

How does the linker refer to the missing target address in the code segment during the compilation link phase?

Linker skillfully uses an intermediate table got (global offset table) to solve the problem of missing target address of referenced symbols. If a symbol is found in the link stage (Jing Tai), the destination address cannot be determined. The linker adds the symbol to the got table and replaces all references to the symbol with its address in the got table. At the loading stage, the dynamic linker will fill in the actual target address corresponding to each symbol in the got table.

When the program executes the code corresponding to the symbol, the program will first look up the location of the corresponding symbol in the got table, and then find the actual target address of the symbol according to the location.

Generation of address independent code

The so-called address independent code requires that any address loaded into memory can be executed normally. Therefore, references to variables or functions in a program must be relative and cannot contain absolute addresses.

For example, pseudo assembly code is as follows:

Pie mode:Code can run at address 100 or 1000

100: COMPARE REG1, REG2
101: JUMP_IF_EQUAL CURRENT+10
...
111: NOP

Non-PIE: Code can only run at address 100

100: COMPARE REG1, REG2
101: JUMP_IF_EQUAL 111
...
111: NOP

Because the code segment of an executable program has only read and execute properties, but no write property, while the data segment has read-write properties. In order to implement address independent code, it is necessary to separate the absolute value to be changed in the code segment into the data segment. When the program is loaded, the code segment can be kept unchanged, and address independent code can be realized by changing the content of the data segment.

Mapping mode of pie and non pie programs in memory

In non pie, the location of the program is the same every time it is loaded into memory.

Address space randomization for app vulnerability scanning

The executor starts loading at a fixed address. Dynamic linker Library of system ld.so Will load first, and then ld.so Will pass through the. Dynamic section with type DT_ Need’s fields look for other shared libraries that need to be loaded. And load them into memory in turn.Note: because of the non pie mode, the dynamic link libraries are loaded in the same order and location each time.

For the execution program generated by pie, the address of each loading is different because there is no absolute address reference.

Address space randomization for app vulnerability scanning

Not only the loading address of DLL is not fixed, but also the address of each loading of execution program is different. This requires ld.so After it is first loaded, it is not only responsible for relocating other shared libraries, but also for relocating executable files.

Pie and compiler options

The main parameters used by gcc compiler to generate address independent code are – FPIC, – fpie, – pie.

Where – FPIC and – fpie are compile time options, which are used to generate shared libraries and executable files respectively. They can make the intermediate code generated in the compilation phase have the characteristics of address independent code. However, this does not mean that the final generated executable is pie. You also need to tell the linker the executable program that generates address independent code through the – pie option when linking.

The compilation settings of a standard pie program are as follows:

gcc -o sample_pie sample.c -fPIE -pie

The corresponding relationship between using compile options in GCC and whether to generate a pie executable file is as follows:

Address space randomization for app vulnerability scanning

Programs with type dyn support pie, but exec does not. The corresponding relationship between dyn, exec and pie is shown in the following text.

Application of ASLR in Android

Pie is a part of ASLR. ASLR includes randomization of stack, heap, LIBS and MMAP, executable, linker and vdso.

Supporting pie only means that ASLR is implemented for executable. With the development of Android, the support for ASLR is gradually enhanced.

ASLR in Android 2.x

Android’s support for ASLR started with Android 2. X. 2. X only supports randomization of stacks.

ASLR in Android 4.0

In 4.0, the so-called version that supports ASLR, ASLR only adds randomization to some shared libraries such as libc, while for heap, executable and linker, they are static.

For the randomization of heap, we can use the

echo 2 > /proc/sys/kernel/randomize_va_space

To open it.

For the randomization of executable, since most of the binaries do not add the – Pie – fpie option of GCC, the compiled executable is not a shared object file like dyn, so it is not pie (position independent executable), so there is no way to randomize;

Similarly, linker did not achieve ASLR.

ASLR in Android 4.1

Finally, in 4.1 jelly bean, Android supports all memory ASLR. In Android 4.1, basically all binaries are compiled and connected into pie mode (you can view its type through readelf). Therefore, compared with 4.0, 4.1 provides ASLR support for heap, executable and linker.

ASLR in Android 5.0

Android in 5.0 abandons the support for non pie, and all processes are ASLR. If the program does not open pie, it will report an error and force exit at runtime.

The pie program runs in all versions of Android

Address space randomization for app vulnerability scanning

Executable programs that support pie can only run on version 4.1 +. Before version 4.1, there will be crash. On the other hand, non pie programs run normally before 5.0, but crash on 5.0.

How to detect whether pie is enabled

The file type of executable program without opening pie should display exec (executable file) when viewing with readelf, and the file type of executable program with open pie is dyn (shared target file). In addition, the virtual address of the code snippet always starts from 0.

Address space randomization for app vulnerability scanning

Why can we judge whether pie is supported by detecting dyn?

Dyn refers to the type of the file, that is, the shared target file. Is pie enabled for all shared target files? We can find the answer from the source code. Look at the code in glibc / glibc-2.16.0 / ELF / dl load. C.

Address space randomization for app vulnerability scanning

From the source code, if the load type is not et_ Map is passed in when calling MMAP when dyn is used to load the file_ Fixed logo. Map the program to a fixed address.

Suggestions on security development of aliju

  1. For systems before Android 2. X-4.1, do not use the option to generate pie when compiling.

  2. After Android 4.1, the native program must be generated by pie to increase the attack cost in attackers.

  3. Before the version goes online, use aliju security vulnerability scanning system to scan the security risks before release.

Reference

  • https://en.wikipedia.org/wiki…

  • http://www.openbsd.org/papers…

  • https://source.android.com/se…

  • http://ytliu.info/blog/2012/1…

  • https://codywu2010.wordpress….

  • http://www.cnblogs.com/huxiao…

  • http://stackoverflow.com/ques…

Author: Daihu @ aliju security, for more Android and IOS technical articles, please visit aliju security blog