IOS underlying principles + reverse article summary
The main purpose of this paper is to understand the storage of global variables and constants in assembly, and how to restore if, while and other assembly code to high-level code
global variable
Before that, you need to understand the memory partition. This is not particularly clear. I suggest you take a look at this articleIOS underlying principle 24: five memory areas, the following is a brief summary description
-
Code area
: store code, readable and executable -
Stack area
: store parameters, local variables and temporary data, readable and writable -
Heap area
: developer dynamic application, variable size, readable and writable -
global variable
: readable and writable -
constant
: read only
case analysis
In main Define a function and a global variable in M
int g = 12;
int func(int a, int b){
printf("haha");
int c = a + g;
return c;
}
int main(int argc, char * argv[]) {
func(1, 2);
}
-
Func function runs at breakpoint. The following is the assembly code of main function
-
View the assembly code of func and analyze it as follows
-
Check whether x0 is “haha”, which can be verified by debugging,
X0 is the address of haha
-
View its address:
x 0x000000010098bf9f
, which belongs to the constant area of the string (that is, the ASCII code of the string on the left is the ASCII code of the string on the right)
-
Key analysisadrp x0,1
andadd x0,x0,#0xf9f
Two sentences
-
adrp
Instruction (address page addressing by page):- Shift the value of 1 by 12 bits to the left. At this time, 1 is binary
- Add the value of PC register (first clear the lower 12 bits of PC)
<!-- (addressing by page) - >
<!--adrp-->
0x10098a824 <+20>: adrp x0, 1
-1) shift left 12 bits: 0x1000
-2) reset the lower 12 bits of PC register: 0x10098a000
-3) add the value of PC register: 0x10098a000 + 0x1000 = 0x10098b000
===>The x0 address obtained is the starting position (i.e. the first address) of a page of data
<!--add-->
0x10098a828 <+24>: add x0, x0, #0xf9f ; =0xf9f
-ADRP address plus offset: 0x10098b000 + 0xf9f = 0x10098bf9f
===>X0 at this time is the address of a code segment in a page, that is, the address of the current code segment
It can be seen from this calculation result that it is consistent with the x0 address debugged above
why?: the size of a page is4096
, and0xFFF
by4095
, plus 1 is0x1000
(i.e4096
)Therefore, it is 1. Move 12 bits to the left to get the first address of a page (Note: the PageSize of MacOS is4k
(0x1000), and the PageSize of iPhone is16k
(0x4000), but 16 is still a multiple of 4. ADRP is compatible with Mac and iPhone, so the location is still one page of data)
-
Continue analysis
bl printf
The following assembly code-
ldur w8, [x29, #-0x4]
: take out the data in the stack, i.e. 1 -
adrp + add + ldr
: take out the data of 0x10098ce98 memory address and give the data of x9 to W10. So you get the global variable G
-
Disassembly analysis
The example code is as follows
int g = 12;
int func(int a, int b){
printf("haha");
int c = a + g + b;
return c;
}
int main(int argc, char * argv[]) {
func(10, 20);
}
adopthopper
For disassembly analysis
-
First compile the project:
CMD+B
-
Enter the package of app
-
Drag the executable file in step 5 into hopper for analysis
-
Search func in hopper
-
Copy the assembly code of func and restore it to high-level language code (i.e. disassembly)
<!-- 1. Restore the assembly to high-level language code -- >
int gl = 12;
int func2(int a, int b){
/*
//The beginning of a function
0000000100006808 sub sp, sp, #0x20
000000010000680c stp x29, x30, [sp, #0x10]
0000000100006810 add x29, sp, #0x10
*/
/*
//Call BL printf
0000000100006814 stur w0, [x29, #-0x4]
0000000100006818 str w1, [sp, #0x8]
//===>At this time, the data of 0x100007f9f address obtained is the value without ASLR
000000010000681c adrp x0, #0x100007000
0000000100006820 add x0, x0, #0xf9f ; "haha"
0000000100006824 bl imp___stubs__printf
*/
printf("haha");
/*
0000000100006828 ldur w8, [x29, #-0x4]
*/
int w8 = a;
/*
//===>Get the data of 0x100008e98 at this time
000000010000682c adrp x9, #0x100008000
0000000100006830 add x9, x9, #0xe98 ; _g
*/
// int gl = 12;// (need to write outside)
/*
0000000100006834 ldr w10, x9
*/
int w10 = gl;
/*
0000000100006838 add w8, w8, w10
*/
w8 += w10;
/*
000000010000683c ldr w10, [sp, #0x8]
*/
w10 = b;
/*
0000000100006840 add w8, w8, w10
*/
w8 += w10;
/*
0000000100006844 str w8, [sp, #0x4]
0000000100006848 ldr w8, [sp, #0x4]
000000010000684c mov x0, x8
*/
return w8;
/*
//End of a function
0000000100006850 ldp x29, x30, [sp, #0x10]
0000000100006854 add sp, sp, #0x20
0000000100006858 ret
*/
}
<!-- 2. Remove assembly -- >
int gl = 12;
int func2(int a, int b){
printf("haha");
int w8 = a;
int w10 = gl;
w8 += w10;
w10 = b;
w8 += w10;
return w8;
}
<!-- 3. Simplify code -- >
int gl = 12;
int func2(int a, int b){
printf("haha");
return a + b + gl;
}
The simplified process is shown in the figure below (Note: Yes)Restore from bottom to top
, not from top to bottom (business logic is executed from top to bottom):


among
//===>At this time, the data of 0x100007f9f address obtained is the value without ASLR
000000010000681c adrp x0, #0x100007000
0000000100006820 add x0, x0, #0xf9f
-
Press in hopper
G
, find0x100007f9f
Corresponding data
Similarly, getGlobal variable G
The same principle
//===>Get the data of 0x100008e98 at this time
000000010000682c adrp x9, #0x100008000
0000000100006830 add x9, x9, #0xe98 ; _g
0000000100006834 ldr w10, x9

summary
-
obtain
Globals and Constants
Appears whenadrp
andadd
Two instructions get an address -
ADRP(Address Page)
-
adrp x0,1
-
take
PC
RegisterLow 12 bit reset
-
Shift the value of 1 by 12 bits to the left, and the hexadecimal is 0x1000
-
The above two results are added together
x0
register
-
-
-
adopt
ADD
instructionsGets the offset value in memory for this page
condition
There is the following code, check its assembly
int g = 12;
void func(int a, int b){
if (a > b) {
g = a;
}else{
g = b;
}
}
int main(int argc, char * argv[]) {
func(1, 2);
}
Check its assembly through hopper. The code is as follows
_func:
==>Stretch stack space
0000000100006828 sub sp, sp, #0x10 ; CODE XREF=_main+32
==>W0, W1 data stack
000000010000682c str w0, [sp, #0xc]
0000000100006830 str w1, [sp, #0x8]
==>Read data from stack to W8 and w9
0000000100006834 ldr w8, [sp, #0xc]
0000000100006838 ldr w9, [sp, #0x8]
==>Compare W8 and w9, that is, compare W0 and W1 (CMP is subtraction, but it does not affect the target registers W8 and w9. Just look at the subtraction result and modify the flag register)
000000010000683c cmp w8, w9
//If it is less than or equal to, skip to LOC_ Execute 100006858. If it is greater than, execute directly down
0000000100006840 b.le loc_100006858
0000000100006844 ldr w8, [sp, #0xc]
0000000100006848 adrp x9, #0x100008000
000000010000684c add x9, x9, #0xe90 ; _g
0000000100006850 str w8, x9
//Hard jump, avoid code less than or equal to, and jump to LOC_ one hundred million six thousand eight hundred and sixty-eight
0000000100006854 b loc_100006868
loc_100006858:
0000000100006858 ldr w8, [sp, #0x8] ; CODE XREF=_func+24
000000010000685c adrp x9, #0x100008000
0000000100006860 add x9, x9, #0xe90 ; _g
0000000100006864 str w8, x9
loc_100006868:
0000000100006868 add sp, sp, #0x10 ; CODE XREF=_func+44
000000010000686c ret
This is typicalif-else
, check the assembly code through hopper as follows

Restore the above assembly code
<!-- 1. Restore -- >
int cc = 12;
void func2(int a, int b){
//==>Stretch stack space
//0000000100006828 sub sp, sp, #0x10
//==>W0, W1 data stack
//000000010000682c str w0, [sp, #0xc]
//0000000100006830 str w1, [sp, #0x8]
//==>Read data from stack to W8 and w9
//0000000100006834 ldr w8, [sp, #0xc]
//0000000100006838 ldr w9, [sp, #0x8]
int w8 = a;
int w9 = b;
//==>Compare W8 and w9, that is, compare W0 and W1 (CMP is subtraction, but it does not affect the target registers W8 and w9. Just look at the subtraction result and modify the flag register)
//000000010000683c cmp w8, w9
////If it is less than or equal to, skip to LOC_ Execute 100006858. If it is greater than, execute directly down
//0000000100006840 b.le loc_100006858
If (W8 > w9) {// greater than
//0000000100006844 ldr w8, [sp, #0xc]
//0000000100006848 adrp x9, #0x100008000
//000000010000684c add x9, x9, #0xe90 ; _g
//0000000100006850 str w8, x9
cc = w8;// W8 at this time is a
////Hard jump, avoid code less than or equal to, and jump to LOC_ one hundred million six thousand eight hundred and sixty-eight
//0000000100006854 b loc_100006868
}Else {// less than or equal to
// loc_100006858:
//0000000100006858 ldr w8, [sp, #0x8]
//000000010000685c adrp x9, #0x100008000
//0000000100006860 add x9, x9, #0xe90 ; _g
//0000000100006864 str w8, x9
cc = w8;// At this time, W8 is B
}
// loc_100006868:
//0000000100006868 add sp, sp, #0x10
//000000010000686c ret
}
<!-- 2. Simplify -- >
int cc = 12;
void func2(int a, int b){
If (a > b) {// greater than
cc = a;
}Else {// less than or equal to
cc = b;
}
}
CMP (compare) compare instruction
-
CMP
Compare the contents of one register with the contents of another registerContent or immediate values are compared, but the results are not stored, just the correct change flag
(CMP is followed byB.LE
, i.e. else condition) - Generally, CMP will jump after judgment, and usually follow the B instruction
-
BL label
: jump to the label -
B. LT label
: if the comparison result is less than, execute the label, otherwise do not jump -
B. Le label
: if the comparison result is less than or equal to, execute the label, otherwise do not jump -
B. GT label
: the comparison result isGreater than, execute label
, otherwise do not jump -
B. Ge label
: the comparison result isGreater than or equal to
(greater than or equal to), execute the label, otherwise do not jump
-
-'b.eq label': if the comparison result is' equal to ', execute the label, otherwise do not jump
-'b.ne label': if the comparison result is not equal, execute the label, otherwise do not jump
-'b.hi label': the comparison result is' unsigned greater than '. Execute the label, otherwise do not jump
-'b.hs label': the comparison result is' unsigned greater than or equal to '. Execute the label, otherwise do not jump
loop
The main methods commonly used in circulation arefor
、while
、do-while
, let’s analyze them one by one
Do while analysis
Analyze the following do while code
int main(int argc, char * argv[]) {
int sum = 0;
int i = 0;
do{
sum += 1;
I++;
}while (i<100);
}
-
View its assembly through hopper
-
The compilation ends as follows
conclusion:do-while
Out of loop condition
While loop analysis
int main(int argc, char * argv[]) {
int sum = 0;
int i = 0;
while (i<100){
sum += 1;
I++;
}
}
The assembly is shown in the figure

conclusion:while
Cycle: judge the condition inside and jump out if it is not satisfied
For loop analysis
int main(int argc, char * argv[]) {
int sum = 0;
for (int i = 0; i < 100; i++) {
sum += 1;
}
}
At this time, it is the same as the assembly of while

conclusion:for
The loop is very similar: the judgment condition is inside, and if it is not satisfied, it jumps out
summary
Globals and Constants
-
obtain
Globals and Constants
Appears whenadrp
andadd
Two instructions get an address -
ADRP(Address Page)
-
adrp x0,1
-
take
PC
RegisterLow 12 bit reset
-
Shift the value of 1 by 12 bits to the left
-
The above two results are added together
x0
register
-
-
-
adopt
ADD
instructionsGets the offset value in memory for this page
Conditional judgment
-
CMP
Compare the contents of one register with the contents of another registerContent or immediate values are compared, but the results are not stored, just the correct change flag
(CMP is followed byB.LE
, i.e. else condition) - Generally, CMP will jump after judgment, and usually follow the B instruction
-
BL label
: jump to the label -
B. LT label
: if the comparison result is less than, execute the label, otherwise do not jump -
B. Le label
: if the comparison result is less than or equal to, execute the label, otherwise do not jump -
B. GT label
: the comparison result isGreater than, execute label
, otherwise do not jump -
B. Ge label
: the comparison result isGreater than or equal to
(greater than or equal to), execute the label, otherwise do not jump -
B. EQ label
: the comparison result isbe equal to
, execute the label, otherwise do not jump -
B. NE label
: if the comparison result is not equal, execute the label, otherwise do not jump -
B. Hi label
: the comparison result isUnsigned greater than
, execute the label, otherwise do not jump -
B. HS label
: the comparison result isUnsigned greater than or equal to
, execute the label, otherwise do not jump
-
loop
-
do-while
Cycle: the judgment condition isbehind
, jump out if conditions are met -
for
Cycle andwhile
The loop is very similar: the judgment condition isinside
, jump out if you’re not satisfied