Blog: Draveness
Pay attention to the warehouse and get updates in time:iOS-Source-Code-Analyze
Follow: Draveness · Github
Because objc’s runtime can only be compiled under Mac OS, the code in this article is on Mac OS, that is
x86_64
The code running in arm64 will be specially described for the code running in arm64.
In the previous analysisisa
ArticlesLearn about isa from the initialization of nsobjectIt has been mentioned in that when an instance method is called, it will be held through itisa
Pointer to find the corresponding class, and thenclass_data_bits_t
In this article, we will introduce how methods are stored in objc.
The of this article will first analyze the storage structure of the method in memory according to the objc source code, and then verify the correctness of the analysis step by step in the lldb debugger.
Method’s location in memory
Let’s first understand the structure diagram of classes in objc:
-
isa
Is a pointer to a metaclass. If you don’t know about metaclasses, you can see itClasses and Metaclasses -
super_class
Points to the parent class of the current class -
cache
Used to cache pointers andvtable
, speed up method calls -
bits
It is the place to store the methods, properties, protocols and other information of the class
class_data_bits_t
structural morphology
This summary will analyze theclass_data_bits_t bits
。
The following is in objcclass_data_bits_t
Structure, which contains only a 64 bitbits
Used to store class related information:
stayobjc_class
Comments in structureclass_data_bits_t
amount toclass_rw_t
The pointer is marked with RR / alloc.
class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags
It provides us with a convenient way to return theclass_rw_t *
Pointer:
class_rw_t* data() {
return (class_rw_t *)(bits & FAST_DATA_MASK);
}
takebits
AndFAST_DATA_MASK
Carry out bit operation and take only one of them[3, 47]
Bit conversion toclass_rw_t *
return.
In x86_ On 64 architecture, Mac OSOnly 47 of these bits are used to assign addresses to objects。 Moreover, since the address should be byte aligned in memory, the last three bits of the mask are 0.
becauseclass_rw_t *
The pointer only exists in the[3, 47]
Bits, so the last three bits can be used to store other information about the current class:
#define FAST_IS_SWIFT (1UL<<0)
#define FAST_HAS_DEFAULT_RR (1UL<<1)
#define FAST_REQUIRES_RAW_ISA (1UL<<2)
#define FAST_DATA_MASK 0x00007ffffffffff8UL
-
isSwift()
-
FAST_IS_SWIFT
Used to determine swift class
-
-
hasDefaultRR()
-
FAST_HAS_DEFAULT_RR
The current class or parent class contains the defaultretain/release/autorelease/retainCount/_tryRetain/_isDeallocating/retainWeakReference/allowsWeakReference
method
-
-
requiresRawIsa()
-
FAST_REQUIRES_RAW_ISA
An instance of the current class needs rawisa
-
implementclass_data_bits_t
In structuredata()
Method or callobjc_class
Mediumdata()
Method returns the sameclass_rw_t *
Pointer, becauseobjc_class
The method in is just rightclass_data_bits_t
Encapsulation of the corresponding method in.
// objc_ Data () method in class
class_data_bits_t bits;
class_rw_t *data() {
return bits.data();
}
// class_ data_ bits_ Data() method in t
uintptr_t bits;
class_rw_t* data() {
return (class_rw_t *)(bits & FAST_DATA_MASK);
}
class_rw_t
andclass_ro_t
The properties, methods, protocols and other information in the objc class are saved in theclass_rw_t
Medium:
struct class_rw_t {
uint32_t flags;
uint32_t version;
const class_ro_t *ro;
method_array_t methods;
property_array_t properties;
protocol_array_t protocols;
Class firstSubclass;
Class nextSiblingClass;
};
There is also a pointer to a constantro
, whereThe properties, methods and protocols of the current class have been determined at compile time。
struct class_ro_t {
uint32_t flags;
uint32_t instanceStart;
uint32_t instanceSize;
uint32_t reserved;
const uint8_t * ivarLayout;
const char * name;
method_list_t * baseMethodList;
protocol_list_t * baseProtocols;
const ivar_list_t * ivars;
const uint8_t * weakIvarLayout;
property_list_t *baseProperties;
};
During compilationClassclass_data_bits_t *data
It points to aclass_ro_t *
Pointer:
Then loadObjc runtimeIn the process ofrealizeClass
Method:
-
from
class_data_bits_t
calldata
Method to convert the results fromclass_rw_t
Cast toclass_ro_t
Pointer -
Initialize a
class_rw_t
structural morphology -
Set structure
ro
Andflag
-
Finally, set the correct
data
。
const class_ro_t *ro = (const class_ro_t *)cls->data();
class_rw_t *rw = (class_rw_t *)calloc(sizeof(class_rw_t), 1);
rw->ro = ro;
rw->flags = RW_REALIZED|RW_REALIZING;
cls->setData(rw);
The following figure isrealizeClass
The layout of the memory occupied by the class after the method is executed. You can compare it with the memory layout before calling the method above to see what changes are made:
<p align=’center’>
However, after this code runsclass_rw_t
The method, property and protocol list in are empty. Need at this timerealizeClass
callmethodizeClass
Method toLoad the methods (including classification), properties and protocols implemented by the class into themethods
、 properties
andprotocols
In the list。
XXObject
Next, we will analyze a classXXObject
Changes in memory during runtime initialization, which isXXObject
Interface and implementation of:
// XXObject. H file
#import <Foundation/Foundation.h>
@interface XXObject : NSObject
- (void)hello;
@end
// XXObject. M file
#import "XXObject.h"
@implementation XXObject
- (void)hello {
NSLog(@"Hello");
}
@end
This code is running on Mac OS X 10.11.3 (x86#u 64) instead of iPhone simulator or real machine. If you run on iPhone or real machine, it may be different.
<p align=’center’>
This is the code of the main program:
#import <Foundation/Foundation.h>
#import "XXObject.h"
int main(int argc, const char * argv[]) {
@autoreleasepool {
Class cls = [XXObject class];
NSLog(@"%p", cls);
}
return 0;
}
Structure of classes in memory after compilation
becauseThe location of a class in memory is determined at compile time, run the code acquisition once firstXXObject
Address in memory.
0x100001168
Next, before the entire objc runtime is initialized, that is_objc_init
Add a breakpoint to the method:
Then enter the following command in lldb:
(lldb) p (objc_class *)0x100001168
(objc_class *) $0 = 0x0000000100001168
(lldb) p (class_data_bits_t *)0x100001188
(class_data_bits_t *) $1 = 0x0000000100001188
(lldb) p $1->data()
warning: could not load any Objective-C class information. This will significantly reduce the quality of type information available.
(class_rw_t *) $2 = 0x00000001000010e8
(lldb) P (class_ro_t *) $2 // class_ rw_ T force conversion to class_ ro_ t
(class_ro_t *) $3 = 0x00000001000010e8
(lldb) p *$3
(class_ro_t) $4 = {
flags = 128
instanceStart = 8
instanceSize = 8
reserved = 0
ivarLayout = 0x0000000000000000 <no value available>
name = 0x0000000100000f7a "XXObject"
baseMethodList = 0x00000001000010c8
baseProtocols = 0x0000000000000000
ivars = 0x0000000000000000
weakIvarLayout = 0x0000000000000000 <no value available>
baseProperties = 0x0000000000000000
}
Now we get the read-only attribute of the class processed by the compilerclass_ro_t
:
(class_ro_t) $4 = {
flags = 128
instanceStart = 8
instanceSize = 8
reserved = 0
ivarLayout = 0x0000000000000000 <no value available>
name = 0x0000000100000f7a "XXObject"
baseMethodList = 0x00000001000010c8
baseProtocols = 0x0000000000000000
ivars = 0x0000000000000000
weakIvarLayout = 0x0000000000000000 <no value available>
baseProperties = 0x0000000000000000
}
You can see that there are onlybaseMethodList
andname
It is valuable, othersivarLayout
、 baseProtocols
、 ivars
、weakIvarLayout
andbaseProperties
All point to null pointers because there are no instance variables, protocols and properties in the class. So the structure here meets our expectations.
View through the following commandbaseMethodList
Contents in:
(lldb) p $4.baseMethodList
(method_list_t *) $5 = 0x00000001000010c8
(lldb) p $5->get(0)
(method_t) $6 = {
name = "hello"
types = 0x0000000100000fa4 "[email protected]:8"
imp = 0x0000000100000e90 (method`-[XXObject hello] at XXObject.m:13)
}
(lldb) p $5->get(1)
Assertion failed: (i < count), function get, file /Users/apple/Desktop/objc-runtime/runtime/objc-runtime-new.h, line 110.
error: Execution was interrupted, reason: signal SIGABRT.
The process has been returned to the state before expression evaluation.
(lldb)
use$5->get(0)
Successfully obtained-[XXObject hello]
Method structuremethod_t
。 When trying to get the next method, the assertion indicates that there is only one method in the current class.
realizeClass
This article will notrealizeClass
Through detailed analysis, the main function of this method is to initialize the class for the first time, including:
-
Allocate read / write data space
-
Returns the real class structure
static Class realizeClass(Class cls)
The above is the signature of this method. We need to make a conditional breakpoint in this method to judge whether the current class isXXObject
:
Here, it is directly determined whether the two pointers are equal without using[NSStringFromClass(cls) isEqualToString:@"XXObject"]
Because these methods cannot be called at this time point, and there are no such methods in objc, the current class can only be confirmed by judging whether the class pointers are equalXXObject
。
The direct comparison with the pointer is because the position of the class in memory is determined during compilation. As long as the code does not change, the position of the class in memory will remain unchanged (I have said it many times).
This breakpoint is set here becauseXXObject
Is a normal class, so it will goelse
Branch allocates writable class data.
When running the code, it will judge whether the current class pointer points to each time
XXObject
, so it will wait a while before entering the breakpoint.
At this time, the in the class structure is printeddata
It is found that the layout is still like this:
After running this Code:
Let’s print the class structure again:
(lldb) P (objc_class *) CLS // print class pointer
(objc_class *) $262 = 0x0000000100001168
(lldb) P (class_data_bits_t *) 0x0000000100001188 // add 32 offset to the class pointer to print the class_ data_ bits_ T pointer
(class_data_bits_t *) $263 = 0x0000000100001188
(lldb) P * $263 // access class_ data_ bits_ Contents of T pointer
(class_data_bits_t) $264 = (bits = 4302315312)
(lldb) p $264. Data() // get class_ rw_ t
(class_rw_t *) $265 = 0x0000000100701f30
(lldb) P * $265 // access class_ rw_ The contents of the T pointer, and it is found that its ro has been set
(class_rw_t) $266 = {
flags = 2148007936
version = 0
ro = 0x00000001000010e8
methods = {
list_array_tt<method_t, method_list_t> = {
= {
list = 0x0000000000000000
arrayAndFlag = 0
}
}
}
properties = {
list_array_tt<property_t, property_list_t> = {
= {
list = 0x0000000000000000
arrayAndFlag = 0
}
}
}
protocols = {
list_array_tt<unsigned long, protocol_list_t> = {
= {
list = 0x0000000000000000
arrayAndFlag = 0
}
}
}
firstSubclass = nil
nextSiblingClass = nil
demangledName = 0x0000000000000000 <no value available>
}
(lldb) p $266. Ro // get class_ ro_ T pointer
(const class_ro_t *) $267 = 0x00000001000010e8
(lldb) P * $267 // access class_ ro_ Contents of T pointer
(const class_ro_t) $268 = {
flags = 128
instanceStart = 8
instanceSize = 8
reserved = 0
ivarLayout = 0x0000000000000000 <no value available>
name = 0x0000000100000f7a "XXObject"
baseMethodList = 0x00000001000010c8
baseProtocols = 0x0000000000000000
ivars = 0x0000000000000000
weakIvarLayout = 0x0000000000000000 <no value available>
baseProperties = 0x0000000000000000
}
(lldb) p $268. Basemethodlist // get the basic method list
(method_list_t *const) $269 = 0x00000001000010c8
(lldb) P $269 - > get (0) // access the first method
(method_t) $270 = {
name = "hello"
types = 0x0000000100000fa4 "[email protected]:8"
imp = 0x0000000100000e90 (method`-[XXObject hello] at XXObject.m:13)
}
(lldb) P $269 - > get (1) // try to access the second method, out of bounds
error: Execution was interrupted, reason: signal SIGABRT.
The process has been returned to the state before expression evaluation.
Assertion failed: (i < count), function get, file /Users/apple/Desktop/objc-runtime/runtime/objc-runtime-new.h, line 110.
(lldb)
The last operation can’t be intercepted
const class_ro_t *ro = (const class_ro_t *)cls->data();
class_rw_t *rw = (class_rw_t *)calloc(sizeof(class_rw_t), 1);
rw->ro = ro;
rw->flags = RW_REALIZED|RW_REALIZING;
cls->setData(rw);
After the above code runs, the read-only pointer of the classclass_ro_t
And read-write pointersclass_rw_t
Are set correctly. But here, itsclass_rw_t
Some methods and other members have null pointers, which will be displayed inmethodizeClass
Set in:
Called heremethod_array_t
ofattachLists
Method, willbaseMethods
Add methods from tomethods
After array. We visitmethods
Will get the instance method of the current class.
Structure of method
Having said so much, now we can simply take a look at the structure of a method. Like classes and objects, a method is also a structure in memory.
struct method_t {
SEL name;
const char *types;
IMP imp;
};
It contains the method name, type and implementation pointer of the methodIMP
:
above-[XXObject hello]
The structure of the method is as follows:
name = "hello"
types = 0x0000000100000fa4 "[email protected]:8"
imp = 0x0000000100000e90 (method`-[XXObject hello] at XXObject.m:13
There is nothing to say about the name of the method here. The type of the method is a very strange string"[email protected]:8"
This is called in objcType code(type encoding), you can read this articleOfficial documentsLearn about type coding.
For the implementation of the method, lldb indicates the location of the method in the file.
Summary
When analyzing the location of the method in memory, the author has been trying to find it at the beginningread-onlystructural morphologyclass_ro_t
MediumbaseMethods
The location of the first setting (understand how the class’s methods are loaded). Try frommethodizeClass
The method keeps looking up until_obj_init
Method also did not find a method to set the read-only areabaseMethods
Methods.
And after runtime initialization,realizeClass
Before, fromclass_data_bits_t
Structureclass_rw_t
It has always been wrong. This problem puzzled me at the beginning until laterrealizeClass
I found that it was not at this timeclass_rw_t
Structure, butclass_ro_t
To understand the reason for the mistake.
Later, it suddenly occurred to me that some methods, properties and protocols of the class were determined at compile time(baseMethods
Wait until the position of members and classes in memory is determined at compile time).
-
The location of the class in memory is determined during compilation. Modifying the code later will not change the location in memory.
-
The methods, properties and protocols of the class are stored in the “error” location during compilation until
realizeClass
After execution, it was put intoclass_rw_t
Read only area toclass_ro_t
So that we canclass_rw_t
Adding methods does not affect the read-only structure of the class. -
stay
class_ro_t
The properties in cannot be changed during operation, and will be modified when adding a methodclass_rw_t
Mediummethods
List, notclass_ro_t
MediumbaseMethods
, the addition of methods will be analyzed in later articles.
reference material
Blog: Draveness
Pay attention to the warehouse and get updates in time:iOS-Source-Code-Analyze
Follow: Draveness · Github