Do you really know the load method?

Time:2020-11-29

Pay attention to the warehouse and get updated in time: IOS source code analyze
Follow: Draveness · Github

Because the runtime of objc can only be compiled under Mac OS, the code in this article is in Mac OS, that is to sayx86_64Architecture, for the code running in arm64 will be specially described.

Write it at the front

The title of the article is not so much to ask readers as to ask the author himself:IReally understand+ loadHow?

+ loadAs a method in Objective-C, it is quite different from other methods. It’s just aWhen the entire file is loaded into the runtime, themainHook method called by objc runtime before function call。 The keywords are as follows:

  • The file has just been loaded

  • mainBefore function

  • Hook Method

Before I read the objc source code, I once felt that I was right+ loadThe function of the method is very well understood. Until I look at the implementation in the source code, I know that what I thought was just my own.

This article assumes that you know:

  • Used+ loadmethod

  • know+ loadMethod call order (this article will briefly introduce)

There will not be a lengthy introduction in this article+ loadThe role of the methodIt doesn’t really workThe focus is mainly on the following two issues:

  • + loadHow is the method called

  • + loadWhy are methods called in this order

Call stack of load method

First, throughloadMethod, analyze how it is called.

Here is the full code of the program:

// main.m
#import <Foundation/Foundation.h>

@interface XXObject : NSObject @end

@implementation XXObject

+ (void)load {
    NSLog(@"XXObject load");
}

@end

int main(int argc, const char * argv[]) {
    @autoreleasepool { }
    return 0;
}

The code implements only one in allXXObjectOf+ loadThere is nothing in the main function

Do you really know the load method?

Although no method is called in the main function, it is still printed after runningXXObject loadString, that is to say called+ loadmethod.

Use symbolic breakpoints

Use Xcode to add a symbolic breakpoint+[XXObject load]

Pay attention here+and[There is no space between them

Do you really know the load method?

Why add a symbolic breakpoint? Because it looks more advanced.

Rerun the program. At this point, the code stops atNSLog(@"XXObject load");On the implementation of this line:

Do you really know the load method?

The call stack on the left clearly tells us which methods have been called

0  +[XXObject load]
1  call_class_loads()
2  call_load_methods
3  load_images
4  dyld::notifySingle(dyld_image_states, ImageLoader const*)
11 _dyld_start

Dynamic is the abbreviation of the dynamic link editor, which is Apple’sDynamic linker

After the system kernel is ready for the program, dyld is responsible for the rest of the work. It will not be explained in this paper

Whenever a new image is loaded, it is executed3 load_imagesMethod, where the callback is initialized throughout the runtime_objc_initRegistered (which will be described in a later article)

dyld_register_image_state_change_handler(dyld_image_state_dependents_initialized, 0/*not batch*/, &load_images);

When the new image is loaded to the runtimeload_imagesMethod and pass in a list of the latest image informationinfoList

const char *
load_images(enum dyld_image_states state, uint32_t infoCount,
            const struct dyld_image_info infoList[])
{
    bool found;

    found = false;
    for (uint32_t i = 0; i < infoCount; i++) {
        if (hasLoadMethods((const headerType *)infoList[i].imageLoadAddress)) {
            found = true;
            break;
        }
    }
    if (!found) return nil;

    recursive_mutex_locker_t lock(loadMethodLock);

    {
        rwlock_writer_t lock2(runtimeLock);
        found = load_images_nolock(state, infoCount, infoList);
    }

    if (found) {
        call_load_methods();
    }

    return nil;
}

What is mirror image

Here we will encounter a problem: what is the image? We use a breakpoint to print all loaded images

Do you really know the load method?

The output from the console is roughly like this. We can see that the image is not an Objective-C code file. It should be a target compilation product.

...
(const dyld_image_info) $52 = {
  imageLoadAddress = 0x00007fff8a144000
  imageFilePath = 0x00007fff8a144168 "/System/Library/Frameworks/CoreServices.framework/Versions/A/CoreServices"
  imageFileModDate = 1452737802
}
(const dyld_image_info) $53 = {
  imageLoadAddress = 0x00007fff946d9000
  imageFilePath = 0x00007fff946d9480 "/usr/lib/liblangid.dylib"
  imageFileModDate = 1452737618
}
(const dyld_image_info) $54 = {
  imageLoadAddress = 0x00007fff88016000
  imageFilePath = 0x00007fff88016d40 "/System/Library/Frameworks/Foundation.framework/Versions/C/Foundation"
  imageFileModDate = 1452737917
}
(const dyld_image_info) $55 = {
  imageLoadAddress = 0x0000000100000000
  imageFilePath = 0x00007fff5fbff8f0 "/Users/apple/Library/Developer/Xcode/DerivedData/objc-dibgivkseuawonexgbqssmdszazo/Build/Products/Debug/debug-objc"
  imageFileModDate = 0
}

There are many dynamic link libraries, and some frameworks provided by apple, such as foundation, core services and so on, are all in thisload_imagesAnd theseimageFilePathIt’s all correspondingBinary fileThe address of.

But if you go to the bottom directory, you will find that it is aExecutable fileIt runs the same result as Xcode:

Do you really know the load method?

Preparation + load method

We come backload_imagesMethod, if it is found in the process of scanning the image+ loadSymbol:

for (uint32_t i = 0; i < infoCount; i++) {
    if (hasLoadMethods((const headerType *)infoList[i].imageLoadAddress)) {
        found = true;
        break;
    }
}

It will enterload_images_nolockTo find outloadmethod:

bool load_images_nolock(enum dyld_image_states state,uint32_t infoCount,
                   const struct dyld_image_info infoList[])
{
    bool found = NO;
    uint32_t i;

    i = infoCount;
    while (i--) {
        const headerType *mhdr = (headerType*)infoList[i].imageLoadAddress;
        if (!hasLoadMethods(mhdr)) continue;

        prepare_load_methods(mhdr);
        found = YES;
    }

    return found;
}

callprepare_load_methodsYesloadMethod (will need to be calledloadMethod is added to a list, which is described in the following sections:

void prepare_load_methods(const headerType *mhdr)
{
    size_t count, i;

    runtimeLock.assertWriting();

    classref_t *classlist = 
        _getObjc2NonlazyClassList(mhdr, &count);
    for (i = 0; i < count; i++) {
        schedule_class_load(remapClass(classlist[i]));
    }

    category_t **categorylist = _getObjc2NonlazyCategoryList(mhdr, &count);
    for (i = 0; i < count; i++) {
        category_t *cat = categorylist[i];
        Class cls = remapClass(cat->cls);
        if (!cls) continue;  // category for ignored weak-linked class
        realizeClass(cls);
        assert(cls->ISA()->isRealized());
        add_category_to_loadable_list(cat);
    }
}

adopt_getObjc2NonlazyClassListAfter getting the list of all the classes, theremapClassGets the pointer corresponding to the class, and then callsschedule_class_load Recursively arranges the current class and has no calls+ loadSuperclassEnter the list.

static void schedule_class_load(Class cls)
{
    if (!cls) return;
    assert(cls->isRealized());

    if (cls->data()->flags & RW_LOADED) return;

    schedule_class_load(cls->superclass);

    add_class_to_loadable_list(cls);
    cls->setInfo(RW_LOADED); 
}

In executionadd_class_to_loadable_list(cls)Before adding the current class to the load list, theFirst add the parent class to the list to be loadedEnsure that the parent class is called before subclass.loadmethod.

Call the + load method

When the image is loaded into the runtime, theloadWhen the method is ready, executecall_load_methods, start callingloadmethod:

void call_load_methods(void)
{
    ...

    do {
        while (loadable_classes_used > 0) {
            call_class_loads();
        }

        more_categories = call_category_loads();

    } while (loadable_classes_used > 0  ||  more_categories);

    ...
}

The call process of the method is as follows:

Do you really know the load method?

amongcall_class_loadsFrom a list of classes to be loadedloadable_classesFind the corresponding class in the@selector(load)Implementation and implementation.

static void call_class_loads(void)
{
    int i;
    
    struct loadable_class *classes = loadable_classes;
    int used = loadable_classes_used;
    loadable_classes = nil;
    loadable_classes_allocated = 0;
    loadable_classes_used = 0;
    
    for (i = 0; i < used; i++) {
        Class cls = classes[i].cls;
        load_method_t load_method = (load_method_t)classes[i].method;
        if (!cls) continue;

        (*load_method)(cls, SEL_load);
    }
    
    if (classes) free(classes);
}

This business(*load_method)(cls, SEL_load)The code will call+[XXObject load]method.

We will introduce it belowloadable_classesHow lists are managed.

So far, we have answered the first question:

Q:loadHow is the method called?

A: When the Objective-C runtime is initialized, thedyld_register_image_state_change_handlerEvery time a new image is addedRuntimeCall back. implementload_imagesInclude allloadMethod is added to the listloadable_classesAnd then find the correspondingloadMethod, callloadmethod.

Loading management

Objc mainly uses two lists for loading managementloadable_classesandloadable_categories

Method call process is also divided into two parts, preparationloadMethods and callsloadI think these two parts are more like producers and consumers

Do you really know the load method?

add_class_to_loadable_listMethod is responsible for adding classesloadable_classesAssemble, andcall_class_loadsResponsible for the elements in the consumption set.

For classification, the model is similar, except that another list is usedloadable_categories

“Production” loadable_ class

Callingload_images -> load_images_nolock -> prepare_load_methods -> schedule_class_load -> add_class_to_loadable_listWill add the unloaded class to theloadable_classesIn the array:

void add_class_to_loadable_list(Class cls)
{
    IMP method;

    loadMethodLock.assertLocked();

    method = cls->getLoadMethod();
    if (!method) return;

    if (loadable_classes_used == loadable_classes_allocated) {
        loadable_classes_allocated = loadable_classes_allocated*2 + 16;
        loadable_classes = (struct loadable_class *)
            realloc(loadable_classes,
                              loadable_classes_allocated *
                              sizeof(struct loadable_class));
    }
    
    loadable_classes[loadable_classes_used].cls = cls;
    loadable_classes[loadable_classes_used].method = method;
    loadable_classes_used++;
}

When the method is just called:

  1. FromclassIn theloadmethod:method = cls->getLoadMethod();

  2. Judge the currentloadable_classesWhether this array has been fully occupied:loadable_classes_used == loadable_classes_allocated

  3. Expand the size of the array based on the current array:realloc

  4. Bring inclassAnd the implementation of the corresponding method is added to the list

Another list is used to hold categoriesloadable_categoriesThere is a similar approachadd_category_to_loadable_list

void add_category_to_loadable_list(Category cat)
{
    IMP method;

    loadMethodLock.assertLocked();

    method = _category_getLoadMethod(cat);

    if (!method) return;
    
    if (loadable_categories_used == loadable_categories_allocated) {
        loadable_categories_allocated = loadable_categories_allocated*2 + 16;
        loadable_categories = (struct loadable_category *)
            realloc(loadable_categories,
                              loadable_categories_allocated *
                              sizeof(struct loadable_category));
    }

    loadable_categories[loadable_categories_used].cat = cat;
    loadable_categories[loadable_categories_used].method = method;
    loadable_categories_used++;
}

To achieve almost the same asadd_class_to_loadable_listExactly the same.

Here we are. Rightloadable_classesas well asloadable_categoriesThe following will start consuming the elements in the list.

“Consumption” loadable_ class

callloadThe process of method is “consumption”loadable_classesThe process,load_images -> call_load_methods -> call_class_loadsFromloadable_classesTake out the corresponding class and method from theload

void call_load_methods(void)
{
    static bool loading = NO;
    bool more_categories;

    loadMethodLock.assertLocked();

    if (loading) return;
    loading = YES;

    void *pool = objc_autoreleasePoolPush();

    do {
        while (loadable_classes_used > 0) {
            call_class_loads();
        }

        more_categories = call_category_loads();

    } while (loadable_classes_used > 0  ||  more_categories);

    objc_autoreleasePoolPop(pool);

    loading = NO;
}

The above method is applicable to allloadable_classesas well asloadable_categoriesClass and classification execution inloadmethod.

do {
    while (loadable_classes_used > 0) {
        call_class_loads();
    }

    more_categories = call_category_loads();

} while (loadable_classes_used > 0  ||  more_categories);

The call sequence is as follows:

  1. Calling the+ loadMethod untilloadable_classesIs empty

  2. callonce call_category_loadsLoad classification

  3. If soloadable_classesOr more categories, continue to callloadmethod

Compared to classesloadMethod call, in classificationloadMethods are a little more complicated

static bool call_category_loads(void)
{
    int i, shift;
    bool new_categories_added = NO;
    //1. Get the list of categories that can be loaded currently
    struct loadable_category *cats = loadable_categories;
    int used = loadable_categories_used;
    int allocated = loadable_categories_allocated;
    loadable_categories = nil;
    loadable_categories_allocated = 0;
    loadable_categories_used = 0;

    for (i = 0; i < used; i++) {
        Category cat = cats[i].cat;
        load_method_t load_method = (load_method_t)cats[i].method;
        Class cls;
        if (!cat) continue;

        cls = _category_getClass(cat);
        if (cls  &&  cls->isLoadable()) {
            //2. If the current class is loadable, 'CLS & & CLS - > isloadable()', the classified load method will be called
            (*load_method)(cls, SEL_load);
            cats[i].cat = nil;
        }
    }

    //3. Remove all loaded categories_ Categories' list
    shift = 0;
    for (i = 0; i < used; i++) {
        if (cats[i].cat) {
            cats[i-shift] = cats[i];
        } else {
            shift++;
        }
    }
    used -= shift;

    //4. It is' loadable '_ Categories ` reallocate memory and reset its value
    new_categories_added = (loadable_categories_used > 0);
    for (i = 0; i < loadable_categories_used; i++) {
        if (used == allocated) {
            allocated = allocated*2 + 16;
            cats = (struct loadable_category *)
                realloc(cats, allocated *
                                  sizeof(struct loadable_category));
        }
        cats[used++] = loadable_categories[i];
    }

    if (loadable_categories) free(loadable_categories);

    if (used) {
        loadable_categories = cats;
        loadable_categories_used = used;
        loadable_categories_allocated = allocated;
    } else {
        if (cats) free(cats);
        loadable_categories = nil;
        loadable_categories_used = 0;
        loadable_categories_allocated = 0;
    }

    return new_categories_added;
}

This method has some advantages. Let’s explain the function of the method step by step

  1. Gets the list of categories that can be loaded currently

  2. If the current class is loadablecls && cls->isLoadable()It will call the classificationloadmethod

  3. Remove all loaded classificationsloadable_categorieslist

  4. byloadable_categoriesReallocate memory and reset its value

Order of calls

You’ve probably heard about it in the pastloadThere are two rules for calling methods:

  1. The parent class is called before the child class

  2. Class precedes category call

This phenomenon is very consistent with our intuition, let’s analyze the reasons for this phenomenon.

The first rule is becauseschedule_class_loadThe implementation is as follows:

static void schedule_class_load(Class cls)
{
    if (!cls) return;
    assert(cls->isRealized());

    if (cls->data()->flags & RW_LOADED) return;

    schedule_class_load(cls->superclass);

    add_class_to_loadable_list(cls);
    cls->setInfo(RW_LOADED); 
}

Through this line of codeschedule_class_load(cls->superclass)It is always guaranteed that there is no callloadThe parent class of the method is added before the child classloadable_classesArray to ensure the correctness of its calling order.

Class and classificationloadMethods are called in the following ordercall_load_methodsTo achieve:

do {
    while (loadable_classes_used > 0) {
        call_class_loads();
    }

    more_categories = call_category_loads();

} while (loadable_classes_used > 0  ||  more_categories);

abovedo whileTo some extent, theloadMethod is called before the classification call. However, there is no guarantee that the calling order is correct.

IfThe classified image is loaded into the runtime before the class is mirroredThe above code can’t guarantee the correct order, so we still need to use thecall_category_loadsTo determine whether the class has been loaded into memory (callloadMethods:

if (cls  &&  cls->isLoadable()) {
    (*load_method)(cls, SEL_load);
    cats[i].cat = nil;
}

Here, we check whether the class exists and whether it can be loaded. If it is true, then we can call the classified load method.

Application of load

loadIt can be said that we can touch the call time in daily developmentThe most advanced methodBefore the main function runs,loadMethod is called.

Because its call is notinertiaAnd it will only be called once during a program call, most importantly, if it is implemented in both classes and classificationsloadMethods, they will be called, unlike other methods implemented in the classification will be overriddenloadThe method becomes the best time to adjust the method.

But becauseloadMethod runs too early, so this may not be an ideal environment becauseSome classes may need to be loaded before othersBut we can’t guarantee that. However, at this point in time, all frameworks have been loaded into the runtime, so it is safe to call methods in the framework.

reference material

  • NSObject +load and +initialize – What do they do?

  • Method Swizzling

  • Objective-C Class Loading and Initialization

Pay attention to the warehouse and get updated in time: IOS source code analyze
Follow: Draveness · Github