Swoole journey

Time:2020-2-18

Author: Han Tianfeng original address: Click to view

What is the process?

In fact, the concept has appeared for a long timeAccording to Donald Knuth, the term coroutine was coined by Melvin Conway in 1958, after he applied it to construction of an assembly program.The first published explanation of the coroutine appeared later, in 1963. It has a long history than C language. In terms of its concept, it is a kind of subroutine, which can transfer the control power of the program through yield. The relationship between the processes is not the relationship between the caller and the callee, but symmetrical and equal to each other. Since it is fully user controlled, it is also a user thread. The process is scheduled by the user in a non preemptive manner, not by the operating system. Because of this, there is no overhead of context switching in system scheduling, and the process is light, efficient and fast. (most of them are non preemptive, but for example, golang has also joined preemptive scheduling in 1.4. One of the collaborators has a life and death cycle, so that other collaborators will not starve to death. Need to give up CPU when necessary)

Thanks to the popularity and rapid development of golang in China, it has been popular in recent years. At present, there are many languages that support co programming, such as: golang, Lua, python, C ා, JavaScript, etc. You can also use a very short code with C / C + + out of the process model. Of course, PHP also has its own co program implementation, that is, generator. We will not discuss it here.

Swoole 1.x

At first, swoole came into our attention as a high-performance network communication engine. Swoole 1. X code is mainly asynchronous callback. Although the performance is very efficient, many developers will find that with the increase of the complexity of project engineering, writing business code in the way of asynchronous callback is contrary to human normal thinking, especially when the callback is nested in multiple layers, not only the development Maintenance costs have risen exponentially, and the chances of errors have increased dramatically. The ideal coding method is: synchronous coding to achieve asynchronous non blocking effect. So swoole started to explore the process of collaboration very early.

The original version of the collaboration is based on the way PHP generator generators yield is implemented. Please refer to the introduction of the collaboration on Nikita’s early blog. The event driven combination of PHP and swoole can refer to Tencent’s open source TSF framework. We also use this framework in many production projects, which really makes you feel the pleasure of writing asynchronous code in the way of synchronous programming. However, the reality is always cruel, which has several fatal disadvantages:

  • All logic that is actively ceded requires the yield keyword. This will give programmers a great chance to make mistakes, which will lead to the shift of understanding to the principles of generators syntax.
  • Because grammar can’t be compatible with the old project, it’s too complex and expensive to transform the old project.

In this way, no matter new or old projects, they are not easy to use.

Swoole 2.x

After 2. X, all the processes are based on kernel native processes, without yield keyword. The 2.0 version is a very important milestone. It realizes the stack management of PHP. It goes deep into the Zend kernel to operate the PHP stack at the time of creating, switching and ending the cooperation process.

2. X mainly uses setjmp / longjmp to implement the cooperation process. Many C projects mainly use this way to implement try catch finally. You can also refer to the use of Zend kernel. The return value of the first call to setjmp is 0. When longjmp jumps, the return value of setjmp is the value passed to longjmp. Setjmp / longjmp has only the ability to control the flow jump. Although PC and stack pointer can be restored, stack frame cannot be restored, so there will be many problems. For example, in the case of long JMP, the scope of setjmp has exited, and the stack frame at that time has been destroyed. Undefined behavior occurs. Suppose there is such a call chain:

func0() -> func1() -> ... -> funcN()

Only in the case of setjmp in func {I} (), and longjmp in func {I + K} (), can the behavior of the program be expected.

Swoole 3.x

3. X is a version with a very short life cycle. It mainly refers to the fiber ext project and uses the VM interrupts mechanism of php7. This mechanism can set the marker bit in VM, check the marker bit when executing some instructions (such as jump and function call, etc.), and execute the corresponding hook function to switch the VM stack if hit, so as to realize the cooperation.

Swoole 4.x

Starting from 4. X, swoole has implemented a dual stack mode co programming kernel. And all the IO and system operations are encapsulated to the bottom layer to realize the thorough kernel co programming. In addition, it also provides a new runtime hook module, which can change the existing PHP synchronization code into a co process mode.

Swoole4 process analysis

Start with the simplest example of a cooperation process:

<?php
go(function(){
    echo "coro 1 start\n";
    co::sleep(1);
    echo "coro 1 exit";
});
echo "main flag\n";
go(function(){
    echo "coro 2 start\n";
    co::sleep(1);
    echo "coro 2 exit\n";
});
echo "main end\n";
//The output content is
coro 1 start
main flag
coro 2 start
main end
coro 1 exit
coro 2 exit

It can be found that the native coroutine jumps inside the function, the control flow jumps from line 4 to line 7, then the go function starts from line 8, jumps from line 10 to line 13, then line 9, and then line 15. Why can swoole’s processes be executed in this way? We will analyze it step by step.

We know that PHP, as an interpreted language, needs to be compiled into middle byte code before it can be executed. First, the script will be compiled into opcode array through lexical and syntax analysis, and then it will be executed by VM engine. We only focus on the VM execution part here. Several important data structures need to be focused on in the execution part

Opcodes

struct _zend_op {
    Const void * handler; // C processing function corresponding to each opcode
    Znode_op OP1; // operand 1
    Znode Ou OP op2; // operand 2
    Znode_op result; // return value
    uint32_t extended_value;
    uint32_t lineno;
    Zend_charopcode; // opcode instruction
    Zend_charop1_type; // operand 1 type
    Zend_charop2_type; // operand 2 type
    Zend_charresult_type; // return value type
};

It is easy to find that opcodes is essentially a three address code, where opcode is the type of instruction, with two input operands and one representing the output operands. Each instruction may use all or part of these operands. For example, add, subtract, multiply, divide, etc., all three operands will be used! Operation only involves OP1 and result; function call involves whether there is return value, etc.

Op Array

The main script of Zend? OP? Array PHP will generate a Zend? OP? Array. Each function, Eval, or even assert an expression will generate a new OP? Array.

struct _zend_op_array {
    /* Common zend_function header here */
    /* ... */
    Uint32_t last; // number of opcodes in the array
    Zend_op * opcodes; // opcode instruction array
    Int last_var; // number of CVS
    Uint32 T; // number of is TMP VaR and is var
    Zend_string * * vars; // array of variable names
    /* ... */
    Int last_literal; // literal quantity
    Zval * literals; // when accessing literal array, read it through "Zend" op "array - > literals + offset
    /* ... */
};

We are already familiar with PHP’s functions that have their own scopes. This is because each Zend? OP? Array contains all the stack information under the current scope. The call relationship between functions is also based on the switch of Zend? OP? Array.

PHP stack frame

All the states required for PHP execution are saved in VM stacks associated with linked list structure. Each stack will be initialized to 256K by default. Swoole can customize the stack size independently (the default value of the protocol is 8K). When the stack capacity is insufficient, it will automatically expand the capacity, and still associate each stack with the relationship of linked list. At each function call, a new stack frame will be applied in VM stack space to accommodate the current scope execution. The memory layout of stack frame structure is as follows:

+----------------------------------------+
| zend_execute_data                      |
+----------------------------------------+
| VAR[0]                =         ARG[1] | arguments
| ...                                    |
| VAR[num_args-1]       =         ARG[N] |
| VAR[num_args]         =   CV[num_args] | remaining CVs
| ...                                    |
| VAR[last_var-1]       = CV[last_var-1] |
| VAR[last_var]         =         TMP[0] | TMP/VARs
| ...                                    |
| VAR[last_var+T-1]     =         TMP[T] |
| ARG[N+1] (extra_args)                  | extra arguments
| ...                                    |
+----------------------------------------+

The last structure to be introduced is the most important one.

struct _zend_execute_data {
    Const zend_op * opline; // the opcode currently executed will start with zend_op_array after initialization
    zend_execute_data   *call;//
    Zval * return_value; // return value
    Zend_function * function; // currently executing function (empty when non function call)
    zval                 This;/* this + call_info + num_args    */
    Zend_class_entry * called_scope; // the class of the current call
    zend_execute_data   *prev_execute_data;
    Zend_array * symbol_table; // global variable symbol table
    void               **run_time_cache;   /* cache op_array->run_time_cache */
    zval                *literals;         /* cache op_array->literals       */
};

Prev? Execute? Data indicates the previous stack frame structure. After the current stack execution, the current execution pointer (similar to PC) will point to this stack frame. The execution flow of PHP is to load many Zend? OP? Array to execute on the stack frame in turn. This process can be divided into the following steps:

  • 1: Apply for the current stack frame from VM stack for the current op’u array to be executed. The structure is as above. Initialize the global variable symbol table, and point the global pointer eg (current execute data) to the newly allocated Zend execute data stack frame, and ex (opline) to the starting position of OP array.
  • 2: Start from ex (opline) to call the C processing handler (i.e., “Zend” op.handler) of each opcode. After each opcode is executed, ex (opline) + + will continue to execute the next opcode until all opcodes are executed. When a function or class member method call is encountered:

    • Take out zend’op’array corresponding to this function according to function’name from eg (function’table), then repeat step 1, assign eg (current’execute’data) to prev’execute’data of the new structure, and then point eg (current’execute’data) to the new zend’execute’data stack frame, and then start to execute the new stack frame, starting from zend’execute’data.opline. After the function is executed, execute eg (current’execute’data) _Execute? Data) points to ex (prev? Execute? Data), releases the allocated run stack frame, and returns to the next opline at the end of function execution.
  • 3: After all opcodes are executed, the stack frame allocated by 1 will be released and the execution phase will end.

With the above details of PHP execution, let’s go back to the original example and find out that what the cooperation needs to do is to change the original PHP operation mode, not to switch the stack frame at the end of function operation, but to switch to other stack frames at any time during the current op’u array execution of the function (the internal control of swoole is to encounter IO waiting). Next, we will combine Zend VM and swoole to analyze how to create the cooperation stack, encounter IO switching, stack recovery after IO completion, and stack frame destruction when the cooperation exits. First, the main structure of the PHP part of the cooperation process is introduced:

  • PHP core task
struct php_coro_task
{
    /*List only key structures*/
    /*...*/
    Zval * vm_stack_top; // stack top
    Zval * vm_stack_end; // stack bottom
    Zend VM stack VM stack; // current collaboration stack pointer
    /*...*/
    Zend? Execute? Data * execute? Data; // current protocol stack frame
    /*...*/
    PHP ﹣ Coro ﹣ task * origin ﹣ task; // last process stack frame, similar to prev ﹣ execute ﹣ data
};

The cooperation switch is mainly for context saving and recovery when the current stack execution is interrupted. In combination with the execution process of VM, we can know the function of the above fields.

  • There is no doubt that the execute data stack frame pointer needs to be saved and recovered
  • What is the role of the VM stack * series? The reason is that PHP is a dynamic language. As we analyzed above, every time a new function enters and exits, a stack frame needs to be created and released on the global stack. Therefore, the corresponding global stack pointer needs to be properly saved and recovered to ensure that each stack frame can be released without memory leakage. (after compiling PHP in debug mode, each release will check whether the global stack is legal.)
  • Origin task is the previous stack frame that needs to be executed automatically after the execution of the current cooperation.

The main methods involved are

  • Create the collaboration application stack frame on the global stack.

    • The creation of a cooperation procedure is to create a closure function, and pass the function (which can be understood as the op’u array to be executed) into the built-in function go() of swoole as a parameter;
  • Process yield, yield, encounter IO, save the context information of the current stack frame
  • Resume and IO of the process are completed. Recover the context information of the process to be executed to the state before yield
  • The exit, exit and op array of the cooperation process are all completed, releasing the stack frame and the relevant data of the swoole cooperation process.

After the introduction above, you should have a general understanding of the jump that can be realized in the function during the operation of the swoole orchestration. Going back to the initial example and the PHP execution details above, we can know that this example will generate three op’u arrays, which are the main script, orchestration 1 and orchestration 2. We can use some tools to print out opcodes for intuitive observation. Usually we use the following two tools

//Opcache, version >= PHP 7.1
php -d opcache.opt_debug_level=0x10000 test.php

//VLD, third party extension
php -d vld.active=1 test.php

We use opcache to observe opcodes before optimization. We can clearly see the details of these three groups of op’u array.

php -dopcache.enable_cli=1 -d opcache.opt_debug_level=0x10000 test.php
$_main: ; (lines=11, args=0, vars=0, tmps=4)
    ; (before optimizer)
    ; /path-to/test.php:2-6
L0 (2):     INIT_FCALL 1 96 string("go")
L1 (2):     T0 = DECLARE_LAMBDA_FUNCTION string("")
L2 (6):     SEND_VAL T0 1
L3 (6):     DO_ICALL
L4 (7):     ECHO string("main flag
")
L5 (8):     INIT_FCALL 1 96 string("go")
L6 (8):     T2 = DECLARE_LAMBDA_FUNCTION string("")
L7 (12):    SEND_VAL T2 1
L8 (12):    DO_ICALL
L9 (13):    ECHO string("main end
")
L10 (14):   RETURN int(1)

{closure}: ; (lines=6, args=0, vars=0, tmps=1)
    ; (before optimizer)
    ; /path-to/test.php:2-6
L0 (9):     ECHO string("coro 2 start
")
L1 (10):    INIT_STATIC_METHOD_CALL 1 string("co") string("sleep")
L2 (10):    SEND_VAL_EX int(1) 1
L3 (10): do ﹐ fCall // yiled from current op ﹐ array [Coro 1]; resume
L4 (11):    ECHO string("coro 2 exit
")
L5 (12):    RETURN null

{closure}: ; (lines=6, args=0, vars=0, tmps=1)
    ; (before optimizer)
    ; /path-to/test.php:2-6
L0 (3):     ECHO string("coro 1 start
")
L1 (4):     INIT_STATIC_METHOD_CALL 1 string("co") string("sleep")
L2 (4):     SEND_VAL_EX int(1) 1
L3 (4): do ﹐ fCall // yiled from current op ﹐ array [Coro 2]; resume
L4 (5):     ECHO string("coro 1 exit
")
L5 (6):     RETURN null
coro 1 start
main flag
coro 2 start
main end
coro 1 exit
coro 2 exit

When executing Co:: sleep(), swoole gives up the current control right and jumps to the next op’u array. In combination with the above notes, that is to say, when do’fcall, it gives up and restores the execution stack of the cooperation respectively, so as to achieve the purpose of the jump of the control flow of the original cooperation.

We analyze how the init fCall do fCall instruction is executed in the kernel. In order to better understand the function call stack switching relationship.

VM internal instructions will be specialized into a C function according to the return value of the current operand. In this example, we have the following corresponding relationship

INIT_FCALL => ZEND_INIT_FCALL_SPEC_CONST_HANDLER

DO_FCALL => ZEND_DO_FCALL_SPEC_RETVAL_UNUSED_HANDLER

ZEND_INIT_FCALL_SPEC_CONST_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
    USE_OPLINE

    zval *fname = EX_CONSTANT(opline->op2);
    zval *func;
    zend_function *fbc;
    zend_execute_data *call;

    fbc = CACHED_PTR(Z_CACHE_SLOT_P(fname));
    if (UNEXPECTED(fbc == NULL)) {
        func = zend_hash_find(EG(function_table), Z_STR_P(fname));
        if (UNEXPECTED(func == NULL)) {
            SAVE_OPLINE();
            zend_throw_error(NULL, "Call to undefined function %s()", Z_STRVAL_P(fname));
            HANDLE_EXCEPTION();
        }
        fbc = Z_FUNC_P(func);
        CACHE_PTR(Z_CACHE_SLOT_P(fname), fbc);
        if (EXPECTED(fbc->type == ZEND_USER_FUNCTION) && UNEXPECTED(!fbc->op_array.run_time_cache)) {
            init_func_run_time_cache(&fbc->op_array);
        }
    }

    call = zend_vm_stack_push_call_frame_ex(
        opline->op1.num, ZEND_CALL_NESTED_FUNCTION,
        FBC, offline - > extended [value, null, null); // request the execution stack of the current function from the global stack
    Call - > prev? Execute? Data = ex (call); // assign the stack being executed to prev? Execute? Data of the stack to be executed, and recover here after the function execution
    Ex (call) = call; // assign the function stack to the global execution stack, the function stack to be executed
    ZEND_VM_NEXT_OPCODE();
}
ZEND_DO_FCALL_SPEC_RETVAL_UNUSED_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
    USE_OPLINE
    Zend_execute_data * call = ex (call); // get the execution stack
    Zend_function * FBC = call - > func; // current function
    zend_object *object;
    zval *ret;

    Save? Opline(); // when there is a global register ((execute? Data) - > opline) = opline
    Ex (call) = call - > prev ﹣ execute ﹣ data; // the current execution stack execute ﹣ data - > call = ex (call) - > prev ﹣ execute ﹣ data function recovers to the called function after execution
    /*...*/
    LOAD_OPLINE();

    if (EXPECTED(fbc->type == ZEND_USER_FUNCTION)) {
        ret = NULL;
        if (0) {
            ret = EX_VAR(opline->result.var);
            ZVAL_NULL(ret);
        }

        call->prev_execute_data = execute_data;
        i_init_func_execute_data(call, &fbc->op_array, ret);

        if (EXPECTED(zend_execute_ex == execute_ex)) {
            ZEND_VM_ENTER();
        } else {
            ZEND_ADD_CALL_FLAG(call, ZEND_CALL_TOP);
            zend_execute_ex(call);
        }
    } else if (EXPECTED(fbc->type < ZEND_USER_FUNCTION)) {
        zval retval;

        call->prev_execute_data = execute_data;
        EG(current_execute_data) = call;
        /*...*/
        ret = 0 ? EX_VAR(opline->result.var) : &retval;
        ZVAL_NULL(ret);

        if (!zend_execute_internal) {
            /* saves one function call if zend_execute_internal is not used */
            fbc->internal_function.handler(call, ret);
        } else {
            zend_execute_internal(call, ret);
        }

        EG(current_execute_data) = execute_data;
        Zend? VM? Stack? Free? Args (call); // release local variables

        if (!0) {
            zval_ptr_dtor(ret);
        }

    } else { /* ZEND_OVERLOADED_FUNCTION */
        /*...*/
    }

fcall_end:
        /*...*/
    }
    Zend? VM? Stack? Free? Call? Frame (call); // release the stack
    if (UNEXPECTED(EG(exception) != NULL)) {
        zend_rethrow_exception(execute_data);
        HANDLE_EXCEPTION();
    }
    ZEND_VM_SET_OPCODE(opline + 1);
    ZEND_VM_CONTINUE();
}

Swoole can be switched in the PHP layer according to the above methods. As for IO waiting during execution, additional technologies are needed to drive it. In the following articles, we will introduce the driver technology of each version, combining with the original event model of swoole, and describe how swoole cooperation has progressed to the present.

Swoole4 co process dual stack

Because our system has two parts: C stack and PHP stack, the agreed name is:

  • C protocol C stack management,
  • PHP protocol PHP stack management part.

Adding C stack is the most important and key part of 4. X cooperation program. Previous versions can’t support PHP syntax perfectly because they didn’t save C stack information. Next, we will analyze the support of c-stack switching. At first, we used Tencent’s libco to support it, but there will be memory read and write errors through the pressure test, and the open source community is very inactive, and there are problems that can’t be dealt with in time. Therefore, the assembly part of the C + + boost library that we split is based on this.

System architecture

Swoole journey

It can be found that the role of swoole is to glue in the system API and PHP zendvm to write high-performance code for PHPer user depth interface; not only that, but also to support the development and use of C + + / C users. For details, please refer to the document how C + + developers use swoole. The code of Part C is mainly divided into several parts:

  • Assembly ASM driver
  • Conext context encapsulation
  • Socket protocol socket encapsulation
  • PHP stream system encapsulation can seamlessly co program PHP related functions
  • Zendvm binding layer

The underlying system of swoole is more hierarchical. Socket will be the cornerstone of the whole network drive. In the original version, each client needs to maintain the context based on asynchronous callback. Therefore, compared with the previous version, 4. X has a qualitative leap in terms of the complexity of the project and the stability of the system. Code catalog level

$ tree swoole-src/src/coroutine/
swoole-src/src/coroutine/
├ - base.cc // C orchestration API, which can be called back to PHP orchestration API
├── channel.cc //channel
├ - context.cc // the implementation of the cooperation process is based on ASM make ﹣ fcontext jump ﹣ fcontext
├── hook.cc //hook
└ -- socket.cc // network operation process package
Swoole Src / swoole_coroutine.cc // zendvm related encapsulation, PHP co programming API

From the user level to the system level, we have PHP, C and ASM API. The socket layer is the network encapsulation of compatible system API. We go down and up. For example, ASM x86-64 architecture has 16 64 bit general registers. The registers and purposes are as follows:

  • %Rax is usually used to store the return results of function calls, but also for multiplication and division instructions. In the imul instruction, the multiplication of two 64 bits can produce 128 bit results at most. It needs% rax and% RDX to store the multiplication results together. In the div instruction, the divisor is 128 bits. It also needs% rax and% RDX to store the divisor together.

    • %RSP is a stack pointer register, which usually points to the top of the stack. The pop and push operations of the stack are realized by changing the value of% RSP, that is, moving the stack pointer.
    • %RBP is a stack frame pointer, used to identify the starting position of the current stack frame
    • %Six registers, RDI,% RSI,% RDX,% RCX,% R8,% R9, are used to store six parameters when a function is called
  • %RBX,% R12,% R13,% 14,% 15 are used as data storage and follow the callee usage rules

%R10,% R11 are used as data storage and follow the rules of the caller

That is to say, after entering the assembly function, the first parameter value has been put into the% RDI register, the second parameter value has been put into the% RSI register, and the stack pointer% RSP points to the position where the return address of the parent function is stored in the top of the stack x86-64 use swoole Src / thirdparty / boost / ASM / make x86_64_sysv_elf_gas. S

//Create a context at the top of the current stack to execute the third parameter function fn and return the execution environment context after initialization
fcontext_t make_fcontext(void *sp, size_t size, void (*fn)(intptr_t));
make_fcontext:
    /* first arg of make_fcontext() == top of context-stack */
    movq  %rdi, %rax

    /* shift address in RAX to lower 16 byte boundary */
    andq  $-16, %rax

    /* reserve space for context-data on context-stack */
    /* size for fc_mxcsr .. RIP + return-address for context-function */
    /* on context-function entry: (RSP -0x8) % 16 == 0 */
    leaq  -0x48(%rax), %rax

    /* third arg of make_fcontext() == address of context-function */
    movq  %rdx, 0x38(%rax)

    /* save MMX control- and status-word */
    stmxcsr  (%rax)
    /* save x87 control-word */
    fnstcw   0x4(%rax)

    /* compute abs address of label finish */
    leaq  finish(%rip), %rcx
    /* save address of finish as return-address for context-function */
    /* will be entered after context-function returns */
    movq  %rcx, 0x40(%rax)

    Return / * return pointer to context data * returns the stack bottom pointer pointed by rax as context/
//Save the current context (including stack pointer, PC program counter and register) to * OFC, recover the context from NFC and start execution.
intptr_t jump_fcontext(fcontext_t *ofc, fcontext_t nfc, intptr_t vp, bool preserve_fpu = false);

jump_fcontext:
//Save current register, stack
    pushq  %rbp  /* save RBP */
    pushq  %rbx  /* save RBX */
    pushq  %r15  /* save R15 */
    pushq  %r14  /* save R14 */
    pushq  %r13  /* save R13 */
    pushq  %r12  /* save R12 */

    /* prepare stack for FPU */
    leaq  -0x8(%rsp), %rsp

    /* test for flag preserve_fpu */
    cmp  $0, %rcx
    je  1f

    /* save MMX control- and status-word */
    stmxcsr  (%rsp)
    /* save x87 control-word */
    fnstcw   0x4(%rsp)

1:
    /*Store RSP (pointing to context data) in RDI save the current stack top to RDI, that is, save the current stack top pointer to the first parameter% RDI OFC*/
    movq  %rsp, (%rdi)

    /*Restore RSP (pointing to context data) from RSI changes the top address of the stack to the address of the new protocol, and RSI is the second parameter address*/
    movq  %rsi, %rsp

    /* test for flag preserve_fpu */
    cmp  $0, %rcx
    je  2f

    /* restore MMX control- and status-word */
    ldmxcsr  (%rsp)
    /* restore x87 control-word */
    fldcw  0x4(%rsp)

2:
    /* prepare stack for FPU */
    leaq  0x8(%rsp), %rsp
//Register recovery
    popq  %r12  /* restrore R12 */
    popq  %r13  /* restrore R13 */
    popq  %r14  /* restrore R14 */
    popq  %r15  /* restrore R15 */
    popq  %rbx  /* restrore RBX */
    popq  %rbp  /* restrore RBP */

    /*Restore return address put the return address in the R8 register*/
    popq  %r8

    /* use third arg as return-value after jump*/
    movq  %rdx, %rax
    /* use third arg as first arg in context function */
    movq  %rdx, %rdi

    /* indirect jump to context */
    jmp  *%r8

Context management is located in context.cc, which encapsulates ASM and provides two APIs

bool Context::SwapIn()
bool Context::SwapOut()

The final orchestration API is located in base.cc, and the main API is

//Create a c-stack protocol, provide an execution entry function, and enter the execution context of the function
//For example, the PHP stack's entry function coroutine:: create (phpcoroutine:: create_func, (void *) & php_coro_args);
long Coroutine::create(coroutine_func_t fn, void* args = nullptr); 
//Cut out from the current context, and call hook functions such as PHP stack switch function void phpcoroutine:: on ˊ yield (void * ARG)
void Coroutine::yield()
//Cut in from the current context, and call hook functions such as PHP stack switch function void phpcoroutine:: on UUME (void * ARG)
void Coroutine::resume()
//The execution of the C cooperation procedure ends, and a hook function is called, such as PHP stack cleaning void phpcoroutine:: on close (void * ARG)
void Coroutine::close()

Next, the glue layer of zendvm is located in swoole Src / swoole_coroutine.cc

Phpcoroutine is used by the C cooperation program or the underlying interface
//Create an entry function with PHP function as the parameter
static long create(zend_fcall_info_cache *fci_cache, uint32_t argc, zval *argv);
//C process creation API
static void create_func(void *arg);
//The last part of the c-coprocess hook function, the c-coprocess of base.cc, is related to the following three hook functions
static void on_yield(void *arg);
static void on_resume(void *arg);
static void on_close(void *arg);
//PHP stack management
static inline void vm_stack_init(void);
static inline void vm_stack_destroy(void);
static inline void save_vm_stack(php_coro_task *task);
static inline void restore_vm_stack(php_coro_task *task);
//Output cache management related
static inline void save_og(php_coro_task *task);
static inline void restore_og(php_coro_task *task);

With the construction of the above basic part and the implementation of stack management in combination with the PHP kernel, you can drive the PHP co process from the C co process, and realize the dual stack native co process of C stack + PHP stack.

Author information

Han Tianfeng, founder of swoole open source project and chief architect of XRS online school.