Detailed explanation of JS execution process

Time:2021-4-30

Original blog address:https://finget.github.io/view…

JS code execution, mainly divided into two stages: compilation stage, execution stage!
This article is based on the V8 engine.

preface

V8 engine

Working principle of V8 engine:

Detailed explanation of JS execution process

V8 is composed of many sub modules, among which the four modules are the most important

  • Parser: responsible for converting javascript source code to abstract syntax tree (AST);

    • If the function is not called, it will not be converted to ast
  • Ignition: interpreter is responsible for converting AST to bytecode, interpreting and executing bytecode; At the same time, it collects the information needed for turbofan optimization compilation, such as the type of function parameter, which can be used for real operation;

    • If the function is called only once, ignition executes the interpretation and bytecode
    • The interpreter also has the ability to interpret and execute bytecode

There are usually two types of interpreters,Stack basedandRegister basedStack based interpreter uses stack to save function parameters, intermediate operation results, variables, etc; Register based virtual machine supports the instruction operation of register, and uses register to save parameters and intermediate calculation results. Generally, stack based virtual machines also define a small number of registers. Register based virtual machines also have stacks. The difference is reflected in the instruction set they provide. Most interpreters are stack based, such asJava virtual machine. net virtual machineAndEarly V8 virtual machine. Stack based virtual machine is simple and lively in processing function calls, solving recursion problems and switching context. andToday's V8 virtual machineThe design based on register is adopted, which saves some intermediate data in register.
Register based interpreter architecture:
Detailed explanation of JS execution process

  • TurboFan: compiler, which converts bytecode into optimized assembly code by using the type information collected by ignitio;

    • If a function is called many times, it will be marked as a hot function, and then it will be converted into optimized machine code through turbo fan to improve the performance of code execution;
    • However, the machine code will actually be restored to bytecode. This is because if the type changes during the subsequent execution of the function (for example, the sum function used to execute the number type, and then the execution becomes the string type), the previously optimized machine code can not correctly process the operation, and will be conversed into bytecode;
  • Orinoco: garbage collector, garbage collection module, which is responsible for reclaiming the memory space no longer needed by the program;

Give me a word

Stack

The stack is characterized by “LIFO, that is, last in, first out”. Data can only be stored one by one from the top when it is stored, and it also needs to be taken out one by one from the top when it is taken out.

Detailed explanation of JS execution process

Heap

Heap is characterized by “unordered” key value “key value pair” storage. The access mode of the heap has nothing to do with the order, and is not limited to the access.

Detailed explanation of JS execution process

Queue

The queue is characterized by FIFO (first in, first out).
When accessing data, “insert from the end of the team and take out from the head of the team”.

The difference between stack and stack: the storage and retrieval of stack are at the top of one entrance and exit, while the queue is divided into two, one exit and one entrance.

Detailed explanation of JS execution process

Compilation phase

Lexical analysis scanner

A string composed of characters is decomposed into meaningful code blocks, which are called token.

Detailed explanation of JS execution process

[
    {
        "type": "Keyword",
        "value": "var"
    },
    {
        "type": "Identifier",
        "value": "name"
    },
    {
        "type": "Punctuator",
        "value": "="
    },
    {
        "type": "String",
        "value": "'finget'"
    },
    {
        "type": "Punctuator",
        "value": ";"
    }
]

Parsing parser

This process is to transform the lexical unit stream (array) into a tree composed of nested elements, which represents the syntax structure of the program. This tree is called abstract syntax tree (AST).

{
  "type": "Program",
  "body": [
    {
      "type": "VariableDeclaration",
      "declarations": [
        {
          "type": "VariableDeclarator",
          "id": {
            "type": "Identifier",
            "name": "name"
          },
          "init": {
            "type": "Literal",
            "value": "finget",
            "raw": "'finget'"
          }
        }
      ],
      "kind": "var"
    }
  ],
  "sourceType": "script"
}

In this process, if the source code does not conform to the syntax rules, it will terminate and throw a “syntax error”.

Detailed explanation of JS execution process

Here is a tool, you can generate syntax tree in real time, you can tryesprima。

Bytecode generation

You can use itnode node --print-bytecodeView bytecode:

// test.js
function getMyname() {
    var myname = 'finget';
    console.log(myname);
}
getMyname();
node --print-bytecode test.js 

...
[generated bytecode for function: getMyname (0x10ca700104e9 <SharedFunctionInfo getMyname>)]
Parameter count 1
Register count 3
Frame size 24
   18 E> 0x10ca70010e86 @    0 : a7                StackCheck 
   37 S> 0x10ca70010e87 @    1 : 12 00             LdaConstant [0]
         0x10ca70010e89 @    3 : 26 fb             Star r0
   48 S> 0x10ca70010e8b @    5 : 13 01 00          LdaGlobal [1], [0]
         0x10ca70010e8e @    8 : 26 f9             Star r2
   56 E> 0x10ca70010e90 @   10 : 28 f9 02 02       LdaNamedProperty r2, [2], [2]
         0x10ca70010e94 @   14 : 26 fa             Star r1
   56 E> 0x10ca70010e96 @   16 : 59 fa f9 fb 04    CallProperty1 r1, r2, r0, [4]
         0x10ca70010e9b @   21 : 0d                LdaUndefined 
   69 S> 0x10ca70010e9c @   22 : ab                Return 
Constant pool (size = 3)
Handler Table (size = 0)
...

This involves a very important concept: JIT (just in time) explains and executes at the same time.

How does it work

  1. Add a monitor (also called analyzer) to the JavaScript engine. The monitor monitors the running status of the code and records the information such as how many times the code has been run and how to run it. If the same line of code has been run several times, the code segment will be marked as “warm”. If it has been run many times, it will be marked as “hot”;

2. (baseline compiler) if a piece of code becomes “warm”, JIT sends it to the baseline compiler to compile and stores the compilation results. For example, if the monitor monitors that a line or a variable executes the same code and uses the same variable type, the compiled version will replace the execution of this line of code and store it;

3. (optimizing compiler) if a code segment becomes “hot”, the monitor will send it to the optimizing compiler. Generate a faster and more efficient code version and store it. For example: when adding an object attribute cyclically, assume that it is int type, and give priority to the judgment of int type;

4. (anti optimization deoptimization) however, JavaScript has never been sure that the first 99 object attributes maintain int type. Maybe the 100th does not have this attribute. At this time, JIT will think that it has made a wrong assumption, and throw away the optimization code. The execution process will return to the interpreter or baseline compiler, This process is called anti optimization.

Scope

Scope is a set of rules that govern how the engine looks up variables. Before Es5, JS had onlyglobal scopeandFunction scope. ES6 introduces block level scope. But this block level scope needs attention{}Instead of the scope ofletconstKeywordBlock level scope

var name = 'FinGet';

function fn() {
  var age = 18;
  console.log(name);
}

The scope is determined when parsing:

Detailed explanation of JS execution process

In short, a scope is a box that specifies the accessible scope of variables and functions and their life cycle.

Lexical scope

Lexical scope means that the scope is determined by the position of the function declaration in the code, so lexical scope is a static scope, through which we can predict how to find the identifier in the process of code execution.

function fn() {
    console.log(myName)
}
function fn1() {
    var myName = " FinGet "
    fn()
}
var myName = " global_finget "
fn1()

The result of the above code printing is:global_fingetThis is because the scope has been determined in the compilation phase,fnIs defined in the global scope and cannot be found within itselfmyNameIt will be found in the global scope, not in the global scopefn1Find in.

Implementation phase

Execution context

When a function is executed, an execution context is created. Execution context is an abstract concept of the environment in which the current JavaScript code is parsed and executed.

There are three types of execution context in javascript:

  • Global execution context (only one)
  • Function execution context
  • eval

The creation of execution context is divided into two phases: 1. Creation phase 2. Execution phase

Creation phase

When any JavaScript code is executed, the execution context is in the creation phase. Three things happened in the creation phase:

  • Determining the value of this is also called this binding
  • The lexical environment component is created.
  • The variable environment component is created.
ExecutionContext = {  
  Thisbinding = < this value >, // confirm this 
  Lexicalenvironment = {...}, // Lexical Environment
  Variableenvironment = {...}, // variable environment
}
This Binding

In the context of global execution,thisIn the browser,thisThe value of thewindowObject.
In the context of function execution,thisThe value of depends on how the function is called. If it is called by an object reference, thenthisIs set to the object, otherwisethisIs set to the global object orundefined(in strict mode).

Lexical Environment

Lexical environment is a structure containing identifier variable mapping( The identifier here represents the name of the variable / function, and the variable is the reference to the actual object (including function type object) or the original value).
In lexical environment, there are two parts
(1) Environment record
(2) Reference to external environment

  • The environmental record isStorage variablesandFunction declarationThe actual location of the.
  • A reference to the external environment means that itYou can access its external lexical environment。( Implement an important part of the scope chain)

There are two types of lexical environment

  • Global environment(in the context of global execution) is a lexical environment without an external environment. The external environment reference of the global environment is null. It has a global object (window object) and its associated methods and properties (such as array method) as well as any user-defined global variable. The value of this points to this global object.
  • Function environmentThe variables defined by the user in the function are stored in the environment record. A reference to an external environment can be a global environment or an external function environment that contains internal functions.

Note: for function environments, the environment record also contains aargumentsObject that contains the mapping between the index and the parameters passed to the function, as well as the length (number) of the parameters passed to the function.

Variable environment

It is also a lexical environment, and itsEnvironmentRecordIncluding theVariableStatementsThe binding created by the context is executed here.

As mentioned above, the variable environment is also a lexical environment, so it has all the properties of the lexical environment defined above.

Example code:

let a = 20;  
const b = 30;  
var c;

function multiply(e, f) {  
 var g = 20;  
 return e * f * g;  
}

c = multiply(20, 30);

Execution context:

GlobalExectionContext = {

  ThisBinding: <Global Object>,

  LexicalEnvironment: {  
    EnvironmentRecord: {  
      Type: "Object",  
      //The identifier is bound here  
      a: < uninitialized >,  
      b: < uninitialized >,  
      multiply: < func >  
    }  
    outer: <null>  
  },

  VariableEnvironment: {  
    EnvironmentRecord: {  
      Type: "Object",  
      //The identifier is bound here  
      c: undefined,  
    }  
    outer: <null>  
  }  
}

FunctionExectionContext = {  
   
  ThisBinding: <Global Object>,

  LexicalEnvironment: {  
    EnvironmentRecord: {  
      Type: "Declarative",  
      //The identifier is bound here  
      Arguments: {0: 20, 1: 30, length: 2},  
    },  
    Outer: < globallexicalenvironment > // specifies the global environment
  },

  VariableEnvironment: {  
    EnvironmentRecord: {  
      Type: "Declarative",  
      //The identifier is bound here  
      g: undefined  
    },  
    outer: <GlobalLexicalEnvironment>  
  }  
}

Take a closer look at the above:a: < uninitialized >,c: undefined. So you’re herelet aBefore definitionconsole.log(a)I’ll get it when I get thereUncaught ReferenceError: Cannot access 'a' before initialization

Why two lexical environments

Variable environment componentIt’s for registrationvar functionVariable declaration,Lexical environment componentIt’s for registrationlet const classAnd so on.

There is no block level scope before ES6, and we can use it after ES6let constTo declare a block level scope. These two lexical environments are designed to implement the block level scope without affecting thevarVariable declaration and function declaration are as follows:

  1. First, in a running execution context, the lexical environment consists ofLexicalEnvironmentandVariableEnvironmentTo register all variable declarations.
  2. When executing to block level code, theLexicalEnvironmentRecord, record asoldEnv
  3. Create a newLexicalEnvironment(outer points to oldenv), recorded asnewEnv, and willnewEnvSet to the executing contextLexicalEnvironment
  4. In block level codelet constWill register atnewEnvInside, butvarThe declaration and function declaration are still registered in the originalVariableEnvironmentIn.
  5. At the end of block level code execution, theoldEnvReverting to the executing contextLexicalEnvironment
function foo(){
    var a = 1
    let b = 2
    {
      let b = 3
      var c = 4
      let d = 5
      console.log(a)
      console.log(b)
    }
    console.log(b) 
    console.log(c)
    console.log(d)
}   
foo()

Detailed explanation of JS execution process

As can be seen from the figure, when entering the scope block of a function, the variables declared by let in the scope block will be stored in a separate area of the lexical environment, and the variables in this area will not affect the variables outside the scope block. For example, variable B is declared outside the scope block, and variable B is also declared inside the scope block. When executing to the inside of the scope block, They all exist independently.

In fact, within the lexical environment, it maintains aSmall stack structureThe bottom of the stack is the outermost variable of the function. After entering a scope block, the variables inside the scope block will be pushed to the top of the stack; When the scope is executed, the information of the scope will pop up from the top of the stack, which is the structure of the lexical environment. Note that the variables I’m talking about here are those declared through let or const.

Next, when theconsole.log(a)In this line of code, you need to find the value of variable a in the lexical environment and variable environment. The specific search method is: query down along the top of the stack of the lexical environment. If you find a block in the lexical environment, you will directly return it to the JavaScript engine. If you don’t find it, you can continue to search in the variable environment.

Execution context stack

Each function will have its own execution context, and multiple execution contexts will be managed in the way of stack (call stack).

function a () {
  console.log('In fn a')
  function b () {
    console.log('In fn b')
    function c () {
      console.log('In fn c')
    }
    c()
  }
  b()
}
a()

Detailed explanation of JS execution process

Detailed explanation of JS execution process

Can use this tool to try, more intuitive observation into the stack and out of the stackJavaScript visualizer tool

You can see the scope chain by looking at this figure. It’s very intuitive. The scope chain is in theExecution context creation phaseaffirmatory. Only with the execution environment can we determine with whom it should form a scope chain.

V8 garbage collection

memory allocation

Stack

Stack is a temporary storage space, which mainly stores local variables and function calls. It is small and continuous, easy to operate, and generally controlled by the systemAutomatic allocationAutomatic recoverySo the garbage collection in this article is based on heap memory.

Basic type data (number, Boolean, string, null, undefined, symbol, bigint) is stored in stack memory. The reference type data is stored in the heap memory, and the variable of the reference data type is a reference to the actual object in the heap memory, which is stored in the stack.

Why are basic data types stored in the stack and reference data types stored in the heap?

JavaScript engine needs to use the stack to maintain the state of the context during the execution of the program. If the stack space is large, all the data will be stored in the stack space, which will affect the efficiency of context switching and the execution efficiency of the whole program.

heap

Detailed explanation of JS execution process

It is used to store objects and dynamic data, which is the largest area in memory and where GC (garbage collection) works. However, not all heap memory can be GC managed, only the new generation and old generation are managed by GC. The heap can be further subdivided as follows:

  • Cenozoic space: it is the place where the newly generated data live, and these data are often transient. The space is split in two and then managed by scavenger (minor GC). I’ll talk about it later. You can use the V8 logo, such as — max_ semi_ space_ Size or — min_ semi_ space_ Size to control the size of Cenozoic space
  • Old generation space: data that survived at least two rounds of minor GC from the new generation space. This space is managed by major GC (Mark sweep & Mark compact), which will be introduced later. You can use — initial_ old_ space_ Size or — Max_ old_ space_ Size controls the size of the space.

Old pointer space: Surviving objects that contain pointers to other objects
Old data space: Surviving objects that only contain data.

  • Large object space: This is an object larger than the size of the space. Large objects will not be processed by GC.
  • Code space: here is the code compiled by JIT. This is the only executable space other than allocating code and executing it in large object space.
  • Map space: cell and map are stored, and each area stores elements of the same size, with simple structure.

Intergenerational hypothesis

The intergenerational hypothesis has two characteristics

  • The first is that most objects exist in memory for a short time. In short, many objects become inaccessible as soon as memory is allocated;
  • The second is the undead, who will live longer.

In V8, the heap is divided intoCenozoic eraandThe old generationIn the two regions, the objects with short survival time are stored in the Cenozoic and the objects with long survival time are stored in the old generation.

The capacity of the new area is usually only 1-8m, while the capacity of the old area is much larger. For these two areas, V8 uses two different garbage collectors to implement garbage collection more efficiently.

  • Deputy garbage collector, mainly responsible for the new generation of garbage collection.
  • The main garbage collector is mainly responsible for the garbage collection of the old generation.

New generation of Chinese MedicineScavengeAlgorithm. so-calledScavengeThe algorithm divides the Cenozoic space into two regions, one is the object region, the other is the free region.

Detailed explanation of JS execution process

Cenozoic recycling

The newly added objects will be stored in the object area. When the object area is almost full, a garbage cleaning operation needs to be performed.

  1. Firstly, mark the objects to be recycled, then copy the active objects in the object area to the free area, and sort them;

Detailed explanation of JS execution process

  1. After the copy is completed, the role of the object area and the free area is flipped, that is, the original object area becomes the free area, and the original free area becomes the object area.

Detailed explanation of JS execution process

Due to the new generation of scavenge algorithm, it is necessary to copy the surviving objects from the object area to the free area every time the cleaning operation is performed. But the replication operation needs time cost. If the space of the new area is set too large, the time of each cleaning will be too long. Therefore, in order to implement efficiency, the space of the new area will be set relatively small.

It is also because the space of the new area is not large, so it is easy to fill the whole area with living objects. In order to solve this problem, JavaScript engine adopts the object promotion strategy, that is, the objects that are still alive after two garbage collection will be moved to the Laosheng district.

Old generation recycling

Mark-Sweep

Mark sweep processing is divided into two stages, marking stage and cleaning stage. It looks similar to scavenge. The difference is that scavenge algorithm copies the active objects. Because the active objects are the majority in the old generation, mark sweep directly clears the inactive objects after marking the active objects and inactive objects.

  • Marking stage: scan the old generation for the first time and mark the active objects
  • Clean up stage: the second scan of the old generation to clear the unmarked objects, that is, clean up the inactive objects

Detailed explanation of JS execution process

Mark-Compact

After the completion of mark sweep, a lot of memory fragments are generated in the memory of the old generation. If you do not clean up these memory fragments, if you need to allocate a large object, then all the fragmented space can not be allocated, garbage collection will be triggered in advance, but this collection is not necessary.

In order to solve the problem of memory fragmentation, mark compact is proposed. It is based on mark sweep. Compared with mark sweep, mark compact adds the active object collation phase, which moves all the active objects to one end. After moving, it directly cleans up the memory outside the boundary.

Detailed explanation of JS execution process

Stop the world

If garbage collection takes time, the JS operation of the main thread will stop and wait for garbage collection to complete. We call this behavior stop the world.

Detailed explanation of JS execution process

Increment mark

In order to reduce the stuck caused by the old generation’s garbage collection, V8 divides the marking process into sub marking processes one by one, and alternates the garbage collection marking and JavaScript application logic until the marking stage is completed. We call this algorithm incremental marking algorithm. As shown in the figure below:

Detailed explanation of JS execution process

Inert cleaning

Incremental tags only mark active and inactive objects. Lazy cleanup is used to clean and release memory. After the incremental marking is completed, if the available memory is enough for us to quickly execute the code, in fact, we don’t need to clean up the memory immediately. We can delay the cleaning process to let the JavaScript logic code execute first, and we don’t need to clean up all the inactive object memory at one time. The garbage collector will clean up one by one on demand, Until all the pages are cleaned up.

Concurrent recycling

Concurrent GC allows the main thread not to be suspended while garbage collection, and the two can be carried out at the same time. Only in individual cases, it needs to stop for a short time to let the garbage collector do some special operations. But this method also has to face the problem of incremental collection, that is, in the process of garbage collection, because JavaScript code is executing, the reference relationship of objects in the heap may change at any time, so it also needs to be improvedWriting barrierOperation.

Detailed explanation of JS execution process

Parallel recycling

Parallel GC allows the main thread and the auxiliary thread to perform the same GC work at the same time, so that the auxiliary thread can share the GC work of the main thread, so that the garbage collection time is equal to the total time divided by the number of participating threads (plus some synchronization overhead).

Detailed explanation of JS execution process

Standing on the shoulders of giants

I’m here to show my respect to the elder. I’ve searched a lot of information. If there is any omission, please forgive me. If there is any mistake in the article, please point it out in time. Thank you!
Working principle and practice of browser
Reflection on the JS implementation mechanism triggered by reading Li Lao’s Curriculum|
Notes on browser principles – JS execution mechanism in browser (Part one)
The execution process of JS engine
Preliminary understanding of the underlying principles of JavaScript
The execution process of JavaScript language at engine level
How much do you know about the front-end foundation | JS execution process?
How does the code run? The JavaScript execution process
How does V8 execute JavaScript code?
How does the V8 engine work
Learn more about JavaScript execution (one of JS series)
Explain in simple terms how the V8 engine executes JavaScript code
How browsers work: Chrome V8 makes you understand JavaScript better
How to understand JS execution context and stack
JS execution visualization
Understand JavaScript execution context and stack
Understand JavaScript execution context and stack
Detailed explanation of JS scope chain
Browser garbage collection details (take Google browser V8 as an example)
In depth understanding of Google’s strongest V8 garbage collection mechanism
Orinoco: the garbage collector of V8
V8 memory management and garbage collection mechanism
JS Memory Leak And V8 Garbage Collection