Webpack principle series 9: tree shaking implementation principle

Time:2022-1-20

1、 What is tree shaking

Tree shaking is a dead code elimination technology based on ES module specification. It will statically analyze the import and export between modules during operation, determine which export values in ESM module have not been used by other modules, and delete them, so as to optimize the packaged products.

Tree shaking was first implemented by rich Harris in rollup earlier. Webpack has been connected since version 2.0. So far, it has become a widely used performance optimization method.

1.1 start tree shaking in webpack

In webpack, three conditions must be met to start the tree shaking function:

  • Write module code using the ESM specification
  • to configureoptimization.usedExportsbytrue, start the marking function
  • Start the code optimization function, which can be realized in the following ways:

    • to configuremode = production
    • to configureoptimization.minimize = true
    • provideoptimization.minimizerarray

For example:

// webpack.config.js
module.exports = {
  entry: "./src/index",
  mode: "production",
  devtool: false,
  optimization: {
    usedExports: true,
  },
};

1.2 theoretical basis

In the old versions of JavaScript modularization schemes such as commonjs, AMD and CMD, the import and export behavior is highly dynamic and difficult to predict, for example:

if(process.env.NODE_ENV === 'development'){
  require('./bar');
  exports.foo = 'foo';
}

The ESM scheme avoids this behavior from the normative level. It requires that all import and export statements can only appear at the top level of the module, and the name of the imported and exported module must be a string constant, which means that the following codes are illegal under the ESM scheme:

if(process.env.NODE_ENV === 'development'){
  import bar from 'bar';
  export const foo = 'foo';
}

Therefore, the dependency relationship between modules under ESM is highly determined and has nothing to do with the running state. The compiler only needs to make a static analysis of the ESM module to infer which module values have not been used by other modules from the code literal, which is a necessary condition for the implementation of tree shaking Technology.

1.3 example

For the following codes:

// index.js
import {bar} from './bar';
console.log(bar);

// bar.js
export const bar = 'bar';
export const foo = 'foo';

In the example,bar.jsModule exportedbarfoo, but onlybarThe exported value is used by other modules and processed by tree shaking,fooVariables are deleted as useless code.

2、 Implementation principle

In webpack, the implementation of tree shaking is firstsignThe second is to delete these unused export statements by using terser. The marking process can be roughly divided into three steps:

  • In the make phase, collect the module export variables and record them in the modulegraph variable of the module dependency diagram
  • In the seal phase, traverse the modulegraph to mark whether the exported variables of the module have been used
  • When generating a product, delete the corresponding export statement if the variable is not used by other modules

The marking function needs to be configuredoptimization.usedExports = trueopen

In other words, the effect of marking is to delete export statements that are not used by other modules, such as:

Webpack principle series 9: tree shaking implementation principle

In the example,bar.jsThe module (second from the left) exports two variables:barAndfoo, wherefooIt is not used by other modules, so after marking, it is built in the product (first on the right)fooThe export statement corresponding to the variable is deleted. For comparison, if the marking function is not activated(optimization.usedExports = falseThe export statement will be retained whether the variable is used or not, as shown in the product code in the second right of the figure above.

Attention, this timefooCode corresponding to variableconst foo='foo'It is because the marking function will only affect the export statement of the module and the real execution“Shaking”It is the terser plug-in that operates. For example, in the above examplefooAfter the variable is marked, it has become a dead code – code that cannot be executed. At this time, you can delete this definition statement only by using the DCE function provided by terser, so as to achieve the complete tree shaking effect.

Next, I will expand the source code of the marking process and explain in detail the implementation process of tree shaking in webpack 5. Students who are not interested in the source code can skip to the next chapter directly.

2.1 collection module export

First, webpack needs to find out what export values each module has. This process takes place in the make phase. The general process is as follows:

For more information on the make phase, please refer to the previous section[ten thousand words summary] understand the core principles of webpack

  1. Convert all ESM export statements of the module into dependency objects and record them tomoduleObjectdependenciesSet, conversion rule:
  • Named export toHarmonyExportSpecifierDependencyobject
  • defaultExport toHarmonyExportExpressionDependencyobject

For example, for the following modules:

export const bar = 'bar';
export const foo = 'foo';

export default 'foo-bar'

CorrespondingdependenciesThe value is:

Webpack principle series 9: tree shaking implementation principle

  1. After all modules are compiled, triggercompilation.hooks.finishModulesHook, start executionFlagDependencyExportsPluginPlug in callback
  2. FlagDependencyExportsPluginThe plug-in reads the module information stored in the modulegraph from the entry and traverses all modulesmoduleobject
  3. ergodicmoduleObjectdependenciesArray, find allHarmonyExportXXXDependencyType, convert it toExportInfoObject and record it into the modulegraph system

afterFlagDependencyExportsPluginAfter the plug-in processing, all ESM style export statements will be recorded in the modulegraph system, and subsequent operations can directly read the exported value of the module from the modulegraph.

reference material:

  1. [ten thousand words summary] understand the core principles of webpack
  2. A little difficult webpack knowledge: dependency graph deep parsing

2.2 marking module export

After the module export information is collected, webpack needs to mark which export values are used by other modules and which are not in the export list of each module. This process occurs in the seal stage. The main process is as follows:

  1. triggercompilation.hooks.optimizeDependenciesHook, start executionFlagDependencyUsagePluginPlug in logic
  2. stayFlagDependencyUsagePluginIn the plug-in, start from the entry and step through all the data stored in the modulegraphmoduleobject
  3. ergodicmoduleObject correspondingexportInfoarray
  4. For eachexportInfoObject executioncompilation.getDependencyReferencedExportsMethod to determine its correspondingdependencyIs the object used by other modules
  5. The exported value used by any module, callexportInfo.setUsedConditionallyMethod to mark it as used.
  6. exportInfo.setUsedConditionallyInternal modificationexportInfo._usedInRuntimeProperty to record how the export is used
  7. end

The above version is extremely simplified. There are many branch logic and complex set operations in the middle. We focus on the key point: the operation of marking module export focuses onFlagDependencyUsagePluginIn the plug-in, the execution result will eventually be recorded in the corresponding module export statementexportInfo._usedInRuntimeIn the dictionary.

2.3 code generation

After the previous collection and marking steps, webpack has clearly recorded the exported values of each module in the modulegraph system, and each exported value is not used by that module. Next, webpack will generate different codes according to the usage of the exported values, for example:

Webpack principle series 9: tree shaking implementation principle

Focus onbar.jsFile, which is also the exported value,barcoverindex.jsThe module is used, so it is generated accordingly__webpack_require__.dcall"bar": ()=>(/* binding */ bar), for comparisonfooOnly the definition statement is retained and the corresponding export is not generated in the chunk.

Content and of webpack products__webpack_require__.dThe meaning of method can be referred toWebpack principles Series 6: a thorough understanding of the webpack runtimeA penny.

This generation logic is generated by the correspondingHarmonyExportXXXDependencyClass implementation, general process:

  1. Packaging phase, callingHarmonyExportXXXDependency.Template.applyMethod generate code
  2. stayapplyMethod, read the data stored in modulegraphexportsInfoInformation to determine which exported values are used and which are not used
  3. Create corresponding export values for used and unused export values respectivelyHarmonyExportInitFragmentObjects, saving toinitFragmentsarray
  4. ergodicinitFragmentsArray to generate the final result

Basically, the logic of this step is to use the data collected earlierexportsInfoObject does not generate export statements separately from the exported values of the module.

2.4 delete dead code

After the previous steps, the unused values in the module export list will not be defined in the__webpack_exports__Object, forming a dead code effect that cannot be executed, as in the above examplefooVariable:

Webpack principle series 9: tree shaking implementation principle

After that, DCE tools such as terser and uglifyjs will “shake” this part of invalid code to form a complete tree shaking operation.

2.5 summary

To sum up, the implementation of tree shaking in webpack is divided into the following steps:

  • stayFlagDependencyExportsPluginAccording to the module in the plug-independenciesCollect module export values from the list and record them in the modulegraph systemexportsInfoin
  • stayFlagDependencyUsagePluginCollect the usage of the exported value of the module in the plug-in and record it to theexportInfo._usedInRuntimeIn collection
  • stayHarmonyExportXXXDependency.Template.applyMethod generates different export statements according to the use of exported values
  • Use DCE tool to delete dead code to achieve complete tree shaking effect

The above implementation principles have high requirements for background knowledge. Readers are recommended to synchronously cooperate with the following documents:

  1. [ten thousand words summary] understand the core principles of webpack
  2. A little difficult webpack knowledge: dependency graph deep parsing
  3. Webpack principles Series 6: a thorough understanding of the webpack runtime

3、 Best practices

Although webpack since 2 X originally supports the tree shaking function, but due to the dynamic characteristics of JS and the complexity of modules, many problems caused by code side effects have not been solved until the latest version 5.0, so the optimization effect is not as perfect as that originally envisaged by tree shaking. Therefore, users need to consciously optimize the code structure, Or use some patch techniques to help webpack more accurately detect invalid codes and complete the tree shaking operation.

3.1 avoid meaningless assignment

When using webpack, you need to consciously avoid some unnecessary assignment operations. Observe the following example code:

Webpack principle series 9: tree shaking implementation principle

In the example,index.jsModule referencesbar.jsModularfooAnd assigned tofVariable, but it is not used laterfooorfVariables, in this scenariobar.jsModule exportedfooThe value is not actually used and should be deleted, but the tree shaking operation of webpack does not take effect and remains in the productfooExport:

Webpack principle series 9: tree shaking implementation principle

The shallow reason for this result is that webpack’s tree shaking logic stays at the level of code static analysis and only makes a simple judgment:

  • Is the module export variable referenced by other modules
  • Does this variable appear in the body code of the reference module

There is no further analysis on whether the module derived value is really used effectively.

The deeper reason is that JavaScript assignment statements are notpure, depending on the specific scenario, unexpected side effects may occur, such as:

import { bar, foo } from "./bar";

let count = 0;

const mock = {}

Object.defineProperty(mock, 'f', {
    set(v) {
        mock._f = v;
        count += 1;
    }
})

mock.f = foo;

console.log(count);

In the example, formockObject imposedObject.definePropertyCall, resulting inmock.f = fooAssignment statement paircountVariables produce side effects. In this scenario, even using complex dynamic semantic analysis, it is difficult to shake off all useless code branches and leaves perfectly on the premise of ensuring correct side effects.

Therefore, when using webpack, developers need to consciously avoid these meaningless repeated assignment operations.

3.3 use#pureLabel pure function calls

Similar to assignment statements, function call statements in JavaScript may also have side effects. Therefore, by default, webpack does not perform tree shaking on function calls. However, developers can add before the call statement/*#__PURE__*/Note: clearly tell webpack that this function call will not have side effects on the context, for example:

Webpack principle series 9: tree shaking implementation principle

In the example,foo('be retained')Call without/*#__PURE__*/Note: the code is reserved; By contrast,foo('be removed')After carrying the pure declaration, it will be deleted by tree shaking.

3.3 prohibit Babel translation module from importing and exporting statements

Babel is a very popular JavaScript code converter, which can translate the high version of JS code into the low version code with better compatibility, so that front-end developers can use the latest language features to develop code compatible with the old version of browsers.

However, some functions and features provided by Babel will cause the tree shaking function to fail. For example, Babel canimport/exportESM statements in common JS style are translated into modular statements in common JS style, but this function makes webpack unable to perform static analysis on the imported and exported contents of the translated modules, for example:

Webpack principle series 9: tree shaking implementation principle

Example usebabel-loaderhandle*.jsFile and set Babel configuration itemmodules = 'commonjs', the modular scheme is translated from ESM to commonjs, resulting in the translation code (the previous one in the right figure) not correctly marking the unused export valuefoo。 For comparison, figure 2 on the right showsmodules = falseThe result of the package is displayed at this timefooThe variable is correctly marked dead code.

Therefore, use in webpackbabel-loaderWhen, it is recommended thatbabel-preset-envofmoduelsConfiguration item set tofalse, turn off the translation of module import and export statements.

3.4 optimizing the granularity of derived values

The role of tree shaking logic in ESMexportStatement, so for the following export scenarios:

export default {
    bar: 'bar',
    foo: 'foo'
}

Even if onlydefaultOne of the attributes of the exported value, the entiredefaultThe object remains intact. Therefore, in the actual development, the granularity and atomicity of the derived value should be maintained as much as possible. The optimized version of the above example code:

const bar = 'bar'
const foo = 'foo'

export {
    bar,
    foo
}

3.5 using a package that supports tree shaking

If possible, try to use NPM packages that support tree shaking, such as:

  • uselodash-esreplacelodash, or usebabel-plugin-lodashAchieve similar effects

However, not all NPM packages have the space for tree shaking. Frameworks such as react and vue2 have optimized the production version to the extreme. At this time, the business code needs the complete functions provided by the whole code package, and tree shaking is basically not necessary.