Webpack Series Part 3: deep analysis of dependency graph

Time:2022-6-6

The full text is 2500 words, and the reading time is about 30 minutes. If you think the article is useful, you are welcome to praise and pay attention to it, but it is not easy to write it. It is forbidden to reprint it in any form without the consent of the author!!!

background

The concept of dependency graph comes from the official websiteDependency Graph | webpackThe original text is explained as follows:

Any time one file depends on another, webpack treats this as a dependency_. This allows webpack to take non-code assets, such as images or web fonts, and also provide them as _dependencies for your application.

When webpack processes your application, it starts from a list of modules defined on the command line or in its configuration file. Starting from these entry points_, webpack recursively builds a _dependency graph that includes every module your application needs, then bundles all of those modules into a small number of bundles – often, just one – to be loaded by the browser.

When webpack processes application code, it will recursively build an application that contains all modules from the entry provided by the developerdependency graph_,_ Then package these modules as bundles.

However, the fact is far from simple as described in the official website. Dependency graph runs through the entire operation cycle of webpack. From module parsing in the make phase to chunk generation in the seal phase, and the tree shaking function are highly dependent on dependency graph, which is a very core data structure for the construction of webpack resources.

This article will focus on webpack\@v5 The implementation of dependency graph of X is discussed in three aspects:

  • What kind of data structure is the dependency graph presented in the webpack implementation
  • How to collect dependencies between modules during webpack operation and build dependency graph
  • How is the dependency graph consumed after it is built

After studying this article, you will further understand the processing details of webpack module parsing. In combination with the previous article[ten thousand words summary] thoroughly understand the core principles of webpack, you can have a better understanding of the core mechanism of webpack.

Follow the official account [tecvan], reply to [1], and get the brain map of webpack knowledge system

Dependency Graph

This section will go deep into the source code of webpack and interpret the internal data structure and dependency collection process of dependency graph. Before the formal launch, it is necessary to review several important concepts of webpack:

  • Module: the mapping object of resources in the webpack, which contains the path, context, dependency, content and other information of resources
  • Dependency: reference other modules in a module, for exampleimport "a.js"Statement, the webpack will first express the reference relationship as a dependency subclass and associate the module object. After the current module content is resolved, start the next cycle to convert the dependency object into an appropriate module subclass.
  • Chunk: the object used to organize the output structure. After the webpack analyzes the contents of all module resources and builds a complete dependency graph, it will build one or more chunk instances according to the user configuration and the content of the dependency graph. Each chunk roughly corresponds to the final output file.

data structure

Webpack 4. The implementation of the dependency graph of X is relatively simple. It is mainly referenced by a series of attribute records built in the dependency/module.

After webpack 5.0, a relatively complex class structure is implemented to record the dependencies between modules, decoupling the logic related to module dependencies from dependency/module to a set of independent type structures. The main types are:

  • ModuleGraph: a container for recording dependency graph information. On the one hand, it saves all the information involved in the construction processmoduledependencyObjects, and references between these objects; On the other hand, it provides various tools and methods to facilitate users to quickly read and take outmoduleordependencyAdditional information
  • ModuleGraphConnection: a data structure that records the reference relationship between modules, internally throughoriginModuleAttribute record refers to the parent module in the relationship. ThemoduleAttribute record sub module. In addition, a series of function tools are provided to judge the validity of the corresponding reference relationship
  • ModuleGraphModuleModuleThe supplementary information of the object under the dependency graph system, including theincomingConnections——The modulegraphconnection collection that points to the module itself, that is, who refers to the module itself;outgoingConnections——The external dependency of the module, that is, the module references other modules.

The relationship between classes is roughly as follows:

The above class diagram requires additional attention:

  • ModuleGraphObject passing_dependencyMapAttribute recordDependencyObjects andModuleGraphConnectionThe mapping relationship between connection objects can be quickly found in subsequent processing based on this layer of mappingDependencyInstance corresponding reference and referenced person
  • ModuleGraphObject passing_moduleMapstaymoduleAdditional on the basisModuleGraphModuleInformation, andModuleGraphModuleThe most important function is to record the reference and referenced relationship of the module. Subsequent processing can be found based on this attributemoduleAll dependencies and dependencies of instances

Dependent collection process

ModuleGraphModuleGraphConnectionModuleGraphModuleThe three collaborate to gradually collect the dependencies between modules in the webpack construction process (make phase), and review the previous article[ten thousand words summary] thoroughly understand the core principles of webpackConstruction flow chart mentioned:

The construction process itself is very complicated, and readers are recommended to compare it[ten thousand words summary] thoroughly understand the core principles of webpackOne article, deepen understanding. The dependency collection process mainly occurs at two nodes:

  • addDependency: after the webpack parses the reference relationship from the module content, create the appropriateDependencySubclass and call the method to record tomoduleexample
  • handleModuleCreation: after the module is resolved, the webpack traverses the dependency collection of the parent module and calls this method to createDependencyCorresponding sub module object, and then callcompilation.moduleGraph.setResolvedModuleMethod to record parent-child reference information tomoduleGraphOn object

setResolvedModuleThe logic of the method is as follows:

class ModuleGraph {
    constructor() {
        /** @type {Map<Dependency, ModuleGraphConnection>} */
        this._dependencyMap = new Map();
        /** @type {Map<Module, ModuleGraphModule>} */
        this._moduleMap = new Map();
    }

    /**
     * @param {Module} originModule the referencing module
     * @param {Dependency} dependency the referencing dependency
     * @param {Module} module the referenced module
     * @returns {void}
     */
    setResolvedModule(originModule, dependency, module) {
        const connection = new ModuleGraphConnection(
            originModule,
            dependency,
            module,
            undefined,
            dependency.weak,
            dependency.getCondition(this)
        );
        this._dependencyMap.set(dependency, connection);
        const connections = this._getModuleGraphModule(module).incomingConnections;
        connections.add(connection);
        const mgm = this._getModuleGraphModule(originModule);
        if (mgm.outgoingConnections === undefined) {
            mgm.outgoingConnections = new Set();
        }
        mgm.outgoingConnections.add(connection);
    }
}

The code of the above example mainly changes_dependencyMapandmoduleGraphModuleAccess toconnectionsProperty to collect the upstream and downstream dependencies of the current module.

Instance resolution

Take a simple example. For the following dependencies:

After the webpack is started, it is called recursively during the construction phasecompilation.handleModuleCreationFunction to gradually supplement the dependency graph structure, and finally the following data results may be generated:

ModuleGraph: {
    _dependencyMap: Map(3){
        { 
            EntryDependency{request: "./src/index.js"} => ModuleGraphConnection{
                module: NormalModule{request: "./src/index.js"}, 
                //The entry module has no referent, so it is set to null
                originModule: null
            } 
        },
        { 
            HarmonyImportSideEffectDependency{request: "./src/a.js"} => ModuleGraphConnection{
                module: NormalModule{request: "./src/a.js"}, 
                originModule: NormalModule{request: "./src/index.js"}
            } 
        },
        { 
            HarmonyImportSideEffectDependency{request: "./src/a.js"} => ModuleGraphConnection{
                module: NormalModule{request: "./src/b.js"}, 
                originModule: NormalModule{request: "./src/index.js"}
            } 
        }
    },

    _moduleMap: Map(3){
        NormalModule{request: "./src/index.js"} => ModuleGraphModule{
            incomingConnections: Set(1) [
                //Entry module, corresponding originmodule is null
                ModuleGraphConnection{ module: NormalModule{request: "./src/index.js"}, originModule:null }
            ],
            outgoingConnections: Set(2) [
                //From index to a module
                ModuleGraphConnection{ module: NormalModule{request: "./src/a.js"}, originModule: NormalModule{request: "./src/index.js"} },
                //From index to B module
                ModuleGraphConnection{ module: NormalModule{request: "./src/b.js"}, originModule: NormalModule{request: "./src/index.js"} }
            ]
        },
        NormalModule{request: "./src/a.js"} => ModuleGraphModule{
            incomingConnections: Set(1) [
                ModuleGraphConnection{ module: NormalModule{request: "./src/a.js"}, originModule: NormalModule{request: "./src/index.js"} }
            ],
            //Module a has no other dependencies, so the outgoingconnections attribute value is undefined
            outgoingConnections: undefined
        },
        NormalModule{request: "./src/b.js"} => ModuleGraphModule{
            incomingConnections: Set(1) [
                ModuleGraphConnection{ module: NormalModule{request: "./src/b.js"}, originModule: NormalModule{request: "./src/index.js"} }
            ],
            //B module has no other dependencies, so the outgoingconnections attribute value is undefined
            outgoingConnections: undefined
        }
    }
}

As can be seen from the dependency graph above, in essenceModuleGraph._moduleMapA directed acyclic graph structure has been formed, in which the dictionary_moduleMapThe key of is the node of the graph, corresponding to valueModuleGraphModuleIn structureoutgoingConnectionsIf the attribute is the edge of the graph, in the above example, from the starting pointindex.jsDeparture edgeoutgoingConnectionsAll vertices of the graph can be traversed forward.

effect

Take webpack\@v5.16.0 as an example, keywordmoduleGraph1277 times, almost coveringwebpack/libThe role of all the files under the folder can be seen. Although the frequency of occurrence is very high, generally speaking, it can be seen that it has two main functions: information indexing and transformation intoChunkGraphTo determine the output structure.

Information index

ModuleGraphType provides many tool functions for querying module / dependency information, such as:

  • getModule(dep: Dependency): find the corresponding according to depmoduleexample
  • getOutgoingConnections(module: Module): findmoduleAll dependencies of the instance
  • getIssuer(module: Module): findmoduleWhere is it referenced (for more information about the issuer mechanism, please refer to my other article:Ten minute webpack:module Detailed explanation of issuer attribute )

wait.

Webpack\@v5. The implementation of many plug-ins, dependency subclasses, and module subclasses in X requires these tool functions to find the information of specific modules and dependencies, such as:

  • SplitChunksPluginIn optimizing chunks processing, you need to usemoduleGraph.getExportsInfoQuery each module’sexportsInfo(the information set exported by the module is strongly related to tree shaking and will be explained in a separate article later) information to determine how to separatechunk
  • staycompilation.sealIn the function, you need to traverse the dep corresponding to the entry and callmoduleGraph.getModuleGet the complete module definition

Then, when you write plug-ins, you can consider appropriate referenceswebpack/lib/ModuleGraph.jsTo confirm which functions can be used to obtain the information you need.

Building chunkgraph

In the webpack main process, the make build phase will enter thesealStage, start to sort out how to organize the output content. On webpack\@v4 X,sealThe stage mainly focuses onChunkandChunkGroupTwo types are expanded. After 5.0, similar to dependency graph, a new set ofChunkGraphThe resource generation algorithm is implemented with the graph structure of.

In compilation In the seal function, first organize each entry into a chunk according to the default rule, and then callwebpack/lib/buildChunkGraph.jsFile definedbuildChunkGraphMethods, traversingmakePhase generatedmoduleGraphObject to convert module dependencies tochunkGraphObject.

The logic of this section is also very complex. It will not be expanded here. Next time, a separate article will be published to explain itchunk/chunkGroup/chunkGraphAnd other objects.

summary

The concept of dependency graph discussed in this article is widely used within webpack. Therefore, understanding this concept will be very helpful for us to understand the source code of webpack or learn how to write plug-ins and loaders. In fact, many new knowledge blind spots have been discovered in the analysis process:

  • What is the complete mechanism of chunk?
  • How is the complete system of dependency implemented and what is its role?
  • How to collect the exportsinfo of the module? How is it used in the tree shaking process?

If you are also interested in the above issues, you are welcome to like them. More useful articles will be output around webpack in the future.

Previous articles: