Node.js modularizes what you need to know

Time:2021-6-11

1、 Preface

As we know, node.js is based on the common JS specification for modular management. Modularization is an indispensable tool for complex business scenarios. You may often use it, but you have never systematically understood it. So today, let’s talk about some things you need to know about node.js modularization and explore the face of node.js modularization.

2、 Text

In node.js, there are two built-in modules for modular management. These two modules are also two very familiar Keywords: require and module. Built in means that we can use these two modules globally without having to refer to them before using them like other modules.

There is no need to require ('require ') or require ('module')

It’s not difficult to refer to a module in node.js. It’s very simple

const config = require('/path/to/file')

But in fact, this simple code performs a total of five steps:

Node.js modularizes what you need to know

Understanding these five steps will help us understand the basic principles of node.js modularization, and also help us identify some pitfalls. Let’s briefly summarize what these five steps have done

  • Resolving:Find the target module to be referenced and generate the absolute path.
  • Loading:Determine the type of module content to be referenced, which may be. JSON file,. JS file or. Node file.
  • Wrapping:As the name suggests, packages the referenced module. By wrapping, the module has a private scope.
  • Evaluating:The loaded module is actually parsed and processed.
  • Caching:Cache module, which allows us to introduce the same module without repeating the above steps.

Some students may already know these five steps well after reading them, and they are familiar with these principles. Some students may have more doubts in their hearts. In any case, the following content will analyze the above implementation steps in detail, hoping to help you answer questions and solve doubts, consolidate knowledge, check and fill gaps.

By the way, if necessary, just like me, you can build an experiment directory and experiment with demo.

2.1 what is module

If you want to understand modularity, you need to have an intuitive look at what a module is.

We know that in node.js, files are modules. Just mentioned that modules can be. JS,. JSON or. Node files. By referring to them, you can get tool functions, variables, configurations and so on. But what is its specific structure? Simply execute the following command on the command line to see the module, that is, the structure of the module object:

~/learn-node $ node
> module
Module {
  id: '<repl>',
  exports: {},
  parent: undefined,
  filename: null,
  loaded: false,
  children: [],
  paths: [ ... ] }

You can see that a module is a common object, but there are several special attribute values in the structure that we need to understand one by one. Some attributes, such as ID, parent, file name and children, don’t even need to be explained. They can be understood literally.

The following content will help you understand the meaning and function of these fields.

2.2 Resolving

After a general understanding of what a module is, we start from the first step resolving to understand the principle of modularization, that is, how node.js finds the target module and generates the absolute path of the target module.

So why do we just want to print the module object and let you know the structure of the module? Because there are two field values ID, paths and resolving, this step is closely related. Let’s have a look.

  • The first is the ID attribute

Each module has an ID attribute, which is usually the full path of the module. Node.js can be used to identify and locate the location of the module. But there is no specific module here, we just output the module structure in the command line, so it is the default < repl > value (repl means interactive interpreter).

  • Secondly, paths attribute

What is the function of the paths attribute? Node.js allows us to refer to modules in various ways, such as relative path, absolute path and preset path (which will be explained soon). Suppose we need to refer to a module called find me, how can require help us find this module?

require('find-me')

Let’s first print out what’s in paths:

~/learn-node $ node
> module.paths
[ '/Users/samer/learn-node/repl/node_modules',
  '/Users/samer/learn-node/node_modules',
  '/Users/samer/node_modules',
  '/Users/node_modules',
  '/node_modules',
  '/Users/samer/.node_modules',
  '/Users/samer/.node_libraries',
  '/usr/local/Cellar/node/7.7.1/lib/node' ]

OK, in fact, it is a bunch of absolute paths of the system. These paths indicate the possible locations of all target modules, and they are in order, which means that node.js will find all the paths listed in paths in order. If the module is found, it will output the absolute path of the module for subsequent use.

Now we know that node.js will look for modules in this heap of directories, and try to execute require (‘find me ‘) to find the find me module. Since we have not placed the find me module in any directory, node.js can not find the target module after traversing all the directories. Therefore, we report the error cannot find module’ find me ‘

~/learn-node $ node
> require('find-me')
Error: Cannot find module 'find-me'
    at Function.Module._resolveFilename (module.js:470:15)
    at Function.Module._load (module.js:418:25)
    at Module.require (module.js:498:17)
    at require (internal/module.js:20:19)
    at repl:1:1
    at ContextifyScript.Script.runInThisContext (vm.js:23:33)
    at REPLServer.defaultEval (repl.js:336:29)
    at bound (domain.js:280:14)
    at REPLServer.runBound [as eval] (domain.js:293:12)
    at REPLServer.onLine (repl.js:533:10)

Now, you can try to put the find me module to be referenced in any of the above directories. Here we create a node_ Modules directory, and create the find-me.js file so that node.js can find it:

~/learn-node $ mkdir node_modules
 
~/learn-node $ echo "console.log('I am not lost');" > node_modules/find-me.js
 
~/learn-node $ node
> require('find-me');
I am not lost
{}
>

After manually creating the find-me.js file, node.js found the target module. Of course, when node. JS is a local node_ If you find the find me module in the modules directory, you will not continue to search in the subsequent directory.

Students with experience in node.js development will find that when referring to a module, it is not necessary to specify an accurate file. You can also refer to the target module by referring to the directory, for example:

~/learn-node $ mkdir -p node_modules/find-me
 
~/learn-node $ echo "console.log('Found again.');" > node_modules/find-me/index.js
 
~/learn-node $ node
> require('find-me');
Found again.
{}
>

The index.js file in the find me directory will be imported automatically.

Of course, there are rules. The reason why node.js can find the index.js file in the find me directory is that the default module introduction rule is to find the index.js file when the specific file name is missing. We can also change the import rules (by modifying package. JSON), such as index > Main:

~/learn-node $ echo "console.log('I rule');" > node_modules/find-me/main.js
 
~/learn-node $ echo '{ "name": "find-me-folder", "main": "main.js" }' > node_modules/find-me/package.json
 
~/learn-node $ node
> require('find-me');
I rule
{}
>

2.3 require.resolve

If you only want to introduce a module into the project and do not want to execute it immediately, you can use the require. Resolve method. It has the same function as the require method, except that it will not execute the introduced module method

> require.resolve('find-me');
'/Users/samer/learn-node/node_modules/find-me/start.js'
> require.resolve('not-there');
Error: Cannot find module 'not-there'
    at Function.Module._resolveFilename (module.js:470:15)
    at Function.resolve (internal/module.js:27:19)
    at repl:1:9
    at ContextifyScript.Script.runInThisContext (vm.js:23:33)
    at REPLServer.defaultEval (repl.js:336:29)
    at bound (domain.js:280:14)
    at REPLServer.runBound [as eval] (domain.js:293:12)
    at REPLServer.onLine (repl.js:533:10)
    at emitOne (events.js:101:20)
    at REPLServer.emit (events.js:191:7)
>

You can see that if the module is found, node.js will print the full path of the module. If it is not found, an error will be reported.

After learning how node.js looks for modules, let’s see how node.js loads modules.

2.4 parent child dependency between modules

We express the reference relationship between modules as parent-child dependency.

Simply create a lib / util.js file and add a line of console.log statement to identify it as a referenced sub module.

~/learn-node $ mkdir lib
~/learn-node $ echo "console.log('In util');" > lib/util.js

In index.js, enter a line of console.log statement to identify it as a parent module, and refer to the newly created lib / util.js as a sub module.

~/learn-node $ echo "require('./lib/util'); console.log('In index, parent', module);" > index.js

Execute index.js to see the dependencies between them

~/learn-node $ node index.js
In util
In index <ref *1> Module {
  id: '.',
  path: '/Users/samer/',
  exports: {},
  parent: null,
  filename: '/Users/samer/index.js',
  loaded: false,
  children: [
    Module {
      id: '/Users/samer/lib/util.js',
      path: '/Users/samer/lib',
      exports: {},
      parent: [Circular *1],
      filename: '/Users/samer/lib/util.js',
      loaded: true,
      children: [],
      paths: [Array]
    }
  ],
  paths: [...]
}

Here we focus on two properties related to dependency: children and parent.

In the printed result, the children field contains the introduced util. JS module, which indicates that util. JS is the sub module that index. JS depends on.

However, when we carefully observe the parent attribute of util.js module, we find that the value of circular appears here. The reason is that when we print the module information, there is a circular dependency relationship. The parent module information is printed in the sub module information, and the sub module information is printed in the parent module information, so node.js simply marks it as circular.

Why do we need to understand the relationship between father and son? Because this is related to how node.js handles circular dependencies, which will be described in detail later.

Before looking at how to deal with circular dependencies, we need to understand two key concepts: exports and module. Exports.

2.5 exports, module.exports

  • exports:

Exports is a special object that can be directly used as a global variable in node.js without declaration. It is actually a reference to module. Exports, which can be modified by modifying exports.

Exports is also an attribute value in the module structure just printed, but the values just printed are all empty objects, because we did not operate on it in the file. Now we can try to simply assign values to it

//Add a new line at the beginning of Lib / util. JS
exports.id = 'lib/util';
 
//Add a new line at the beginning of index. JS
exports.id = 'index';

Execute index.js:

~/learn-node $ node index.js
In index Module {
  id: '.',
  exports: { id: 'index' },
  loaded: false,
  ... }
In util Module {
  id: '/Users/samer/learn-node/lib/util.js',
  exports: { id: 'lib/util' },
  parent:
   Module {
     id: '.',
     exports: { id: 'index' },
     loaded: false,
     ... },
  loaded: false,
  ... }

You can see that the two ID attributes just added are successfully added to the exports object. We can also add any attribute except ID, just like operating ordinary objects. Of course, we can also turn exports into a function, for example:

exports = function() {}
  • module.exports:

The module. Exports object is actually what we finally get through require. When we write a module, we can get the value of module. Exports when others refer to it. For example, combined with the operation on lib / util just now:

const util = require('./lib/util');
 
console.log('UTIL:', util);
 
//Output results
 
UTIL: { id: 'lib/util' }

Since we have just assigned {ID: ‘lib / util’} to module. Exports through the exports object, the result of require changes accordingly.

Now we have a general understanding of what exports and module. Exports are, but there is a small detail to note, that is, the module loading of node. JS is a synchronous process.

Let’s go back to the loaded attribute in the module structure. This attribute identifies whether the module has been loaded. Through this attribute, we can simply verify the synchronization of node.js module loading.

When the module is loaded, the loaded value should be true. But so far, every time we print a module, its status is false. In fact, in node.js, the module loading is synchronous. When we have not completed the loading action (the loading action includes marking the module, including marking the loaded attribute), the printed result is the default loaded: false.

We use setimmediate to help us verify this information:

// In index.js
setImmediate(() => {
  console.log('The index.js module object is now loaded!', module)
});
The index.js module object is now loaded! Module {
  id: '.',
  exports: [Function],
  parent: null,
  filename: '/Users/samer/learn-node/index.js',
  loaded: true,
  children:
   [ Module {
       id: '/Users/samer/learn-node/lib/util.js',
       exports: [Object],
       parent: [Circular],
       filename: '/Users/samer/learn-node/lib/util.js',
       loaded: true,
       children: [],
       paths: [Object] } ],
  paths:
   [ '/Users/samer/learn-node/node_modules',
     '/Users/samer/node_modules',
     '/Users/node_modules',
     '/node_modules' ] }

OK, since console.log is postpositioned to the end of loading (marking), the loading status is now loaded: true. This fully verifies that node.js module loading is a synchronous process.

After understanding the synchronization of exports, module.exports and module loading, let’s see how node.js handles the cyclic dependency of modules.

2.6 module cycle dependency

In the above content, we learned that there is a parent-child dependency relationship between modules. If there is a circular dependency relationship between modules, what will node.js do? Suppose there are two modules, module1.js and module2.js, and they refer to each other, as follows:

// lib/module1.js
 
exports.a = 1;
 
require('./module2'); //  Quote here
 
exports.b = 2;
exports.c = 3;
 
// lib/module2.js
 
const Module1 = require('./module1');
console.log('Module1 is partially loaded here', Module1); //  Reference module1 and print it

Try running module1.js, and you can see the output:

~/learn-node $ node lib/module1.js
Module1 is partially loaded here { a: 1 }

Only {A: 1} is output, but {B: 2, C: 3} is missing. Looking at module1.js carefully, we found that we added a reference to module2.js in the middle of module1.js, that is, before exports. B = 2 and exports. C = 3 were executed. If we call this location the location where the cyclic dependency occurs, then the result we get is the attribute exported before the cyclic dependency occurs. This is also the conclusion that the module loading of node.js is a synchronous process, which we have verified above.

Node. JS simply handles circular dependencies. In the process of loading the module, the exports object will be constructed step by step to assign values to exports. If we refer to the module before it is fully loaded, we can only get part of the exports object properties.

2.7. JSON and. Node

In node.js, we can not only use require to refer to JavaScript files, but also to refer to JSON or C + + plug-ins (. JSON and. Node files). We don’t even need to explicitly declare the corresponding file suffix.

You can also see the file types supported by require on the command line

~ % node
> require.extensions
[Object: null prototype] {
  '.js': [Function (anonymous)],
  '.json': [Function (anonymous)],
  '.node': [Function (anonymous)]
}

When we use require to refer to a module, first node.js will match whether there is a. JS file. If it is not found, then match the. JSON file. If it is not found, finally try to match the. Node file. But usually, in order to avoid confusion and unclear reference intention, we can explicitly specify the suffix when referring to. JSON or. Node files, and omit the suffix when referring to. JS (optional, or add both suffixes).

  • . JSON file:

It is very common to refer to. JSON files, such as static configuration in some projects. Using. JSON files to store is more convenient for management, such as:

{
  "host": "localhost",
  "port": 8080
}

It’s easy to refer to it or use it:

const { host, port } = require('./config');
console.log(`Server will run at http://${host}:${port}`)

The output is as follows:

Server will run at http://localhost:8080
  • . node file:

The. Node file is converted from C + + file. The official website provides a simple C + + file  Hello plug in , It exposes a hello () method that outputs the string world. If necessary, you can jump to the link to learn more and experiment.

We can compile and build. CC file into. Node file through node gyp, and the process is very simple. We only need to configure a binding. Gyp file. We don’t elaborate here. We only need to know that after the. Node file is generated, we can normally reference the file and use the methods in it.

For example, after transforming hello() into addon.node file, reference and use it:

const addon = require('./addon');
console.log(addon.hello());

2.8 Wrapping

In fact, in the above content, we described the first two steps of referencing a module in node.js, resolving and loading, which solve the problem of module path and loading respectively. Next, let’s see what wrapping does.

Wrapping is packaging, and the object of packaging is all the code we write in the module. That is, when we refer to modules, we actually experience a layer of “transparent” packaging.

To understand the packaging process, we need to understand the difference between exports and module. Exports.

Exports is a reference to module. Exports. We can use exports to export properties in a module, but we can’t replace it directly. For example:

exports.id = 42; //  OK, at this point, exports points to module. Exports, which is equivalent to modifying module. Exports
exports = { id: 42 }; //  It's useless. It just points to the {ID: 42} object. It doesn't change module. Exports
module.exports = { id: 42 }; //  OK, directly operate module. Exports

You may wonder why the exports object seems to be a global object for each module, but it can distinguish which module the exported object comes from and how to do it.

Before we get to know the wrapping process, let’s take a look at a small example:

// In a.js
var value = 'global'
 
// In b.js
Console. Log (value) // output: Global
 
// In c.js
Console. Log (value) // output: Global
 
// In index.html
...
<script></script>
<script></script>
<script></script>

When we define a value value in the A.js script, the value is globally visible, which can be accessed by B.js and c.js. However, this is not the case in the node.js module. Variables defined in one module have private scopes and cannot be accessed directly in other modules. How is this private scope generated?

The answer is very simple, because before compiling the module, node.js wraps the content in the module in a function, and realizes the private scope through the function scope.

The wrapper property can be printed by require (‘module ‘)

~ $ node
> require('module').wrapper
[ '(function (exports, require, module, __filename, __dirname) { ',
  '\n});' ]
>

Node.js will not directly execute any code in the file, but it will execute the code through the wrapped function, which gives each module a private scope and does not affect each other.

This wrapper function has five parameters: exports, require, module\_\_ filename, \_\_ dirname。 We can directly access and print these parameters through the arguments parameter

/learn-node $ echo "console.log(arguments)" > index.js
 
~/learn-node $ node index.js
{ '0': {},
  '1':
   { [Function: require]
     resolve: [Function: resolve],
     main:
      Module {
        id: '.',
        exports: {},
        parent: null,
        filename: '/Users/samer/index.js',
        loaded: false,
        children: [],
        paths: [Object] },
     extensions: { ... },
     cache: { '/Users/samer/index.js': [Object] } },
  '2':
   Module {
     id: '.',
     exports: {},
     parent: null,
     filename: '/Users/samer/index.js',
     loaded: false,
     children: [],
     paths: [ ... ] },
  '3': '/Users/samer/index.js',
  '4': '/Users/samer' }

Let’s take a brief look at these parameters. The first parameter exports is empty (unassigned) initially. The second and third parameters require and module are instances related to the module we are referring to. They are not global. The fourth and fifth parameters\_\_ Filenames and\_\_ Dirname represents the file path and directory respectively.

What the whole packaged function does is approximately equal to:

unction (require, module, __filename, __dirname) {
  let exports = module.exports;
   
  // Your Code...
   
  return module.exports;
}

In a word, wrapping is to privatize the scope of our module and expose variables or methods for use with module. Exports as the return value.

2.9 Cache

Caching is easy to understand. Let’s take a look at a case

echo 'console.log(`log something.`)' > index.js
// In node repl
> require('./index.js')
log something.
{}
> require('./index.js')
{}
>

As you can see, the same module is referenced twice, and the information is printed only once. This is because the cache is used for the second reference, and there is no need to reload the module.

Print require.cache to see the current cache information

> require.cache
[Object: null prototype] {
  '/Users/samer/index.js': Module {
    id: '/Users/samer/index.js',
    path: '/Users/samer/',
    exports: {},
    parent: Module {
      id: '<repl>',
      path: '.',
      exports: {},
      parent: undefined,
      filename: null,
      loaded: false,
      children: [Array],
      paths: [Array]
    },
    filename: '/Users/samer/index.js',
    loaded: true,
    children: [],
    paths: [
      '/Users/samer/learn-node/repl/node_modules',
      '/Users/samer/learn-node/node_modules',
      '/Users/samer/node_modules',
      '/Users/node_modules',
      '/node_modules',
      '/Users/samer/.node_modules',
      '/Users/samer/.node_libraries',
      '/usr/local/Cellar/node/7.7.1/lib/node'
    ]
  }
}

You can see that the index.js file just referenced is in the cache, so the module will not be reloaded. Of course, we can also clear the cache content by deleting require. Cache to achieve the purpose of reloading, which will not be demonstrated here.

3、 Summary

This paper outlines some basic principles and common sense when using node.js modularization, hoping to help you have a clearer understanding of node.js modularization. However, more in-depth details are not described in this paper, such as the internal processing logic of wrapper function, the synchronous loading problem of commonjs, the difference from ES module and so on. You can do more exploration outside of this article.

Author: vivo Wei Xing