Import mechanism of Python entry Foundation

Time:2021-10-14

1、 Foreword

This article is based on open source projects:

github.com/pwwang/pyth…

Supplement and expand the explanation, hoping to make readers understand one articlePythonofimport mechanism

1.1 what is the import mechanism?

Generally speaking, to execute a piece of Python code that references the code in another module, you need to use Python’s import mechanism. The import statement is the most commonly used means to trigger the import mechanism, but it is not the only means.

importlib.import_ Module and__ import__ Function can also be used to introduce code from other modules.

1.2 how is import executed?

The import statement performs two steps:

  • Search for modules to be introduced
  • Bind the name of the module as a variable to a local variable

The search step is actually through__ import__ The function is completed, and its return value is bound to the local variable as a variable. We’ll talk about it in detail below__ import__ If the function works.

2、 Overview of import mechanism

The following figure is an overview of the import mechanism. It is not difficult to see that when the import mechanism is triggered, python will first go to sys.modules to find out whether the module has been introduced. If the module has been introduced, call it directly, otherwise go to the next step. Here, sys. Modules can be regarded as a cache container. It is worth noting that if the corresponding value in sys.modules is none, a modulenotfounderror exception will be thrown. Here is a simple experiment:

In [1]: import sys

In [2]: sys.modules['os'] = None

In [3]: import os
---------------------------------------------------------------------------
ModuleNotFoundError   Traceback (most recent call last)
<ipython-input-3-543d7f3a58ae> in <module>
----> 1 import os

ModuleNotFoundError: import of os halted; None in sys.modules

If the corresponding module is found in sys.modules and the import is triggered by the import statement, the next step is to bind the corresponding variable to the local variable.

If no cache is found, the system will perform a new import process. In this process, python will traverse sys.meta_ Path to find whether there is a meta path finder that meets the conditions. sys.meta_ Path is a list of meta path finders. It has three default finders:

  • Built in module finder
  • Frozen module finder
  • Path based module finder.
In [1]: import sys

In [2]: sys.meta_path
Out[2]: 
[_frozen_importlib.BuiltinImporter,
 _frozen_importlib.FrozenImporter,
 _frozen_importlib_external.PathFinder]

Find of finder_ The spec method determines whether the finder can handle the module to be imported and returns a modeulespec object, which contains the relevant information used to load the module. If no suitable modulespec object is returned, the system will view sys.meta_ The next Metapath finder for path. If you traverse sys.meta_ If the path does not find a suitable meta path finder, the modulenotfounderror will be thrown. This happens when you introduce a module that does not exist, because sys.meta_ None of the finders in path can handle this situation:

In [1]: import nosuchmodule
---------------------------------------------------------------------------
ModuleNotFoundError      Traceback (most recent call last)
<ipython-input-1-40c387f4d718> in <module>
----> 1 import nosuchmodule

ModuleNotFoundError: No module named 'nosuchmodule'

However, if you manually add a finder that can handle this module, it can also be introduced:

In [1]: import sys
 ...: 
 ...: from importlib.abc import MetaPathFinder
 ...: from importlib.machinery import ModuleSpec
 ...: 
 ...: class NoSuchModuleFinder(MetaPathFinder):
 ...:  def find_spec(self, fullname, path, target=None):
 ...:   return ModuleSpec('nosuchmodule', None)
 ...: 
 ...: # don't do this in your script
 ...: sys.meta_path = [NoSuchModuleFinder()]
 ...: 
 ...: import nosuchmodule
---------------------------------------------------------------------------
ImportError        Traceback (most recent call last)
<ipython-input-6-b7cbf7e60adc> in <module>
  11 sys.meta_path = [NoSuchModuleFinder()]
  12 
---> 13 import nosuchmodule

ImportError: missing loader

As you can see, when we tell the system how to find_ The modulenotfound exception will not be thrown during spec. However, to successfully load a module, you also need a loader.

Loader is a property of modulespec object, which determines how to load and execute a module. If the modulespec object is “master leads in”, then the loader is “cultivating in the individual”. In the loader, you can decide how to load and execute a module. The decision here is not only to load and execute the module itself, but also to modify a module:

In [1]: import sys
 ...: from types import ModuleType
 ...: from importlib.machinery import ModuleSpec
 ...: from importlib.abc import MetaPathFinder, Loader
 ...: 
 ...: class Module(ModuleType):
 ...:  def __init__(self, name):
 ...:   self.x = 1
 ...:   self.name = name
 ...: 
 ...: class ExampleLoader(Loader):
 ...:  def create_module(self, spec):
 ...:   return Module(spec.name)
 ...: 
 ...:  def exec_module(self, module):
 ...:   module.y = 2
 ...: 
 ...: class ExampleFinder(MetaPathFinder):
 ...:  def find_spec(self, fullname, path, target=None):
 ...:   return ModuleSpec('module', ExampleLoader())
 ...: 
 ...: sys.meta_path = [ExampleFinder()]

In [2]: import module

In [3]: module
Out[3]: <module 'module' (<__main__.ExampleLoader object at 0x7f7f0d07f890>)>

In [4]: module.x
Out[4]: 1

In [5]: module.y
Out[5]: 2

As can be seen from the above example, a loader usually has two important methods create_ Module and Exec_ Module needs to be implemented. If exec is implemented_ Module method, then create_ Module is required. If the import mechanism is initiated by the import statement, create_ The variable corresponding to the module object returned by the module method will be bound to the current local variable. If a module is successfully loaded as a result, it will be cached in sys. Modules. If the module is loaded again, the cache of sys. Modules will be directly referenced.

Import mechanism of Python entry Foundation

3、 Import hooks

In order to simplify, we did not mention the hook of the import mechanism in the above flow chart. In fact, you can add a tick to change sys. Meta_ Path or sys. Path to change the behavior of the import mechanism. In the above example, we directly modified sys.meta_ path。 In fact, you can also use the hook to achieve:

In [1]: import sys
 ...: from types import ModuleType
 ...: from importlib.machinery import ModuleSpec
 ...: from importlib.abc import MetaPathFinder, Loader
 ...: 
 ...: class Module(ModuleType):
 ...:  def __init__(self, name):
 ...:   self.x = 1
 ...:   self.name = name
 ...: 
 ...: class ExampleLoader(Loader):
 ...:  def create_module(self, spec):
 ...:   return Module(spec.name)
 ...: 
 ...:  def exec_module(self, module):
 ...:   module.y = 2
 ...: 
 ...: class ExampleFinder(MetaPathFinder):
 ...:  def find_spec(self, fullname, path, target=None):
 ...:   return ModuleSpec('module', ExampleLoader())
 ...: 
 ...: def example_hook(path):
 ...:  # some conditions here
 ...:  return ExampleFinder()
 ...: 
 ...: sys.path_hooks = [example_hook]
 ...: # force to use the hook
 ...: sys.path_importer_cache.clear()
 ...: 
 ...: import module
 ...: module
Out[1]: <module 'module' (<__main__.ExampleLoader object at 0x7fdb08f74b90>)>

4、 Meta path finder

The job of the Metapath finder is to see if the module can be found. These finders are stored in sys.meta_ Path for Python to traverse (of course, they can also be returned through the import hook, see the above example). Each finder must implement find_ Spec method. If a finder knows how to handle the incoming module, find_ Spec will return a modulespec object (see the next section), otherwise it will return none.
As mentioned earlier, sys.meta_ Path contains three kinds of finders:

  • Built in module finder
  • Freeze module finder
  • Path based finder

Here, we’d like to focus on the path based finder. It is used to search a series of import paths, and each path is used to find whether a corresponding module can be loaded. The default path finder implements the function of finding modules in all special files of the file system, including Python source files (. Py files) , python compiled code file (. PyC file) and shared library file (. So file). If the python standard library contains zipimport, the relevant files can also be used to find the modules that can be imported.

The path finder is not limited to files in the file system. It can also query the URL database, or any other address that can be represented by a string.

You can use the tick provided in the previous section to find the module of the same type of address. For example, if you want to import a module through a URL, you can write an import hook to parse the URL and return a path finder.

Note that the path finder is different from the meta path finder. The latter is in sys.meta_ Path is used to be traversed by python, while the former specifically refers to a path based finder.

5、 Modulespec object

Each Metapath finder must implement find_ Spec method. If the finder knows how to process the module to be introduced, this method will return a modulespec object. This object has two properties worth mentioning, one is the name of the module, and the other is the finder. If the finder of a modulespec object is none, an exception similar to importerror: missing loader will be thrown. The finder will be used to create and execute a module (see the next section).

You can use < module >__ spec__ To find the modulespec object of the module:

In [1]: import sys

In [2]: sys.__spec__
Out[2]: ModuleSpec(name='sys', loader=<class '_frozen_importlib.BuiltinImporter'>)

6、 Loader

Loader through create_ Module to create modules and exec_ Module to execute the module. Generally, if a module is a python module (non built-in module or dynamic extension), the code of the module needs to be in the module__ dict__ Spatial execution. If the code of the module cannot be executed, an importerror exception will be thrown, or other exceptions during execution will also be thrown.

In most cases, the finder and loader are the same thing. In this case, the finder’s find_ The loader property of the modulespec object returned by the spec method will point to itself.

We can use create_ Module to dynamically create a module. If it returns none, python will automatically create a module.

7、 Summary

Python’s import mechanism is flexible and powerful. Most of the above descriptions are based on official documents and the newer Python version 3.6 +. Due to space, there are many details not included, such as sub module loading, module code caching mechanism and so on. Mistakes are inevitable in the article. If you have any questions, you are welcome to github.com/pwwang/pyth… Issue for questions and discussion.

This is the end of this article on the import mechanism, which is the basis of getting started with Python. For more information about Python import mechanism, please search the previous articles of developeppaper or continue to browse the relevant articles below. I hope you will support developeppaper in the future!