Analysis of Tomcat architecture principle to architecture design

Time:2021-11-14

1、 Learning purpose

1.1. Master Tomcat architecture design and principle to improve internal skills

Macroscopically

Tomcat as aHttpServer+ServletContainer “shields us from application layer protocol and network communication details, and gives us standardRequestandResponseObject; The specific business logic is taken as the change point and handed over to us for implementation. We usedSpringMVCSuch a framework, but it never needs to be consideredTCPConnectionHttpProtocol data processing and response. Because Tomcat has done this for us, we only need to focus on the specific business logic of each request.

From a microscopic point of view

TomcatThe interior also isolates the change points and invariant points, and uses component-based design. The purpose is to realize the highly customized (composite mode) of “Russian Dolls”. Some common things in the life cycle management of each component are extracted into interfaces and abstract classes, so that specific subclasses can realize the change points, that is, the template method design mode.

Today’s popular microservices are the same idea. Separate individual applications into “microservices” according to functions. In the splitting process, commonalities should be extracted, and these commonalities will become the core basic services or general libraries. The same is true of “China Taiwan” thought.

Design patterns are often a sharp weapon to encapsulate changes. Rational use of design patterns can make our code and system design elegant and tidy.

This is the “internal skill” that can be obtained by learning excellent open source software. It will never be outdated. The design idea and philosophy are the fundamental way. Learn from the design experience, reasonably use the design pattern to encapsulate the change and invariance, and better learn from their source code to improve their system design ability.

1.2. Macroscopically understand how a request relates to spring

In the process of work, we are already familiar with Java syntax. We have even “memorized” some design patterns and used many web frameworks, but we rarely have the opportunity to use them in actual projects. It seems that we can design a system independently and implement it one by one according to the needs. I don’t seem to have a panorama of Java Web Development in my mind. For example, I don’t know how the browser’s request is related to the code in spring.

In order to break through this bottleneck, why not stand on the shoulders of giants and learn excellent open source systems to see how big cattle think about these problems.

Learning the principle of tomcat, I found thatServletTechnology is the origin of web development. Almost all Java Web frameworks (such as spring) are based onServletThe spring application itself is aServletDispatchSevlet)Web containers such as Tomcat and jetty are responsible for loading and runningServlet。 As shown in the figure:

1.3 improve your system design ability

When learning tomcat, I also found that many advanced Java technologies are used, such as Java multithreading concurrent programming, socket network programming and reflection. I only knew these techniques before and recited some questions for the interview. But I always feel that there is a gap between “knowing” and participating users. Through learning the Tomcat source code, I learned what scenarios to use these technologies.

There are also system design capabilities, such as interface oriented programming, component composition mode, skeleton abstract class, one click start stop, object pool technology and various design modes, such as template method, observer mode, responsibility chain mode, etc. Then I began to imitate them and apply these design ideas to practical work.

2、 Overall architecture design

Today, we will analyze the design idea of Tomcat step by step. On the one hand, we can learn the overall architecture of tomcat, how to design a complex system, how to design top-level modules, and the relationship between modules; On the other hand, it also lays a foundation for us to deeply study the working principle of Tomcat.

Tomcat startup process:

startup.sh -> catalina.sh start ->java -jar org.apache.catalina.startup.Bootstrap.main()

Tomcat implements two core functions:

  • handleSocketConnection, responsible for network byte stream andRequestandResponseObject conversion.
  • Load and manageServletAnd deal with specific problemsRequestRequest.

So Tomcat designed two core components, connector and container. The connector is responsible for external communication, and the container is responsible for internal treatment

TomcatIn order to support multipleI/OAccording to the model and application layer protocol, a container may connect to multiple connectors, just like a room with multiple doors.

  • Server corresponds to a Tomcat instance.
  • There is only one service by default, that is, a Tomcat instance defaults to one service.
  • Connector: a service may have multiple connectors and accept different connection protocols.
  • Container: multiple connectors correspond to a container. The top-level container is actually an engine.

Each component has a corresponding life cycle and needs to be started. At the same time, it also needs to start its own internal sub components. For example, a Tomcat instance contains a service, and a service contains multiple connectors and a container. A container contains multiple hosts, there may be multiple context t containers inside the host, and a context also contains multiple servlets, so Tomcat uses the composite mode to manage each component, and treats each component as a single group. On the whole, the design of each component is like a “Russian Doll”.

2.1 connector

Before I start talking about connectors, I’ll pave the wayTomcatMultiple supportedI/OModel and application layer protocol.

TomcatSupportiveI/OThe models are:

  • NIO: non blockingI/O, usingJava NIOClass library implementation.
  • NIO2: asynchronousI/O, usingJDK 7abreast of the timesNIO2Class library implementation.
  • APR: adoptApachePortable runtime implementation, yesC/C++Write a local library.

The application layer protocols supported by Tomcat are:

  • HTTP/1.1: This is the access protocol adopted by most web applications.
  • AJP: for integration with web servers (such as APACHE).
  • HTTP/2: http 2.0 greatly improves web performance.

Therefore, a container may dock multiple connectors. Connector pairServletThe container shields network protocols andI/OModel differences, whetherHttpstillAJP, all the information obtained in the container is a standardServletRequestObject.

The detailed functional requirements of the connector are:

  • Listen to the network port.
  • Accept network connection requests.
  • Read request network byte stream.
  • According to the specific application layer protocol(HTTP/AJP)Parse byte stream to generate unifiedTomcat RequestObject.
  • takeTomcat RequestObject to standardServletRequest
  • callServletContainer, getServletResponse
  • takeServletResponseTurn intoTomcat ResponseObject.
  • takeTomcat ResponseConvert to network byte stream. Write the response byte stream back to the browser.

After the requirements are listed clearly, the next question we need to consider is, which sub modules should the connector have? Excellent modular design should consider high cohesion and low coupling.

  • High cohesion means that functions with high correlation should be concentrated as much as possible rather than scattered.
  • Low coupling means that the dependent parts and degree of two related modules should be reduced as much as possible, and the two modules should not have strong dependence.

We found that the connector needs to complete three highly cohesive functions:

  • Network Communications.
  • Application layer protocol resolution.
  • Tomcat Request/ResponseAndServletRequest/ServletResponseTransformation of.

Therefore, the designer of Tomcat designed three components to realize these three functions, namelyEndpoint, processor, and adapter

The I / O model of network communication is changing, and the application layer protocol is also changing, but the overall processing logic is unchanged,EndPointResponsible for providing byte stream toProcessorProcessorResponsible for providingTomcat RequestObject toAdapterAdapterResponsible for providingServletRequestObject to the container.

2.2 package change and unchanged

Therefore, Tomcat designed a series of abstract base classes to encapsulate these stable parts, abstract base classesAbstractProtocolRealizedProtocolHandlerInterface. Each application layer protocol has its own abstract base class, such asAbstractAjpProtocolandAbstractHttp11Protocol, the implementation class of the specific protocol extends the abstract base class of the protocol layer.

This is the application of template design pattern.

To sum up, the three core components of the connectorEndpointProcessorandAdapterTo do three things, one of whichEndpointandProcessorPut it together and abstract it intoProtocolHandlerComponents, and their relationship is shown in the figure below.

Protocolhandler component:

It mainly deals with network connection and application layer protocol, including two important components endpoint and processor. The two components are combined to form protocohandler. I will introduce their working principle in detail below.

EndPoint:

EndPointIt is a communication endpoint, that is, the interface for communication monitoring, a specific socket receiving and sending processor, and an abstraction of the transport layerEndPointIs used to implementTCP/IPThe essence of protocol data reading and writing calls the socket interface of the operating system.

EndPointIs an interface, and the corresponding abstract implementation class isAbstractEndpoint, andAbstractEndpointSpecific subclasses of, for example, inNioEndpointandNio2EndpointIn, there are two important subcomponents:AcceptorandSocketProcessor

The acceptor is used to listen to socket connection requests.SocketProcessorFor processingAcceptorReceivedSocketRequest, it implementsRunnableInterface, inRunMethod to call the application layer protocol processing componentProcessorProcess. In order to improve processing capacity,SocketProcessorIs committed to the thread pool for execution.

We know that the use of Java multiplexers is nothing more than two steps:

  • Create a Seletor, register all kinds of interesting events on it, then call the select method, waiting for something interesting to happen.
  • When something of interest happens, such as reading, a new thread is created to read data from the channel.

In TomcatNioEndpointthen isAbstractEndpointAlthough there are many components, the processing logic is still the first two steps. It containsLimitLatchAcceptorPollerSocketProcessorandExecutorThere are five components in total, which work separately and cooperate to realize the processing of the whole TCP / IP protocol.

Limitlatch is a connection controller, which is responsible for controlling the maximum number of connections. In NiO mode, the default is 10000. After reaching this threshold, the connection request is rejected.

AcceptorRunning in a separate thread, it calls in an endless loopacceptMethod to receive a new connection. Once a new connection request arrives,acceptMethod returns aChannelObject, and thenChannelObject to poller.

PollerThe essence of is aSelector, also run in a separate thread.PollerMaintain an internalChannelArray, which is constantly detected in an endless loopChannelData ready status, once there isChannelReadable, generate oneSocketProcessorTask object thrown toExecutorDeal with it.

Socketprocessor implements the runnable interface, in which thegetHandler().process(socketWrapper, SocketEvent.CONNECT_FAIL);The code is to obtain the handler and execute the socket wrapper, and finally obtain the appropriate application layer protocol processor through the socket, that is, call the http11processor component to process the request. The http11processor reads the data of the channel to generate a ServletRequest object. The http11processor does not directly read the channel. This is because Tomcat supports synchronous non blocking I / O model and asynchronous I / O model. In the Java API, the corresponding channel classes are also different. For example, there are asynchronous socketchannel and socketchannel. In order to shield these differences from http11processor, Tomcat has designed a packaging class called socketwrapper, Http11processor only calls the socketwrapper method to read and write data.

ExecutorIs the thread pool, which is responsible for runningSocketProcessorTask class,SocketProcessorofrunMethod will callHttp11ProcessorTo read and parse the request data. We know,Http11ProcessorIt is the encapsulation of the application layer protocol. It will call the container to obtain the response, and then pass the response throughChannelWrite.

The workflow is as follows:

Processor:

The processor is used to implement the HTTP protocol. The processor receives the socket from the endpoint, reads the byte stream, parses it into Tomcat request and response objects, and submits them to the container for processing through the adapter. The processor is an abstraction of the application layer protocol.

We can see from the figure that after receiving the socket connection, endpoint generates a socketprocessor task and submits it to the thread pool for processing. The run method of socketprocessor will call the httprocessor component to parse the application layer protocol. After the processor generates the request object through parsing, it will call the service method of the adapter, Method passes the request to the container through the following code.


// Calling the container
connector.getService().getContainer().getPipeline().getFirst().invoke(request, response);

Adapter component:

Due to different protocols, Tomcat defines its ownRequestClass to store request information, which actually reflects the object-oriented thinking. But this request is not standardServletRequestTherefore, Tomcat cannot be directly used to define request as a parameter container.

The solution for Tomcat designers is to introduceCoyoteAdapter, this is a classic application of the adapter pattern, connector callCoyoteAdapterofSeviceMethod, passed in isTomcat RequestObject,CoyoteAdapterBe responsible forTomcat RequestTurn intoServletRequest, and then call theServicemethod.

2.3 container

The connector is responsible for external communication and the container is responsible for internal treatment. Specifically, the connector handles socket communication and application layer protocol parsing to obtainServletRequest; The container is responsible for handlingServletRequest.

Container: as the name suggests, it is used to load things, so Tomcat container is used to loadServlet

Tomcat has designed four containers, namelyEngineHostContextandWrapperServerRepresents a Tomcat instance.

Note that these four containers are not parallel, but parent-child, as shown in the following figure:

You may ask, why should we design so many levels of containers, which increases the complexity? In fact, the consideration behind this is that Tomcat makes the servlet container very flexible through a layered architecture. Because it happens that one host has multiple contexts, and one context also contains multiple servlets, and each component needs unified life cycle management, these containers are designed in composite mode

WrapperRepresents aServletContextRepresents a web application, and a web application may have multipleServletHostRepresents a virtual host, or a site. A Tomcat can configure multiple sites (hosts); A site (host) can deploy multiple web applications;EngineRepresents an engine, which is used to manage multiple sites (hosts). A service can only have oneEngine

You can deepen the understanding of its hierarchical relationship through Tomcat configuration file.

< server port = "8005" shutdown = "shutdown" > // the top-level component, which can contain multiple services, represents a Tomcat instance

  < service name = "Catalina" > // top level component, including one engine and multiple connectors
    <Connector port="8080" protocol="HTTP/1.1"
               connectionTimeout="20000"
               redirectPort="8443" />

    <!-- Define an AJP 1.3 Connector on port 8009 -->
    < connector port = "8009" protocol = "AJP / 1.3" redirectport = "8443" / > // connector

	//Container component: an engine handles all requests of a service, including multiple hosts
    <Engine name="Catalina" defaultHost="localhost">
	  //Container component: handles client requests under the specified host and can contain multiple contexts
      <Host name="localhost"  appBase="webapps"
            unpackWARs="true" autoDeploy="true">
			//Container component: handles all client requests for a specific context web application
			<Context></Context>
      </Host>
    </Engine>
  </Service>
</Server>

How do I manage these containers? We found that there is a parent-child relationship between containers, forming a tree structure. Did we think of the combination pattern in the design pattern.

Tomcat manages these containers in a composite mode. The specific implementation method is that all container components are implementedContainerInterface, so the composite mode can make users use single container objects and composite container objects consistently. Here, a single container object refers to the lowest levelWrapper, the composite container object refers to the aboveContextHostperhapsEngineContainerThe interface is defined as follows:


public interface Container extends Lifecycle {
    public void setName(String name);
    public Container getParent();
    public void setParent(Container container);
    public void addChild(Container child);
    public void removeChild(Container child);
    public Container findChild(String name);
}

We saw itgetParentSetParentaddChildandremoveChildAnd other methods, which just verifies the combination mode we say. We also seeContainerInterface expansionLifecycle, Tomcat is throughLifecycleManage the lifecycle of components of all containers. Manage all containers through combination mode and expandLifecycleRealize the life cycle management of each component,LifecycleMethods mainly includedInit(), start(), stop(), destroy()

2.4. Process of requesting to locate the servlet

How is a request located to whichWrapperofServletHandled? The answer is that Tomcat uses the mapper component to accomplish this task.

MapperThe function of the component is to transfer the information requested by the userURLNavigate to aServlet, its working principle is:MapperThe component stores the configuration information of the web application, which is actually the mapping relationship between the container component and the access path, such asHostThe domain name configured in the containerContextIn the containerWebApplication path, andWrapperIn the containerServletMapping path, you can imagine that these configuration information is a multi-levelMap

When a request comes,MapperThe component can locate one by parsing the domain name and path in the request URL, and then finding it in its saved mapServlet。 Please note that a request URL will only locate oneWrapperContainer, that is, aServlet

Suppose a user accesses a URL, such as the one in the figurehttp://user.shopping.com:8080/order/buy, how does Tomcat locate this URL to a servlet?

1. First, determine the service and engine according to the protocol and port number. Tomcat’s default HTTP connector listens to port 8080 and the default AJP connector listens to port 8009. The URL in the above example accesses port 8080, so the request will be received by the HTTP connector, and a connector belongs to a service component, so the service component is determined. We also know that in addition to multiple connectors, a service component also has a container component, specifically an engine container. Therefore, when the service is determined, it means that the engine is also determined.

2. Select the host according to the domain name. After the service and engine are determined, the mapper component finds the corresponding host container through the domain name in the URL. For example, the domain name accessed by the URL in the example isuser.shopping.comSo mapper will find the container host2.

3. Find the context component according to the URL path. After the host is determined, mapper matches the path of the corresponding web application according to the path of the URL. For example, in the example, you accessed / order, so you found the context container context4.

4. Find the wrapper (servlet) according to the URL path. After the context is determined, mapper finds the specific wrapper and servlet according to the servlet mapping path configured in web.xml.

The adapter in the connector will call the service method of the container to execute the servlet. The first to get the request is the engine container. After the engine container processes the request, it will pass the request to its own sub container host for further processing, and so on. Finally, the request will be passed to the wrapper container, and the wrapper will call the final servlet for processing. So how is this calling process implemented? The answer is to use a pipeline valve pipeline.

Pipeline-ValveIt is the responsibility chain mode. The responsibility chain mode refers to that in the process of processing a request, many processors process the request in turn. Each processor is responsible for doing its own corresponding processing. After processing, the next processor will be called to continue processing. Valve represents a processing point (i.e. a processing valve). ThereforeinvokeThe method is to process the request.


public interface Valve {
  public Valve getNext();
  public void setNext(Valve valve);
  public void invoke(Request request, Response response)
}

Continue to look at the pipeline interface


public interface Pipeline {
  public void addValve(Valve valve);
  public Valve getBasic();
  public void setBasic(Valve valve);
  public Valve getFirst();
}

PipelineThere areaddValvemethod. Maintained in pipelineValveLinked list,ValveCan be inserted intoPipelineIn, do some processing on the request. We also found that there is no invoke method in pipeline, because the trigger of the whole call chain is completed by valve,ValveAfter completing your own processing, callgetNext.invoke()To trigger the next valve call.

In fact, each container has a pipeline object. As long as the first valve of the pipeline is triggered, it will be in the containerPipelineThe valve in will be called to. However, how can pipelines of different containers be triggered in a chain? For example, pipelines in the engine need to call pipelines in the host of the lower container.

that is becausePipelineThere’s another one in itgetBasicmethod. thisBasicValvebe inValveAt the end of the linked list, it isPipelineAn essential part ofValve, which is responsible for calling the first valve in the pipeline of the lower container.

The whole process is divided into two parts through the connectorCoyoteAdapterTriggered, it will call the first valve of the engine:

@Override
public void service(org.apache.coyote.Request req, org.apache.coyote.Response res) {
    //Omit other codes
    // Calling the container
    connector.getService().getContainer().getPipeline().getFirst().invoke(
        request, response);
    ...
}

The last valve of the wrapper container will create a filter chain and calldoFilter()The method will eventually be adjusted toServletofservicemethod.

Didn’t we talk about it earlierFilterIt seems to have a similar functionValveandFilterWhat’s the difference? Their differences are:

  • ValveyesTomcatPrivate mechanism with Tomcat infrastructureAPIIs tightly coupled.Servlet APIIs a public standard. All web containers, including jetty, support the filter mechanism.
  • Another important difference isValveWork at the web container level and intercept all application requests; andServlet FilterWorking at the application level, only one can be interceptedWebAll requests applied. If you want to do the whole thingWebThe interceptor of the container must pass throughValveTo achieve.

Lifecycle lifecycle

We saw earlierContainerThe container inheritsLifecycleLife cycle. If we want a system to provide external services, we need to create, assemble and start these components; When the service stops, we also need to release resources and destroy these components, so this is a dynamic process. That is, Tomcat needs to dynamically manage the lifecycle of these components.

How to uniformly manage the creation, initialization, start, stop and destruction of components? How to make the code logic clear? How to easily add or remove components? How to start and stop components without omission and repetition?

One touch start stop: lifecycle interface

Design is to find the change point and invariant point of the system. The invariance here is that each component has to go through the processes of creation, initialization and startup. These States and state transformation are invariable. The change point is the initialization method of each specific component, that is, the startup method is different.

Therefore, Tomcat abstracts the invariant point into an interface, which is related to the life cycle, called lifecycle. Several methods are defined in the lifecycle interface:Init(), start(), stop(), destroy(), each specific component (i.e., container) implements these methods.

In the parent componentinit()Method needs to create a sub component and call the function of the sub componentinit()method. Similarly, in the parent componentstart()Methods also need to call sub componentsstart()Method, so the caller can call the of each component without differenceinit()Methods andstart()Method, which is the use of composite mode, and as long as you call the top-level component, that is, the server componentinit()andstart()Method, the whole Tomcat is started. Therefore, Tomcat adopts the combination mode to manage containers, which inherits the lifecycle interface. In this way, the life cycle of each container can be managed with one key as for a single object, and the whole Tomcat can be started.

Scalability: lifecycle events

Let’s consider another problem, that is, the scalability of the system. Because each componentinit()andstart()The specific implementation of the method is complex and changeable. For example, in the startup method of the host container, you need to scan the web application under the webapps directory and create the corresponding context container. If you need to add new logic in the future, you can modify it directlystart()method? This will violate the opening and closing principle. How to solve this problem? The opening and closing principle says that in order to expand the functions of the system, you can’t directly modify the existing classes in the system, but you can define new classes.

Componentinit()andstart()The call is triggered by the state change of its parent component. The initialization of the upper component will trigger the initialization of the child component, and the startup of the upper component will trigger the startup of the child component. Therefore, we define the component life cycle as states, and regard the state transition as an event. Events have listeners. Some logic can be implemented in the listener, and listeners can be easily added and deleted. This is a typical observer mode.

Here isLyfecycleDefinition of interface:

Reusability: lifecyclebase abstract base class

See the abstract template design pattern again.

With interfaces, we need to use classes to implement interfaces. Generally speaking, there is more than one implementation class. Different classes often have some same logic when implementing interfaces. If each subclass is implemented once, there will be repeated code. How can subclasses reuse this logic? In fact, it is to define a base class to realize common logic, and then let each subclass inherit it to achieve the purpose of reuse.

Tomcat defines a base class lifecyclebase to implement the lifecycle interface, and puts some public logic into the base class, such as the transformation and maintenance of life state, the triggering of life events, the addition and deletion of listeners, etc., while the subclass is responsible for implementing its own initialization, start and stop methods.

public abstract class LifecycleBase implements Lifecycle{
    //Hold all observers
    private final List<LifecycleListener> lifecycleListeners = new CopyOnWriteArrayList<>();
    /**
     *Publish event
     *
     * @param type  Event type
     * @param data  Data associated with event.
     */
    protected void fireLifecycleEvent(String type, Object data) {
        LifecycleEvent event = new LifecycleEvent(this, type, data);
        for (LifecycleListener listener : lifecycleListeners) {
            listener.lifecycleEvent(event);
        }
    }
    //The template method defines the whole startup process and starts all containers
    @Override
    public final synchronized void init() throws LifecycleException {
        //1. Status inspection
        if (!state.equals(LifecycleState.NEW)) {
            invalidTransition(Lifecycle.BEFORE_INIT_EVENT);
        }

        try {
            //2. The listener that triggers the initializing event
            setStateInternal(LifecycleState.INITIALIZING, null, false);
            //3. Call the initialization method of the specific subclass
            initInternal();
            //4. The listener that triggers the initialized event
            setStateInternal(LifecycleState.INITIALIZED, null, false);
        } catch (Throwable t) {
            ExceptionUtils.handleThrowable(t);
            setStateInternal(LifecycleState.FAILED, null, false);
            throw new LifecycleException(
                    sm.getString("lifecycleBase.initFail",toString()), t);
        }
    }
}

In order to realize one click start stop and elegant life cycle management, and considering scalability and reusability, Tomcat has brought the object-oriented idea and design pattern to the extreme,ContainaerThe interface maintains the parent-child relationship of the container,LifecycleThe composite mode realizes the life cycle maintenance of components. Each component has change and invariance points in the life cycle, and the template method mode is used. The methods of composition pattern, observer pattern, skeleton abstract class and template are used respectively.

If you need to maintain a bunch of entities with parent-child relationship, you can consider using composite mode.

The observer mode sounds “tall”. In fact, when an event occurs, a series of update operations need to be performed. A low coupling and non-invasive notification and update mechanism is realized.

ContainerIt inherits lifecycle. Standardengine, standardhost, standardcontext and standardwrapper are the specific implementation classes of corresponding container components. Because they are containers, they inherit the abstract base class of containerbase. Containerbase implements the container interface and inherits the lifecyclebase class. Their life cycle management interface and function interface are separate, This is also in line with the principle of interface separation in the design.

3、 Why did Tomcat break the parental delegation mechanism

3.1 parental assignment

We knowJVMThe class loader of is based on the two parent delegation mechanism when loading class, that is, it will give the loading to its own parent loader. If the parent loader is empty, it will be searchedBootstrapWhether it has been loaded. Only when it cannot be loaded can it be loaded by itself. JDK provides an abstract classClassLoader, three key methods are defined in this abstract class. External useLoadclass (string name) is used for subclass rewriting to break parent delegation: loadclass (string name, Boolean resolve)

public Class<?> loadClass(String name) throws ClassNotFoundException {
    return loadClass(name, false);
}
protected Class<?> loadClass(String name, boolean resolve)
    throws ClassNotFoundException
{
    synchronized (getClassLoadingLock(name)) {
        //Find out whether the class has been loaded
        Class<?> c = findLoadedClass(name);
        //If not loaded
        if (c == null) {
            //Delegate to the parent loader to load and call recursively
            if (parent != null) {
                c = parent.loadClass(name, false);
            } else {
                //If the parent loader is empty, find out whether the bootstrap has been loaded
                c = findBootstrapClassOrNull(name);
            }
            //If it still cannot be loaded, call its own findclass to load it
            if (c == null) {
                c = findClass(name);
            }
        }
        if (resolve) {
            resolveClass(c);
        }
        return c;
    }
}
protected Class<?> findClass(String name){
    //1. According to the incoming class name name, find the class file in a specific directory and read the. Class file into memory
    ...

        //2. Call defineclass to convert byte array into class object
        return defineClass(buf, off, len);
}

//Parse the bytecode array into a class object and implement it with the native method
protected final Class<?> defineClass(byte[] b, int off, int len){
    ...
}

There are three class loaders in JDK. In addition, you can customize class loaders. Their relationship is shown in the figure below.

  • BootstrapClassLoaderIs the startup class loader, which is implemented by C language and used to loadJVMCore classes required for startup, such asrt.jarresources.jarWait.
  • ExtClassLoaderIs an extension class loader used to load\jre\lib\extJar package under directory.
  • AppClassLoaderIs a system class loader used to loadclasspathThe application uses it to load classes by default.
  • Custom class loader, which is used to load classes under the custom path.

The working principle of these class loaders is the same, but the difference is that they have different loading paths, that is to sayfindClassThis method looks for different paths. The two parent delegation mechanism is to ensure that a Java class is unique in the JVM. If you accidentally write a class with the same name as the JRE core class, such asObjectClass, the two parent delegation mechanism can ensure that what is loaded isJREThe one in theObjectClass, not what you wroteObjectClass. that is becauseAppClassLoaderWhen loading your object class, it will delegate toExtClassLoaderTo load, andExtClassLoaderWill be entrusted toBootstrapClassLoaderBootstrapClassLoaderI found myself loadedObjectClass, will return directly, will not load what you writeObjectClass. The best we can get isExtClassLoaderPay attention here.

3.2. Tomcat hot loading

The essence of Tomcat is to do periodic tasks through a background thread, regularly detect the changes of class files, and reload classes if there are changes. Let’s seeContainerBackgroundProcessorHow to achieve it.

protected class ContainerBackgroundProcessor implements Runnable {

    @Override
    public void run() {
        //Note that the parameter passed in here is an instance of the "host class"
        processChildren(ContainerBase.this);
    }

    protected void processChildren(Container container) {
        try {
            //1. Call the backgroundprocess method of the current container.
            container.backgroundProcess();

            //2. Traverse all sub containers and call processchildren recursively,
            //In this way, the descendants of the current container will be processed
            Container[] children = container.findChildren();
            for (int i = 0; i < children.length; i++) {
            //Please note that the container base class has a variable called backgroundprocessordelay. If it is greater than 0, it indicates that the child container has its own background thread and does not need the parent container to call its processchildren method.
                if (children[i].getBackgroundProcessorDelay() <= 0) {
                    processChildren(children[i]);
                }
            }
        } catch (Throwable t) { ... }

Tomcat hot loading is implemented in the context container, mainly by calling the reload method of the context container. Regardless of the details, the following tasks are mainly completed from a macro perspective:

  • Stop and destroy the context container and all its sub containers. The sub container is actually the wrapper, that is, the servlet instance in the wrapper has also been destroyed.
  • Stop and destroy the listener and filter associated with the context container.
  • Stop and destroy pipelines and various valves under context.
  • Stop and destroy the classloader of context and the class file resources loaded by the classloader.
  • Start the context container. In this process, the resources destroyed in the previous four steps will be recreated.

Class loaders play a key role in this process. A context container corresponds to a class loader. During the destruction process, the class loader will destroy all the classes it loads. During the startup of the context container, a new class loader will be created to load the new class file.

3.3 Tomcat class loader

Tomcat’s custom class loaderWebAppClassLoaderBreaking the two parent delegation mechanism, it first tries to load a class by itself. If it can’t find it, it will proxy it to the parent class loader. Its purpose is to give priority to loading the classes defined by the web application. The specific implementation is rewritingClassLoaderThere are two ways to:findClassandloadClass

Findclass method

org.apache.catalina.loader.WebappClassLoaderBase#findClass;

In order to facilitate understanding and reading, I removed some details:

public Class<?> findClass(String name) throws ClassNotFoundException {
    ...

    Class<?> clazz = null;
    try {
            //1. First find the class in the web application directory
            clazz = findClassInternal(name);
    }  catch (RuntimeException e) {
           throw e;
       }

    if (clazz == null) {
    try {
            //2. If it is not found in the local directory, give it to the parent loader to find it
            clazz = super.findClass(name);
    }  catch (RuntimeException e) {
           throw e;
       }

    //3. If the parent class is not found, throw classnotfoundexception
    if (clazz == null) {
        throw new ClassNotFoundException(name);
     }

    return clazz;
}

1. First find the class to be loaded in the local directory of the web application.

2. If not found, give it to the parent loader. Its parent loader is the system class loader mentioned aboveAppClassLoader

3. How to throw if the parent loader does not find this classClassNotFoundAbnormal.

Loadclass method

Let’s look at the of Tomcat class loaderloadClassFor the implementation of the method, I also removed some details:

public Class<?> loadClass(String name, boolean resolve) throws ClassNotFoundException {

    synchronized (getClassLoadingLock(name)) {

        Class<?> clazz = null;

        //1. First check whether the class has been loaded in the local cache
        clazz = findLoadedClass0(name);
        if (clazz != null) {
            if (resolve)
                resolveClass(clazz);
            return clazz;
        }

        //2. Check whether it has been loaded from the cache of the system class loader
        clazz = findLoadedClass(name);
        if (clazz != null) {
            if (resolve)
                resolveClass(clazz);
            return clazz;
        }

        //3. Try loading with extclassloader class. Why?
        ClassLoader javaseLoader = getJavaseClassLoader();
        try {
            clazz = javaseLoader.loadClass(name);
            if (clazz != null) {
                if (resolve)
                    resolveClass(clazz);
                return clazz;
            }
        } catch (ClassNotFoundException e) {
            // Ignore
        }

        //4. Try to search the local directory for class and load it
        try {
            clazz = findClass(name);
            if (clazz != null) {
                if (resolve)
                    resolveClass(clazz);
                return clazz;
            }
        } catch (ClassNotFoundException e) {
            // Ignore
        }

        //5. Try loading with the system class loader (i.e. appclassloader)
            try {
                clazz = Class.forName(name, false, parent);
                if (clazz != null) {
                    if (resolve)
                        resolveClass(clazz);
                    return clazz;
                }
            } catch (ClassNotFoundException e) {
                // Ignore
            }
       }

    //6. All the above processes fail to load, and an exception is thrown
    throw new ClassNotFoundException(name);
}

There are six main steps:

1. First find out whether the class has been loaded in the local cache, that is, whether the class loader of Tomcat has loaded the class.

2. If the Tomcat class loader has not loaded this class, check whether the system class loader has been loaded.

3. If there is none, let extclassloader load it. This is a key step to prevent the web application’s own classes from overwriting the core classes of JRE. Because Tomcat needs to break the parental delegation mechanism. If a class called object is customized in the web application, if this object class is loaded first, it will overwrite the object class in JRE. This is why Tomcat’s class loader will give priority to tryingExtClassLoaderTo load becauseExtClassLoaderWill be entrusted toBootstrapClassLoaderTo load,BootstrapClassLoaderFind that you have loaded the object class and directly return it to Tomcat’s class loader, so that Tomcat’s class loader will not load the object class under the web application, and avoid the problem of overwriting the JRE core class.

4. IfExtClassLoaderThe loader failed to load, that is to sayJREIf there is no such class in the core class, find and load it in the local web application directory.

5. If there is no such class in the local directory, it indicates that it is not the class defined by the web application itself, and it shall be loaded by the system class loader. Here, please note that the web application is throughClass.forNameCall to the system class loader becauseClass.forNameThe default loader for is the system class loader.

6. If all the above loading processes fail, theClassNotFoundAbnormal.

3.4. Tomcat class loader hierarchy

Tomcat asServletContainer, which is responsible for loading ourServletClass, which is also responsible for loadingServletThe jar package on which it depends. alsoTomcatIt is also a java program, so it needs to load its own classes and dependent jar packages. First, let’s think about these questions:

1. Suppose we run two web applications in tomcat, and one of the two web applications has the same nameServletHowever, the functions are different. Tomcat needs to load and manage the two files with the same name at the same timeServletClasses to ensure that they do not conflict, so classes between web applications need to be isolated.

2. Suppose two web applications rely on the same third-party jar package, such asSpring, thatSpringAfter the jar package is loaded into memory,TomcatTo ensure that the two web applications can be shared, that is to saySpringThe jar package of is loaded only once. Otherwise, as the number of dependent third-party jar packages increases,JVMYour memory will swell.

3. Like the JVM, we need to isolate the classes of Tomcat itself and the classes of web applications.

1. WebAppClassLoader

Tomcat’s solution is to customize a class loaderWebAppClassLoaderAnd create a classloader instance for each web application. We know that the context container component corresponds to a web application, so eachContextThe container is responsible for creating and maintaining aWebAppClassLoaderLoader instance. The principle behind this is that classes loaded by different loader instances are considered to be different classes, even if their class names are the same. This is equivalent to creating isolated Java class spaces inside the Java virtual machine. Each web application has its own class space, and web applications are isolated from each other through their own class loaders.

2.SharedClassLoader

The essential requirement is how to share library classes between two web applications, and the same classes cannot be loaded repeatedly. In the two parent delegation mechanism, each child loader can load classes through the parent loader, so it’s OK to put the classes to be shared under the loading path of the parent loader.

Therefore, the designer of Tomcat added a class loaderSharedClassLoader, asWebAppClassLoaderThe parent loader of is designed to load classes shared between web applications. IfWebAppClassLoaderIf you do not load into a class, you will delegate to the parent loaderSharedClassLoaderTo load this class,SharedClassLoaderThe shared class will be loaded in the specified directory, and then returned toWebAppClassLoaderIn this way, the problem of sharing is solved.

3. CatalinaClassloader

How to isolate the classes of Tomcat itself from those of web applications?

Sharing can be through parent-child relationship, and isolation requires brotherhood. Brotherhood means that two class loaders are parallel. They may have the same parent loader. Based on this, Tomcat designs another class loaderCatalinaClassloader, specifically to load Tomcat’s own classes.

There is a problem with this design. What should we do when Tomcat and various web applications need to share some classes?

The old way, or add another oneCommonClassLoader, asCatalinaClassloaderandSharedClassLoaderParent loader for.CommonClassLoaderAll classes that can be loaded can beCatalinaClassLoaderandSharedClassLoaderuse

4、 Analysis and harvest summary of overall architecture design

Through the study of the overall architecture of tomcat, we know what core components Tomcat has and the relationship between components. And how Tomcat handles an HTTP request. Let’s review it through a simplified class diagram. From the diagram, you can see the hierarchical relationship of various components. The dotted line in the diagram represents the flow process of a request in Tomcat.

4.1 connector

The overall architecture of Tomcat includes two core components, connectors and containers. The connector is responsible for external communication, and the container is responsible for internal treatment. For connectorProtocolHandlerInterface to encapsulate communication protocols andI/ODifferences in models,ProtocolHandlerThe interior is divided intoEndPointandProcessormodular,EndPointResponsible for the bottomSocketsignal communication,ProccesorResponsible for application layer protocol resolution. Connector through adapterAdapterCall the container.

By studying the overall architecture of tomcat, we can get some basic ideas for designing complex systems. First, analyze the requirements, determine the sub module according to the principle of high cohesion and low coupling, then find out the change points and invariant points in the sub module, encapsulate the invariant points with interfaces and abstract base classes, define template methods in the abstract base class, and let the sub class realize the abstract methods, that is, specific sub classes to realize the change points.

4.2 container

The principle of decoupling and opening and closing is achieved by using the combination mode to manage the container and publishing the startup events through the observer mode. Skeleton abstract classes and template methods change and remain unchanged, and the changes are implemented by subclasses, so as to realize code reuse and flexible expansion. Use the responsibility chain to process requests, such as logging.

4.3 class loader

Tomcat’s custom class loaderWebAppClassLoaderIn order to isolate the web application, the two parent delegation mechanism is broken. It first tries to load a class by itself. If it cannot be found, it will be proxy to the parent class loader. Its purpose is to give priority to loading the classes defined by the web application. Prevent the web application’s own class from overwriting the core class of JRE, and use extclassloader to load, which not only breaks the parental delegation, but also can be loaded safely.

5、 Actual scene application

This paper briefly analyzes the overall architecture design of tomcat, from [connector] to [container], and details the design ideas and design patterns of some components. The next step is how to apply what we have learned and apply elegant design to practical work development. Learning starts with imitation.

5.1 responsibility chain mode

In work, there is a demand that users can enter some information and choose to check one or more modules of the enterprise, such as [industrial and commercial information], [judicial information] and [registration status], as shown below, and there are some common things between modules that need to be reused by each module.

This is like a request, which will be processed by multiple modules. Therefore, each query module can be abstracted to deal with valves and use a list to save these valves. In this way, we only need to add one valve in the new module to realize the opening and closing principle. At the same time, we decouple a pile of inspection codes into different specific valves and use abstract classes to extract “invariant” functions.

The specific example code is as follows:

First, abstract our processing valve,NetCheckDTOIs request information

/**
 *Chain of responsibility mode: handle each module valve
 */
public interface Valve {
    /**
     *Call
     * @param netCheckDTO
     */
    void invoke(NetCheckDTO netCheckDTO);
}

Define abstract base classes and reuse code.

public abstract class AbstractCheckValve implements Valve {
    public final AnalysisReportLogDO getLatestHistoryData(NetCheckDTO netCheckDTO, NetCheckDataTypeEnum checkDataTypeEnum){
        //Get the history and omit the code logic
    }

    //Get and verify data source configuration
    public final String getModuleSource(String querySource, ModuleEnum moduleEnum){
       //Omit code logic
    }
}

Define the specific business logic processed by each module, such as the corresponding processing of Baidu negative news


@Slf4j
@Service
public class BaiduNegativeValve extends AbstractCheckValve {
    @Override
    public void invoke(NetCheckDTO netCheckDTO) {

    }
}

Finally, the management user selects the module to be checked, which we save through the list. Used to trigger the required verification module

@Slf4j
@Service
public class NetCheckService {
    //Fill all valves
    @Autowired
    private Map<String, Valve> valveMap;

    /**
     *Send verification request
     *
     * @param netCheckDTO
     */
    @Async("asyncExecutor")
    public void sendCheckRequest(NetCheckDTO netCheckDTO) {
        //Module valve for saving customer selection processing
        List<Valve> valves = new ArrayList<>();

        CheckModuleConfigDTO checkModuleConfig = netCheckDTO.getCheckModuleConfig();
        //Add the module selected by the user for inspection to the valve chain
        if (checkModuleConfig.getBaiduNegative()) {
            valves.add(valveMap.get("baiduNegativeValve"));
        }
        //Omit part of code
        if (CollectionUtils.isEmpty(valves)) {
            Log.info ("the network check module is empty and there are no tasks to check");
            return;
        }
        //Trigger processing
        valves.forEach(valve -> valve.invoke(netCheckDTO));
    }
}

5.2 template method mode

The requirement is that the financial report analysis can be performed according to the financial report Excel data or enterprise name entered by the customer.

For unlisted, analyze excel – > verify whether the data is legal – > perform calculation.

Listed enterprise: judge whether the name exists. If it does not exist, send an email and abort the calculation – > pull the financial report data from the database, initialize the inspection log, generate a report record, trigger the calculation – > modify the task status according to failure and success.

Important “change” and “unchanged”,

  • What remains unchanged is that the whole process is to initialize the inspection log, initialize a report, verify the data in the early stage (if the listed company fails to pass the verification, it is also necessary to build and send the email data), pull the financial report data from different sources and adapt the general data, and then trigger the calculation. The status needs to be modified for task exception and success.
  • The change is that the verification rules of listed and unlisted are different, and the ways to obtain financial report data are different. The financial report data of the two ways need to be adapted

The whole algorithm flow is a fixed template, but it is necessary to delay the specific implementation of some internal changes of the algorithm to different subclasses, which is the best scenario of the template method mode.

public abstract class AbstractAnalysisTemplate {
    /**
     *Submit the financial report analysis template method and define the skeleton process
     * @param reportAnalysisRequest
     * @return
     */
    public final FinancialAnalysisResultDTO doProcess(FinancialReportAnalysisRequest reportAnalysisRequest) {
        FinancialAnalysisResultDTO analysisDTO = new FinancialAnalysisResultDTO();
		//Abstract method: submit legal verification for verification
        boolean prepareValidate = prepareValidate(reportAnalysisRequest, analysisDTO);
        Log.info ("preparevalidate verification result = {}", preparevalidate);
        if (!prepareValidate) {
			//Abstract method: build the data required for notification mail
            buildEmailData(analysisDTO);
            Log.info ("build mail information, data = {}", json.tojsonstring (analysisdto));
            return analysisDTO;
        }
        String reportNo = FINANCIAL_REPORT_NO_PREFIX + reportAnalysisRequest.getUserId() + SerialNumGenerator.getFixLenthSerialNumber();
        //Generate analysis log
        initFinancialAnalysisLog(reportAnalysisRequest, reportNo);
		//Generate analysis record
        initAnalysisReport(reportAnalysisRequest, reportNo);

        try {
            //Abstract method: pull financial report data and implement it in different subclasses
            FinancialDataDTO financialData = pullFinancialData(reportAnalysisRequest);
            Log.info ("pull the financial report data and prepare to execute the calculation");
            //Measurement index
            financialCalcContext.calc(reportAnalysisRequest, financialData, reportNo);
			//Set analysis log to successful
            successCalc(reportNo);
        } catch (Exception e) {
            Log. Error ("abnormal financial report calculation subtask", e);
			//Failed to set analysis log
            failCalc(reportNo);
            throw e;
        }
        return analysisDTO;
    }
}

Finally, two new subclasses inherit the template and implement the abstract method. This decouples the processing logic of listed and unlisted types, and reuses the code at the same time.

5.3 strategy mode

The requirements are as follows. An excel interface for universal identification of bank flow is required. It is assumed that the standard flow includes [transaction time, revenue, expenditure, transaction balance, payer account number, payer name, payee name, payee account number] and other fields. Now let’s resolve the subscript of the excel header where each necessary field is located. However, there are many situations:

1. One is to include all standard fields.

2. The subscripts of income and expenditure are the same column. Income and expenditure are distinguished by positive and negative.

3. Revenue and expenditure are in the same column and are distinguished by a transaction type field.

4. Special treatment of special banks.

That is, we need to find the corresponding processing logic algorithm according to the corresponding subscript. We may write too many in one methodif elseThe whole pipeline processing is coupled together. If there is another new pipeline type in the future, we should continue to change the old code. Finally, there may be code complexity that is “smelly, long and difficult to maintain”.

At this time, we can use the policy mode, use different processors to process the pipeline of different templates, and find the corresponding policy algorithm to process according to the template. Even if we add another type in the future, we just need to add a new processor with high cohesion, low coupling and expandability.

Define processor interfaces and implement processing logic with different processors. Inject all processors intoBankFlowDataHandlerofdata_processor_mapAccording to different scenarios, the processing pipeline for the existing processor is taken out.

public interface DataProcessor {
    /**
     *Process flow data
     *@ param bankflowtemplatedo pipeline subscript data
     * @param row
     * @return
     */
    BankTransactionFlowDO doProcess(BankFlowTemplateDO bankFlowTemplateDO, List<String> row);

    /**
     *Whether the template can be processed or not. Different types of pipeline policies judge whether they support parsing according to the template data
     * @return
     */
    boolean isSupport(BankFlowTemplateDO bankFlowTemplateDO);
}

//Processor context
@Service
@Slf4j
public class BankFlowDataContext {
    //Inject all processors into the map
    @Autowired
    private List<DataProcessor> processors;

    //Find the corresponding processor processing pipeline
    public void process() {
         DataProcessor processor = getProcessor(bankFlowTemplateDO);
      	 for(DataProcessor processor : processors) {
           if (processor.isSupport(bankFlowTemplateDO)) {
             //Row is a row of pipeline data
        		 processor.doProcess(bankFlowTemplateDO, row);
             break;
           }
         }

    }


}

Define the default processor and process the normal template. If you add a new template, you only need to add a new processor to implement itDataProcessorJust.

/**
 *Default processor: facing the specification pipeline template
 *
 */
@Component("defaultDataProcessor")
@Slf4j
public class DefaultDataProcessor implements DataProcessor {

    @Override
    public BankTransactionFlowDO doProcess(BankFlowTemplateDO bankFlowTemplateDO) {
        //Omit processing logic details
        return bankTransactionFlowDO;
    }

    @Override
    public String strategy(BankFlowTemplateDO bankFlowTemplateDO) {
      //Omit to judge whether the pipeline supports parsing
      boolean isDefault = true;

      return isDefault;
    }
}

Through the policy pattern, we assign different processing logic to different processing classes, which is completely decoupled and easy to expand.

Debugging source code using embedded Tomcat: GitHub:https://github.com/UniqueDong/tomcat-embedded

The above is the detailed content of analyzing Tomcat architecture principle to architecture design. For more information about Tomcat Architecture Principle and architecture design, please pay attention to other relevant articles of developeppaer!