Build a high availability architecture for micro services

Time:2021-1-18

With the rapid development of microservice and cloud computing in recent years, the machine has gradually changed from physical machine to virtual machine, and the application service has gradually changed from a huge single application to an application cluster composed of several microservices. The speed of update iteration has doubled. The traditional deployment mode can not meet the daily update needs of development, and a set of management architecture suitable for microservice is needed.

Technology stack and documentation

Resource scheduling framework mesos

Application choreography platform – Marathon

Dynamic modification of upstream dyups by nginx

Dynamic modification of upstream upsync by nginx

Machine resource management with mesos

First, the management of machine resources. In the architecture of microservice, the original single services are divided into independent applications. These services are small in size and can run independently on machines with small configurations. For fault isolation, we will try our best to deploy these services on different virtual machines, so that the number of machines will multiply. For operation and maintenance, when deploying a new service, you need to check whether the remaining resources of the existing machine meet the needs of the new service. Sometimes, inaccurate evaluation may lead to back and forth expansion, migration, or resource waste.

In the beginning, our architecture might look like this

Build a high availability architecture for micro services

In order to solve the above problems, we can use mesos (distributed resource management framework), which allows us to use the whole data center like a computer (a resource pool).
Mesos deployment is divided into two roles: master and agent. Of course, you can start them on the same machine.

Zookeeeper needs to be installed before installing mesos. Mesos uses ZK to achieve high availability and election, including a master leader and several backup masters to avoid downtime.

The mesos master is responsible for managing frameworks and slave, and non allocating resources on slave to frameworks.
The mesos agent is responsible for managing each mesos task on the node and allocating resources for each executor (low version is mesos slave).

$ cat > /tmp/bintray-mesos-el.repo <<EOF

bintray-mesos-el – packages by mesos from Bintray

[bintray-mesos-el]
name=bintray-mesos-el
baseurl=https://dl.bintray.com/apache…
gpgcheck=0
repo_gpgcheck=0
enabled=1
EOF
 
$ sudo mv /tmp/bintray-mesos-el.repo /etc/yum.repos.d/bintray-mesos-el.repo
 
$ sudo yum update
 
$ sudo yum install mesos
 
$ tree /etc/mesos-master
/etc/mesos-master/
|– hostname
|– ip
|– log_dir
|– quorum   # quorum > (number of masters)/2
`– work_dir
 
$ tree /etc/mesos-agent
/etc/mesos-agent/
|–Containerizers # container type, the default is “mesos”, and “docker” can be added, such as: mesos, docker
|– hostname
|– ip
|– log_dir
|–Master # master address, in the format of host:port Or
ZK: / / host1: port1, host2: port2,… / path or file: / / / path / to / file
|–Resources # set the total resource size, which can be set smaller to reserve more machine resources
`– work_dir
 
$cat / etc / mesos / ZK # set the storage directory of mesos in ZK #
zk://192.168.100.9:2181,192.168.100.110:2181,192.168.100.234:2181/mesos
 
$ systemctl start mesos-master
$ systemctl start mesos-slave

When the mesos service is started, the agent will report the machine resources to the master node, including CPU, memory, disk, etc. When we want to publish a service, we only need to set the CPU, memory and disk parameters of the service. The mesos master will automatically help us select the machine with enough resources to run it, as shown in the figure below

Build a high availability architecture for micro services

We leave the startup of micro services to mesos for management, so we only need to focus on the overall resources. Mesos provides a UI interface, which can directly access the 5050 port of the mesos master to view the usage of cluster resources. Overall usage and agent node usage

Build a high availability architecture for micro services

Build a high availability architecture for micro services

After that, our architecture is like this

Build a high availability architecture for micro services

Microservice management with marathon

Marathon is a private PAAS platform built on mesos. It can automatically handle hardware or software failures and ensure that every application is “always online.”. We use marathon to manage microservices, which has the following advantages: 1. Support container and non container, not limited by service startup type, operating system version, etc. 2. Beautiful and powerful user interface, which can be used for quick and convenient application configuration on UI. 3. Support constraints, such as allowing a mesos agent node to run only one application. 4. Support health examination. You can configure HTTP, HTTPS, TCP, command type monitoring and checking. 5. Complete rest API, easy to integrate and script. This is crucial for later integration.

 Add the repository

$ sudo rpm -Uvh http://repos.mesosphere.com/e…
 

 Install packages

$ sudo yum -y install mesos marathon
 

 marathon  and mesos zk path

$ cat /etc/default/marathon 
MARATHON_MESOS_USER=”root”
MARATHON_MASTER=”zk://192.168.100.9:2181,192.168.100.110:2181,192.168.100.234:2181/mesos”
MARATHON_ZK=”zk://192.168.200.9:1181,192.168.100.110:2181,192.168.100.234:2181/marathon”
 
systemctl start marathon

After startup, you can directly access port 8080 of Marathon to see a beautiful and powerful UI interface.

Build a high availability architecture for micro services

We take the springboot application as an example to create an application on marathon

Build a high availability architecture for micro services

Build a high availability architecture for micro services

Build a high availability architecture for micro services

Build a high availability architecture for micro services

When we update the application, marathon will create a new application with the same number of instances, and replace the old node after the health check is passed, so we don’t need to worry about the online accident caused by the old service stopping when the new service is not started. So far, we have been able to create, upgrade, expand and shrink daily applications on marathon. When the service health check fails or the machine goes down, marathon will automatically start the application on other nodes, which greatly improves the high availability.

Build a high availability architecture for micro services

Using nginx upsync / dyups module to make smooth changes

When our microservices can be randomly distributed on different machines, a new headache arises. Nginx doesn’t know about the change of the back-end node, and it’s impossible to manually modify the upstream node and reload nginx every time, so the cost is too high. Our solution is to connect with the micro service registry. When the service is registered or cancelled, the registry will be updated. Using the nginx upsync / dyups module, we can dynamically modify the ability of the upstream node to synchronize and make smooth changes. If the registry used is consumer, it is recommended to use the upsync module. In this way, we do not need to develop it. We only need a simple nginx configuration to achieve the desired effect and support consumer, kV and consumer_ services, consul_ At the same time, upsync also supports etcd. Consumer is recommended_ Health interface. The upsync module is not a built-in module of nginx, so it needs to be recompiled and added.

wget ‘http://nginx.org/download/ngi…
tar -xzvf nginx-1.8.0.tar.gz
cd nginx-1.8.0/
 
 
./configure –add-module=/path/to/nginx-upsync-module
make
make install

Sample configuration file

http {
    upstream test {
        upsync 127.0.0.1:8500/v1/health/service/test upsync_timeout=6m upsync_interval=500ms upsync_type=consul_health strong_dependency=off;
        upsync_dump_path /usr/local/nginx/conf/servers/servers_test.conf;
 
        include /usr/local/nginx/conf/servers/servers_test.conf;
    }
 
    upstream bar {
        server 127.0.0.1:8090 weight=1 fail_timeout=10 max_fails=3;
    }
 
    server {
        listen 8080;
 
        location = /proxy_test {
            proxy_pass http://test;
        }
 
        location = /bar {
            proxy_pass http://bar;
        }
 
        location = /upstream_show {
            upstream_show;
        }
 
    }
}

When upsync can not meet our needs or the registry is not consumer or etcd, we can consider using nginx dyups module. Dyups only provides the interface of adding, deleting, checking and modifying the upstream. We need to complete the work of comparing and modifying with the registry by script. Although this method is troublesome, it is highly customizable and supports HTTP, C and Lua APIs, which can basically meet the needs of most scenarios.

The dyups module also needs to be added at nginx compile time

$ git clone git://github.com/yzprofile/ngx_http_dyups_module.git
 

 to compile as a static module

$ ./configure –add-module=./ngx_http_dyups_module
 

 to compile as a dynamic module

$ ./configure –add-dynamic-module=./ngx_http_dyups_module

Sample configuration

http {
 
    include conf/upstream.conf;
 
    server {
        listen   8080;
 
        location / {
            # The upstream here must be a nginx variable
            proxy_pass http://$dyups_host;
        }
    }
 
    server {
        listen 8088;
        location / {
            return 200 “8088”;
        }
    }
 
    server {
        listen 8089;
        location / {
            return 200 “8089”;
        }
    }
 
    server {
        listen 8081;
        location / {
            dyups_interface;
        }
    }
}

In particular, when using dyups, proxy_ When pass, the upstream must be a nginx variable, otherwise it will not take effect. Remember.

Build a high availability architecture for micro services

Overall review

After the above adjustment, we get the following optimization

  1. Automatic allocation of server resources, rational use
  2. Improve the high availability of micro services
  3. Reduce the labor cost of OPS and make it easier to manage and maintain