Containerization has become a trend, which can solve many pain points in operation and maintenance, such as efficiency, cost, stability and other issues, while the process of access to containers often encounter many problems and inconvenience. In the process of containerization, we have encountered various problems, such as container technology, operation and maintenance system adaptation, user use habits change, etc. This paper mainly introduces the problems encountered in the process of containerization of youzan and the solutions adopted.
The original intention of containerization of youzan
There will be many projects and daily parallel development at the same time of youzan. The preemption of environment has seriously affected the efficiency of development, testing and online. We need to provide a set of development joint debugging (daily) and test environment (QA) for each project, and with the creation and destruction of the project and daily life cycle project environment, our earliest containerization requirements are how to solve them Fast delivery of environment.
The above is about the R & D process. In the standard process, we have four stable environments: daily environment, QA environment, pre release environment and test environment. Our development, testing and joint debugging work will not be directly carried out in a stable environment, but will pull out a set of independent project environment, and then synchronously return to the stable environment of daily / QA after the code is finally released to the production environment after development, testing and pre delivery acceptance.
We provide a set of environment delivery scheme to meet the maximum project parallelism with the minimum resource input. On the basis of the daily / QA stable environment, we isolate n project environments. In the project environment, we only need to create the application computing resources involved in the project, and other missing service calls are provided by the stable environment. In the project environment, we use a lot of container technology.
Later, we implemented the continuous delivery pipeline on the basis of the fast delivery solution of the project environment. At present, there are more than 600 sets of project / continuous delivery environment, plus the daily / QA stable environment, involving four or five thousand computing instances. These computing instances are very low in CPU or memory utilization, and containerization can solve the delivery effect of the environment very well Rate problem, as well as increasing resource utilization to save cost.
Container scheme of youzan
Our containerization scheme is based on kubernetes (1.7.10), docker (1.12.6) and docker (1.13.1). Here are the problems and solutions we have encountered in various aspects.
The back end of youzan is mainly Java application, which adopts the customized Dubbo service-oriented scheme. In the process, it is impossible to achieve the full container of the whole unit, and the interworking with the original cluster on the network route is just needed. Because we cannot solve the interworking problem between the public cloud overlay network and the public cloud network, we abandoned the overlay network scheme and adopted the managed network at the beginning The MAC VLAN solution solves the problem of network interworking and network performance, but also can not enjoy the advantages of public cloud elastic resources. With the development of multi Cloud Architecture and more and more cloud manufacturers supporting container overlay network and VPC network, the problem of elastic resources has been alleviated.
Container isolation mainly uses the kernel’s namespace and CGroup technology, which has a good performance in the process, CPU, memory, IO and other resource isolation restrictions, but there are many shortcomings compared with virtual machine in other aspects. The most common problem we encounter in the use process is that the number of CPUs and memory size seen in the container are not accurate, because the / proc file system cannot be isolated, resulting in The process in the container “sees” the number of CPUs and memory size of the physical machine.
Our Java application will decide how to configure the JVM parameters according to the memory size of the server. We use the lxcfs scheme to avoid this.
CPU number problem
Because we have oversold demand and kubernetes also uses CPU share as the CPU limit by default. Although we use lxcfs, the number of CPUs is not allowed. The number of threads created by the JVM and many Java SDKs depends on the number of CPUs in the system. As a result, the number of online programs and memory usage of Java applications are much more than that of virtual machines, which seriously affects the operation. Other types of applications have similar problems.
We will build an environment variable num UU CPUs according to the specification of the container, and then for example, nodejs application will create its worker process number according to this variable. To solve the problem of Java class application, we simply override the jvm_activeprocessorcount function through ld_preload, and let it directly return the value of num_cpus .
Before containerization, all applications of youzan have been connected to the publishing system, and the packaging and publishing processes of applications have been standardized in the publishing system, so the cost of application access is relatively small, and the business side does not need to provide dockerfile.
- Nodejs, python, PHP SOA and other applications managed by supervisor only need to provide app.yaml file in Git warehouse to define the runtime and startup command needed for running.
- The application business side started by Java standardization does not need to be changed
- Java non standardized application needs to be standardized
Container image is divided into three layers: stack layer (OS), runtime layer (language environment), application layer (business code and some auxiliary agents). Application and auxiliary agent are started by runit. Because our configuration is not completely separated, at present, each environment is packaged independently in the application layer. In addition to the business code in the image, we will put some auxiliary agents according to the language type of the business. At first, we also want to split various agents into multiple images, and then each pod runs multiple containers. Later, because we can’t solve the problem of the starting sequence of containers in the pod (service startup has dependency), we throw all services into one container to run.
Our container image integration process is also scheduled through kubernetes (it will be scheduled to the specified packaging node). When the publishing task is initiated, the control system will create a packaged pod in the cluster. The packaging program will compile the code, install the dependency according to the application type and other parameters, and generate the docker, and then use the docker in docker side in this pod To integrate the container image and push it to the warehouse.
In order to speed up the packaging speed of the application, we use PVC to cache the virtualenv of python, node_modules of nodejs, Maven package of Java and other files. In addition, in earlier versions of docker, the dockerfile add instruction does not support specifying the file owner and grouping, which brings a problem that when specifying the file owner (our application runs under the app account), run chown needs to be run more than once, so the image also has one more layer of data, so the docker version of our packaged node is officially newer Because the new version supports the add — chown feature.
Load balancing (Ingress)
There are more perfect service and service mesh schemes for the internal call of youzan application. The access in the cluster does not need to be considered too much. The load balancing only needs to consider the HTTP traffic of users and system access. Before the containerization, we have developed a set of unified access system. Therefore, in the containerization load balancing, we have not fully implemented the controller according to the mechanism of ingress , the resource configuration of inress is configured in the unified access system. The upstream forwarded in the configuration will be associated with the service in kubernetes. We just made a sync program watch Kube API to sense the change of service to update the server column table information of the upstream in the unified access system in real time.
Container login and debugging
In the process of containerized access, the development feedback is that the console is relatively difficult to use. Although we have optimized it many times, and the experience of iterm2 is still insufficient. Finally, we have released the project / continuous delivery environment, which requires frequent login and debugging of SSH login rights.
Another serious problem is that when an application starts up, there is a problem in the health check, which will cause the pod to be rescheduled all the time. In the development process, we certainly hope to see the failure scene. We provide a debugging release mode, so that the container does not do health check.
You Zan has a special log system. Our internal name is Skynet. Most of the logs and business monitoring data are directly connected to Skynet through SDK. Therefore, the container’s standard output log is only used as a means to assist troubleshooting. The log collection of our container uses fluent D. after fluent D processing, it is printed to Kafka according to the log format agreed by Skynet, and finally processed by Skynet into es for storage.
There are three main parts of the traffic we involve in gray-scale Publishing:
- HTTP access traffic for clients
- HTTP calls between applications
- Dubbo calls between applications
First of all, we need to label all dimensions (such as users, stores, etc.) required by gray level on the unified access of the entrance, and then we need to modify the unified access, HTTP client and Dubbo client, so that these labels can be transmitted throughout the whole call chain. When we do the gray-scale release of the container, we will send a gray-scale deployment, and then configure the gray-scale rules in the unified access and gray-scale configuration center. The callers on the whole link will perceive these gray-scale rules to realize the gray-scale release.
Containerization of standard environment
Starting point of standard environment
- Similar to the project environment, more than half of the low-level servers in the standard stable environment, such as daily, QA, pre and prod, are very wasteful.
- Because the cost of daily, QA and pre are all run as a single virtual machine, so once the stable environment needs to be released, the standard stable environment and project environment will be temporarily unavailable.
- The delivery speed of virtual machine is slow, and using virtual machine for gray-scale publishing is also complex.
- Virtual machine often exists for several years or even longer, and the convergence of operating system and basic software version is very troublesome.
Promotion of containerization of standard environment
After the previous project / continuous delivery of online and iterative, most of the applications themselves already have the conditions of containerization. However, for online, the entire operation and maintenance system is needed to adapt to containerization, such as monitoring, publishing, logging and so on. At present, we have basically completed the preparation of production environment containerization, the production network has been put into some front-end nodejs applications, and other applications are also being promoted in succession, hoping to share more containerization experience in the production environment in the future.
The above is the application of youzan in containerization, as well as some problems and solutions encountered in the process of containerization. The containerization of our production environment is still in its infancy, and we will encounter various kinds of problems later. We hope to learn from each other and share more experience with you later.