Performance competition between Apache APIs IX and envoy

Time:2021-10-27

At a technology sharing meeting organized by CNCF, I heard enovy for the first time. Balabala, the guest of sharing, talked a lot and didn’t remember anything. He remembered a particularly novel concept “communication bus”. Later, Google what this thing is. I found it described on the official website:

“Envoy is an L7 proxy and communication bus designed for large modern SOA (Service Oriented Architecture) architectures”

In other words, envoy is an L7 agent software created to solve the server mesh field. Here I found a picture on the Internet. I understand that envoy is probably the following deployment architecture. (if you are wrong, please ask the boss for advice)

Performance competition between Apache APIs IX and envoy
Since it is L7’s agent software, as an old driver who has been mixing with openresty community all year round, I can’t help but take it for comparison.

The comparison object we selected is apifix, which recently graduated from the Apache community. It is an API gateway based on openresty. (in fact, the L7 agent adds functions such as routing, authentication, flow restriction, dynamic upstream, etc.)

Why did you choose it? I heard that the routing implementation of this product was very good when sharing with the community. It happened that our current business routing system was in a mess. I pulled the source code of apifix and found that it was indeed 6 to fly up and sling the similar products I had seen. I was so impressed that it was it!

Here is a picture of pickpocketing on the official website of apisik. It’s really a picture that wins a thousand words. You can see how this thing works at a glance;

Performance competition between Apache APIs IX and envoy

Let’s do it. First, we go to the official website to find the latest versions of two products:

Apache APISIX 1.5andEnvoy 1.14

Build environment preparation

  • Stress test client:wrk
  • The main test indicators include: gateway delay, QPS and linear expansion;
  • Test environment: Microsoft cloud Linux (Ubuntu 18.04), standard D13 V2 (8 vcpus, 56 gib memory);
  • Test method 1: horizontal comparison of single core operation (because they are based on epoll IO model, their processing capacity is verified by single core pressure test);
  • Test method 2: horizontal comparison of multi-core operation is adopted to verify whether the overall processing capacity of the two can increase linearly under the scenario of adding multiple processes (threads);

Test scenario

Here, we build an upstream server with nginx, configure two workers, and directly respond to 4K content after receiving the request. The reference configuration is as follows:

server {
  listen 1980;
  
  access_log off;
  location = /hello {
    echo_duplicate 400 "1234567890";
  }
}
  • The schematic diagram of network architecture is as follows: (green is normal load, not full. Red is high-voltage load, and process resources should be full, mainly CPU)

Performance competition between Apache APIs IX and envoy

Routing configuration

First we findGetting started configuration guide for APIs IX, we add a route to / Hello. The configuration is as follows:

curl http://127.0.0.1:9080/apisix/admin/routes/1 -X PUT -d '{、
    "uri": "/hello",
    "upstream": {
        "type": "roundrobin",
        "nodes": {
            "127.0.0.1:1980": 1
        }
    }}'

Note that proxy is not started here_ Cache and proxy_ Mirror plug-in, because enovy does not have similar functions;

Then we refer toEnvoy official pressure measurement guidanceAdd a route for envoy:

static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address: { address: "0.0.0.0", port_value: 10000 }

    filter_chains:
    - filters:
      - name: envoy.http_connection_manager
        config:
          generate_request_id: false,
          stat_prefix: ingress_http
          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              domains: ["*"]
              routes:
              - match: { prefix: "/hello" }
                route: { cluster: service_test }
          http_filters:
          - name: envoy.router
            config:
              dynamic_stats: false
  clusters:
  - name: service_test
    connect_timeout: 0.25s
    type: LOGICAL_DNS
    dns_lookup_family: V4_ONLY
    lb_policy: ROUND_ROBIN
    hosts: [{ socket_address: { address: "127.0.0.1", port_value: 1980 }}]
    circuit_breakers:
      thresholds:
        - priority: DEFAULT
          max_connections: 1000000000
          max_pending_requests: 1000000000
          max_requests: 1000000000
          max_retries: 1000000000
        - priority: HIGH
        max_connections: 1000000000
        max_pending_requests: 1000000000
        max_requests: 1000000000
        max_retries: 1000000000

Generate above_ request_ id、dynamic_ Stats and circuit_ The breakers part is turned on by default in envoy, but it is not used in this pressure test. It needs to be turned off or set an ultra large threshold to improve performance. (who can explain to me why the configuration of this thing is so complicated -#-!)

Pressure measurement results

A single route without any plug-ins. Open different number of CPUs for full load pressure test. Note: nginx is called the number of workers, and envoy is concurrent. In order to unify, it is called the number of workers later.

Number of processes APISIX QPS APISIX Latency Envoy QPS Envoy Latency
1 worker 18608.4 0.96 15625.56 1.02
2 workers 34975.8 1.01 29058.135 1.09
3 workers 52334.8 1.02 42561.125 1.12

Note: the original data is disclosed ingistYes.

Performance competition between Apache APIs IX and envoy

QPS: the number of requests completed per second. The greater the number, the better. The greater the number, the more requests can be completed per unit time. From the QPS results, the performance of apisex is about 120% of envoy. The more cores, the greater the QPS gap.

Latency: the delay time per request. The smaller the value, the better. It represents how long it takes for each request to receive a response after it is sent. For the reverse proxy scenario, the smaller the value, the less impact on the request. From the results, the delay per request of envoy is 6-10% more than that of apisix. The more cores, the greater the delay.

It can be seen that the gap between QPS and latency is not large in the single worker process mode, but the gap gradually widens with the increase of worker processes. Here I analyze the following two reasons. Will nginx have more advantages in interacting with multi worker and system IO models in high concurrency scenarios? On the other hand, It may also be that nginx itself is “stingy” in the use of memory and CPU. This accumulated performance advantage will be evaluated in detail in the future.

summary

In general, apisix is slightly better than envoy in terms of response delay and QPS. Because nginx’s multi worker cooperation mode has more advantages in high concurrency scenarios, the performance improvement of apisix is more obvious than enovy after opening multiple worker processes; However, there is no conflict between the two. Envoy’s bus design makes it have unique advantages in dealing with east-west traffic. Apisix’s performance in performance and delay makes it have massive throughput capacity in dealing with north-south traffic. It is a positive solution to select reasonable components and plug-ins to build its own services according to its own business scenarios.