The transition to microservice faces a series of challenges. If we think that the architecture, design and development of microservices are very complex, so are the deployment and management of them.
Developers need to ensure that cross service communication is secure. They also need to implement distributed tracing to tell how long each call takes. Some best practices of distributed services such as retries, circuit breakers, etc. bring flexibility to services. Microservices are usually multilingual and use different libraries and SDKs. It is very complex, expensive and time-consuming to write general reusable software to manage intra service communication across different protocols such as HTTP, grpc and graphql.
After the microservice based application is deployed, the operation and maintenance after going online will be performed by Devops team. They need to monitor service health, latency, logging, events and tracking. Devops team is also expected to implement policy based routing to configure Blue / Green deployment, Canary version and rolling upgrade. Finally, metrics, events, logs and alerts from multiple microservices need to be aggregated and integrated with existing observability and monitoring technology stacks.
Service grid is a new phenomenon in the cloud native and micro service world. It tries to solve these problems for developers and operators. After the container choreography, if there is a technology that attracts the attention of developers and operators, it is definitely the service grid. Cloud native advocates recommend using service grid when running microservices in a production environment.
Service grid enables developers to manage intra service communication without building language specific SDKs and tools. For operators, service grid can provide ready to use traffic strategy, observability and insight from stack.
The best thing about service grid is that it’s a kind of “zero intrusion” software, and it doesn’t force code or configuration changes. By using sidecar mode, service grid injects agents into each service to act as agents for host services. Because the proxy intercepts every inbound and outbound call, it gets unparalleled visibility in the call stack. Each agent associated with the service sends telemetry collected from the call stack to a centralized component, which also acts as a control plane. When the operator configures the traffic policy, it will be submitted to the control plane, which pushes the policy to the agent to affect the traffic. Software Reliability Engineers (SRES) use the observability of service grids to gain insight into applications.
The service grid is integrated with the existing API gateway or kubernetes entry controller. When the API gateway and portal handle the north-south traffic, the service grid is responsible for the east-west traffic.
In a word, service grid is the infrastructure layer to realize secure service to service communication. It relies on lightweight network agents deployed with each microservice. Centralized control plane coordinates agents to manage traffic policies, security, and observability.
Even if the service grid is mainly used with micro services packaged as containers, it can integrate with VMS and even physical servers. By effectively utilizing the traffic policy of service grid, applications running across multiple environments can be seamlessly integrated. This factor makes service grid one of the key drivers of hybrid cloud and multi cloud.
There are many service grids for enterprises to choose from. This paper attempts to help compare and contrast some mainstream service grid platforms available in cloud native ecosystem.
AWS App Mesh
AWS app mesh was launched on AWS re: invest 2018 to bring the advantages of service grid to Amazon Web Services’ computing and container services. You can easily configure it with Amazon EC2, Amazon ECs, AWS fargate, Amazon eks and even AWS outputs.
Since app mesh can serve as a service grid for VMS and containers, Amazon creates an abstraction layer based on virtual services, virtual nodes, virtual routers and virtual routes.
The virtual service represents the actual service deployed in the VM or container. Each version of the virtual service is mapped to a virtual node. There is a one to many relationship between virtual services and virtual nodes. After deploying the new version of microservice, you only need to configure it as a virtual node. Similar to the network router, the virtual router acts as the endpoint of the virtual node. A virtual router has one or more virtual routes that follow the traffic policy and the retrial policy. Grid objects act as logical boundaries for all related entities and services.
A proxy is associated with each service participating in the grid, which handles all the traffic flowing in the grid.
Suppose we run two services in AWS: servicea.apps.local and serviceb.apps.local .
We can easily enable the grid of these services without modifying the code.
We note that, serviceb.apps.local There are virtual services, one virtual node and one virtual router with two virtual routes. These virtual routes determine the percentage of V1 and V2 traffic sent to the micro service.
Like most service grid platforms, AWS app mesh also relies on the open source envoy proxy data plane. The app mesh control plane is built with AWS computing services in mind. Amazon has also customized the envoy proxy to support this control plane.
When using AWS app mesh with Amazon eks, you’ll get the benefits of automatic sidecar injection and the ability to define app mesh entities in yaml. Amazon has built a CRD for eks to simplify the configuration of APP mesh with standard kubernetes objects.
Telemetry generated by AWS app mesh can be integrated with Amazon cloudwatch. Metrics can be exported to third-party services (such as Splunk, Prometheus and grafana) and open tracking solutions (such as Zipkin and lightstep).
For customers using AWS computing services, AWS app mesh is free. There is no additional charge for AWS app mesh.
Consul of hashicorp was launched as a service discovery platform with built-in key / value databases. It acts as an efficient, lightweight load balancer for services running on the same host or distributed environment. Consul exposes the DNS query interface used to discover the registration service. It also performs health checks on all registered services.
Consul was created before containers and kubernetes became mainstream. However, the rise of micro services and service grid prompted hashicorp to expand consul into a fully functional service grid platform. The service grid extension consul connect uses mutual transport layer security (TLS) to provide service to service connection authorization and encryption.
For a detailed description and step-by-step guide to implementing consult, please refer to my consult service discovery and consult connect tutorials.
Since sidecar mode is the preferred method for service grid, consul connect has its own agent to handle inbound and outbound service connections. Based on the plug-in architecture, envoy can be used as an alternative proxy for consul connect.
Consul connect adds two basic functions to consul security and observability.
By default, consul adds TLS certificates to service endpoints to implement mutual TLS (MTLs). This ensures that communication between services is always encrypted. Security policy is implemented by means of intensions, which define the access control of services and are used to control which services can establish connections. Intensions can reject or allow traffic from specific services. For example, a database service can reject inbound traffic directly from a web service, but allow requests made through a business logic service.
When envoy is used as a proxy for consul connect, it will take advantage of the observability of L7. Envoy integrated with Consul connect can be configured to send telemetry to various sources, including statsd, dogstatsd and Prometheus.
Consul can act as a client (proxy) or server depending on the context, and supports sidecar injection when integrated with choreographers such as nomad and kubernetes. There is a helm chart to deploy consul connect in kubernetes. Consul connect configuration and metadata are added as comments to the pod specification submitted to kubernetes. It can be integrated with Ambassador, which is the gateway controller of datawire and is used to handle north-south traffic.
Consul lacks advanced traffic routing and splitting capabilities for implementing Blue / Green deployment or Canary version. Compared with other service grid choices, its security traffic policy is not very flexible. Through the integration of envoy, some advanced routing policies can be configured. However, consul connect does not provide an interface for this.
In general, consul and Consul connect are robust service discovery and grid platforms, which are easy to manage.
Istio is one of the most popular open source service grid platforms supported by Google, IBM and red hat.
Istio is also one of the first service grid technologies to use envoy as a proxy. It follows the standard approach of centralized control plane and distributed data plane associated with microservices.
Although istio can be used with virtual machines, it is mainly integrated with kubernetes. Pods deployed in a specific namespace can be configured with automatic sidecar injection, where istio attaches data plane components to the pod.
Istio provides four main functions for microservice developers and operators:
- Traffic management: istio simplifies the configuration of service level attributes such as circuit breakers, timeouts and retries, and makes it easy to implement percentage based traffic splitting, such as a / B test, Canary deployment and segmented deployment. It also provides out of the box recovery to help make your application more powerful to prevent related service or network failures. Istio has its own ingress to handle north-south traffic.
- Expansibility: webassembly is a sandbox technology that can be used to extend the capabilities of istio proxy. Proxy wasm sandbox API replaces mixer as the main extension mechanism of istio. In istio 1.6, a unified configuration API will be provided for proxy wasm plug-ins.
- security: istio provides out of the box security for in service communication. It provides a basic secure communication channel and manages authentication, authorization and encryption of service communication on a large scale. With istio, service communication can be protected by default, so that developers and operators can execute policies consistently between various protocols and runtime without changing code or configuration.
- Observability: since istio’s data plane intercepts inbound and outbound traffic, you can understand the current deployment status. Istio provides powerful tracking, monitoring and logging capabilities to gain insight into service grid deployment. Istio comes with integrated and preconfigured Prometheus and grafana dashboards to improve observability.
Google and IBM offer managed istio as part of their managed kubernetes platform. Google builds knative as a server less computing environment based on istio. For Google services such as anthos and cloud run, istio has become the core foundation. Compared with other products, istio is considered as a complex and onerous service grid platform. But scalability and rich functions make it the preferred platform for enterprises.
Kuma, launched in September 2019, is one of the latest players in the service grid ecosystem. It is developed and maintained by API gateway company Kong, Inc., which builds open source and commercial products under the same name Kong.
Kuma is a logical extension of the Kong API gateway. The former deals with north-south flow, while the latter deals with east-west flow.
Like most service grid platforms, Kuma comes with separate data plane and control plane components. Control plane is the core supporter of service grid. It supports all the basic principles of service configuration and can be expanded infinitely to manage thousands of services in the whole organization. Kuma combines the fast data plane with the advanced control plane, which allows users to easily set permissions, publish indicators and set routing policies through the custom resource definition (CRD) in kubernetes or rest API.
Kuma’s data plane is tightly integrated with the envoy agent, so that the data plane can run in the virtual machine or container deployed in kubernetes. Kuma has two deployment modes:
When running in kubernetes, Kuma uses API server and etcd database to store configuration. In general mode, it needs external PostgreSQL as data storage.
Control plane component kuma – CP manages one or more data plane components Kuma – dp。 Every microservice registered on the grid runs Kuma – The only copy of DP. In kubernetes, kuma – CP runs as a CRD in the Kuma system namespace. The annotated namespace for Kuma can inject the data plane into each pod.
Kuma comes with a GUI that provides an overview of the deployment, including the status of each data plane registered with the control plane. You can use the same interface to view health checks, traffic policies, routing and tracking from agents attached to the microservice.
Kuma service grid has built-in Ca, which is used to encrypt traffic based on MTLs. Traffic licensing can be configured based on the tags associated with the microservice. Tracking can be integrated with Zipkin, and metrics can be redirected to Prometheus.
Kuma lacks some advanced elastic functions, such as open circuit, Retry, fault injection and delay injection.
Kuma is a well-designed clean service grid implementation. Its integration with Kong gateway may drive its adoption among existing users and customers.
Linkerd 2. X is an open source service grid built by buoyant for kubernetes. It is licensed with Apache V2 and is an incubation project for cloud native Computing Foundation.
Linkerd is an ultra lightweight and easy to install service grid platform. It has three components – 1) CLI and UI, 2) control plane and 3) data plane.
Once the CLI is installed on a computer that can communicate with the kubernetes cluster, you can use a single command to install the control plane. All components of the control plane are deployed and installed in the linker D namespace as kubernetes. The web and cli tools use the controller’s API server. The target component informs the agent running the data plane of the routing information. The injector is a kubernetes admission controller, which receives webhook requests every time a pod is created. The service is used to inject a proxy as a sidecar into each pod started in the namespace. The identity component is responsible for managing the certificates that are critical to the implementation of MTLs connections between agents. Click component to receive requests from CLI and Web UI to monitor requests and responses in real time.
Linkerd comes with preconfigured Prometheus and grafana components to provide ready-made dashboards.
The data plane has a lightweight agent that can be attached to the service as an auxiliary tool. There is a kubernetes initialization container to configure iptables to define traffic and connect agents to the control plane.
Linkerd meets all the attributes of the service grid – Traffic routing / Splitting, security and observability.
Interestingly, linkerd does not use envoy as a proxy. Instead, it relies on dedicated lightweight proxies written in the rust programming language. Linkerd has no built-in entry in its stack, but it can be used with an entry controller.
After istio, linkerd is one of the popular service grid platforms. Considering the lightweight and easy-to-use service grid, it attracts the attention of developers and operators.
Maesh is from containous, which founded the popular ingress traefik. Similar to Kong, Inc., containous established maesh to complement traefik. While maesh deals with east-west traffic in microservices, traefik drives north-south traffic. Like Kuma, maesh can be used with other entry controllers.
Compared with other service grid platforms, maesh adopts a different approach. It doesn’t use sidecar mode to manipulate traffic. Instead, it deploys a pod for each kubernetes node to provide well-defined service endpoints. Even with maesh deployed, microservices can continue to work. However, when they use alternative endpoints exposed by maesh, they can take advantage of service grid capabilities.
Maesh’s goal is to provide a non-invasive and non-invasive infrastructure that provides selection capabilities for developers. But it also means that the platform lacks some key functions, such as transparent TLS encryption.
Maesh supports the basic functions of service grid, including routing and observability (except security). It supports the latest specifications defined by the service mesh interface (SMI) project.
Of all the service grid technologies I have deployed in kubernetes, maesh is the simplest and fastest platform.