Thoroughly understand the etcd series (3): etcd cluster operation and maintenance deployment

Time:2020-8-3

0 album overview

Etcd is an important basic component in cloud native architecture, which is incubated and hosted by CNCF. Etcd can be used not only as service registration and discovery, but also as a key value storage Middleware in microservices and kubernates clusters.

“Thoroughly understanding etcd series” will introduce etcd from the aspects of etcd’s basic function practice, API interface, implementation principle, source code analysis, as well as the experience in the implementation of etcd. It is expected that there will be about 20 articles, and the author will continue to update every week.

1 etcd cluster deployment

In the production environment, for the high availability of the whole cluster, etcd is normally deployed in clusters to avoid single point of failure. This section describes how to deploy etcd clusters. There are three mechanisms to boot the etcd cluster

  • static state
  • Etcd dynamic discovery
  • DNS discovery

Static startup of the etcd cluster requires that each member knows the other member of the cluster. In many cases, the IP of a cluster member may be unknown in advance. In these cases, the etcd cluster can be booted with the help of discovery services.

We will introduce these methods respectively.

2. Start etcd cluster in static mode

Stand alone installation

If you want to practice building etcd clusters on a single machine, you can use the goreman tool.

Goreman is a multi process management tool written in go language. It is a rewriting of foreman, which is widely used in Ruby (the original author of foreman also implemented a go version: forego, but it is not as easy to use as goreman).

We need to confirm the go installation environment, and then directly execute:

go get github.com/mattn/goreman

The compiled file is placed in the$GOPATH/binMedium,$GOPATH/binThe directory has been added to the system$PATHSo we can easily execute the commandgoremanCommand. The following is how to write the procfile script. We start three etcds, which are as follows:

Hostname | IP | client interaction port | peer communication port
:-: | :-: | :-: | :-:
infra1 | 127.0.0.1 | 12379 | 12380 |
infra2 | 127.0.0.1| 22379 | 22380 |
infra3 | 127.0.0.1| 32379 | 32380 |

The procfile script is as follows:

etcd1: etcd --name infra1 --listen-client-urls http://127.0.0.1:12379 --advertise-client-urls http://127.0.0.1:12379 --listen-peer-urls http://127.0.0.1:12380 --initial-advertise-peer-urls http://127.0.0.1:12380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380' --initial-cluster-state new --enable-pprof --logger=zap --log-outputs=stderretcd2: etcd --name infra2 --listen-client-urls http://127.0.0.1:22379 --advertise-client-urls http://127.0.0.1:22379 --listen-peer-urls http://127.0.0.1:22380 --initial-advertise-peer-urls http://127.0.0.1:22380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380' --initial-cluster-state new --enable-pprof --logger=zap --log-outputs=stderretcd3: etcd --name infra3 --listen-client-urls http://127.0.0.1:32379 --advertise-client-urls http://127.0.0.1:32379 --listen-peer-urls http://127.0.0.1:32380 --initial-advertise-peer-urls http://127.0.0.1:32380 --initial-cluster-token etcd-cluster-1 --initial-cluster 'infra1=http://127.0.0.1:12380,infra2=http://127.0.0.1:22380,infra3=http://127.0.0.1:32380' --initial-cluster-state new --enable-pprof --logger=zap --log-outputs=stderr

Configuration Item Description:

  • –Name: the node name in the etcd cluster, which can be distinguished and not repeated
  • –Listen peer URLs: the URLs used for communication between nodes can be monitored. Multiple URLs can be monitored. Data interaction (such as election, data synchronization, etc.) will be carried out in the cluster through these URLs
  • –Initial appeal peer URLs: the recommended URL for communication between nodes, which will be used for communication between nodes.
  • –Listen client URLs: the URL used for client communication. It can also listen to multiple URLs.
  • –Advice client URLs: the recommended client communication URL, which is used by etcd agents or etcd members to communicate with etcd nodes.
  • –Initial cluster token: etcd-cluster-1, the token value of the node. After setting this value, the cluster will generate a unique ID and a unique ID for each node. When starting a cluster with the same configuration file, as long as the token value is different, the etcd cluster will not affect each other.
  • –Initial cluster: the collection of all initial appeal peer URLs in the cluster.
  • –Initial cluster state: new, the flag of the new cluster

Note the above script. The etcd command needs to be configured according to the actual local installation address. Let’s start the etcd cluster.

goreman -f /opt/procfile start

Use the above command to start the etcd cluster. After the startup is completed, check the members in the cluster.

$ etcdctl --endpoints=http://localhost:22379  member list8211f1d0f64f3269, started, infra1, http://127.0.0.1:12380, http://127.0.0.1:12379, false91bc3c398fb3c146, started, infra2, http://127.0.0.1:22380, http://127.0.0.1:22379, falsefd422379fda50e48, started, infra3, http://127.0.0.1:32380, http://127.0.0.1:32379, false

We successfully set up a pseudo cluster on a single machine. It should be noted that when the cluster is started, we specify the cluster members in a static way. In the actual environment, the IP addresses of the cluster members may not be known in advance. At this time, we need to adopt the mechanism of dynamic discovery.

Docker starts the cluster

Etcd usage gcr.io/etcd -Development / etcd is the main accelerator of the container, quay.io/coreos/etcd As an auxiliary accelerator. Unfortunately, we can’t access these two accelerators. If we can’t download them, we can use the address provided by the author:

docker pull bitnami/etcd:3.4.7

Then, retag the pulled image:

docker image tag bitnami/etcd:3.4.7 quay.io/coreos/etcd:3.4.7

After the image is set, we start the etcd cluster of three nodes. The script command is as follows:

REGISTRY=quay.io/coreos/etcd# For each machineETCD_VERSION=3.4.7TOKEN=my-etcd-tokenCLUSTER_STATE=newNAME_1=etcd-node-0NAME_2=etcd-node-1NAME_3=etcd-node-2HOST_1= 192.168.202.128HOST_2= 192.168.202.129HOST_3= 192.168.202.130CLUSTER=${NAME_1}=http://${HOST_1}:2380,${NAME_2}=http://${HOST_2}:2380,${NAME_3}=http://${HOST_3}:2380DATA_DIR=/var/lib/etcd# For node 1THIS_NAME=${NAME_1}THIS_IP=${HOST_1}docker run \  -p 2379:2379 \  -p 2380:2380 \  --volume=${DATA_DIR}:/etcd-data \  --name etcd ${REGISTRY}:${ETCD_VERSION} \  /usr/local/bin/etcd \  --data-dir=/etcd-data --name ${THIS_NAME} \  --initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://0.0.0.0:2380 \  --advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://0.0.0.0:2379 \  --initial-cluster ${CLUSTER} \  --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}# For node 2THIS_NAME=${NAME_2}THIS_IP=${HOST_2}docker run \  -p 2379:2379 \  -p 2380:2380 \  --volume=${DATA_DIR}:/etcd-data \  --name etcd ${REGISTRY}:${ETCD_VERSION} \  /usr/local/bin/etcd \  --data-dir=/etcd-data --name ${THIS_NAME} \  --initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://0.0.0.0:2380 \  --advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://0.0.0.0:2379 \  --initial-cluster ${CLUSTER} \  --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}# For node 3THIS_NAME=${NAME_3}THIS_IP=${HOST_3}docker run \  -p 2379:2379 \  -p 2380:2380 \  --volume=${DATA_DIR}:/etcd-data \  --name etcd ${REGISTRY}:${ETCD_VERSION} \  /usr/local/bin/etcd \  --data-dir=/etcd-data --name ${THIS_NAME} \  --initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://0.0.0.0:2380 \  --advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://0.0.0.0:2379 \  --initial-cluster ${CLUSTER} \  --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN}

Note that the above script is deployed on three machines, and each machine can execute the corresponding script. At run time, you can specify the API version:

docker exec etcd /bin/sh -c "export ETCDCTL_API=3 && /usr/local/bin/etcdctl put foo bar"

The installation method of docker is relatively simple. Readers can customize some configurations according to their needs.

3. Dynamic discovery starts etcd cluster

As mentioned earlier, in a real environment, the IP addresses of cluster members may not be known in advance. In this case, you need to use auto discovery to boot the etcd cluster instead of specifying a static configuration, a process calledfind。 We start three etcds, which are as follows:

Hostname | IP | client interaction port | peer communication port
:-: | :-: | :-: | :-:
etcd1 | 192.168.202.128 | 2379 | 2380 |
etcd2 | 192.168.202.129| 2379 | 2380 |
etcd3 | 192.168.202.130| 2379 | 2380 |

Principle of protocol

Discovery service protocol helps new etcd members discover all other members in the cluster boot phase using the shared URL.

The protocol uses a new discovery token to boot a unique etcd cluster. A discovery token can only represent one etcd cluster. As long as the discovery protocol on this token is started, it cannot be used to boot another etcd cluster, even if it fails halfway.

Tips: discovery service protocol is only used in the cluster boot phase and cannot be used for runtime reconfiguration or cluster monitoring.

Discovery protocol uses an internal etcd cluster to coordinate the bootloader for the new cluster. First, all new members interact with the discovery service and help generate the expected member list. After that, each new member uses this list to boot its server, which performs--initial-clusterFlag the same function, that is, set the member information of all clusters.

Get the token of discovery

Generate a unique token that will identify the new cluster. In the following steps, it will be used as the unique prefix in the discovery key space. A simple way is to use uuidgen:

UUID=$(uuidgen)

Specifies the size of the cluster

The cluster size must be specified when getting a token. The discovery service uses this size to know when all the members that will initially make up the cluster are discovered.

curl -X PUT http://10.0.10.10:2379/v2/keys/discovery/6c007a14875d53d9bf0ef5a6fc0257c817f0fb83/_config/size -d value=3

We need to send the URL address http://10.0.10.10 : 2379 / V2 / keys / discovery / 6c007a14875d53d9bf0ef5a6fc0257c817f0fb83 as--discoveryParameter to start etcd.

Nodes are automatically used http://10.0.10.10 : 2379 / V2 / keys / discovery / 6c007a14875d53d9bf0ef5a6fc0257c817f0fb83 directory for etcd registration and discovery services.

Public discovery service

When there is no available etcd cluster locally, the etcd official website provides an etcd storage address that can be accessed by the public network. We can get the directory of etcd service through the following command, and use it as--discoveryParameter usage.

Public discovery servicediscovery.etcd.ioWorks the same way, but with a layer of decoration that extracts ugly URLs, automatically generates UUIDs, and provides protection against excessive requests. The common discovery service still uses the etcd cluster as the data storage on it.

$ curl http://discovery.etcd.io/new?size=3http://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de

Start cluster in dynamic discovery mode

In etcd discovery mode, the command to start etcd is as follows:

#Etcd1 start $/ opt / etcd / bin / etcd -- name etcd1 -- initial appeal peer URLs http://192.168.202.128 :2380 \  --listen-peer-urls  http://192.168.202.128 :2380 \  --data-dir /opt/etcd/data \  --listen-client-urls  http://192.168.202.128 :2379, http://127.0.0.1 :2379 \  --advertise-client-urls  http://192.168.202.128 :2379 \  --discovery  https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de# Etcd2 start / opt / etcd / bin / etcd -- name etcd2 -- initial appeal peer URLs http://192.168.202.129 :2380 \  --listen-peer-urls  http://192.168.202.129 :2380 \  --data-di r /opt/etcd/data \  --listen-client-urls  http://192.168.202.129 :2379, http://127.0.0.1 :2379 \  --advertise-client-urls  http://192.168.202.129 :2379 \  --discovery  https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de# Etcd3 start / opt / etcd / bin / etcd -- name etcd3 -- initial appeal peer URLs http://192.168.202.130 :2380 \    --listen-peer-urls  http://192.168.202.130 :2380 \    - -data-dir /opt/etcd/data \    --listen-client-urls  http://192.168.202.130 :2379, http://127.0.0.1 :2379 \    --advertise-client-urls  http://192.168.202.130 :2379 \    --discovery  https://discovery.etcd.io/3e86b59982e49066c5d813af1c2e2579cbf573de

It should be noted that after we have finished the initialization of the cluster, this information will not work. When you need to add nodes, you need to use etcdctl for operation. For security, each time a new etcd cluster is started, the new discovery token is used for registration. In addition, if more than a specified number of nodes are started during initialization, the extra nodes will be automatically converted to proxy mode etcd.

Result verification

After the cluster is started, verify it. Let’s take a look at the members of the cluster

The results are as follows: 40e2ac06ca1674a7, started, etcd3 http://192.168.202.130 :2380,  http://192.168.202.130 :2379, false    c532c5cedfe84d3c, started, etcd1,  http://192.168.202.128 :2380,  http://192.168.202.128 :2379, false    db75d3022049742a, started, etcd2,  http://192.168.202.129 :2380,   http://192.168.202.129 :2379, false

The results are in line with the expectation, and then look at the health status of the nodes

$ /opt/etcd/bin/etcdctl  --endpoints=" http://192.168.202.128 :2379, http://192.168.202.129 :2379, http://192.168.202.130 : 2379 "endpoint health" results are as follows http://192.168.202.128 :2379 is healthy: successfully committed proposal: took = 3.157068ms     http://192.168.202.130 :2379 is healthy: successfully committed proposal: took = 3.300984ms     http://192.168.202.129 :2379 is healthy: successfully committ ed proposal: took = 3.263923ms


As you can see, all three nodes in the cluster are healthy and normal. The cluster was started successfully in dynamic discovery mode.

 

4 DNS self discovery mode

Etcd also supports boot using DNS SRV records. In fact, it is realized by using the SRV records of DNS. DNS SRV is a type of resource record supported in DNS database. It records the correspondence between computer and service information provided.

Dnsmasq installation

The DNS service is created using dnsmasq. Dnsmasq provides DNS cache, DHCP service and FTP service. As a domain name resolution server, dnsmasq can cache DNS requests to improve the connection speed of visited websites. Dnsmasq is lightweight and easy to configure, suitable for individual users or networks with less than 50 hosts. It also comes with a PXE server.

When a DNS request is received, dnsmasq first looks for the file / etc / hosts, and then looks for / etc/ resolv.conf External DNS as defined in. Configure dnsmasq as DNS cache server, and add local intranet resolution to / etc / hosts file. In this way, intranet machines will first query hosts files when querying, which is equivalent to sharing / etc / hosts to all intranet machines, so as to solve the problem of mutual identification of Intranet machines. Instead of editing the hosts file on a machine by machine basis or adding bind DNS records, you can edit only one hosts file.

Based on the CentOS 7 host used by the author, first install dnsmasq:

yum install dnsmasq

After installation, configure it. All configuration is completed in one file / etc/ dnsmasq.conf 。 We can also write configuration files of any name in / etc / dnsmasq. D.

Configure upstream server address

Resolv file configures dnsmasq additional upstream DNS servers. If it is not enabled, the default / etc of Linux host will be used/ resolv.conf Nameserver in.

$ vim /etc/ dnsmasq.conf# Add the following: resolv file = / etc/ resolv.dnsmasq.confsrv -host=_ etcd-server._ tcp.blueskykong.com ,etcd1. blueskykong.com ,2380,0,100srv-host=_ etcd-server._ tcp.blueskykong.com ,etcd2. blueskykong.com ,2380,0,100srv-host=_ etcd-server._ tcp.blueskykong.com ,etcd3. blueskykong.com ,2380,0,100

In dnsmasq.conf The corresponding domain name records in the configuration of the three servers involved, corresponding to etcd1, etcd2, etcd3.

Add the forwarding DNS address in the specified file
$ vim /etc/resolv.dnsmasq.confnameserver 8.8.8.8nameserver 8.8.4.4

These two free DNS services, we should not be unfamiliar. Readers can configure according to the actual local network.

Enable dnsmasq resolution locally
$ vim /etc/resolv.confnameserver 127.0.0.1

It’s understandable to configure dnsmasq parsing locally.

Add resolution record

Configure relevant a records for each domain name to point to the corresponding machine IP of etcd core node. There are three ways to add parsing records: using system default hosts, using custom hosts file, and using custom conf. Here we use the relatively simple first method.

$VIM / etc / hosts ා adds the following content resolution 192.168.202.128 etcd1 blueskykong.com192 .168.202.129 etcd2. blueskykong.com192 .168.202.130 etcd3. blueskykong.com

Start service

service dnsmasq start

After the startup, we verify:

  • The query results of SRV records on DNS server are as follows:

    $ dig @192.168.202.128 +noall +answer SRV _etcd-server._tcp.blueskykong.com
_etcd-server._tcp.blueskykong.com. 0 IN SRV     0 100 2380 etcd2.blueskykong.com.  _etcd-server._tcp.blueskykong.com. 0 IN SRV     0 100 2380 etcd1.blueskykong.com.  _etcd-server._tcp.blueskykong.com. 0 IN SRV     0 100 2380 etcd3.blueskykong.com.
  • Query domain name resolution results

  $ dig @192.168.202.128 +noall +answer etcd1.blueskykong.com etcd2.blueskykong.com etcd3.blueskykong.com

etcd1.blueskykong.com.  0       IN      A       192.168.202.128  etcd2.blueskykong.com.  0       IN      A       192.168.202.129
etcd3.blueskykong.com.  0       IN      A       192.168.202.130

So far, we have successfully installed dnsmasq. Let’s start the etcd cluster based on DNS discovery.

Start cluster

After completing the above two steps of DNS configuration, you can use DNS to start the etcd cluster. Etcd needs to be removed_ INITIAL_ Cluster configuration (for static service discovery) and specifies the DNS SRV domain name (etcd)_ DISCOVERY_ SRV)。 The URL parameter to configure DNS resolution is-discovery-srvThe start command of etcd1 node is as follows:

$ /opt/etcd/bin/etcd   --name etcd1 \--discovery-srv blueskykong.com \--initial-advertise-peer-urls http://etcd1.blueskykong.com:2380 \--initial-cluster-token etcd-cluster-1 \--data-dir /opt/etcd/data \--initial-cluster-state new \--advertise-client-urls http://etcd1.blueskykong.com:2379 \--listen-client-urls http://0.0.0.0:2379 \--listen-peer-urls http://0.0.0.0:2380

Etcd cluster members can broadcast using domain name or IP address, and the process started will resolve DNS records. –The resolved address in the initial appeal peer URLs must match the resolved address in the SRV target. The etcd member reads the resolved address to find out if it belongs to the cluster defined in the SRV record.

We verify the correctness of starting the cluster based on DNS discovery and view the list of cluster members

The results are as follows: 40e2ac06ca1674a7, started, etcd3 http://192.168.202.130 :2380,  http://etcd3.blueskykong.com :2379, falsec532c5cedfe84d3c, started, etcd1,  http://192.168.202.128 :2380,  http://etcd1.blueskykong.com :2379, falsedb75d3022049742a, started, etcd2,  http://192.168.202.129 :2380,   http://etcd2.blueskykong.com :2379, false

As you can see, the output etcd cluster has three members, as expected. Next, we will continue to verify the status of cluster nodes by IP address.

$ /opt/etcd/bin/etcdctl  --endpoints=" http://192.168.202.128 :2379, http://192.168.202.129 :2379, http://192.168.202.130 The results were as follows: http://192.168.202.129 :2379 is healthy: successfully committed proposal: took = 2.933555 mshttp://192.168.202.128 :2379 is healthy: successfully committed proposal: took = 7.252799 mshttp://192.168.202.130 :2379 is healthy: successfully committed proposal: took  = 7.415843ms

More etcd cluster operations, readers can try their own, the author does not start here one by one.

5 Summary

In this paper, based on the previous article, a supplement is made to the etcd cluster. It mainly introduces the startup methods of etcd cluster: static monomer, static docker, dynamic discovery and DNS discovery. So many installation positions are for our actual use. In the next article, we will specifically introduce the use of etcdctl.

Subscribe to the latest article. Welcome to my official account.

Recommended reading

  1. Comparison between etcd and other K-V components such as zookeeper and consult
  2. A thorough understanding of etcd series articles (1): first knowledge of etcd
  3. Thoroughly understand the etcd series (2): various installation positions of etcd

reference resources

etcd docs