A situation in which the kudu table cannot be accessed normally

Time:2020-6-9

1 General

In this paper, we sort out a situation in which the kudu table cannot be accessed normally, and how to confirm this situation and how to deal with it. The solution of this paper refers to document 11

2 problem description and solution

2.1 problem description

When we use the kudu Java API or other methods (hue) to access the kudu table, we cannot access it normally. For example, when using the kudu Java API to access the kudu table, output the following log contents.
Using kudu-client-1.5.0.jar may output the following:

2020-01-18 19:18:35 INFO AsyncKuduClient.invalidateTabletCache:1432 Removing server 50864b767cb542098bd25299f43dccb8 from this tablet's cache f5de27bb309e48d7ad95c031c37bec7a
2020-01-18 19:18:35 ERROR Connection.exceptionCaught:418 [peer 50864b767cb542098bd25299f43dccb8] unexpected exception from downstream on [id: 0x07e116d0]
java.net.ConnectException: Connection refused: no further information: node-2/192.168.3.131:7050

Using kudu-client-1.11.1.jar may output the following:

2020-01-18 19:16:47 INFO AsyncKuduClient.invalidateTabletCache:2157 Invalidating location 50864b767cb542098bd25299f43dccb8(node-2:7050) for tablet f5de27bb309e48d7ad95c031c37bec7a: java.net.ConnectException: Connection refused: no further information: node-2/192.168.3.131:7050

2.2 causes of problems

In kudu, the table consists of a table. In the case of 3 replicas, if there are 2 replicas where the tablet server is unavailable (TS unavailable), and one of them is a leader. This table cannot be accessed normally.
Reason (personal understanding): Based on raft23In the consistency algorithm, when the number of members is n = 3, the majority of votes of N / 2 + 1 = 2 are needed, and the candidate can be the leader. For the election between the above table copies. The rest of the three replicas can’t be a leader (because the other two replicas’ tablet servers are unavailable, the normal replicas can’t get two votes). If consistency cannot be achieved, the corresponding table of the replica cannot normally provide external read-write access.

2.3 problem diagnosis

Obviously, you need to check the copy status of the table of kudu table. You can use the command-line tool ksck provided by kudu45

Examples of ksck use: ` kudu cluster ksck node-1:7051, node-2:7051, node-3:7051 - tables = impala:: qianyi2.cat`

The output is as follows.

[[email protected]]# kudu cluster ksck node-1:7051,node-2:7051,node-3:7051 -tables=impala::qianyi2.cat
...
Tablet f5de27bb309e48d7ad95c031c37bec7a of table 'impala::qianyi2.cat' is unavailable: 2 replica(s) not RUNNING
  30df8cd3a6064d02a502118575138b76 (node-4:7050): RUNNING
  50864b767cb542098bd25299f43dccb8 (node-2:7050): TS unavailable [LEADER]
  60469325c50d4c388f93048c0c909a66 (node-1:7050): TS unavailable

Table impala::qianyi2.cat has 1 unavailable tablet(s)

Table Summary
        Name         |   Status    | Total Tablets | Healthy | Under-replicated | Unavailable
---------------------+-------------+---------------+---------+------------------+-------------
 impala::qianyi2.cat | UNAVAILABLE | 1             | 0       | 0                | 1
==================
Errors:
==================
error fetching info from tablet servers: Network error: Not all Tablet Servers are reachable
table consistency check error: Corruption: 1 out of 1 table(s) are bad

Through the output of ksck, in this Tablet, the tablet server (node-2:7050 and node-1:7050) with 2 copies of uuid=f5de27bb309e48d7ad95c031c37bec7a is not available.

2.4 problem solving

How to deal with this situation?Priority to restore tablet server。 First, query whether the node tablet server service has been shut down. If so, start the tablet server service first. Then check ksck again to see the availability of the tablet copy.
If the tablet server cannot be recovered, consider the following methods. The following methods may result in the loss of the latest modified data of the table,This method should not be used preferentially。 Please refer to the literature for specific contents6
The following is an excerpt from the references.

On each tablet server with a healthy replica, alter the consensus configuration to remove unhealthy replicas. In the typical case of 1 out of 3 surviving replicas, there will be only one healthy replica, so the consensus configuration will be rewritten to include only the healthy replica.
Once the healthy replicas’ consensus configurations have been forced to exclude the unhealthy replicas, the healthy replicas will be able to elect a leader. The tablet will become available for writes, though it will still be under-replicated. Shortly after the tablet becomes available, the leader master will notice that it is under-replicated, and will cause the tablet to re-replicate until the proper replication factor is restored. The unhealthy replicas will be tombstoned by the master, causing their remaining data to be deleted.6

On each tablet server with a normal copy, change the consistency configuration to remove the abnormal copy. In a typical case of one-third of the surviving replicas, there is only one healthy replica, so the consistent configuration will be overridden to contain only healthy replicas.
Once the health replica is in a consistent configuration and the unhealthy replica is forced to be excluded, the healthy replica will be able to select a leader. Although it has fewer than three copies, this tablet will still be writable. Shortly after the tablet is available, leader master will notice that its number of replicas is less than 3, and will cause the tablet to replicate again until the correct number of replica factors are restored. The abnormal replica will be logically deleted by the primary replica to delete its remaining data.

The reason (personal understanding) for selecting a leader in the above operations: in the new consistency configuration of the healthy copy, if the number of members is n = 1 (excluding two unhealthy copies), then the number of votes obtained by N / 2 + 1 = 1 is the majority of votes, and the healthy copy is voted by itself, and the leader can be obtained by obtaining one vote. That is to say, the leader can be elected normally and the consistency is achieved. Then the table of the table can be accessed normally.
Use the command line tools provided by kudu in:kudu remote_replica unsafe_change_config7

kudu remote_replica unsafe_change_config <tserver_address> <tablet_id> <peer uuids>… 

In this case, a healthy tserver_ address=node-4:7050,tablet_ Id = f5de27bb309e48d7ad95c031c37bec7a, the UUID of a healthy tserver, that is, peer UUIDs = 30df8cd3a6064d02a502118575138b76. The command that should be executed is as follows.

sudo -u kudu kudu remote_replica unsafe_change_config node-4:7050 f5de27bb309e48d7ad95c031c37bec7a 30df8cd3a6064d02a502118575138b76

3 others

3.1 batch processing problems


If there are multiple tables that cannot be accessed, you can use the program to parse the output of ksck, batch process to generate the command to modify the consistency configuration, and then batch modify the command.

3.2 when the tablet server is closed, the movement of the tablet.

This part is a supplement to the content of this paper.

Tablet replicas are not tied to a UUID.Kudu doesn’t do tablet re-balancing at runtime, so new tablet server will get tablets the next time a node dies or if you create new tables.8
When a tablet server is shut down for more than a certain period of time (5 minutes by default), the tablet on the tablet server will be moved to another tablet server9。 If a tablet server is shut down for a long time and started again, the number of tablets on this tablet server may be small. There are more tablet data on other tablet servers, which is unbalanced.

3.3 other supplements

The personal understanding part of the article shall be marked separately. Where the views are unreasonable, you are welcome to give advice.

4 references


  1. https://www.cnblogs.com/barne… ([original] kudu of big data base (2) remove dead tsever Mr. craftsman blog Park) A kind of
  2. https://raft.github.io/ (Raft Consensus Algorithm) ↩
  3. https://www.jianshu.com/p/4c8… (consensus algorithm – raft protocol – short form) A kind of
  4. https://kudu.apache.org/relea… ↩
  5. https://docs.cloudera.com/run… ↩
  6. https://kudu.apache.org/docs/… ↩
  7. https://kudu.apache.org/relea… ↩
  8. https://mail-archives.apache…. (Re: How to reuse tablet server UUID, or removing old one) ↩
  9. https://docs.cloudera.com/run… (Decommissioning or permanently removing a tablet server from a cluster) ↩