Usage analysis of kbucket on libp2p RS

Time:2021-9-10

At present, kad protocol has been implemented on the main line of the project, and kbucket, as a link of storage nodes, is worthy of analysis.

Brief introduction to kbucket

In kad, each time a peer obtains the information of a node, it will store it in its own kbucket. Each peer_ ID is passed by the public key through Sha2_ 256 operation, with a length of 32 bytes. Each node can obtain the longest prefix through XOR operation with other nodes, that is, the number of consecutive zeros from the first bit. The more zeros, the closer the two nodes are. There can be up to 32 * 8 consecutive zeros. Therefore, for kbucket, the maximum number of buckets is 256.

Kbucket structure

In trust libp2p, each kbucket internally maintains an arrayvector of node type with a size of 20, in which the node structure uses the key to store the peer_ ID and value store address information; first_ connected_ POS is used as the connection mark bit of kbucket to record the node subscripts that can be cleaned up; It also provides apply_ The pending attribute stores the node to be inserted

Unlike trust libp2p, libp2p RS uses a new design method. Because of the existence of PeerStore, we do not need to store the address information corresponding to peer in kbucket. Correspondingly, some connection information related to peer, such as the last connection time, can be stored in node as a new value. Another advantage of this design is that you don’t need to apply_ Pending and connect_ POS these attributes. Each peer can maintain its own connection status information. When kbucket is cleaned up, you can directly use the filter method to perform relevant operations. Based on the above, we designed a structure peerinfo as a new value type

Three properties are recorded in peerinfo

/// The information of a peer in Kad routing table.
#[derive(Clone, Debug)]
pub struct PeerInfo {
    /// The time instant at which we talk to the remote peer.
    /// Sets to `Some` if it is deemed to be alive. Otherwise,
    /// it is set to `None`
    aliveness: Option<Instant>,

    /// The time this peer was added to the routing table.
    added_at: Instant,

    /// reserved for future use?
    replaceable: bool,
}

Liveness indicates the last communication time, added_ At records the time when the node is added to the routing table, and replaceable marks whether this information can be replaced (not enabled at present). In this way, it is easier to judge the status of related peers when adding or deleting kbuckets.

code analysis

The following is the try of kbuckettable_ add_ Peer() for analysis:

  1. First, judge whether it is an existing node. If it exists and belongs to the method invoked in the iterative query process, it is necessary to update the last communication time of peer.
  2. If the node does not already exist, it needs to be discussed by situation

    1. If the insert method can be successfully added by calling, you need to set the GC flag bit of the node to false in the PeerStore to prevent the address information from being cleaned up due to GC and repeated iterative queries.
    2. If the addition fails, the kbucket is full and needs to be cleaned up. First, filter and min_ By finds the node that has not communicated for the longest time, expels it from the kbucket, and modifies the PeerStore to be GC accessible. Then insert the new node into the kbucket, and the PeerStore tag does not perform GC
fn try_add_peer(&mut self, peer: PeerId, queried: bool) {
        let timeout = self.check_kad_peer_interval;
        let now = Instant::now();
        let key = kbucket::Key::new(peer.clone());

        log::debug!(
            "trying to add a peer: {:?} bucket-index={:?}, query={}",
            peer,
            self.kbuckets.bucket_index(&key),
            queried
        );

        match self.kbuckets.entry(&key) {
            kbucket::Entry::Present(mut entry) => {
                // already in RT, update the node's aliveness if queried is true
                if queried {
                    entry.value().set_aliveness(Some(Instant::now()));
                    log::debug!("{:?} updated: {:?}", peer, entry.value());
                }
            }
            kbucket::Entry::Absent(mut entry) => {
                let info = PeerInfo::new(queried);
                if entry.insert(info.clone()) {
                    log::debug!("Peer added to routing table: {} {:?}", peer, info);
                    // pin this peer in PeerStore to prevent GC from recycling multiaddr
                    if let Some(s) = self.swarm.as_ref() {
                        s.pin(&peer)
                    }
                } else {
                    log::debug!("Bucket full, trying to replace an old node for {}", peer);
                    // try replacing an 'old' peer
                    let bucket = entry.bucket();
                    let candidate = bucket
                        .iter()
                        .filter(|n| n.value.get_aliveness().map_or(true, |a| now.duration_since(a) > timeout))
                        .min_by(|x, y| x.value.get_aliveness().cmp(&y.value.get_aliveness()));

                    if let Some(candidate) = candidate {
                        let key = candidate.key.clone();
                        let evicted = bucket.remove(&key);
                        log::debug!("Bucket full. Peer node added, {} replacing {:?}", peer, evicted);
                        // unpin the evicted peer
                        if let Some(s) = self.swarm.as_ref() {
                            s.unpin(key.preimage())
                        }
                        // now try to insert the value again
                        let _ = entry.insert(info);
                        // pin this peer in PeerStore to prevent GC from recycling multiaddr
                        if let Some(s) = self.swarm.as_ref() {
                            s.pin(&peer)
                        }
                    } else {
                        log::debug!("Bucket full, but can't find an replaced node, give up {}", peer);
                    }
                }
            }
            _ => {}
        }
    }

Netwops is composed of a domestic senior cloud computing and distributed technology development team, which has very rich landing experience in finance, power, communication and Internet industries. At present, netwops has set up R & D centers in Shenzhen and Beijing, with a team size of 30 +, most of which are technicians with more than 10 years of development experience, respectively from professional fields such as Internet, finance, cloud computing, blockchain and scientific research institutions.
Netwops focuses on the R & D and application of secure storage technology products. Its main products include decentralized file system (DFS) and decentralized computing platform (DCP). It is committed to providing distributed storage and distributed computing platform based on decentralized network technology. It has the technical characteristics of high availability, low power consumption and low network, and is suitable for the Internet of things Industrial Internet and other scenarios.
Official account: Netwarps