Hand a grpc service discovery + load balancing + grpc load balancing part of the source code learning

Time:2021-3-3

assembly

server

The server of grpc needs three components

  • Service provides client response;
  • Register is that the server registers its address to the registry every time it starts;
  • Unregister is to clear the registration information from the registry every time the server terminates;

register

func Register(client *etcd3.Client, target, key, val         string, stopSignal chan os.Signal) error {
    go func() {
        ticker := time.NewTicker(1 * time.Second)
        for {
            select {
            case <-ticker.C:
                ttl := 10
                resp, err := client.Grant(context.Background(), int64(ttl))
                if err != nil {
                    log.Println("grant lease error ", err)
                }
                _, err = client.Get(context.Background(), key)
                if err != nil {
                    if err == rpctypes.ErrKeyNotFound {
                        if _, err = client.Put(context.Background(), key, val, etcd3.WithLease(resp.ID)); err != nil {
                            log.Printf("put %+v in etcd error:%+v", val, err)
                        }
                    } else {
                        log.Printf("get from etcd error:%+v", err)
                    }
                } else {
                    if _, err = client.Put(context.Background(), key, val, etcd3.WithLease(resp.ID)); err != nil {
                        log.Printf("put %+v in etcd error:%+v", val, err)
                    }
                }
                select {
                case <-stopSignal:
                    return
                default:
                }
            }
        }
    }()
    return nil
}

Main logic description:
First, etcd The client goes to etcd to obtain the key of the service instance. If it cannot be obtained, it will save the key of the instance to etcd. Val, the key is “service name / address of this startup”, and val is the address of this startup. If it is obtained, it means that the key has been saved, and you only need to keep alive. Note that register is an infinite loop protocol, which will constantly check the registration of the current instance In addition, it has timeliness to prevent the client from sending the request to the instance without cleaning up its own registration information.

unregister

func Unregister(client *etcd3.Client, key string) error {
    var err error
    if _, err := client.Delete(context.Background(), key); err != nil {
        log.Printf("grpclb: unregister '%s' failed: %s", key, err.Error())
    } else {
        log.Printf("grpclb: unregister '%s' ok.", key)
    }
    return err
}

Main logic: the deregistration information is called when the service exits normally, and the registration information is cleared from the etcd to prevent the client from accepting the request after that.

service

func RungGRPCServer(grpcPort int16) {

    svcKey := fmt.Sprintf("%+v/%+v/127.0.0.1:%+v", lb.PREFIX, SERVICE_NAME, *port)
    svcVal := fmt.Sprintf("127.0.0.1:%+v", *port)

    //Start a grpc server
    grpcServer := grpc.NewServer()
    //Binding services to implement registerhelloworldserviceserver

    client, err := etcd3.New(etcd3.Config{
        Endpoints: strings.Split(ETCD_ADDR, ","),
    })
    if err != nil {
        log.Fatalf("creat etcd3 client failed:%+v", err)
    }
    gs := &GreeterService{client}

    ch := make(chan os.Signal, 1)
    signal.Notify(ch, syscall.SIGTERM, syscall.SIGINT, syscall.SIGKILL, syscall.SIGHUP, syscall.SIGQUIT)
    go func() {
        s := <-ch
        log.Printf("receive stop signal:%+v", s)
        lb.Unregister(gs.etcdCli, svcKey)
        os.Exit(1)
    }()

    lb.Register(gs.etcdCli, ETCD_ADDR, svcKey, svcVal, ch)
    //Only when all methods of greeterservice are implemented can they be registered with grpcserver
    api.RegisterGreeterServer(grpcServer, gs)

    //Listening port
    listen, e := net.Listen("tcp", fmt.Sprintf(":%+v", *port))

    if e != nil {
        log.Fatal(e)
    }

    //Binding listening port
    log.Printf("serve gRPC server: 127.0.0.1:%+v", grpcPort)
    if err := grpcServer.Serve(listen); err != nil {
        log.Printf("failed to serve: %v", err)
        return
    }
}

Main logic: start the instance, register in the registry, and set up a coroutine to listen for the exit information. If you exit, call the logout function to logout the registration information of the starting instance;

client

client

func main() {

    r := lb.NewResolver(GREETER_SERVICE)
    //R needs to be implemented naming.Resolver Resolve interface for
    b := grpc.RoundRobin(r)

    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    defer cancel()
    conn, err := grpc.DialContext(ctx, ETCD_ADDR, grpc.WithInsecure(), grpc.WithBalancer(b))
    time.Sleep(5 * time.Second)

    if err != nil {
        log.Fatalf("did not connect: %v", err)
    }

    defer conn.Close()

    c := api.NewGreeterClient(conn)

    name := "world"
    if len(os.Args) > 1 {
        name = os.Args[1]
    }
    ticker := time.NewTicker(1 * time.Second)
    for {
        select {
        case <-ticker.C:
            log.Println("start to request...")
            r, err := c.SayHello(context.Background(), &api.HelloRequest{Name: name})

            if err != nil {
                log.Fatalf("call say hello fail: %v", err)
            }
            log.Println(r.Reply)
        }
    }
}

If the client wants to realize the function of load balancing, it needs to add thegrpc.WithBalancer(b)You can see that there are two components:
r := lb.NewResolver(GREETER_SERVICE)b := grpc.RoundRobin(r). Resolver needs the name of the service that you want to access and where to find the address corresponding to the service name; finally, it will be used as a parameter to initialize balancer.

resolver

type resolver struct {
    serviceName string
}

func NewResolver(serviceName string) *resolver {
    return &resolver{
        serviceName: serviceName,
    }
}

func (r *resolver) Resolve(target string) (naming.Watcher, error) {
    if r.serviceName == "" {
        log.Println("no service name provided")
        return nil, fmt.Errorf("no service name provided")
    }

    client, err := etcd3.New(etcd3.Config{
        Endpoints: strings.Split(target, ","),
    })
    if err != nil {
        return nil, fmt.Errorf("grpclb: create etcd3 client failed: %s", err.Error())
    }
    return &watcher{client, false, r.serviceName}, nil
}

Using resolver to implement the naming.Resolver Interface, which has only one method: resolve.

type Resolver interface {
    // Resolve creates a Watcher for target.
    Resolve(target string) (Watcher, error)
}

Returns a watcher according to the target given by the client. The watcher is also one naming.Watcher Interface, which has two methods: next and close.

type Watcher interface {
    // Next blocks until an update or error happens. It may return one or more
    // updates. The first call should get the full set of the results. It should
    // return an error if and only if Watcher cannot recover.
    Next() ([]*Update, error)
    // Close closes the Watcher.
    Close()
}

The next method of the watcher returns a naming.Update Array, which identifies the changes that the watcher listens to each time:

type Update struct {
    // Op indicates the operation of the update.
    Op Operation
    // Addr is the updated address. It is empty string if there is no address update.
    Addr string
    // Metadata is the updated metadata. It is nil if there is no metadata update.
    // Metadata is not required for a custom naming implementation.
    Metadata interface{}
}

Select the correct address for the client according to the change.
From grpc.balancer You can see how to call this interface in the source code of

func (rr *roundRobin) Start(target string, config BalancerConfig) error {
    rr.mu.Lock()
    defer rr.mu.Unlock()
    if rr.done {
        return ErrClientConnClosing
    }
    if rr.r == nil {
        // If there is no name resolver installed, it is not needed to
        // do name resolution. In this case, target is added into rr.addrs
        // as the only address available and rr.addrCh stays nil.
        rr.addrs = append(rr.addrs, &addrInfo{addr: Address{Addr: target}})
        return nil
    }
    w, err := rr.r.Resolve(target)
    if err != nil {
        return err
    }
    rr.w = w
    rr.addrCh = make(chan []Address, 1)
    go func() {
        for {
            if err := rr.watchAddrUpdates(); err != nil {
                return
            }
        }
    }()
    return nil
}

It can be seen from the source code that the resolver’s resolve interface is called first, and a watcher is returned. Then the watchaddrupdates method of the watcher is called continuously in an infinite loop to listen for changes.

So we can see the intention of grpc interface design: the target of resolver is used to provide to the watcher, and the watcher uses this target to address the specific service address.

Let’s look at the specific implementation of watchaddrupdates in the source code

func (rr *roundRobin) watchAddrUpdates() error {
    updates, err := rr.w.Next()
    if err != nil {
        grpclog.Warningf("grpc: the naming watcher stops working due to %v.", err)
        return err
    }
    rr.mu.Lock()
    defer rr.mu.Unlock()
    for _, update := range updates {
        addr := Address{
            Addr:     update.Addr,
            Metadata: update.Metadata,
        }
        switch update.Op {
        case naming.Add:
            var exist bool
            for _, v := range rr.addrs {
                if addr == v.addr {
                    exist = true
                    grpclog.Infoln("grpc: The name resolver wanted to add an existing address: ", addr)
                    break
                }
            }
            if exist {
                continue
            }
            rr.addrs = append(rr.addrs, &addrInfo{addr: addr})
            case naming.Delete:
            for i, v := range rr.addrs {
                if addr == v.addr {
                    copy(rr.addrs[i:], rr.addrs[i+1:])
                    rr.addrs = rr.addrs[:len(rr.addrs)-1]
                    break
                }
            }
        default:
            grpclog.Errorln("Unknown update.Op ", update.Op)
        }
    }
    // Make a copy of rr.addrs and write it onto rr.addrCh so that gRPC internals gets notified.
    open := make([]Address, len(rr.addrs))
    for i, v := range rr.addrs {
        open[i] = v.addr
    }
    if rr.done {
        return ErrClientConnClosing
    }
    select {
    case <-rr.addrCh:
    default:
    }
    rr.addrCh <- open
    return nil
}

From the source code, we can see that every time we call the next method of the watcher, we return an update array, and then traverse each item of the array. If it is an add type event, we get the address and compare it with the address list of our own cache one by one. If it already exists in the cache, we will not do any operation; otherwise, we will add it to the address list of the cache; if it is a delete type event, we will delete it from the cache Delete the address from the address list.Finally, put the address list into the addrch pipeline, which is a bit confusing

Therefore, we need to implement the next method of the watcher and return the update event array.

watcher

func (w *watcher) Next() ([]*naming.Update, error) {
    key := fmt.Sprintf("%s/%s", PREFIX, w.serviceName)
    baseCtx := context.TODO()
    log.Printf("%+v, call next method", w)
    if !w.init {
        ctx, cancel := context.WithTimeout(baseCtx, 3*time.Second)
        defer cancel()
        resp, err := w.cli.Get(ctx, key, etcd3.WithPrefix())
        //log.Printf("get etcd resp:%+v", resp)
        if err != nil {
            log.Println("get from etcd error ", err)
        } else {
            w.init = true
            addrs := getAddrFromResp(resp)
            if len(addrs) != 0 {
                updates := make([]*naming.Update, 0, len(addrs))
                for _, addr := range addrs {
                    updates = append(updates, &naming.Update{
                        Op:   naming.Add,
                        Addr: addr,
                    })
                }
                return updates, nil
            }
        }
    }
    rch := w.cli.Watch(context.Background(), key, etcd3.WithPrefix())
    for wresp := range rch {
        events := wresp.Events
        log.Printf("get etcd events:%+v", events)
        updates := make([]*naming.Update, 0, len(events))
        for _, ev := range events {
            switch ev.Type {
            case mvccpb.PUT:
                updates = append(updates, &naming.Update{
                    Op:   naming.Add,
                    Addr: string(ev.Kv.Value),
                })
            case mvccpb.DELETE:
                updates = append(updates, &naming.Update{
                    Op:   naming.Delete,
                    Addr: string(ev.Kv.Value),
                })
            }
        }
        return updates, nil
    }
    return nil, nil
}

When the watcher starts for the first time, we will directly take all the addresses of the service name prefix from etcd, put each item into the update array, and return.
After that, it directly listens to the prefix key of the service name and gets the change event from its pipeline. If it is a put event, it indicates that a new service instance has been added to the update array, and the event type is add. If it listens to a delete event, it is also added to the update array, and the event type is delete.

The implementation of extracting address from etcd response is as follows:

func getAddrFromResp(resp *etcd3.GetResponse) []string {
    results := resp.Kvs
    addrs := make([]string, 0, len(results))
    for _, r := range results {
        if string(r.Value) != "" {
            addrs = append(addrs, string(r.Value))
        }
    }
    return addrs
}

demonstration

So far, we have implemented all the interfaces required by grpc load balancing. After starting, we can find that the requests are evenly distributed to each instance, and the new server can receive the requests quickly.
Hand a grpc service discovery + load balancing + grpc load balancing part of the source code learning

summary

deficiencies:

  1. The grpc version used in this experiment is v1.26.0, grpc.RoundRobin The interface has been abandoned in the new version of grpc
// RoundRobin returns a Balancer that selects addresses round-robin. It uses r to watch
// the name resolution updates and updates the addresses available correspondingly.
//
// Deprecated: please use package balancer/roundrobin. May be removed in a future 1.x release.
func RoundRobin(r naming.Resolver) Balancer {
    return &roundRobin{r: r}
}

When you have time, you can use the new version of balancer / roundrobin to implement the first version

  1. If the register interface gets the prefix of the service name, it can directly call the keep alive interface of etcdv3 to renew the key without using put.
  2. The watcher monitors the etcd pipeline and returns directly every time it gets the data. If the pipeline is closed, there is a risk of data loss. A background process can be set up here. Each time the data is sent to the pipeline, the main process takes the data from the pipeline, so as to avoid the frequent closing operation of etcd resp pipeline.

Reference article

https://segmentfault.com/a/11…