Learn about the cloudnetflix hystrix elastic client in spring


I. why there should be client elastic mode

All systems will encounter failures, and the probability of single point failure of distributed system is higher. How to build applications to deal with failures is a key part of every software developer’s work. However, when building a system, most engineers only consider that the infrastructure or key services have completely failed, using technologies such as clustering key servers, load balancing between services, and remote deployment. Although these methods take into account the complete failure of component systems, they solve a small part of the problem of building elastic systems. When a service crashes, it’s easy to detect the service and failure, so the application can bypass it. However, when the service is running slowly, it is very difficult to detect that the service performance is getting worse and bypass it because of the following reasons:

  • Service degradation can start with intermittent failures and build an irreversible momentum – perhaps only a small number of service calls slow down at first, until suddenly the application container runs out of threads (all threads are waiting for the call to complete) and crashes completely.
  • An application is usually designed to handle a complete failure of a remote resource, not a partial degradation — usually, as long as the service does not die completely, the application will continue to call the service until the resource exhaustion crashes.

Poor performance remote services can cause a lot of potential problems. They are not only difficult to detect, but also trigger a chain reaction, which affects the entire application ecosystem. Without proper protection, a poor service can quickly drag down the entire application. Cloud based, microservice based applications are particularly vulnerable to these types of terminals, because they are composed of a large number of fine-grained distributed services that involve different infrastructure when completing user transactions.

2. What is the client elastic mode

Client elastic mode is to protect remote resources (another microservice call or database query) from crashing in case of remote service error or poor performance. The goal of these modes is to enable clients to “fail quickly”, not to consume resources such as database connections, thread pools, etc., but also to avoid the spread of remote service problems to clients’ consumers, causing “avalanche” effect. Spring cloud mainly uses four client elastic modes:

Client load balance mode

Circuit breaker mode

This mode imitates the circuit breaker in the circuit. With the software circuit breaker, when the remote service is called, the circuit breaker will monitor the call. If the call time is too long, the circuit breaker will intervene and interrupt the call. In addition, if the number of failed calls to a remote resource reaches a threshold, a fast failure strategy will be adopted to prevent future calls to the failed remote resource.

Fallback mode

When a remote call fails, an alternate code path is executed and an attempt is made to process the operation in other ways than to generate an exception. That is to provide an emergency measure for remote operation instead of simply throwing an exception.

Bulkhead mode

The bulkhead model is based on the basic concept of shipbuilding. We all know that a ship can be divided into multiple watertight compartments (bulkheads), so even if a few parts are broken through, the whole ship will not be submerged. Bring this concept into remote calls. If all calls are handled by the same thread pool, it is likely that a slow remote call will drag down the entire application. In bulkhead mode, you can isolate each remote resource and allocate their own thread pools so that they do not affect each other.

The following figure shows how these patterns are applied to microservices:

III. use in spring cloud

Use Netflix‘s hystrix library to implement the above elastic mode. Continue to use the project in the previous section to implement the elastic pattern for the licensingservice service.

1. Code modification

Dependency introduction

First, modify the POM file and add the following two dependencies:

<! -- this dependency is not necessary. Spring cloud starter hystrix has been brought, but 1.5.6 has been used in the release of camden.sr5. There is a inconsistency in this version. If there is no backup, java.lang.reflect.undeclaredtrowableexception will be thrown instead of com.netflix.hystrix.exception.hystrixruntimeexception,
Fixed this issue in subsequent releases -- >

Then add @ enablecircuitbreaker to the startup class to enable hystrix.

2. Realization of circuit breaker

First, modify the organizationcontroller in the organizationservice project, simulate the delay, and let the thread sleep for 2 seconds every two times

public class OrganizationController {
private static int count=1;
@GetMapping(value = "/organization/{orgId}")
public Object getOrganizationInfo(@PathVariable("orgId") String orgId) throws Exception{
Map<String, String> data = new HashMap<>(2);
data.put("id", orgId);
Data. Put ("name", orgid + "company");
return data;

Just add @ hystrixcommand to the method to realize timeout short circuit. If spring scans the annotated class, it will dynamically generate a proxy to wrap the method and manage all calls to the method through a thread pool dedicated to handling remote calls.

Modify the organizationbyribbonservice and organizationfeignclient in the licensingservice service, and annotate the methods with @ hystrixcommand.

Then access the interface localhost: 10011 / licensingbyribbon / 11313, localhost: 10011 / licensingbyfeign / 11313. After multiple visits, it can be found that the error com.netflix.hystrix.exception.hystrixruntimeexception is thrown, the circuit breaker is effective, and the operation time is 1s by default.

"timestamp": 1543823192424,
"status": 500,
"error": "Internal Server Error",
"exception": "com.netflix.hystrix.exception.HystrixRuntimeException",
"message": "OrganizationFeignClient#getOrganization(String) timed-out and no fallback available.",
"path": "/licensingByFeign/11313/"

The operation time can be modified by setting annotation parameters. If the timeout is set to be more than 2S, no operation time error will be reported. (I don’t know why setting failed in feign. It’s normal in ribbon. ) Generally, the configuration is written in the configuration file.

@HystrixCommand(commandProperties = {
@HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "20000")

3. Backup processing

Because there is a “middleman” between the consumers of remote resources and the resources themselves, developers can intercept service failures and choose alternative solutions. It is very easy to implement backup processing in hystrix.

1. Implementation in ribbon

Just add the property fallbackmethod = “methodname” to the @ hystrixcommand annotation, and the fallback method will be executed if the execution fails. Note that the protection method must be in the same class as the protected method, and the method signature must be the same. Modify the organizationbyribbonservice class under the service package in licensingservice to read as follows:

public class OrganizationByRibbonService {
private RestTemplate restTemplate;
public OrganizationByRibbonService(RestTemplate restTemplate) {
this.restTemplate = restTemplate;
@HystrixCommand(commandProperties = {
@HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "1000")
},fallbackMethod = "getOrganizationWithRibbonBackup")
public Organization getOrganizationWithRibbon(String id) throws Exception {
ResponseEntity<Organization> responseEntity = restTemplate.exchange("http://organizationservice/organization/{id}",
HttpMethod.GET, null, Organization.class, id);
return responseEntity.getBody();
public Organization getOrganizationWithRibbonBackup(String id)throws Exception{
Organization organization = new Organization();
Organization.setname ("organization service call failed");
return organization;

Start the application and visit localhost: 10011 / licensingbyribbon / 11313 / many times. You can find that when the call fails, the backup method will be enabled.

2. Implementation in feign

To implement the fallback pattern in feign, you need to write an implementation class of feign interface, and then specify the class in feign interface. Take licensing service for example. First, add an organizationfeignclientimpl class to the client package. The code is as follows:

public class OrganizationFeignClientImpl implements OrganizationFeignClient{
public Organization getOrganization(String orgId) {
Organization organization=new Organization();
Organization.setname ("data returned by fallback mode");
return organization;

Then modify the annotation of the organizationfeignclient interface, and change @ feignclient (“organizationservice”) to @ feignclient (name = “organizationservice”, fallback = organizationfeignclientimpl. Class.

Restart the project and visit localhost: 10011 / licensingbyfeign / 11313 / many times. You can see that the backup service is working.

When confirming whether to enable backup service, pay attention to the following two points:

  • Fallback is a mechanism to provide action plan when resource operation or failure occurs. If you just use backup to catch the operation exceptions and then only log them, you just need to try.. catch to catch the hystrixruntimeexception exception.
  • Note the actions performed by the fallback method. If another distributed service is invoked in the backup service, it is necessary to note the method of packing backup by using @HystrixCommand method.

4. Achieve bulkhead mode

In the application based on microservice, it is usually necessary to call multiple microservices to complete specific tasks. In the mode of not applicable to bulkhead, these calls are executed by the same batch of threads by default, and these threads are reserved for processing the requests of the entire Java container. Therefore, in the case of a large number of requests, a performance problem of a service will cause all threads in the Java container to be occupied, block new requests at the same time, and finally the container will crash completely.

Hystrix uses a thread pool to delegate all calls to remote services. By default, this thread pool has 10 worker threads. However, it is easy for a slow running service to occupy all threads. All hystrix provides an easy-to-use mechanism to create ‘bulkheads’ between different remote resource calls and isolate the calls of different services into different thread pools so that they do not affect each other.

To realize the isolated thread pool, you only need to add the annotation of thread pool on @ hystrixcommand. Take ribbon for example (feign is similar). Modify the organizationbyribbonservice class under the service package in licensingservice, and change the annotation of getorganizationwithribbon method to the following:

@HystrixCommand(commandProperties = {
@HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "1000")
}, fallbackMethod = "getOrganizationWithRibbonBackup",
threadPoolKey = "licenseByOrgThreadPool",
threadPoolProperties = {
@HystrixProperty(name = "coreSize", value = "30"),
@HystrixProperty(name = "maxQueueSize", value = "10")

If the value of the maxqueuesize property is set to – 1, all incoming requests will be saved using synchronousqueue, which forces the number of requests in process to never exceed the size of the thread pool. A value greater than 1 will use linkedblockingqueue.

Note: in the example code, the attribute values are hard coded into the hystrix annotation. In the actual application environment, configuration items are generally configured in spring cloud config, which is convenient for unified management.

All codes used this time: Click to jump

The above is the whole content of this article. I hope it will help you in your study, and I hope you can support developepaer more.