Microservice architecture not only brings many benefits, but also improves the complexity of the system. Traditional single applications are divided into distributed microservices according to different dimensions, and different microservices may even be written in different languages; In addition, the deployment of services is often distributed. There may be thousands of servers across multiple different urban data centers. The following is a typical micro service architecture. The number of nodes in the graph is relatively small. In Alipay, the whole transaction payment link is paid under one line, involving hundreds of nodes.
Microservicing introduces the following typical problems:
- Fault location is difficult. One request often involves multiple services. Troubleshooting even requires multiple teams
- It is difficult to sort out the complete call link, and the node call relationship is analyzed
- Performance analysis is difficult, and the performance of short plate joints
The above problems are actually observability problems of application:
This article will focus on the aspect of trace, which is completely distributed tracking. In 2010, Google published dapper’s paper and shared their solutions. It is regarded as the earliest distributed link tracking system in the industry. After that, major Internet companies have launched their own link tracking systems with reference to dapper’s ideas, including Zipkin of twitter, eagle eye of Ali, pinpoint, htrace of Apache and Jaeger of Uber; Of course, there is also the protagonist of our article: sofracer. There are various implementations of distributed links, which also gave birth to the specification of Distributed Link Tracking: opentracing. In 2019, opentracing and opencensus merged into opentelemetry.
Before going deep into sofatracer, first briefly explain opentracing, because sofatracer is built based on the opentracing specification (based on 0.22.0 opentracing, the new specification API is different). A trace consists of a span generated by a service call and its references. A span is a time span. A service call creates a new span, which is divided into a calling span and a called span. Each span includes:
- TraceId and SpanId
- Operation name
- time consuming
- Service call result
There are usually multiple service calls in a trace link, so there will also be multiple spans. The relationship between spans is declared by reference. The reference points from the caller to the service provider. Opentracing specifies two reference types:
- Childof, synchronous service call. The client needs the result returned by the server for subsequent processing;
- Followsfrom: asynchronous service call. The client does not wait for the result of the server.
A trace is a directed acyclic graph. The topology of one call can be shown as follows:
The spancontext in the figure is the data that will be shared in a request, so it is called span context. The data put in by a service node in the context is visible to all subsequent nodes, so it can be used for information transmission.
Traceid collects all service nodes in a request. The generation rules need to avoid conflicts between different traceids, and the overhead cannot be very high. After all, the generation of trace links is an additional overhead beyond the business logic. The traceid generation rule in sofatracer is: server IP + time when ID is generated + auto increment sequence + current process number, for example:
The first 8 bits 0ad1348f are the IP address of the machine that generates the traceid. This is a hexadecimal number. Every two bits represent a segment of the IP address. We can convert this number into hexadecimal to get the common IP address representation 10.209.52.143. You can also find the first server through which the request passes according to this rule. The following 13 bits 1403169275002 are the time when the traceid is generated. The next 4 bits 1003 is a self increasing sequence, rising from 1000 to 9000, returning to 1000 after reaching 9000, and then starting to rise. The last five bits 56696 are the current process ID. in order to prevent traceid conflict between single machine and multiple processes, the current process ID is added at the end of traceid.
The pseudo code is as follows:
Spanid records the service call topology. In softracer:
- The point represents the call depth
- The number represents the calling order
- Spanid is created by the client
The generation rules of traceid and spanid in sofracer refer to Alibaba’s eagle eye component
Combine the calling span and the called span, and combine traceid and spanid to build a complete service invocation topology:
Trace buried point
However, how do we generate and obtain trace data? This requires the introduction of the trace collector (instrumentation framework), which is responsible for:
- Generation, transmission and reporting of trace data
- Parsing and injection of trace context
And the trace collector should be automatic, low intrusion and low overhead. The typical trace collector structure is as follows, which is embedded before the business logic:
- Server received (SR), create a new parent span or extract from the context
- Call business code
- The business code initiates the remote service call again
- Client send (CS) creates a sub span to pass traceid, spanid and transparent data
- Client received (CR), end the current sub span and record / report the span
- Server send (SS) ends the parent span and records / reports the span
Steps 3-5 may not be or may be repeated many times.
There are various implementations of buried point logic. At present, the mainstream methods are as follows:
- Filter, request filter (Dubbo, sofrpc, spring MVC)
- AOP section (datasource, redis, mongodb)
- Hook mechanism (spring message, rocketmq)
In the Java language, both skywalking and pinpoint use javaagent to achieve automatic and non-invasive embedding. Typically, the trace embedding points of softracer for spring MVC are as follows:
The span of softracer is 100% created, but the log / report supports sampling. Relatively speaking, the header of log / report is higher, which is easier to become a performance bottleneck under large traffic / load. In other trace systems, span is generated by sampling, but in order to have 100% trace in case of call error, they adopt the strategy of reverse sampling.
Softracer prints trace information to the log file by default
- Client Digest: call span
- Server Digest: called span
- Client stat: call the data aggregation of span within one minute
- Server stat: data aggregation of span called within one minute
The default log format is JSON, but it can be customized.
A typical trace system, in addition to the collection and reporting of trace, also has collectors, storage and presentation (API & UI): application performance management, referred to as APM, as shown in the following figure:
The general requirements for trace data reporting include real-time, consistency, etc. sofatracer supports Zipkin reporting by default; Before storage, flow computing is involved, and the combination of calling span and called span is generally Alibaba jstorm or Apache Flink; After processing, it will be put into Apache HBase. Because the trace data is only useful for a short time, the automatic elimination mechanism of expired data is generally adopted, and the expiration time is generally about 7 ~ 10 days. In the last part of the presentation, query and analysis from HBase need support:
- Graphical display of directed acyclic graph
- Query by traceid
- Query by caller
- Query by callee
- Query by IP
Within ant group, we do not use span to report, but collect as needed after span is printed to the log. Its architecture is as follows:
(where relic and antique are not real system names.)
Daemonset agent is used to collect trace logs, digest logs are used for troubleshooting and stat logs are used for business monitoring, that is, the log contents to be collected. After log data collection, it will be processed by relic system: single machine log data cleaning and aggregation; Then, through the further integration of the antique system, the trace service data is aggregated through spark for application and service latitude. Finally, we save the processed trace data to the timing database ceresdb and provide it to the web console for query and analysis. The system can also be configured with monitoring and alarm, so as to warn the abnormality of the application system in advance. At present, the above monitoring and alarm can be quasi real-time, with a delay of about 1 minute.
The development of full link tracking has been continuously improved and its functions have been continuously enriched. The application performance management involved at this stage not only includes the complete capability of full link tracking, but also includes:
- Storage & analysis, rich terminal features
- Full link voltage measurement
- Performance analysis
- Monitoring & Alarm: CPU, memory and JVM information, etc
Within ant group, we have a special pressure measurement platform. When the platform initiates pressure measurement flow, it will bring artificially constructed traceid, spanid and transparent data (pressure measurement flag) to realize separate printing of logs. Welcome to sofatracer as a full link tracking tool. Sofatracer’s quick start guide link:
The future development plan of sofracer is as follows. Welcome to participate and contribute! Project GitHub link.
Sofracer quick start:https://www.sofastack.tech/projects/sofa-Tracer/component-access/
Sofatracer GitHub project:https://github.com/sofastack/sofa-Tracer
Recommended reading this week
- KCl: declarative cloud native configuration policy language
- Ant group 10000 scale k8s cluster etcd high availability construction road
- We made a distributed registry
- Are you still worried about multi cluster management? OCM is coming!
More articles please scan code to pay attention to the “financial level distributed architecture” official account.