Tars service information reporting full service monitoring

Time:2021-1-16

Tars service information reporting full service monitoring

By Eaton

Introduction|After the service is put into operation, it is inevitable that there will be an exception. Usually, the problem will be checked through the service log. However, this way of troubleshooting is sometimes inefficient, especially when there are too many logs and you don’t know how to start, which is very troublesome. Let the service actively report the errors, isn’t it? This paper will introduce several information reporting methods in tars.

catalog

  • brief introduction
  • Status statistics report
  • Abnormal Report
  • Attribute statistics reporting
  • summary

brief introduction

The function of service information reporting is integrated in tars framework, which includes three information reporting statistical methods: service status statistics reporting, exception reporting and attribute statistics reporting, so as to realize multi-faceted monitoring of service health. These three information reporting methods are implemented through stat, notify and property nodes respectively. As shown in the figure below

Tars service information reporting full service monitoring

By reporting the information of different dimensions to the three nodes, the service can monitor the service status. Now let’s learn about these three ways of reporting information.

Status statistics report

The so-called status statistics reporting means that in the tars framework, the service reports to theStatReport the call time-consuming, timeout rate, exception rate and other status information and make statistics.

After the service calls the reporting interface, it is actually temporarily stored in the memory, and when it reaches a certain time point, it will formally reportStatService (the default is once a minute). We call the time between two reporting time points as a statistical interval, in which the same key is accumulated and compared.

Status statistics reporting generally does not require additional development by users. After the service is correctly configured and deployed in the tars framework, it can be automatically reported.

Open the service management page of tarsweb, and clickService monitoringThe status information related to the corresponding service can be seen in the page, including traffic, average time consumption, timeout rate, etc., as shown in the figure below.

Tars service information reporting full service monitoring

Abnormal Report

Service status statistics reporting can intuitively understand the service status and health. However, in the actual use scenario, it is not enough to only count the call information of some services. For better monitoring, the tars framework supports reporting the exception to notify directly in the service, and can view it on the tarsweb management page, or notify the user directly in combination with other alarm software or platform. There are exception reporting methods in different versions of tars. This section will be based on tarscpp and tarsgo, and other language versions are similar.

TarsCpp

Tarscpp provides an exception reporting method, that is, usingRemoteNotify::reportTo report an exception, the usage is as follows

RemoteNotify::getInstance()->report(info);

parameterinfoIs the exception information that needs to be reported. The type isstringString, you can directly report the string to notify, and you can see the reported string on the page. For example, we create a service calledDemo.DemoServer.DemoObjThe tarscpp service is created with the following command

/usr/local/tars/cpp/script/cmake_tars_server.sh TestDemo DemoServer Demo

The project directory structure is as follows

DemoServer
Build build directory
├──  CMakeLists.txt           #Cmake build file
└ - SRC # source file directory
    ├── CMakeLists.txt
    ├── Demo.h              #  Demo.tars  Generating files
    ├──  DemoImp.cpp          #Interface implementation file
    Implementation header file of demoimp. H # - Interface
    ├──  DemoServer.cpp       #Service implementation file
    Demoserver. H # service implementation header file
    └──  Demo.tars            #Tars interface definition file

And then in theDemoServer.cppService initialization function forDemoServer::initializeIn this way, when the service starts, it will report a messageDemoServer StartThe information is as follows

void
DemoServer::initialize()
{
    addServant<DemoImp>(ServerConfig::Application + "." + ServerConfig::ServerName + ".DemoObj");
    //Reporting information
    RemoteNotify::getInstance()->report("DemoServer Start");
}

After the service is compiled and deployed, the information reported by the service can be seen in the real-time state of the service on tarsweb, as shown in the figure below

Tars service information reporting full service monitoring

TarsGo

Tarsgo provides the following function to report exception information, similar to tarscpp

func ReportNotifyInfo(level int32, info string)

levelRefers to the abnormal level, divided intoNOTIFY_NORMAL, NOTIFY_WARN, NOTIFY_ERRORinfoIs the reported information.

We create a tarsgo service with the following commandDemo.NotifyDemo.DemoObj

$GOPATH/src/github.com/TarsCloud/TarsGo/tars/tools/cmake_tars_server.sh Demo NotifyDemo Demo github.com/ETZhangSX/NotifyDemo

The project directory structure is as follows

NotifyDemo
├── build
├── client
│   ├── client.go
│   └── CMakeLists.txt
├── CMakeLists.txt
├── config.conf
├── debugtool
│   └── dumpstack.go
├── demo_imp.go
├── Demo.tars
├── go.mod
├── go.sum
├── main.go
├── start.sh
└── tars-protocol

Similar to tarscpp, wedemo_imp.cppOfInitAdd exception report to

func (imp *DemoImp) Init() (error) {
        tars.ReportNotifyInfo(tars.NOTIFY_ERROR, "ssart")
        return nil
}

After the service is built and deployed, it can also be seen in the real-time state of the service

Tars service information reporting full service monitoring

It can be seen that exception reporting is an active reporting process. Developers can actively report service errors through exception reporting, such as intry...catchWhen an error is caught, it is reported.

Attribute statistics reporting

In addition to status statistics reporting and exception reporting, tars also provides the function of attribute statistics. Developers can report business related attributes and make statistics. In order to facilitate business use, tars has the following statistics types:

  • Sum (sum)
  • Average (AVG)
  • Distr (distr)
  • Maximum (max)
  • Minimum value (min)
  • Count

In tarscpp, you cancreatePropertyReport()To create and configure a property reporting object, and call the method of the objectreportTo report attribute values. For example, if we want to monitor the maximum size of an array, we create a property namedarray_sizeAnd configure the reporting method asmax, that is, the maximum value, which can be achieved through the following steps.

PropertyReportPtr reportPtr = Application::getCommunicator()
                    ->getStatReport()
                    ->createPropertyReport("array_size", PropertyReport::max());
// vector<int> a;
reportPtr->report(a.size());

Next, we take a simple queue service implemented in C + + as an example, which includes two queue operation interfaces

  • pop: the number used to eject the front end of the queue
  • push: used to add a number to the queue

And the size of the queue in the service is reported by attribute statistics.

First, we create a new service calledDemo.PropertyDemo.TestObjAnd create a new fileQueue.hThe project structure is as follows

PropertyDemo
├── build
├── CMakeLists.txt
└── src
    ├── CMakeLists.txt
    ├── PropertyDemo.cpp
    ├── PropertyDemo.h
    ├── Queue.h
    ├── Test.h
    ├── TestImp.cpp
    ├── TestImp.h
    └── Test.tars

stayQueue.hTo implement a simple thread safe queue class, the implementation is as follows

#ifndef _QUEUE_H_
#define _QUEUE_H_

#include "util/tc_singleton.h"
#include "util/tc_thread_rwlock.h"

class Queue : public tars::TC_Singleton<Queue>
{
public:
    void Push(int value)
    {
        TC_ThreadWLock wlock(rw_locker_);
        q_.push(value);
    }
    void Pop()
    {
        TC_ThreadWLock wlock(rw_locker_);
        if (!q_.empty()) q_.pop();
    }
    int GetSize()
    {
        TC_ThreadRLock rlock(rw_locker_);
        return q_.size();
    }
private:
    std::queue<int> q_;
    tars::TC_ ThreadRWLocker rw_ locker_ ; // read write lock
};
#endif

You can see thatQueueInherited fromTC_SingletonTC_SingletonIs a singleton class component provided in tarscpp. By inheriting this class, theQueueBecome a singleton class.

modifyTest.tars, we add two new interfacespop, pushThe queue used to operate the service. as follows

module Demo
{

interface Test
{
    int test();
    int push(int value);
    int pop();
};

};

And then in theTestImp.hAdd the declaration of the interface to the

virtual int push(int value, tars::TarsCurrentPtr current);
virtual int pop(tars::TarsCurrentPtr current);

And inTestImp.cppThe implementation of these two interfaces is as follows

#include "Queue.h"

    ...

int TestImp::push(int value, tars::TarsCurrentPtr current)
{
    Queue::getInstance()->Push(value);
    return 0;
}

int TestImp::pop(tars::TarsCurrentPtr current)
{
    Queue::getInstance()->Pop();
    return 0;
}

Finally, in thePropertyDemo.cppAdd the report of queue size as follows

#include "Queue.h"

    ...

void *reportFunc(void *pArg)
{
    static PropertyReportPtr reportPtr = NULL;

    //Initialize distribution data range
    vector<int> v;
    v.push_back(10);
    v.push_back(30);
    v.push_back(50);
    v.push_back(80);
    v.push_back(100);

    //Create the queuelength attribute, which uses all the centralized statistics methods. Pay attention to the initialization of distrv
    reportPtr = Application::getCommunicator()
                    ->getStatReport()
                    ->createPropertyReport("queuelength",
                        PropertyReport::sum(),
                        PropertyReport::avg(),
                        PropertyReport::count(),
                        PropertyReport::max(),
                        PropertyReport::min(),
                        PropertyReport::distr(v));
    //Regular reporting
    while (1)
    {
        //Only integers are supported for reported properties
        reportPtr->report(Queue::getInstance()->GetSize());
        sleep(1);
    }
    return NULL;
}

int
main(int argc, char* argv[])
{
    try
    {
        pthread_t hThread;

        g_app.main(argc, argv);
        //Create a thread to run reportfunc and report attribute information
        pthread_create(&hThread, NULL, reportFunc, NULL);

        g_app.waitForShutdown();
    }

    ...
}

stayreportFuncIn, we create a propertyqueuelengthReport toreportPtr, add the above six statistical strategies and report them regularly; thenmainFunction to runreportFunc

After building and deploying the service, we can see the statistical value of the attribute in the tars web page service feature monitoring, as shown in the figure below

Tars service information reporting full service monitoring

If you can’t see the statistical information, you can synchronize the monitoring information every 5 minutes with an interval of 5 minutes.

From the figure above, you can see the values of the six statistical strategies, which areQueueThe sum, minimum, maximum, distribution, count and average of squadron size. By calling the interface of the servicepopandpushThat is, adding or popping up objects to the queue and changing the size of the queue will change these values accordingly.

We can choose strategies according to our own business needs, such as using summation in traffic statistics.

summary

This paper introduces three kinds of information reporting methods of tars and how to use them. Through these three kinds of service information reporting methods, developers can monitor services in multiple dimensions, understand the real-time health status, abnormal information and business related attributes of services and businesses, and help developers better manage services.

Tars can help developers and enterprises quickly build their own stable and reliable distributed applications in the way of micro services, so that developers only focus on business logic and improve operational efficiency. Multi language, agile R & D, high availability and efficient operation make tars an enterprise product.

Tars microservice helps you with digital transformation. Welcome to:

Tars website:https://TarsCloud.org

Tars source code:https://github.com/TarsCloud

Linux foundation official microservice free course:https://www.edx.org/course/bu…

Access to tars official training e-book:https://wj.qq.com/s2/6570357/…

Or scan the code to get:

Tars service information reporting full service monitoring