Java log API management best practices

Time:2019-12-10

Summary

For today’s applications, the importance of logging is self-evident. It’s hard to imagine an application running in a production environment without any logging capabilities. Log can provide a variety of functions, including recording the error information, status information, debugging information and execution time information. In a production environment, logs are an important basis for finding the source of the problem. All kinds of information generated by the application runtime should be recorded through the log API.

Many developers are accustomed to using the printstracktrace methods of system.out.println, system.err.println, and exception objects to output relevant information. Although the use of these methods is simple, the information generated can not provide effective help in case of problems. These usage methods should be changed to log API. Using the logging API does not add much complexity, but the benefits are significant.

Although logging is an indispensable function in application development, the API and implementation related to logging are not included in the initial version of JDK. The related API (Java. Util. Logging package, Jul) and implementation were not added until JDK 1.4. So in the field of logging, the community has contributed a lot of open source implementations. Among them, log4j and its successor logback are more popular. In addition to the real logging implementation, there is also a class of encapsulation APIs related to logging, such as Apache commons logging and slf4j.

The function of this kind of library is to provide a encapsulated API level based on the implementation of logging, and to provide a unified interface for the users of logging API, so that different logging implementations can be freely switched. For example, switch from the default logging implementation of JDK to log4j. This kind of encapsulation API library is often used in the implementation of framework, because the different needs of framework users need to be considered. In the actual project development, it is less used, because few projects will switch different logging implementations in the development. This article will introduce these two kinds of Libraries in detail.

Logging is only the first step to make full use of the log. What’s more important is how to process and analyze the log generated when the program is running. Typical scenarios include triggering corresponding notification mechanisms, such as email or SMS notification, when logs contain records that meet specific conditions; and quickly locating potential problem sources in case of program running errors. This ability of processing and analysis is particularly important for the maintenance of the actual system. When there are too many components in the running system, the log is very important for fault diagnosis.

This article first introduces the basic content of log API.

Java log API

In terms of function, the function required by the log API itself is very simple. It only needs to be able to record a piece of text. When API users need to record, they construct corresponding text information according to the current context information and call API to complete the record. Generally speaking, the log API consists of the following parts:

  • Logger: the user of the log API issues a logging request through the logger and provides the contents of the log. When logging, you need to specify the severity level of the log.
  • Formatter: formats the text recorded by the recorder and adds additional metadata.
  • Handler: output the formatted log records to different places. Common log output targets include consoles, files, databases, and so on.

Recorder

When you need to log in a program, you first need to get a logger object. General logging APIs provide factory methods to create logger objects. Each logger object has a name. The general practice is to use the name of the current Java class or the package as the name of the logger object. The name of the logger is usually hierarchical, corresponding to the hierarchy of the java package. For example, the name of the logger used in the Java class “com. Myapp. Web. Indexcontroller” is generally “com. Myapp. Web. Indexcontroller” or “com. Myapp. Web”. In addition to using the class name or package name, you can also choose different names according to the corresponding functions of logging.

For example, use “security” as the name of all security related loggers. This naming method is more practical for some crosscutting functions. Developers are usually used to use the current class name as the name of the logger, so that they can quickly locate the Java class generating the log in the log record. Using other names that make sense is also a good choice in many cases.

When logging through a logger object, you need to specify the severity level of the log. Depending on the configuration of each logger object, log messages below a certain level may not be logged. This level is determined by the user of the log API based on the information contained in the log record from the row. Different logging APIs define different levels. The logging encapsulation API also defines its own level and maps to the corresponding actual level in the underlying implementation.

For example, JDK standard log API uses levels such as off, server, warning, info, config, fine, filter, finest and all, while log4j uses levels such as off, fat, error, warn, info, debug, trace and all. In general, fat, error, warn, info, debug and trace are the most commonly used levels. The six levels are different:

  • Fatal: a fatal error that causes the program to end prematurely.
  • Error: runtime exception and unexpected error.
  • Warn: unexpected runtime condition, not necessarily an error condition.
  • Info: event generated at run time.
  • Debug: details about the process when the program runs.
  • Trace: more specific details.

In these six levels, error, warn, info and debug are commonly used.

The consumer of the logging API logs log messages through the logger. Log messages can only be saved as text after they are recorded. However, some implementations, such as log4j, allow any Java object to be used when logging. Objects that are not of type string are converted to type string. Because logging is usually used when an exception occurs, the logger can record the generated exception (object of throwable class) when recording messages.

Each logger object has a severity level corresponding to the runtime. This level can be set by profile or code. If the severity level is not explicitly specified, it will look up based on the hierarchy relationship of the logger name until a name with severity level set is found. For example, if the severity level of the logger object named “com. Myapp. Web. Indexcontroller” is not explicitly specified, it will look for the severity level specified for the names “com. Myapp. Web”, “com. Myapp” and “com” in turn. If it is still not found, use the value configured by the root logger.

When logging through a logger object, only a logging request is issued. Whether the request completes depends on the severity level of the request and the logger object. Log messages generated by the logger consumer that are below the severity level of the logger object are not logged. Such a record request is ignored. In addition to severity level based filtering, the logging framework supports other custom filtering methods. For example, Jul can filter by implementing the java.util.logging.filter interface. Log4j can be filtered by inheriting the org.apache.log4j.spi.filter class.

Formatter

In addition to the messages provided when the logger object is used, some metadata is also included in the actual log. The metadata is provided by the logging framework. Common information includes the name of recorder, time stamp, thread name, etc. The formatter is used to determine how all this information is presented in the log record. Different logging implementations provide their own default formatting and customization support.

Jul inherits the java.util.logging.formatter class to define the format, and provides two standard implementations of simpleformat class and xmlformatter class. Listing 1 shows how to implement the custom formatter in Jul, just inheriting from the formatter class and implementing the format method. The object of the parameter LogRecord class contains all the information in the log record.

Listing 1. Implementation of custom formatter in Jul


public class CustomFormatter extends Formatter { 
public String format(LogRecord record) { 
return String.format("<%s> [%s] : %s", new Date(record.getMillis()), record.getLoggerName(), record.getMessage()); 
} 
}

For the custom formatter class, you need to specify it in the Jul configuration file, as shown in Listing 2.

Listing 2. Specifying a custom formatter class in the Jul configuration file


java.util.logging.ConsoleHandler.formatter = logging.jul.CustomFormatter

Log4j is simpler in the implementation of formatter, and the org.apache.log4j.patternlayout class is responsible for the formatting of log records. You do not need to create a new Java class during customization, but specify the required formatting mode through the configuration file. In format mode, different placeholders represent different types of information. For example, ‘% C’ represents the name of the logger, ‘% d’ represents the date, ‘% m’ represents the message text of the log, ‘% p’ represents the severity level, ‘% t’ represents the name of the thread. Listing 3 shows how to customize logging in the log4j configuration file.

Listing 3. How to customize logging in log4j


log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} [%p] %c - %m%n

Log processor

After the logging is formatted, it is processed by different processors. Different processors have different processing methods. For example, the console processor will output the log to the console, and the file processor will write the log to the file. In addition to these, there are different processing methods, such as writing to the database, sending by mail, writing to the JMS queue, etc.

The log processor can also configure the minimum severity level of the log information being processed. Logs below this level will not be processed. This controls the number of log records processed. For example, the console processor level is generally set to info, while the file processor is generally set to debug.

Generally, the logging framework provides more log processor implementations. Developers can also create custom implementations.

Java log encapsulation API

In addition to journaling libraries such as Jul and log4j, there is another type of library to encapsulate different journaling libraries. At first, Apache commons logging framework was the most popular encapsulation library, and now slf4j is more popular. In this way, the API of the encapsulation library is relatively simple, but on the basis of the API of the logging library, a simple encapsulation is made to shield the differences between different implementations. Because the API provided by the implementation of logging is basically similar, the function of encapsulation library is more to achieve syntax consistency.

In the Apache commons logging library, the core APIs are the org.apache.commons.logging.logfactory class and the org.apache.commons.logging.log interface. The LogFactory class provides the implementation object that the factory method uses to create the log interface. For example, logfactory.getlog can create the implementation object of the log interface according to the Java class or name. The log interface defines a set of methods for six different severity levels. For example, for the debug level, three methods are defined: isdebugenenabled(), debug (object message) and debug (object message, throwable T). At this level, the log interface simplifies the use of loggers.

The slf4j library is similar to the Apache commons logging library. The core API in the slf4j library is the org.slf4j.loggerfactory class that provides factory methods and the org.slf4j.logger interface that records logs. Get the logger object through the getlogger method of the loggerfactory class. Similar to the log interface in the Apache commons logging library, the methods in the logger interface are grouped according to different severity levels. There is the same isdebugenenabled method in the logger interface. However, in the logger interface, methods such as debug that issue logging requests use string type to represent messages, and messages with parameters can be used, as shown in Listing 4.

Listing 4. How to use slf4j


public class Slf4jBasic { 
private static final Logger LOGGER = LoggerFactory.getLogger(Slf4jBasic.class); 
public void logBasic() { 
if (LOGGER.isInfoEnabled()) { 
LOGGER.info("My log message for %s", "Alex"); 
} 
} 
}

MDC

MDC (mapped diagnostic context) is a function provided by log4j and logback, which is convenient for logging under multi-threaded conditions. Some applications are multithreaded to handle requests from multiple users. In the process of using a user, there may be many different threads for processing. A typical example is a web application server. When a user accesses a page, the application server may create a new thread to handle the request, or reuse the existing thread from the thread pool. During the duration of a user’s session, multiple threads may have processed the user’s request. This makes it difficult to distinguish logs corresponding to different users. When you need to track the relevant log records of a user in the system, it becomes very troublesome.

One solution is to use a custom log format to encode the user’s information in a certain way in the log record. The problem with this approach is that it requires access to user related information in every class that uses loggers. This can be used when logging. Such conditions are usually difficult to meet. The role of MDC is to solve this problem.

MDC can be regarded as a hash table bound to the current thread, where key value pairs can be added. The content contained in MDC can be accessed by code executed in the same thread. The child thread of the current thread inherits the contents of the MDC in its parent thread. When you need to log, you only need to get the required information from MDC. The content of MDC is saved by the program at an appropriate time. For a web application, the data is usually saved at the beginning when the request is processed. An example use of MDC is shown in Listing 5.

Listing 5. MDC usage example


public class MdcSample { 
private static final Logger LOGGER = Logger.getLogger("mdc"); 
public void log() { 
MDC.put("username", "Alex"); 
if (LOGGER.isInfoEnabled()) { 
LOGGER.info("This is a message."); 
} 
} 
}

In Listing 5, before logging, you first save the data named “username” in MDC. The data contained in it can be directly referenced when formatting the log record, as shown in Listing 6, “% X {username}” indicates that the value of “username” in MDC is referenced.

Listing 6. Using the data recorded in MDC


log4j.appender.stdout.layout.ConversionPattern=%X{username} %d{yyyy-MM-dd HH:mm:ss} [%p] %c - %m%n

Logging best practices

Here are some good practices in logging.

Check whether the log can be recorded

When the logger receives a logging request, if the severity level of the request is lower than the actual valid level of the logger object, the request is ignored. In the implementation of the logging method, such a check will be carried out first. However, it is recommended to check the API before calling it to record, so as to avoid unnecessary performance problems, as shown in Listing 7.

Listing 7. Check if the log can be recorded


if (LOGGER.isDebugEnabled()) { 
LOGGER.debug("This is a message."); 
}

The purpose of Listing 7 is to avoid the overhead of constructing logging messages. Log messages usually contain information that is relevant to the current context. In order to obtain this information and construct the corresponding message text, there will inevitably be additional overhead. Especially for debug and trace level log messages, the frequency of their occurrence is very high, and the accumulated overhead is relatively large. Therefore, it is a good practice to check the logs at info, debug and trace levels first. In general, warn and above logs do not need to be checked.

Sufficient information in the log

The information contained in the log should be sufficient. When recording log messages, we should include as much information as possible in the current context, so that we can quickly get the required information when encountering problems. For example, in the online payment function, the payment related logs should contain all the information of current users, orders and payment methods. A common practice is to distribute the related log records in the logs recorded by different loggers. When there is a problem, it will take more time and effort to find and match the related logs manually to locate the problem. Therefore, you should include as much information as possible in a single log record.

Use the appropriate logger name

The general logging practice is to use the full name of the current Java class as the name of the logger it uses. By doing so, you get a logger hierarchy corresponding to the Java class and package hierarchies. It is convenient to set the corresponding logging level according to different modules.

However, for some global or crosscutting functions, such as security and performance, function related names are recommended. For example, a program might contain a log record that provides performance profiling information. For such logging, you should use a logger with the same name, such as “performance” or “performance. Web”. In this way, when you need to enable and disable performance profiling, you only need to configure the recorders with these names.

Use semi-structured log messages

It is mentioned in the introduction of formatter in logging API that in addition to basic log messages, other metadata provided by logging framework is also included in logging. The data appears in the log record in the given format. These semi-structured formats make it possible to extract relevant information from log records for analysis. When using the log API for logging, it is also recommended to organize the log messages in a semi-structured way.

For example, for an e-commerce website, when a user logs in, the user’s user name can be included in the corresponding log records of different operations generated by the user, and it appears in the log records in a fixed format, as shown in Listing 8.

Listing 8. Using semi-structured log messages

[user1] user login succeeded.
[user1] user successfully purchased product a.
[user2] order 003 payment failed.

When it is necessary to troubleshoot a user’s problems through logging, only regular expression is needed to quickly query the user’s relevant logging records.

Log aggregation and analysis

Outputting the appropriate log message in the right place in the program is only the first step in using the log properly. The real role of logging is to help developers quickly locate problems when they occur. But a practical system usually consists of many different parts. This includes both the developed program itself and the third-party applications it relies on. Taking a typical e-commerce website as an example, in addition to the program itself, it also includes the underlying operating system, application server, database, HTTP server, proxy server and cache, etc. When a problem occurs, the real reason may come from the program itself or from the third-party program it depends on. This means that developers may need to check the logs of different applications on different servers to determine the real reason.

The function of log aggregation is to aggregate logs generated by different applications from different servers and store them on a single server for easy search and analysis. In terms of log aggregation, there are many mature open-source software that can meet the needs well. This article introduces logstash, a popular open source software for event and log management. Logstash adopts a simple processing mode: input – > filter – > output.

Logstash can be installed as an agent on every machine that needs to collect logs. Logstash provides many plug-ins to handle different types of data input. Typically, it includes console, file, syslog, etc.; for the input data, you can use the filter to process. The typical processing method is to convert log messages into structured fields; the filtered results can be output to different destinations, such as elastic search, files, e-mails, databases, etc.

Logstash is easy to use. Download the jar package from the official website and run it. A profile needs to be specified at run time. Input, filter, and output related configurations are defined in the configuration file. Listing 9 shows an example of a simple logstash configuration file.

Listing 9. Example logstash configuration file


input { 
file { 
path => [ "/var/log/*.log", "/var/log/messages", "/var/log/syslog" ] 
type => 'syslog'
} 
} 
output { 
stdout { 
debug => true 
debug_format => "json" 
} 
}

Listing 9 defines the configuration for input and output when logstash collects logs. The input type is file. Each type of input is configured accordingly. For files, you need to configure the path of the file. For each type of input, you need to specify a type. This type is used to distinguish records from different inputs. The output used in the code is the console. After the configuration file is completed, you can start logstash through “Java – jar logstash-1.1.13-flatjar.jar agent – F logstash-simple.conf”.

In log analysis, structured information is more important. The log information is usually just a piece of text, in which different fields represent different meanings. The format of logs generated by different applications is not the same. What you need to focus on when analyzing is the different fields it contains. For example, the Apache server will generate logs related to user access requests.

The log contains various information of visitors, including IP address, time, HTTP status code, length of response content and user agent string. After the log information is collected by logstash, the data contained in the log information can be extracted and named according to certain rules. Logstash provides grok plug-ins to do this. Grok works based on regular expressions, and provides many extraction patterns for common types of data, as shown in listing 10.

Listing 10. Using grok to extract the contents of a log record

//Apache access log
49.50.214.136 GET /index.html 200 1150 "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.57 Safari/537.17"
//Grok extraction mode
%{IP:client} %{WORD:method} %{URIPATHPARAM:request} %{NUMBER:status} %{NUMBER:bytes} %{QS:useragent}

After the above grok plug-in extraction, the Apache access log is converted to the formatted data containing the fields client, method, request, status, bytes and useragent. You can search based on these fields. This is very helpful for analyzing problems and making statistics.

When log records are collected and processed by logstash, they are usually saved to the database for analysis and processing. At present, the popular way is to save to elasticsearch, so that we can use the index and search ability provided by elasticsearch to analyze logs. Many open-source software have developed corresponding log management functions based on elastic search, which can be easily searched and analyzed. In this article, we introduce graylog 2.

Graylog 2 consists of two parts: server and web interface. The server is responsible for receiving the logging and saving it to elasticsearch. The web interface can view and search logs, and provide other auxiliary functions. Logstash provides the plug-in gelf, which can send the collected and processed log records of logstash to the server of graylog2. In this way, the web interface of graylog2 can be used for query and analysis. Just change the output section of the logstash configuration file in listing 9 to that shown in listing 11.

Listing 11. Configuring logstash output to graylog2


output { 
gelf { 
host => '127.0.0.1'
} 
}

When installing graylog2, you should pay attention to that you must install the version of elasticsearch corresponding to the version of graylog2, otherwise you will have the problem that logging cannot be saved to elasticsearch. The graylog 2 server version 0.11.0 and elastic search version 0.20.4 are used in this article.

In addition to graylog2, another open-source software, kibana, is also popular. Kibana can be seen as a web interface for logstash and elastic search. Kibana provides more functions to display and analyze logging. Similar to the configuration of logstash in the code listing, you only need to change the output to elastic search. Kibana can automatically read and display the log records contained in elasticsearch.

Summary

Logging is an important part of application development. However, this link is easy to be ignored by developers, because it has an impact on program operation and maintenance. For a production system, the importance of logging is self-evident. This paper first introduces the main components and usage of Java log API with java.util.logging package and log4j as examples, and also introduces Apache commons logging and slf4j log encapsulation APIs. This article also presents some best practices for logging. Finally, it introduces how to use open source tools to aggregate and analyze logs. Through this article, developers can learn how to use logs effectively in development.

The above is the whole content of this article. I hope it will help you in your study, and I hope you can support developepaer more.

Recommended Today

Docker learning (5) basic command of dockerfile

To learn dockerfile, you need to understand its basic commands From – base image Try to use the official reference image [x] From Tomcat: 8.5.50-jdk8-openjdk 񖓿 make reference image (based on Tomcat: 8.5.50-jdk8-openjdk) [x] From CentOS ා make a base image based on CentOS: latest [x] From scratch? Does not depend on any reference image […]