Docker based log analysis platform (1) Introduction


1. Why analyze logs

In the traditional web development, the log may not be paid attention to. Only when the application has problems, can we have a timely look. And the log storage method is also very simple, directly write to a text file or throw it into the database. In this way, there is nothing wrong with stand-alone applications. However, when the system architecture is distributed, there are more and more large and small subsystems, such as official websites, forums, social networking, transactions, and so on. In addition, the operating system, application services, business logic, and so on, make the management and checking of logs more and more troublesome, In the face of a large amount of log data, which is distributed in different machines or even different computer rooms, if we still log in to a machine in the traditional way to view the logs, then summarize them, and then sort them across computer rooms, it would be too bad. Therefore, a set of centralized real-time log analysis platform is very important, and a set of log analysis platform should at least include the following features:

  • Collection can collect logs from different sources, including web logs, request logs, local machines, cross machine rooms, etc
  • Storage, stable storage and indexing of log information
  • Analysis, support various levels of analysis, and can be displayed in UI
  • Warning, according to the content of the log for different error levels of alarm

2. Elk protocol stack

In fact, there are many log analysis products on the market, simpleRsyslog, commercialSplunk, open sourceScribeApacheOfFlumeClouderaOfELK。 The elk architecture is adopted here,ELK(Elasticsearch, Logstash, Kibana)After so many years of development, up to now6.0.0edition. There must be a reason for his rapid development. Briefly introduce the characteristics of these three softwares:

  • Elasticsearch has high availability, real-time index, simple expansion and friendly interface
  • Logstash is a real-time data collection engine, which can collect almost all data
  • Kibana provides a platform for analysis and visualizationWebPlatform, used to query, analyze and generate various reports

Docker based log analysis platform (1) Introduction

From the architecture diagram, we can see that the principle of the overall log platform is not difficult, and the producer of logs acts as a platformShipperAll kinds of logs are generated and then transferred to theKafkaIn this case, the transmission is also read from the producer and then transmitted to Kafka through logstashLogstashBy readingKafkaLog data in, stored inElasticSearch。 Only in the middleKafkaAs a buffer layer, becauseLogstashThe log will be transferred toElasticsearch, onceElasticSearchHang up and the data may be lost. So we consider usingKafkaAs a buffer.

Choose hereKafkaThe reason for this is that compared with most messaging systems,KafkaIt has better throughput, built-in partition, replica and fail over, which is conducive to processing large-scale messages, because Internet application logs are basically massive.

3. Based on docker

DockerIt’s an epoch-making project in the era of cloud computing. AboutDockerThere are many introductions and materials about. especiallydocker-composeIt’s equivalent to givingDockerWith wings.DockerCompared with traditional virtualization technology,DockerThe application runs on the host kernel without starting the complete operating system. It can achieve second level or even millisecond level startup time, which greatly saves the time of development, testing and deployment. And to ensure the consistency of the running environment, “this code is OK on my machine.” these problems will never appear again.

Docker based log analysis platform (1) Introduction