Detailed explanation of the process of building Prometheus+Grafana based on docker

Time:2022-8-18

1. Introduction to Prometheus

Prometheus (Prometheus) is a set of open source monitoring & alarm & time series database combination, started bySoundClouddeveloped by the company. With the development, more and more companies and organizations accept and adopt Prometheus, and the society is very active, they will separate it into an open source project, and there are companies to operate it. The Google SRE book also mentioned that a similar implementation to their BorgMon monitoring system is Prometheus. In the most common Kubernetes container management system, Prometheus is usually used for monitoring.

The basic principle of Prometheus is to periodically capture the status of the monitored components through the HTTP protocol. The advantage of this is that any component can access the monitoring system as long as it provides an HTTP interface, without any SDK or other integration processes. Doing so is ideal for virtualized environments such as VMs or Docker.

Prometheus should be one of the few monitoring systems suitable for Docker, Mesos, Kubernetes environments.

The HTTP interface for exporting information about monitored components is called exporter. At present, most of the components commonly used by Internet companies can be directly used by exporters, such as Varnish, Haproxy, Nginx, MySQL, and Linux system information (including disk, memory, CPU, network, etc.). For the specific supported sources, see:https://github.com/prometheus

Compared with other monitoring systems, the main features of Prometheus are:

  • A multidimensional data model (time series is defined by metric name and set key/value dimensions).
  • Very efficient storage, an average of about 3.5bytes of sampled data, 3.2 million time series, sampling every 30 seconds, keeping for 60 days, and consuming about 228G of disk.
  • A flexible query language.
  • Does not rely on distributed storage, a single server node.
  • Time collection is done via the PULL model over HTTP.
  • Push time is supported through intermediate gateways.
  • Discover targets through service discovery or static configuration.
  • Graphics and dashboard support in multiple modes.

2. Overview of Prometheus Architecture

This diagram illustrates the overall architecture of Prometheus and some of its ecosystem components:

Its service process is such that the Prometheus daemon is responsible for regularly grabbing metrics data on the target, and each grab target needs to expose an http service interface for it to grab regularly.

Prometheus: Supports specifying crawl targets through configuration files, text files, zookeeper, Consul, DNS SRV lookup, etc. It supports chart visualization in many ways, such as the very beautiful Grafana, its own Promdash, and its own template engine, etc. It also provides HTTP API query methods to customize the required output.

Alertmanager: It is a component independent of Prometheus, which can support the query statement of Prometheus and provide a very flexible alert method.

PushGateway: This component supports the client to actively push metrics to the PushGateway, while Prometheus only regularly fetches data on the Gateway.

If you have used statsd, you will feel that this is very similar, except that statsd is sent directly to the server, while Prometheus mainly relies on the process to actively grab it.

Most Prometheus components are written in Go, and they can be easily built and deployed as static binaries. accessprometheus.iofor full documentation, examples and guides.

Third, the data model of Prometheus

Fundamentally, all storage in Prometheus is implemented in time series. The same metrics (indicator name) and label (one or more labels) form a time series, and different labels represent different time series. In order to support some queries, some time series storage is sometimes generated temporarily.

metrics name&label Metric name and label

Each time series is composed of a unique "indicator name" and a set of "labels (key=value)" in the form.

Indicator name: Generally, a name is given to the monitoring object, such as http_requests_total. It has some naming rules, which can include alphanumeric _ and the like. Usually it starts with the application name_monitoring_object_value_unit. For example: push_total, userlogin_mysql_duration_seconds, app_memory_usage_bytes.

Label: It is the identification of different dimensions of a time series. For example, whether an http request uses POST or GET, and what is its endpoint, then it must be marked with a label. The final logo is like this: http_requests_total{method=”POST”, endpoint=”/api/tracks”}.

Remember that adding or removing tags for the metrics name http_requests_total will result in a new time series.

The query statement can query the aggregated results according to the combination of the above tags.

If you look at this statement from the understanding of traditional databases, you can consider that http_requests_total is the table name, the label is the field, the timestamp is the primary key, and there is a float64 field that is the value. (All values ​​in Prometheus are stored as float64).

Four, Prometheus four data types

Counter

Counter is used to accumulate values, such as recording the number of requests, the number of tasks completed, and the number of errors. Always increasing, never decreasing. After restarting the process, it will be reset.

For example: http_response_total{method=”GET”,endpoint=”/api/tracks”} 100, grab http_response_total{method=”GET”,endpoint=”/api/tracks”} 100 after 10 seconds.

Gauge

Gauge general values, such as temperature changes, memory usage changes. Can be big, can be small. After restarting the process, it will be reset.

For example: memory_usage_bytes{host=”master-01″} 100 < grab value, memory_usage_bytes{host=”master-01″} 30, memory_usage_bytes{host=”master-01″} 50, memory_usage_bytes{host=”master-01 ″} 80 < grab value.

Histogram

Histogram (histogram) can be understood as a histogram, which is often used to track the scale of events, such as request time and response size. What is special about it is that it can group the recorded content and provide the functions of count and sum all values.

For example: {less than 10=5 times, less than 20=1 times, less than 30=2 times}, count=7 times, sum=7 times the sum value.

Summary

Summary is very similar to Histogram and is often used to track the scale of events, such as request time and response size. Also provides the function of count and sum all values.

For example: count=7 times, sum=7 times the value is evaluated.

It provides a quantiles function that can divide the traced results by % ratio. For example: the value of quantile is 0.95, which means that 95% of the data in the sampled value is taken.

Five, install and run Prometheus (docker version)

The following describes how to use Prometheus and Grafana to monitor the performance of native servers.

To monitor the machine, only one exporter is required

node_exporter – for machine system data collection

Grafana is an open-source, feature-rich data visualization platform commonly used for visualization of time series data. It has built-in support for the following data sources:

The following is the architecture diagram we used when installing:

Note: This article uses ubuntu-16.04.5-server-amd64, only one server is needed!

install docker


apt-get install -y docker.io

Note: Articles on the Internet say to install docker-engine and docker-ce, which is nonsense. The package can't be found at all!

Just install docker.io and that's it!

If it is a Centos system, use yum install -y docker-io to install

Download image package


docker pull prom/node-exporter
docker pull prom/prometheus
docker pull grafana/grafana

start node-exporter


docker run -d -p 9100:9100 \
 -v "/proc:/host/proc:ro" \
 -v "/sys:/host/sys:ro" \
 -v "/:/rootfs:ro" \
 --net="host" \
 prom/node-exporter

Wait a few seconds to see if the port is up


[email protected]:~# netstat -anpt
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address      Foreign Address     State    PID/Program name
tcp    0   0 0.0.0.0:22       0.0.0.0:*        LISTEN   1147/sshd    
tcp    0   36 192.168.91.132:22    192.168.91.1:63648   ESTABLISHED 2969/0     
tcp    0   0 192.168.91.132:22    192.168.91.1:63340   ESTABLISHED 1321/1     
tcp6    0   0 :::9100         :::*          LISTEN   3070/node_exporter

Visit url:


http://192.168.91.132:9100/metrics

The effect is as follows:

These are all collected data, and with it, data can be displayed

start prometheus

Create a new directory prometheus and edit the configuration file prometheus.yml


mkdir /opt/prometheus
cd /opt/prometheus/
vim prometheus.yml

The content is as follows:


global:
 scrape_interval:   60s
 evaluation_interval: 60s
 
scrape_configs:
 - job_name: prometheus
  static_configs:
   - targets: ['localhost:9090']
    labels:
     instance: prometheus
 
 - job_name: linux
  static_configs:
   - targets: ['192.168.91.132:9100']
    labels:
     instance: localhost

Note: Modify the IP address, where 192.168.91.132 is the local address

start prometheus


docker run -d \
 -p 9090:9090 \
 -v /opt/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml \
 prom/prometheus

Wait a few seconds to check the port status


[email protected]:/opt/prometheus# netstat -anpt
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address      Foreign Address     State    PID/Program name
tcp    0   0 0.0.0.0:22       0.0.0.0:*        LISTEN   1147/sshd    
tcp    0   36 192.168.91.132:22    192.168.91.1:63648   ESTABLISHED 2969/0     
tcp    0   0 192.168.91.132:22    192.168.91.1:63340   ESTABLISHED 1321/1     
tcp6    0   0 :::9100         :::*          LISTEN   3070/node_exporter
tcp6    0   0 :::22          :::*          LISTEN   1147/sshd    
tcp6    0   0 :::9090         :::*          LISTEN   3336/docker-proxy

Visit url:


http://192.168.91.132:9090/graph

The effect is as follows:

Access targets, the url is as follows:


http://192.168.91.132:9090/targets

The effect is as follows:

If the status is not UP, wait for a while, it will be UP

start grafana

Create a new empty folder grafana-storage to store data


mkdir /opt/grafana-storage

Setting permissions


chmod 777 -R /opt/grafana-storage

Because grafana users will write files in this directory, and set 777 directly, which is relatively simple and rude!

start grafana


docker run -d \
 -p 3000:3000 \
 --name=grafana \
 -v /opt/grafana-storage:/var/lib/grafana \
 grafana/grafana

Wait a few seconds to check the port status


[email protected]:/opt/prometheus# netstat -anpt
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address      Foreign Address     State    PID/Program name
tcp    0   0 0.0.0.0:22       0.0.0.0:*        LISTEN   1147/sshd    
tcp    0   36 192.168.91.132:22    192.168.91.1:63648   ESTABLISHED 2969/0     
tcp    0   0 192.168.91.132:22    192.168.91.1:63340   ESTABLISHED 1321/1     
tcp6    0   0 :::9100         :::*          LISTEN   3070/node_exporter
tcp6    0   0 :::22          :::*          LISTEN   1147/sshd    
tcp6    0   0 :::3000         :::*          LISTEN   3494/docker-proxy
tcp6    0   0 :::9090         :::*          LISTEN   3336/docker-proxy
tcp6    0   0 192.168.91.132:9100   172.17.0.2:55108    ESTABLISHED 3070/node_exporter

Visit url:


http://192.168.91.132:3000/

By default, it will jump to the login page first. The default username and password are both admin

After logging in, it will ask you to reset your password. You can also enter the admin password again!

After the password is set, you will be redirected to the home page

Click Add data source. Since the mirroring method is used, the version is relatively new. It is not the same as the pictures shown in the articles on the Internet!

The name is written Prometheus

type choose Prometheus, because the data is obtained from it

url Enter the ip+port of Prometheus

Click on Save & Test below, if it appears green, it means ok

Back to the home page, click New dashboard

Click on Graph

The effect is as follows:

Click Edit below the title

The effect is as follows:

Enter cpu, there will be a prompt at the bottom

Here, node_load15 is monitored, which represents the load of the system for 15 minutes. Click Add Query below

The effect is as follows:

add total memory

There will be an extra line

Click on the right to delete the total memory

Click General, modify the title to Chinese

The effect of the chart is as follows:

Click the save button above

enter name

The effect is as follows:

Click on the home page, there will be a display

Reference link for this article:

http://www.ywnds.com/?p=9656

So far, this article about the process of building Prometheus+Grafana based on docker is introduced here. For more information about building Prometheus+Grafana with docker, please search for previous articles on developpaer or continue to browse the related articles below. I hope you will support developpaer more in the future. !