Cluster size estimation


Background: QPS = 2 million / S       All day

It is a safer strategy to control the peak QPS at about 30% of the total QPS that the cluster can carry

That is, the cluster should be designed to carry QPS with an upper limit of 6 million ~ 7 million / S (that is, the processing capacity of the cluster is 3 ~ 4 times that of the peak period)

Data cluster estimation:

Assuming that each data is estimated to be 2KB, 2KB is converted to g = 2 / 1024 / 1024, that is, each data is about 1.9073486328125e-6gb

Storage estimate:

Daily data increment: 2000000 * 24 * 60 * 60 = 172800000000 pieces / day

172800000000 * (2 / 1024 / 1024) = 329589.84375gb/day = 321.8650817871094tb/day = 0.3143213689324724pb/day

0.314321368932724 * 3 * 365pb / year = 344.18189813328pb/year

Generally, cluster storage does not exceed 80% of the total storage size, so the total storage size in a year is as follows:

344.18189813328pb / 0.8 = 430.2273726666pb/year,

Based on the number of 10PB stored in each node,

430.227373726666PB/10PB=43.0227373726666   About 44 nodes are needed to store one year’s data

Memory estimation:

Memory estimation. In fact, there is no absolute standard for memory estimation. Some companies use Flink to process Internet of things data, which can be processed with only a few machines less than 10g. Therefore, the estimation of memory actually differs greatly from different components, such as the number of tasks executed, the number of real-time tasks, offline tasks, algorithm models, etc

Generally, the resources occupied by real-time tasks are fixed and can be estimated according to the number of services. Offline tasks can be estimated according to the number of ETL tasks and task resource configuration. When computing resources are enabled offline and in real time, they cannot exceed 90% of resources

The resource occupation of real-time tasks needs to be less than 50%. The QPS of real-time tasks is 2000000 / s, and the one minute window is 2000000 * 60 * (2 / 1024 / 1024) = 228.9g. In some cases, if the five minute window is set, it is about 1144.4g.

Calculated by no more than 50%, 1144.4g / 0.5 = 2288.8g (calculated by five minute window);   228.9g / 0.5 = 457.8g (calculated in one minute window)

2288.8g / 44 = 52g / set (or 457.8g / 44 = 10.40g / set)

CPU estimation:

The ratio of CPU to memory is generally 1:2 or 1:4. Of course, it depends on the number of threads

16 core CPUs can generally hold one or two hundred threads. If the amount of calculation is too large, it is recommended to use a larger core CPU, such as 32 CPU core,

Support hyper threading, sse4.2 instruction

To sum up, the recommended configuration:

Nodes     44 sets

      Mem      56G

CPU       32 CPU core, supporting sse4.2 instruction