[springboot DB series] hyperloglog of redis advanced features

Time:2021-10-25

[springboot DB series] hyperloglog of redis advanced features

[springboot DB series] hyperloglog of redis advanced features

The hyperloglog algorithm uses very little space to achieve relatively large data level statistics; For example, during the introduction of bitmap, we talked about the statistics of daily life. When the amount of data reaches millions, the best storage method is hyperloglog. This paper will introduce the basic principle of hyperloglog and the usage posture in redis

<!– more –>

1. Basic use

1. Configuration

We use springboot2.2.1.RELEASETo build the project environment, directly inpom.xmlAdd redis dependency in

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>

If our redis is the default configuration, no additional configuration can be added; You can also directlyapplication.ymlIn configuration, as follows

spring:
  redis:
    host: 127.0.0.1
    port: 6379
    password:

2. Use posture

Let’s look at the use of posture. The principle is explained later

In redis,hyperlologIt is very simple to use. Generally, there are two operation commands to addpfadd+Countpfcount; There is also a less commonly usedmerge

a. add

Add a record

public boolean add(String key, String obj) {
    // pfadd key obj
    return stringRedisTemplate.opsForHyperLogLog().add(key, obj) > 0;
}

b. pfcount

Inaccurate counting statistics

public long count(String key) {
    //Pfcount count of imprecise statistics keys
    return stringRedisTemplate.opsForHyperLogLog().size(key);
}

a. merge

Merge multiple hyperloglogs into a new hyperloglog; I don’t think there are too many scenes

public boolean merge(String out, String... key) {
    //Pfmerge out key1 key2 -- > merge key1 key2 into a new hyperlog out
    return stringRedisTemplate.opsForHyperLogLog().union(out, key) > 0;
}

3. Principle description

I will not elaborate on the principle of hyperlog here. To be honest, I don’t fully understand the algorithm and harmonic average formula; Let’s talk about my simple understanding

Hyperlog in redis is divided into2^14=16384Barrels, 6 bits per barrel

Before inserting a data into the hyperlog, hash it first to get a 64 bit binary data

  • Take the lower 14 bits to locate the index of the bucket
  • High 50 bits, from low to high, find the first position n with 1

    • If the median value of the bucket is > N, it is discarded
    • Otherwise, set the value in the bucket to n

So how to count?

  • Take the values in all buckets and substitute them into the following formula for calculation

[springboot DB series] hyperloglog of redis advanced features

How did you get the above formula?

I’ve read an article before. I feel good. If you are interested in understanding the principle, you can move on: https://www.jianshu.com/p/55defda6dcd2

4. Application scenarios

hyperloglogIt is usually used for imprecise counting statistics. The case of daily life statistics was introduced earlier. At that time, bitmap was used as data statistics. However, it is not applicable when userids are unevenly dispersed, small ones are particularly small, and large ones are particularly large

In the case of large data magnitude,hyperloglogThe advantage of is very large, and the storage space it occupies is fixed2^14
The following figure refers to the blog post “how to count the daily and monthly activities of users”

[springboot DB series] hyperloglog of redis advanced features

The design idea of daily activity statistics using hyperlog is relatively simple

  • Generate a key every day
  • After a user accesses, executepfadd key userId
  • Total statistics:pfcount key

2. Other

0. Project

Series blog posts

  • [DB series] publish and subscribe to advanced features of redis
  • [DB series] Introduction to bitmap usage posture and application scenarios of advanced features of redis
  • [DB series] pipeline usage posture of redis
  • [DB series] redis cluster environment configuration
  • [DB series] build a simple site statistics service with redis (application)
  • [DB series] realize ranking function with redis (application)
  • [DB series] Zset data structure usage posture of redis
  • [DB series] set data structure usage posture of redis
  • [DB series] redis’s hash data structure usage posture
  • [DB series] list data structure usage posture of redis
  • [DB series] reading and writing of string data structure of redis
  • [DB series] redis jedis configuration
  • [DB series] basic configuration of redis

Engineering source code

  • Project: https://github.com/liuyueyi/spring-boot-demo
  • Project source code: https://github.com/liuyueyi/spring-boot-demo/tree/master/spring-boot/122-redis-template

1. A gray blog

The above contents are only the words of one family. Due to limited personal ability, it is inevitable that there are omissions and mistakes. If you find a bug or have better suggestions, you are welcome to criticize and correct and be grateful

The following is a gray personal blog, which records all blog posts in study and work. Welcome to visit

  • A personal blog https://blog.hhui.top
  • A gray blog spring special blog http://spring.hhui.top

[springboot DB series] hyperloglog of redis advanced features

Recommended Today

Swift advanced (XV) extension

The extension in swift is somewhat similar to the category in OC Extension can beenumeration、structural morphology、class、agreementAdd new features□ you can add methods, calculation attributes, subscripts, (convenient) initializers, nested types, protocols, etc What extensions can’t do:□ original functions cannot be overwritten□ you cannot add storage attributes or add attribute observers to existing attributes□ cannot add parent […]