First scene, first mirror

Time:2020-3-26

With the last spring breeze blowing, the air filled with the taste of summer, it’s time to roll to learn.
Recently, learning redis, I found a funny thing called the bloon filter. But my level is not enough to study the source code, so I write a simple play.

principle

Please forgive me for teaching. I think the bloom filter is used to determine whether the key exists, based on the bitmap. One of the characteristics is that if I say that the key does not exist, then you can fully trust me. If I say that the key exists, you may have to weigh it. EN, to be specific, we have a key. We have n different hashes to generate N bit indexes. Set these bits to 1 in the bitmap, and then another key to have the same hash to see if all the bits corresponding to the generated indexes are 1. If all the bits are 1, it means (possible) that they exist, otherwise (certain) they do not exist. As for why this kind of inaccuracy occurs, you can go to Google, which is much clearer than I said.

Start

First of all, we need a bitmap. Since we decided to write it by hand, DIY starts from 0.
Master, please go to the code:

type BM int64
//Use slice to act as bitmap, and each bit of slice is in the opposite direction of secondary biting
type BitMap struct {
    BitSlice []BM
    BitNum uint
}
//There must be a better way to calculate the bit number of BM
var bmBitNum = uint(unsafe.Sizeof(BM(1)) * 8)

//N is the number of bits required
func NewBitMap(n int) *BitMap {
    //The calculation requires several elements bit to be able to
    bitNum := uint(n) / bmBitNum + 1
    return &BitMap {
        BitSlice : make([]BM,bitNum,bitNum),
        BitNum : uint(n),
    }
}
//N is the index of bitmap
func (bm *BitMap) Set (n uint) {
    if n > bm.BitNum {
        return
    }
    //Find out the first few elements that should be sliced
    byteIndex := n / bmBitNum
    //Find the bit number of this element
    bitIndex := n % bmBitNum
    //Set the bit to 1 by bit operation
    bm.BitSlice[byteIndex] |= BM(uint(1) << bitIndex)
}
//The same way to find out if the bit has been set to 1
func (bm *BitMap) Get (n uint) bool{
    if n > bm.BitNum {
        return false
    }
    byteIndex := n / bmBitNum
    bitIndex := n % bmBitNum
    return (bm.BitSlice[byteIndex] & BM(uint(1) << bitIndex)) != 0
}

OK, so we finished a simple bitmap. As for the future optimization, I can only wait for a higher level to continue the leading edge.
With the bitmap, you can do the bloon filter.
Here comes my brother:

//The two slices here are used for hash. The maximum prime number in mod is 101. I plan to generate three bits with three hashes, and the maximum bit generated after hash is 101.
var cap = []uint{7, 11, 13}
var mod = []uint{31, 37, 101}
//The bitmap just written by hand is useful
type BloomFilter struct {
    BitMap   *bitMap.BitMap
}
//N is still the number of digits needed
func NewBloomFilter(n int) *BloomFilter {
    return &BloomFilter {
        BitMap:bitMap.NewBitMap(n),
    }
}
//After 3 times of hash
func (bf BloomFilter) Set(value string) {
    for i := 0; i < len(cap); i++ {
        bf.BitMap.Set(hash(value,i))
    }
}

//Whether the third hash judgment of the same rule exists
func (bf BloomFilter) Exist(value string) bool {
    for i := 0; i < len(cap); i++ {
        if !bf.BitMap.Get(hash(value,i)) {
            return false
        }
    }
    return true
}

//The hash algorithm I wrote by myself has a strong local flavor. I will definitely make a better one when I study for a while!
func hash(s string,index int) uint {
    bit := uint(1)
    for i := 0; i < len(s); i++ {
        bit = (bit * cap[index] + (uint(s[i] - 'a') + uint(1))) % mod[index]
    }
    return bit
}

Well, that’s it. In fact, I just want to say that learning anything should start from the principle and practice crazily. Don’t be a programmer with a coder who is halfway through his career.

Finally, share a official account.Algorithmic dreamerCome and play algorithm, music and literary creation with me. Let’s fly!
First scene, first mirror

Recommended Today

PHP Basics – String Array Operations

In our daily work, we often need to deal with some strings or arrays. Today, we have time to sort them out String operation <?php //String truncation $str = ‘Hello World!’ Substr ($STR, 0,5); // return ‘hello’ //Chinese string truncation $STR = ‘Hello, Shenzhen’; $result = mb_ Substr ($STR, 0,2); // Hello //First occurrence of […]