Redis source code learning (2) – Implementation of dynamic string in redis (1)

Time:2020-11-15

staysrc/sds.hIs defined inRedisDynamic inStringType, which means that the consumer only needs to call the interfaceAPIYou can go toStringAdd data without concern about capacity expansion.Redisusetypedef char *sds;To describe this dynamicStringIts distribution format in memory is aStringHeaderAnd in theStringHeaderThe following segment of continuous dynamic memory, andsdsIt’s pointing toStringHeaderThe first byte of subsequent contiguous memory. Its distribution in memory can be shown in the following figure:

Redis source code learning (2) - Implementation of dynamic string in redis (1)


Header information of strings

sdsThe header information ofsdsAccording to the size of the allocated cache and the size of the used cache, theRedisFive are defined insdsHeader information of:

  • sdshdr5, defines the typeSDS_TYPE_5
  • sdshdr8, defines the typeSDS_TYPE_8
  • sdshdr16, defines the typeSDS_TYPE_16
  • sdshdr32, defines the typeSDS_TYPE_32
  • sdshdr64, defines the typeSDS_TYPE_64

Five of the abovesdshdrRepresents the maximum size of the cache that can be allocated, wheresdshdr5Indicates the maximum allocation1 << 5The size of the cache, andsdshdr8Indicates the maximum allocation1 << 8Size of the cache. exceptsdshdr5In addition, the rest of the header information is in the following format (withsdshdr32As an example, the definition of

struct __attribute__ ((__packed__)) sdshdr32
{
    uint32_ T len; // the length of the used cache
    uint32_ T alloc; // the total length of the cache allocated including the header and the ending null terminator
    Unsigned char flags; // save header type information in the lower three bits, SDS_ TYPE_ Thirty-two
    Char buf []; // dynamically allocated cache
};

andsdshdr5The format of the header is defined as follows:

struct __attribute__ ((__packed__)) sdshdr5
{
    // the high bit char header is used to indicate the length of the low flag signer
    Char buf []; // dynamically allocated cache
};

Based on what we needStringChoose different lengthssdshdrThis can save space. adoptRedisAboutsdshdrData type definition, we can find, no matter whichsdshdr,sdshdr.bufBefore caching fields, they are allsdshdr.flagsTag field inRedisWhat we actually usesdsVariables, in fact, point tosdshdr.bufAnd the entireSDSIs a continuous allocation of memory, then, if we pass thesdsForward offset a byte lengthsds[-1]It must be thisSDSDatasdshdr.flagsField. Through bit operation, we can know what to doSDSData usedsdshdrAnd then through the pointer offset, you can get the entireSDSThe header information, the follow-up is very correctSDSThe basic operations of are implemented in this way.


General underlying operation of strings

Several macros and static functions are defined in the header file to implement thesdsThe basic operation of.

#define SDS_HDR(T,s) ((struct sdshdr##T *)((s)-(sizeof(struct sdshdr##T))))

Give onesdsData, usingSDS_HDRTo obtain its correspondingsdshdrHead pointer of. It is used throughSDSThe corresponding data is obtainedsdshdr.flagsAnd then we getHeaderTypeBy callingSDS_HDRTo get

The whole header information, such as:

unsigned char flags = s[-1];
switch (flags * SDS_TYPE_MASK)
{
    ...
    case SDS_TYPE_8:
        SDS_HDR(8, s)->len = new_len;
        break;
    ...
}
#define SDS_HDR_VAR(T,s) struct sdshdr##T *sh = (void*)((s)-(sizeof(struct sdshdr##T)));

Give onesdsData, declare asdshdrPointer variableshAnd the correspondingsdshdrThe head finger is assigned to thisshVariable.

static inline size_t sdslen(const sds s);

Give onesdsThe specific implementation method is as follows:

  1. combinationsdshdrAndsdsIn memory distribution structure, through thes[-1]To getStringHeaderMediumflagsData.
  2. according toflagsCalculate the corresponding typeshshdr
  3. Call macroSDS_HDRGet the correspondingStringHeaderAnd then get thelenField data.
static inline size_t sdsalloc(const sds s);

Given an SDS data, get its allocation cachesdshdr.bufThe total length of.

static inline size_t sdsavail(const sds s);

Given an SDS data, to get the available length of its cache can be understood assdsavail(s) == sdsalloc(s) - sdslen(s)

static inline void sdssetlen(sds s, size_t newlen);

Given a SDS data and a new length of newlen, set the len field in the header of SDS to newlen.

static inline void sdsinclen(sds s, size_t inc);

Given an SDS data and an increase in length Inc, increase the len field in the header of SDS by Inc. It should be noted that in thesdsinclenThe increased length is not checkedincIs it legal or notincAdd up tosdshdr.lenin This requires the caller to callsdsinclenBefore interface, check whether the length is legal.

static inline void sdssetalloc(const sds s, size_t newlen);

Given a SDS data and a new length of newlen, set the alloc field in the header of SDS to newlen. Through the source code, we can find that,SDS_TYPE_5TypesdsOther types of datasdsWhether the data is insdshdrThere are great differences between the structure and the basic operation interface.RedisThis newsdsThe submission information provided by the author himself is:

A new type, SDS_TYPE_5 is introduced having a one byte header with just the string length, without information about the available additional length at the end of the string.

Combined with other operation interfaces, theSDS_TYPE_5Type of processing, we can think of this type ofsdsData, which is mainly used to store data with a length of no more than 32 bytes, and does not reallocate cached data. In this regard,RedisThe author also gives some suggestions

Don’t use TYPE 5 if strings are going to be reallocated, since it sucks not having a free space left field.


Constructing and releasing the operation function of strings

static inline int sdsHdrSzie(char type);
static inline char sdsReqType(size_t string_size);

The above two are insrc/sds.cTwo static functions defined in the header file are used to return a specificHeaderTypeThe length of the head structure, as well as according to astring_sizeReturn the appropriate lengthHeaderType

sds sdsnewlen(const void *init, size_t initlen);
sds sdsempty(void);
sds sdsnew(const char* init);
sds sdsdup(const sds s);
void sdsfree(sds s);

amongsdsnewlenFunction is the basis of this series of functions. Its function is to give a header pointer to initialize memoryinit, and the initial lengthinitlen, build asdsData. This function will initialize the length of the data according to your needsinitlenadoptsdsReqTypeInterface to select theHeaderType, 8-bit, 16 bit, 32-bit or 64 bits_mallocCall to assign a length ofheaderSize + initlen + 1The reason why you need to allocate an extra byte of cache is that theRedisMediumsdsAlways with0As the end tag, because it needs to allocate an extra byte of cache for the end tag. At the same time, due tosdsIt is binary safe in nature, which means that it may also appear in the middle of the data0So that’s why we need it in the header information structuresdshdr.lenThe reason for the field. At the same time, the type, len, alloc fields in the header are initialized, and the data pointed to by init is calledmemcpyCopy to the SDS buffer, and0As a null terminated tag. The following three interfaces complete the related functions by calling sdsnewlen

  • sdsemptyFunction is used to create an empty SDS data.
  • sdsnewThe function creates an SDS data from a null terminated C-style string. Note that this interface is not binary secure, because it is used internallystrlenTo calculate the length of the incoming data.
  • sdsdupFunction can copy a new SDS data from a given SDS data and return it

Last interfacesdsfreeFunction by callings_freeInterface to release a givensdsData, it should be noted that the released content includessdsHead pointer, and theHeaderThe entire cache of data.


Operation function for adjusting string length information

void sdsupdatelen(sds s);

By calling the internal datastrlenTo updatesdsThe length of this interface in thesdsThis is useful when the cache is manually rewritten. In general, this interface is used to shortensdsOfsdshdr.lenField, but this interface does notsdshdr.bufTo modify the data in.

void sdsclear(sds s);

Used to empty onesdsData content, andsdsupdatelenSimilar to the interface, this function does not release or modify existing caches. Just willsdshdr.lenThe length field is cleared, but the space is still there.

sds sdsMakeRoomFor(sds s, size_t addlen);

sdsMakeRoomForThis interface is used to extend a givensdsThe available cache space of data can ensure that users can continue to write to the cache after calling this interfaceaddlenBytes of content, but this operation does not change the size of the used cache, that is, it does not changesdslenThe result of the call.

Several details are as follows:

  1. If the currentsdsThe available space ofsdsavailThe size of is greater thanaddlen, then the function does nothing.
  2. At the same time, in order to reduce the system overhead caused by repeated cache allocation,sdsMakeRoomForThe interface always allocates more reserved (up to 1MB bytes) of cache:
newlen = (len+addlen)
if (newlen < SDS_MAX_PREALLOC)
    newlen *= 2;
else
    newlen += SDS_MAX_PREALLOC;
  1. If the current operation does not causeHeaderFor example, from 8-bitHeaderUpgrade to 16 bit, thes_reallocThe interface increases the cache capacity for it.
  2. The expansion operation will never be usedSDS_TYPE_5TypeHeaderBecause of this type ofHeaderThe size of the available cache cannot be saved, which means that if you use theSDS_TYPE_5Typesds, then every timeappendDuring operation, it will be calledsdsMakeRoomForTo reallocate the cache. So for aSDS_TYPE_5Typesds, was called oncesdsMakeRoomForAfter that, at least it will be upgradedSDS_TYPE_8Typesds
  3. If theHeaderAnd then call thes_mallocInterface to assign a newsdsCopy the original data in and return the new onesdsPointer, which means that the caller cannot guarantee that thesdsWhether the pointer is still valid after the end of the call, so use the return value of the function for example to perform subsequent operationss = sdsMakeRoomFor(s, newlen);

For interfacessdsMakeRoomForRedisThe author’s suggestions are as follows:

Don’t call sdsMakeRoomFor() when obviously not needed.

That is to say, when we can ensure thatsdsDo not call the interface if there is enough extra cache in.

sds sdsRemoveFreeSpace(sds s);

sdsRemoveFreeSpaceThe purpose of this interface is to shrink the size of the SDS cache by freeing up the extra free space so that it is just savedsdslenSize of data.

The details are as follows:

  1. If shrinkage causesHeaderCall thes_mallocThe interface is reassigned to a new onesdsAfter copying the data, return the newsds
  2. If there is no shrinkageHeaderIn this case, thes_reallocInterface to adjust the cache size to achieve cache release.
size_t sdsAllocSize(sds s);

sdsAllocSizeThis interface is assigned to the specified with and returnsdsThe total memory size of the data.

It includes:

  1. sdsBefore pointerHeaderSize of
  2. The size of used data in the cache
  3. Size of free space
  4. Terminator0Size of

This interface is compatible withsdsallocThe difference is,sdsallocWhat’s coming back issdshdr.bufThe size of the allocated cache.

void* sdsAllocPtr(sds s);

sdsAllocPtrThis interface returns asdsThe header pointer to which data is allocated directly, i.eHeaderPointer to.

void sdsIncrLen(sds s, ssize_t incr);

Can be understood as can be assigned tosdsOfsdshdr.lenincreaseincrThe length of thesdsavailThis interface is responsible for processing onlysdsThe length of the data without changing its contents. Basic application scenarios:

  1. callsdsMakeRoomForFunction expansion for SDS
  2. Write data to the SDS cache
  3. callsdsIncrLenFunction to adjust the length of the sdslen after writing data
oldlen = sdslen(s);
s = sdsMakeRoomFor(s, BUFFER_SIZE);
nread = read(fd, s+oldlen, BUFFER_SIZE);
/* ... check for nread <= 0 and handle it ... */
sdsIncrLen(s, nread);

This interface is compatible withsdsinclenSimilar, but the interface is more provided for users to call, and its internal length verification mechanism is added, which requires us to pass thesdsMakeRoomForTo ensure that the interface is checked manuallyincrCannot exceed the size ofsdsavailOtherwise, the assertion mechanism will be triggered.


Redis source code learning (2) - Implementation of dynamic string in redis (1)

Students love to scan two-dimensional code, pay attention to my WeChat official account.Machiavelli incoding