Memcached for multi structured data management


Memcached for multi structured data management

1. Introduction to memcached

  • Memcached is a high performanceDistributed memory objectsCache system is used for dynamic web application to reduce database load.
  • Basic features: cache data and objects in memory to provide faster running speed for dynamic and database driven websites, so as to reduce the number of times to read the database and reduce the disk overhead.
  • Distributed cache, which can be accessed by multiple users on different hosts at the same time, solves the limitation of single machine application.
  • Use your own page block allocator
  • Using theHashMap hash table
  • No redundancy (such as copying HashMap entries) is provided. When a server s stops running or crashes, all key value pairs stored on s will be lost.

2. Memcached application rules

  • Frequently accessed tables: user, user_ details
  • Lifetime: the lifetime of a variable in memcached
  • Active user information: pre imported to memcached
  • Memcached service deployment: starting on multiple machines
  • Monitoring memcached service: write corresponding monitoring script

3. Memcached operation principle

Memcached for multi structured data management

Although it is a distributed cache server, but!!!

Server side: no distributed function

Each memcached does not communicate with each other to share information

How to distribute: depends on the implementation of the client

uselibeventAs the underlying network processing component
Memcached for multi structured data management

3.1 libevent

Libevent learning gate: tea666/article/details/92637297

Libevent GitHub gate:

Libevent: an asynchronous event handler library that encapsulates event handling functions of epoll of Linux and kqueue of BSD operating system into a unified interface.

All registered I / O and signal events are saved by bidirectional linked list, and min is used_ Heap to manage timeout events.

The main loop function continuously detects the registered events. If an event occurs, it will be put into the ready list and call the callback function of the event to complete the business logic processing.

The libevent interface encapsulates three events in a unified way

  • Specific events on the file descriptor
  • Timing events
  • signal

The callback function is executed when the event occurs, rather than the event loop in the event driven network server. The user only needs to call event_ The dispatch() function, and then dynamically add or delete events.

4. Memcached memory allocation

Early memcached memory allocation was done by malloc and free for all records.

  1. It is easy to produce memory fragments;
  2. It increases the burden on the operating system memory manager.
  3. Improvement measures: by defaultSlab AllocatorMechanism allocates and manages memory

Slab AllocatorBasic principle:

Chunk——According to a predetermined size, the allocated memory is divided into blocks of various specific lengths.

slab class——Blocks of the same size are divided into groups (sets of chunks).

The allocated memory will not be released and the allocated memory will be reused.

Slab AllocatorIt solves the original memory fragmentation problem, but also creates a new problem: due to the allocation of a specific length of memory, the allocated memory may not be effectively utilized. (to put it bluntly, it is a waste of bytes. Caching 100 bytes of data into 128 byte chunks wastes the remaining 28 bytes. )

5. Memcached distributed storage processing

Memcached implements distributed by saving different keys to different servers. When the number of servers increases, the keys will be dispersed. Even if one memcached server fails, other cache nodes will not be affected, and the system can continue to run.

The standard distributed method of memcached (the storage of keys is distributed according to the remainder of the number of servers)

1) Get the integer hash value of the key;

2) Divide by the number of servers and select the server according to the remainder.

3) When the selected server fails to connect, rehash — adds the number of connections to the key, then calculates the hash value again and attempts to connect.

Advantages: the method is simple and the dispersion of data is generally good.

Disadvantages: cache reorganization is costly when servers are added or removed.

Improved distributed method——Consistent Hashing

1) Get the hash value of the server node, and configure it to the circle of 0-232;

2) In the same way, the hash value of the key for storing data is obtained and mapped to the circle;

3) Start searching clockwise from the location to which the data is mapped, and save the data to the first server found;

4) If more than 232 still cannot find the server, save it to the first server.
Memcached for multi structured data management

Using the general hash function, the mapping location of the server may be uneven.


Virtual node:

Each physical node (server) is allocated 100-200 points on the ring to suppress the uneven distribution and minimize the cache redistribution when the server increases or decreases.

6. Memcached architecture example

Memcached for multi structured data management

If there are about 200 memcached servers, and the capacity of each server is 3gb, the system will have a huge memory database of nearly 600gb.

reference material:
Mr. Pan Peng ppt

Recommended Today

Layout of angular material (2): layout container

Layout container Layout and container Using thelayoutDirective to specify the layout direction for its child elements: arrange horizontally(layout=”row”)Or vertically(layout=”column”)。 Note that if thelayoutInstruction has no value, thenrowIs the default layout direction. row: items arranged horizontally.max-height = 100%andmax-widthIs the width of the item in the container. column: items arranged vertically.max-width = 100%andmax-heightIs the height of the […]