700 million requests per second, how can Alibaba’s new generation database support?

Time:2020-10-21

700 million requests per second, how can Alibaba's new generation database support?

Reading guide of Ali Mei:Lindorm is an important part of big data storage and processing in cloud operating system Feitian. Lindorm is a distributed NoSQL database developed based on HBase and oriented to the field of big data. It integrates large-scale, high-throughput, fast, flexible and real-time hybrid capabilities. It provides the world’s leading hybrid storage and processing capabilities of high-performance, cross domain, multi model and multi model for massive data scenarios. At present, lindorm has fully served the big data structured and semi-structured storage scenarios in Ali economy.

Note: lindorm is another name for the internal HBase branch of Alibaba. The version sold on Alibaba cloud is called HBase enhanced version. The HBase enhanced version and lindorm in this article refer to the same product.

Since 2019, lindorm has served dozens of Bu, including Taobao, tmall, ant, rookie, mom, Youku, Gaode, Dawei and so on. In this year’s double 11, lindorm’s peak requests reached 750 million times per second, the daily throughput was 22.9 trillion times, the average response time was less than 3 MS, and the overall storage amount of data reached hundreds of Pb. Behind these figures is the hard work and hard work of HBase & lindorm team for many years. Born out of HBase, lindorm is a brand-new product for the team to comprehensively restructure and upgrade its engine in the face of scale cost pressure and HBase’s own defects after carrying hundreds of petabytes of data, 100 million requests and thousands of businesses over the years. Compared with HBase, lindorm has made a great leap in performance, function and usability. This article will introduce the core competence and business performance of lindorm from the aspects of function, availability, performance cost and service ecology. Finally, we will share some of our ongoing projects.

Extreme optimization, super performance

Compared with HBase, lindorm has made deep optimization in RPC, memory management, cache, log writing and other aspects, and introduced many new technologies, which greatly improved the read-write performance. Under the same hardware condition, the throughput can reach more than 5 times of HBase, and the burr can reach 1 / 10 of HBase. These performance data are not generated under laboratory conditions, but the results obtained by using the open source testing tool ycsb without changing any parameters. We publish the testing tools and scenarios in the help file of Alibaba cloud, and anyone can run the same results according to the guide.

700 million requests per second, how can Alibaba's new generation database support?

Behind such excellent performance is the “black technology” accumulated in lindorm for many years. Next, we will briefly introduce some “black technology” used in lindorm kernel.

Trie Index

Ldfile (similar to hfile in HBase) is a read-only B + tree structure, in which the file index is the most important data structure. In block cache, there is a high priority, and it needs to be resident in memory as much as possible. If we can reduce the size of the file index, we can save the valuable memory space needed by the index in block cache. Or increase the index density and reduce the size of the data block while the index space is unchanged, so as to improve the performance. The index block in HBase stores a full number of rowkeys. In a sorted file, many rowkeys have a common prefix.

Trie (prefix tree) structure in the data structure can save only one copy of the common prefix to avoid the waste of repeated storage. However, in the traditional prefix tree structure, the pointer from one node to the next takes up too much space, which is not worth the loss as a whole. This situation is expected to be solved with the success prefix tree. In the best paper surf of SIGMOD in 2018, we propose a method to replace bloom filter with succeed prefix tree and provide the function of range filtering at the same time. Inspired by this article, we use succencttrie to do file block index.

700 million requests per second, how can Alibaba's new generation database support?

We use the index structure implemented by trie index in many online businesses. The results show that trie index can greatly reduce the size of the index in each scenario, and can compress the index space by up to 12 times! The saved valuable space enables more indexes and data files to be stored in the memory cache, which greatly improves the performance of the request.

700 million requests per second, how can Alibaba's new generation database support?

ZGC favors, 100 GB reactor average 5ms pause

ZGC (powered by dragonwell JDK) is one of the representatives of the next generation pauless GC algorithm. Its core idea is that mutator uses the read barrier to identify pointer changes, so that most of the mark and relocate work can be executed in the concurrent phase. Such an experimental technology, under the close cooperation of lindorm team and ajdk team, has carried out a lot of improvement and transformation work. The main work includes: 1

  1. Lindorm memory self-management technology reduces the number of objects and memory allocation rate by order of magnitude. (for example, ccsmap contributed by Alibaba HBase team to the community).
  2. Ajdk ZGC page cache mechanism optimization (lock, page cache strategy).
  3. Ajdk ZGC trigger timing optimization, ZGC no concurrent failure. Ajdk ZGC has been running steadily for two months on lindorm, and has passed the National Double 11 test. The JVM pause time is stable around 5ms, and the maximum pause time is less than 8ms. ZGC greatly improves the RT and burr index of the online cluster. The average RT is optimized by 15% – 20%, and the p999 RT is doubled. In this year’s double eleven ant risk control cluster, with the blessing of ZGC, the p999 time decreased from 12 ms to 5 ms.

700 million requests per second, how can Alibaba's new generation database support?
Note: the unit in the figure should be us, and the average GC is 5ms

LindormBlockingQueue

700 million requests per second, how can Alibaba's new generation database support?

The above figure shows the process of reading RPC requests from the network and distributing them to each handler in HBase. The RPC reader in HBase reads the RPC request from the socket and puts it into the BlockingQueue. The handler subscribes to the queue and executes the request. The BlockingQueue and HBase use the linkedblockingqueue that comes with the Java Native JDK.

Linkedblocking queue uses lock and condition to ensure thread safety and synchronization between threads. Although it is classic and easy to understand, this queue will cause serious performance bottleneck when the throughput increases. Therefore, a new lindorm BlockingQueue is designed to maintain the elements in the slot array. Maintain the head and tail pointer, read and write to the queue through CAS operation, and eliminate the critical area. It also uses cache line padding and dirty read cache acceleration, and can customize a variety of waiting strategies (spin / yield / block) to avoid frequently entering the park state when the queue is empty or full. The performance of lindorm BlockingQueue is outstanding, which is more than four times better than the original linked BlockingQueue.

700 million requests per second, how can Alibaba's new generation database support?

VersionBasedSynchronizer

700 million requests per second, how can Alibaba's new generation database support?

Ldlog is the log used in lindorm for data recovery in case of system failure to ensure the atomicity and reliability of data. Every time data is written, ldlog must be written first. After the ldlog is successfully written, subsequent operations such as writing to the memstore can be performed. Therefore, the handler in lindorm must wait for the wal write to complete before being awakened for the next operation. Under high pressure conditions, useless wake-up will cause a large number of CPU context switches, resulting in performance degradation. To solve this problem, Lindor developed version based synchronizer to greatly optimize context switching.

The main idea of version based synchronizer is to let the handler’s waiting condition be perceived by the notifier, so as to reduce the wake-up pressure of the notifier. After module testing, the efficiency of version based synchronizer is more than twice that of objectmonitor and j.u.c (Java util concurrent package) that comes with JDK.

700 million requests per second, how can Alibaba's new generation database support?

Full lock free

HBase kernel has a large number of locks on the critical path, which will cause thread contention and performance degradation in high concurrency scenarios. The lindorm kernel makes the lock on the key link unlocked, such as the lock in mvcc and wal module. In addition, HBase will produce various indicators in the running process, such as QPS, RT, cache hit rate and so on. There are also lots of locks in the “humble” operations that record these metrics. In the face of such problems, lindorm borrowed the idea of tcmalloc and developed lindormthreadcachecounter to solve the performance problems of metrics.

700 million requests per second, how can Alibaba's new generation database support?

Handler coprogramming

In high concurrency applications, the implementation of an RPC request often contains multiple sub modules, involving several io. These sub modules cooperate with each other, and the contextswitch of the system is quite frequent. The optimization of contextswitch is a topic that cannot be avoided in high concurrency system. Every master shows his or her magic power. There are many ideas and practices in the industry. Among them, coroutine and Seda are the solutions we focus on. Considering the engineering cost, maintainability and code readability, lindorm chooses the method of coprocessing to optimize asynchronization. We use the built-in wisp2.0 function of dragonwell JDK provided by Alibaba JVM team to realize the collaborative programming of HBase handler. Wisp2.0 can be used out of the box, effectively reducing the resource consumption of the system, and the optimization effect is relatively objective.

New encoding algorithm

From the perspective of performance, HBase usually needs to load meta information into block cache. If the block size is small and there is more meta information, the meta cannot be fully loaded into the cache, resulting in performance degradation. If the block size is large, the performance of sequential query of encoded blocks will become the performance bottleneck of random read. In view of this situation, lindorr has newly developed indexable delta encoding, which can also be used for quick query in the block. The performance of seek has been greatly improved. The principle of indexable delta encoding is shown in the figure

700 million requests per second, how can Alibaba's new generation database support?

Through indexable delta encoding, the random seek performance of hfile is doubled compared with that before use. Taking 64K block as an example, the performance of random seek is basically similar to that without encoding (other encoding algorithms will have some performance loss). In the random get scenario with full cache hits, the RT decreases by 50% compared with diff encoding

other

Compared with the community version HBase, lindorm has dozens of performance optimization and refactoring, and has introduced many new technologies. Due to the limited space, we can only list some other core technologies, such as:

  • CCSMap
  • Quorum based write protocol for automatically avoiding failure nodes
  • Efficient group commit
  • High performance cache without fragmentation — shared bucketcache
  • Memstore Bloomfilter
  • Efficient read write oriented data structure
  • GC invisible memory management
  • Separation of online computing and offline job architecture
  • JDK / OS deep optimization
  • FPGA offloading Compaction
  • User mode TCP acceleration
  • ……

Rich query model reduces development threshold

The native HBase only supports the query of kV structure. Although it is simple, it is not able to meet the complex requirements of various services. Therefore, in lindorm, we have developed a variety of query models according to the characteristics of different businesses. Through the API and index design closer to the scene, we can reduce the development threshold.

WideColumn model (native HBase API)

WideColumn is an access model and data structure completely consistent with HBase, which makes lindrom 100% compatible with HBase API. Users can access lindorm through the wideColumn API in the high-performance native client provided by lindorm, or directly access lindorm by using the HBase client and API (without any code modification) through the alihbase connector plug-in. At the same time, lindorm uses the design of light client to sink a large number of data routing, batch distribution, timeout, retrying and other logic to the server, and makes a lot of optimization in the network transport layer, so that the CPU consumption of the application side can be greatly saved. As shown in the table below, compared with HBase, the CPU utilization efficiency of the application side is improved by 60% and the network bandwidth efficiency is increased by 25% after using lindorm.

700 million requests per second, how can Alibaba's new generation database support?
Note: the client CPU in the table represents the CPU resource consumed by HBase / lindorm client, the smaller the better.

On the native HBase API, we also provide exclusive support for high-performance secondary index. Users can use the native HBase API to write the index data transparently into the index table. In the process of query, the scan + filter big query that may scan the whole table can be changed to query the index table first, which greatly improves the query performance. For high performance native secondary index, you can refer to:
https://help.aliyun.com/docum…

Tableservice model (SQL, secondary index)

HBase only supports rowkey, which is inefficient for multi field queries. Therefore, users need to maintain multiple tables to meet the query requirements of different scenarios, which not only increases the complexity of application development, but also can not guarantee the data consistency and writing efficiency perfectly. In addition, HBase only provides kV API, which can only do simple API operations such as put, get and scan, and there is no data type. All data must be converted and stored by users themselves. For the developers who are used to SQL language, the threshold of entry is very high, and it is easy to make mistakes.

In order to solve this pain point, reduce the threshold of user use and improve the development efficiency, we add tableservice model in lindorm, which provides rich data types, structured query expression API, and supports SQL access and global secondary index, which solves many technical challenges and greatly reduces the development threshold of ordinary users. Through the API of SQL and SQL like, users can use lindorm as easily as using relational database. Here is a simple example of lindorm SQL.

--Main table and index DDL
		create table shop_item_relation (
		shop_id varchar,
		item_id varchar,
		status varchar
		constraint primary key(shop_id, item_id)) ;
		create index idx1 on shop_ item_ relation (item_ ID) include (all); -- index the primary key of the second column, and redundant all columns
		create index idx2 on shop_ item_ relation (shop_ ID, status) include (all); -- multi column index, redundant all columns
		--Write data and update 2 indexes synchronously
		upsert into shop_item_relation values('shop1', 'item1',  'active');
		upsert into shop_item_relation values('shop1', 'item2',  'invalid');
		--Automatically select the appropriate index to execute the query according to the where clause
		select * from shop_item_relation where item_id = 'item2';  
		--Hit idx1
		select * from shop_item_relation where shop_id = 'shop1' and status = 'invalid'; 
--Hit idx2

Compared with SQL of relational database, lindorm does not have the ability of multi row transaction and complex analysis (such as join and groupby), which is also the positioning difference between them. Compared with the secondary index provided by Phoenix component on HBase, lindorm’s secondary index is far better than Phoenix in terms of function, performance and stability. The figure below is a simple performance comparison.

700 million requests per second, how can Alibaba's new generation database support?

700 million requests per second, how can Alibaba's new generation database support?
Note: the model has been internally tested on the enhanced version of Alibaba cloud HBase. Interested users can contact the cloud HBase Q & a pin number or initiate work order consultation on Alibaba cloud.

Feedstream model

In the modern Internet architecture, message queuing takes on a very important role, which can greatly improve the performance and stability of the core system. Its typical application scenarios include system decoupling, peak clipping and current limiting, log collection, final consistent guarantee, distribution and push, etc. Common message queues include rabbitmq, Kafka and rocketmq. Although these databases are slightly different in architecture, usage and performance, their basic usage scenarios are relatively close. However, the traditional message queue is not perfect. It has the following problems in message push, feed flow and other scenarios

  • Storage: not suitable for long-term storage of data, usually the expiration time is in the day level
  • Deletion capability: deleting the specified data entry is not supported
  • Query capability: complex query and filter conditions are not supported
  • It is difficult to guarantee consistency and performance at the same time: databases like Kafka are more throughput intensive. In order to improve performance, there is the possibility of data loss in some cases, while the throughput of message queue with better transaction processing capacity is limited.
  • Partition rapid expansion capability: generally, the number of partitions under a toPC is fixed, and rapid expansion is not supported.
  • Physical queue / logical queue: usually only a small number of physical queues are supported (for example, each partition can be regarded as a queue), while the business needs to simulate the logical queue on the basis of the physical queue. For example, the IM system maintains a logical message queue for each user, and users often need a lot of extra development work.

In response to the above requirements, lindorm introduced a queue model feedstreamservice, which can solve the problems of message synchronization, device notification, and auto increment ID allocation under mass users.

700 million requests per second, how can Alibaba's new generation database support?

Feedstream model plays an important role in this year’s mobile Taobao message system, which solves the problems of mobile Taobao message push, order protection, power and so on. In this year’s double 11, lindorm has played a role in the building construction and big red envelope push. In the push of hand Taobao message, the peak value exceeds 100W / s, which can push the whole network users at the minute level.

700 million requests per second, how can Alibaba's new generation database support?
Note: the model has been internally tested on the enhanced version of Alibaba cloud HBase. Interested users can contact the cloud HBase Q & a pin number or initiate work order consultation on Alibaba cloud.

Full text index model

Although the tableservice model in lindorm provides data types and secondary indexes. However, in the face of a variety of complex query conditions and full-text index requirements, it is still unable to meet their needs, and Solr and es are excellent full-text search engines. By using lindorm + Solr / es, we can maximize the advantages of lindorm and Solr / es, so that we can build complex big data storage and retrieval services. Lindorm has built-in external index synchronization component, which can automatically synchronize the data written to lindorm into external index components such as Solr or es. This model is very suitable for the business that needs to save a large amount of data, and the field data of query criteria only accounts for a small part of the original data, and requires a combination of various conditions, such as:

  • Common logistics business scenarios need to store a lot of track logistics information, and query conditions can be arbitrarily combined according to multiple fields
  • In the traffic monitoring business scenario, a large number of vehicle passing records are saved, and the interested records are retrieved according to any combination of vehicle information
  • A variety of website members, commodity information retrieval scenarios, generally save a large number of goods / member information, and need to carry out complex and arbitrary query according to a small number of conditions, so as to meet the arbitrary search needs of website users.

700 million requests per second, how can Alibaba's new generation database support?

The full-text index model has been launched on alicloud and supports Solr / es external index. At present, index query users also need to query Solr / es directly and then check lindorm back. Later, we will package the process of querying external index with tableservice syntax. Users only need to interact with lindorm in the whole process to obtain full-text indexing capability.

More models on the road

In addition to the above models, we will develop more simple and easy-to-use models according to the business needs and pain points, so as to facilitate users to use and reduce the threshold of use. Such as time series model, graph model, etc., are already on the way, please look forward to it.

High availability with zero intervention and second recovery

From a baby to a young man, Ali HBase has fallen many times, and even suffered a broken head. We grew up lucky under the trust of our customers. In the 9 years of Alibaba application, we have accumulated a large number of highly available technologies, and these technologies have been applied to HBase enhanced version.

Optimization of MTTR

HBase is an open source implementation referring to Google’s famous paper BigTable. The core feature of HBase is that the data is persistently stored in the underlying distributed file system HDFS, and the high reliability of the whole system is ensured through the maintenance of multiple copies of data by HDFS. However, HBase itself does not need to care about the multiple copies and consistency of data, which helps to simplify the overall project, but also introduces the “service list” This means that after a node goes down, the data needs to be restored to the memory state by replaying the log and redistributing it to a new node for loading.

When the cluster size is large, the recovery time of HBase after a single point of failure may reach 10-20 minutes, and the recovery time of large-scale cluster downtime may take several hours! In the lindorm kernel, we do a series of optimization on MTTR (mean time to failure recovery), including many technologies such as online region first, parallel replay, reducing the generation of small files and so on. Increase fault recovery speed by more than 10 times! It is basically close to the theoretical value of HBase design.

Adjustable multiformity

In the original HBase architecture, each region can only be launched in one regionserver. If the region server goes down, the region needs to go through the steps of re assgin, wal splitting by region, and wal data playback before it can resume reading and writing. This recovery time may take several minutes, which is an unsolvable pain point for some demanding businesses. In addition, although HBase has master-slave synchronization, it can only be manually switched at cluster granularity in case of failure, and the data of primary and standby can only achieve final consistency, while some services can only accept strong consistency, which is beyond the reach of HBase.

Lindorm implements a consistency protocol based on shared log, which can automatically and quickly recover the service in case of failure through partition multiple replica mechanism, which perfectly adapts to the storage separation architecture, Using the same system can support strong consistent semantics, and can choose to sacrifice consistency for better performance and availability, and achieve multiple capabilities such as liveliness and high availability.

Under this architecture, lindorm has the following consistency levels, which users can freely choose according to their own business:

700 million requests per second, how can Alibaba's new generation database support?
Note: this function is not open to the public on Alibaba cloud HBase enhanced version

Client high availability switch

Although HBase can be used to form a master-slave, there is no efficient client handoff access scheme in the market. HBase clients can only access HBase cluster with fixed address. If the primary cluster fails, the user needs to stop the HBase client, modify the HBase configuration and restart before connecting to the standby cluster for access. Or the user must design a set of complex access logic in the business side to realize the access of the primary and secondary cluster. Alibaba HBase has transformed the HBase client. The traffic switching occurs inside the client. The handover command is sent to the client through the highly available channel. The client will close the old link, open the link with the standby cluster, and then try the request again.

700 million requests per second, how can Alibaba's new generation database support?

If you need to use this feature, please refer to the highly available help document: https://help.aliyun.com/docum…

Cloud native, lower cost

From the beginning of the project, lindorm has considered the cloud, and various designs can also reuse the cloud infrastructure as much as possible to optimize the cloud environment. For example, in the cloud, in addition to supporting cloud disks, we also support storing data in OSS, a low-cost object storage, to reduce costs. We have also made a lot of optimization for ECS deployment, adapting to small memory models, and enhancing deployment flexibility. Everything is for cloud origin, in order to save customer costs.

The ultimate elasticity of ECS + cloud disk

At present, the enhanced version of HBase on the cloud of lindorm adopts ECS + cloud disk deployment (some big customers may use local disk). The configuration of ECS + cloud disk deployment brings lindorm extreme flexibility.

700 million requests per second, how can Alibaba's new generation database support?

At the beginning, HBase was deployed in the form of physical machines in the group. Before each business goes online, the number of machines and the size of disks must be planned. Under the physical machine deployment, there are several problems that are difficult to solve

  • It is difficult to meet the business elasticity: it is difficult to find a new physical machine for capacity expansion in a short time when there is a sudden traffic peak or abnormal request.
  • Storage and computing are bound, and the flexibility is poor: the proportion of CPU and disk on the physical machine is fixed, but the characteristics of each business are different. If the same physical machine is used, some business computing resources are insufficient, but the storage is excessive, and some business computing resources are excessive, and storage bottleneck. Especially after HBase introduces hybrid storage, it is very difficult to determine the proportion of HDD and SSD. Some high demand services often use SSD full while HDD has surplus, and some massive offline business SSD disks cannot be used.
  • High operation and maintenance pressure: when using a physical machine, the operation and maintenance department should always pay attention to whether the physical machine is over guaranteed, whether there are disk failures, network card failures and other hardware failures that need to be dealt with. The repair of the physical machine is a long process and needs to be shut down at the same time, so the operation and maintenance pressure is huge. For HBase, a mass storage business, it is very normal to break down several disks every day. When lindorm adopts ECS + cloud disk deployment, these problems are solved.

ECS provides an almost infinite resource pool. In the face of urgent business expansion, we only need to apply for a new ECS in the resource pool. After pulling up, we can join the cluster. The time is within minutes, and there is no fear of business traffic peak. With cloud disk such storage and computing separation architecture. We can flexibly allocate different disk space for various services. When there is not enough space, the disk can be expanded and shrunk directly online. At the same time, the operation and maintenance no longer need to consider the hardware failure. When the ECS fails, the ECS can be pulled up on another host, and the cloud disk completely blocks the processing of the bad disk from the upper layer. Extreme flexibility also brings cost optimization. We don’t need to reserve too many resources for the business. At the same time, when the promotion of the business is over, we can quickly reduce the capacity and reduce the cost.

700 million requests per second, how can Alibaba's new generation database support?

Integrated cold and hot separation

In the scenario of massive big data, part of the business data in a table is only used as archive data or access frequency is very low over time. At the same time, the volume of historical data is very large, such as order data or monitoring data. Reducing the storage cost of this part of data will greatly save the cost of enterprises. How to greatly reduce the storage cost for enterprises with the minimal operation and maintenance configuration cost, lindorm hot and cold separation function came into being. Lindorm provides a new storage medium for cold data, and the storage cost of the new storage medium is only 1 / 3 of that of the efficient cloud disk.

Lindorm realizes the separation of cold and hot data in the same table. The system will automatically archive the cold data in the table to the cold storage according to the user’s set cold and hot demarcation line. In the process of query, the user only needs to configure the query hint or timerange, and the system can automatically determine whether the query should fall in the hot data area or the cold data area according to the conditions. It is always a table to the user and almost completely transparent to the user. Please refer to: https://yq.aliyun.com/article…

700 million requests per second, how can Alibaba's new generation database support?

Zstd-v2, compression ratio increased by 100%

As early as two years ago, we replaced the storage compression algorithm in the group with zstd. Compared with the original snappy algorithm, we obtained an additional 25% compression revenue. This year, we further optimized this, developed and implemented a new zstd-v2 algorithm. For the compression of small data, we proposed the method of using pre sampled data to train dictionary, and then use dictionary to accelerate. We take advantage of this new function. When building ldfile in lindorm, we first sample the data, build a dictionary, and then compress it. In the data test of different services, we get a compression ratio of 100% higher than the original zstd algorithm, which means that we can save 50% storage cost for customers.

700 million requests per second, how can Alibaba's new generation database support?

HBase serverless, the first choice for beginners

Alibaba cloud HBase serverless version is a new set of HBase services based on the lindorm kernel and using the serverless architecture. Alibaba cloud HBase serverless version really turns HBase into a service. Users do not need to plan resources in advance, select CPU, amount of memory resources, and purchase clusters. In response to business peak, business space growth, there is no need to expand the complex operation and maintenance operations, in the business downturn, there is no need to waste idle resources.

In the process of using, users can purchase the requested amount and space resources according to the current business volume. Using the HBase serverless version of Alibaba cloud, users seem to be using an HBase cluster with unlimited resources to meet the sudden change of business flow at any time, and at the same time, they only need to pay for the part of resources they really use.

700 million requests per second, how can Alibaba's new generation database support?

For the introduction and use of HBase serverless, please refer to: https://developer.aliyun.com/…

Security and multi tenant capability for key customers

Lindorm engine has built-in complete user name password system, which provides multiple levels of authority control, and authenticates each request to prevent unauthorized data access and ensure the security of user data access. At the same time, according to the demands of enterprise level big customers, lindorm has built-in multi tenant isolation functions such as group and quota restriction to ensure that all businesses in the enterprise will not be affected by each other when using the same HBase cluster, and share the same big data platform safely and efficiently.

User and ACL system

Lindorm kernel provides a set of user authentication and ACL system which is easy to use. User authentication only needs to fill in the user name and password in the configuration. The user’s password is stored in the server in non plaintext, and the password will not be transmitted in plaintext during the authentication process. Even if the ciphertext in the verification process is blocked, the communication content used for authentication cannot be reused and can not be forged.

There are three permission levels in lindorm. Global, namespace and table. These three are mutually covered relations. For example, if user1 is given read and write permissions to global, then he has read and write permissions for all tables in all the namespaces. If user2 is given read-write access to namespace1, it will automatically have read-write access to all tables in namespace1.

Group isolation

When multiple users or businesses are using the same HBase cluster, there is often a problem of resource contention. The reading and writing of some important online services may be affected by the batch reading and writing of offline services. The group function is provided by HBase Enhanced Edition (lindorm) to solve the problem of multi tenant isolation. By dividing the regionserver into different group groups, each group has different host tables, so as to achieve the purpose of resource isolation.

700 million requests per second, how can Alibaba's new generation database support?

For example, in the above figure, we create a group1, divide regionserver1 and regionserver2 into group1, create a group2, and divide regionserver3 and regionserver4 into group2. At the same time, we moved table1 and table2 to group1. In this way, all regions of table1 and table2 will only be allocated to regionserver1 and regionserver2 in group1.

Similarly, the regions of table3 and table4 belonging to group2 will only fall on regionserver3 and regionserver4 during allocation and balance. Therefore, when users request these tables, the requests sent to table1 and table2 will only be served by regionserver1 and regionserver2, while the requests to table3 and table4 will only be served by regionserver3 and regionserver4, so as to achieve the purpose of resource isolation.

Quota current limiting

A complete quota system is built into the kernel of lindorm to restrict the resource usage of each user. For each request, the lindorm kernel calculates the consumed Cu (capacity unit) accurately, and the Cu will calculate the actual consumed resources. For example, a scan request from a user may have consumed a lot of CPU and IO resources in the regionserver to filter the data due to the existence of the filter. These real resource consumption will be calculated in the Cu. When using lindorm as a big data platform, enterprise administrators can first assign different users to different businesses, and then use the quota system to limit the number of users’ reading Cu per second or the total Cu, so as to limit the user’s occupation of excessive resources and affect other users. At the same time, quota current limiting also supports namesapce level and table level restriction.

last

The new generation NoSQL database lindorm is the crystallization of Alibaba HBase & lindorm team’s technology accumulation in the past nine years. Lindorm provides the world’s leading hybrid storage and processing capacity of high performance, cross domain, multi model for massive data scenarios. It focuses on solving the demands of big data (unlimited expansion, high throughput), online service (low latency, high availability), and multi-functional query at the same time, providing users with seamless expansion, high throughput, continuous availability, millisecond level stable response, consistent adjustment of strength and weakness, low storage cost and rich index. Lindorm has become one of the core products in Alibaba’s big data system. It has successfully supported more than 1000 businesses of each Bu of the group, and has withstood many tests in the “technology group construction” of tmall double 11. Ali CTO said that Ali’s technology should be exported through Alibaba cloud to benefit millions of customers from all walks of life. Therefore, since this year, lindorm has exported * * in the form of “HBase enhanced version” on Alibaba cloud and in the proprietary cloud (click for details
))**Let customers on cloud enjoy Alibaba’s technology dividend and help business take off!

Recommended Today

MVC and Vue

MVC and Vue This article was written on July 27, 2020 The first question is: is Vue an MVC or an MVVM framework? Wikipedia tells us: MVVM is a variant of PM, and PM is a variant of MVC. So to a certain extent, whether Vue is MVC or MVVM or not, its ideological direction […]