Supplementary course of IM development basic knowledge (6): NoSQL or SQL for database? Reading this is enough!

Time:2019-12-7

Source: 51CTO technology stack public address, the original question: NoSQL or SQL? This article makes it clear that there are revisions and changes in the collection.

1, introduction

With the advent of the era of big data on the Internet, more and more websites and application systems need to support a large amount of data storage, and at the same time, they also have the characteristics of high concurrency, high availability and high scalability.

Many times, the traditional relational database has been unable to cope with these problems, and exposed many difficult problems.

Therefore, a variety of NoSQL (not only SQL) databases have been developed rapidly as a powerful supplement to traditional relational data.

This paper will analyze some problems of traditional database (i.e. SQL database), as well as the characteristics, advantages and disadvantages of several kinds of NoSQL on the market at present, hoping to provide you with some references on the selection of storage technology in different business scenarios.

Comments: as a community that specializes in instant messaging development knowledge, many im developers think of how to select database mysql at the first time of architecture design and selection? Oracle? SQL Server? Or NoSQL? There is obviously no standard answer, because each product, system and architecture has its own user scale, adaptation scenarios, cost factors and so on. This article may not give you an exact answer as an instant messaging developer, but after you read this article and have an understanding of the technical features, applicable scenarios, advantages and disadvantages of the main databases on the market (including NoSQL database), I believe that you can find a suitable database scheme according to the characteristics of your own products or systems, which is the significance of this article Yes.

Supplementary course of IM development basic knowledge (6): NoSQL or SQL for database? Reading this is enough!

exchange of learning:

  • Instant messaging / push technology development exchange group 5: 215477170 [recommended]
  • Introduction to mobile IM development article: novice is enough: develop mobile IM from scratch

(this article is published synchronously at: http://www.52im.net/thread-27…)

2. About the author

CAISON: mainly engaged in server-side development, demand analysis, system design, optimization and reconstruction. The main development language is Java. Now, he is a R & D Engineer of Guangzhou beichat server-side.

Chen Caihua also shared several other technical articles. If you are interested, you can read them together:

Novice: the most thorough analysis of netty’s high performance principle and framework architecture so far

High performance network programming (5): an article to understand the I / O model in high performance network programming

“High performance network programming (6): an article to understand the thread model in high performance network programming”

3. Series of articles

▼ im development dry goods series (this is the 18th article):

Implementation of IM message delivery assurance mechanism (I): ensure reliable delivery of online real-time messages

Implementation of IM message delivery assurance mechanism (2): ensure reliable delivery of offline messages

How to ensure the “timing” and “consistency” of IM real-time messages? “

Should I use “push” or “pull” to synchronize online status in Im single chat and group chat? “

“IM group chat news is so complex, how to ensure that it is not lost or not heavy?”? “

Design and implementation of an im intelligent heartbeat algorithm for Android (including sample code)

How to save traffic by pulling data when mobile IM logs in? “

Easy to understand: sharing of load balancing scheme for im access layer of mobile terminal based on Cluster

On the principle of multi point login and message roaming of mobile IM

Supplementary course of IM development basic knowledge (1): correctly understand the principle of pre HTTP SSO single point login interface

Supplementary course of IM development basic knowledge (2): how to design a server storage architecture for a large number of image files? “

Supplementary course for basic knowledge of IM development (3): quick understanding of the principle of separation of reading and writing of server-side database and practical suggestions

Make up for basic knowledge of IM development (4): correct understanding of cookies, sessions and tokens in HTTP short connections

How to realize the read receipt function of IM group chat message? “

Is im group chat message stored in one copy (i.e. spread reading) or multiple copies (i.e. spread writing)? “

Supplementary course of IM development basic knowledge (5): easy to understand, correctly understand and use MQ message queue

A low cost approach to ensure IM message timing

Supplementary course of IM development basic knowledge (6): NoSQL or SQL for database? Reading this is enough! 》(in this paper)

If you are a beginner in IM development, it is strongly recommended that you read “beginner’s Guide: developing mobile IM from scratch”.

4. Disadvantages of traditional SQL database

The traditional relational database has the following disadvantages.

1) high I / O in the big data scenario: because the data is stored by row, even if only one of the columns is calculated, the relational database will read the whole row of data from the storage device into memory, resulting in high I / O.

2) row record is stored: data structure cannot be stored.

3) the Schema Extension of the table structure is not convenient: if you want to modify the table structure, you need to execute DDL (data definition language) and modify the statement. During the modification, the table will be locked and some services will not be available.

4) the function of full-text search is weak: only the matching query of substring can be performed in relational database. When the data of table becomes larger, the matching of like query will be very slow, even when there is index. Moreover, a relational database should not index text fields.

5) the function of storing and processing complex relational data is weak: many applications need to understand and navigate the relationship between highly connected data to enable use cases such as social applications, recommendation engines, fraud detection, knowledge maps, life sciences and it / network. However, the traditional relational database is not good at dealing with the relationship between data points. Their tabular data models and strict patterns make it difficult to add new or different kinds of association information.

5. NoSQL solution

NoSQL (not only SQL), which generally refers to non relational databases, can be understood as a powerful supplement to SQL.

In many aspects, the performance of NoSQL is much better than that of non relational database. At the same time, it is often accompanied by the lack of some features, the more common is the lack of transaction function of transaction database.

The four basic elements of acid for the correct execution of database transactions are as follows:
Supplementary course of IM development basic knowledge (6): NoSQL or SQL for database? Reading this is enough!

Next, we will introduce the technical features of five kinds of NoSQL databases and the disadvantages of traditional relational databases.

6. Column database

Column database is a data storage database based on column related storage architecture, which is mainly suitable for batch data processing and instant query.

Corresponding to the row database, the data is allocated in the row related storage architecture, which is mainly suitable for small batch data processing, and commonly used for online transactional data processing.

Based on the column storage characteristics of column database, it can solve the problem of high I / O of relational database in some specific scenarios.

6.1 basic principles

Traditional relational database stores data according to rows, which is called “row database”, while column database stores data according to columns.

There are two ways to put tables into the storage system, and most of us use row storage. Line storage method is to put each line into a continuous physical location, which is very similar to the traditional record and file system.

Column storage is to store data into the database according to columns, similar to row storage.

The following is a graphical explanation of two storage methods:
Supplementary course of IM development basic knowledge (6): NoSQL or SQL for database? Reading this is enough!

6.2 common column database

HBase: it is an open-source non relational distributed database (NoSQL). It refers to Google’s BigTable modeling, and the programming language is Java.

It is part of the Hadoop project of the Apache Software Foundation. It runs on the HDFS file system and provides Hadoop with services similar to BigTable scale. Therefore, it can store a large amount of sparse data fault tolerance.

BigTable: it is a compressed, high-performance and highly scalable data storage system based on Google File System (GFS), which is used to store large-scale structured data and suitable for cloud computing.

6.3 relevant characteristics

1) the advantages are as follows:

Efficient storage space utilization: the columnar database often has a higher compression rate than the row database because of its different algorithms for different column data characteristics.

The compression ratio of general row database is about 3:1 to 5:1, while that of column database is about 8:1 to 30:1.

It is common to compress data through dictionary table: the following table is the original appearance. After the dictionary table is compressed, the strings in the table become numbers.

Just because each string appears only once in the dictionary table, compression is achieved (a bit like normalize and denomalize).

High query efficiency: it is efficient to read the same column of multiple pieces of data, because these columns are stored together. A disk operation can read all the specified columns of data into memory.

The following figure illustrates the advantages of columnar storage (and data compression) through the execution of a query:

The steps are as follows:

A. go to the dictionary to find the corresponding number of the string (only one string comparison);

B. use numbers to match in the list, and set the position on the match to 1. ;

C. carry out bit operation on the matching results of different columns to get the record subscripts that meet all conditions;

D. use this subscript to assemble the final result set.

The columnar database is also suitable for aggregation operation, and is suitable for a large number of data rather than small data.

2) the disadvantages are as follows:

Not suitable for scanning small amount of data;

Not suitable for random updating;

Not suitable for real-time operation including deletion and update;

Single row data is acid. When multi row transactions occur, normal rollback of transactions is not supported. I (isolation) isolation (transaction serial commit) and D (durability) persistence are supported. A (atomicity) atomicity and C (consistency) consistency cannot be guaranteed.

6.3 use scenarios

Take HBase as an example:

1) large amount of data (100s TB level data) and the need for fast random access;

2) write intensive applications, which write a huge amount every day, but read a small amount of applications, such as Im history messages, game logs, etc;

3) there is no need for complex query criteria to query data. HBase only supports rowkey based queries. For HBase, a single record or a small range of queries is acceptable. Large scale queries may have some impact on performance due to distributed reasons. HBase is not suitable for data models with join, multi-level index and complex table relationship;

4) applications with very high requirements for performance and reliability, because HBase itself has no single point of failure, the availability is very high;

5) for applications with large amount of data and unpredictable growth, HBase with elegant data expansion needs to support online expansion. Even if the amount of data increases in a blowout way in a period of time, it can also meet the function through HBase horizontal expansion;

6) store structured and semi-structured data.

7. K-V database

It refers to the database stored with key value, whose data is organized, indexed and stored in the form of key value pairs.

K-V storage is very suitable for data that does not involve too many data relationship business relationships, and can effectively reduce the number of times of reading and writing disk. It has better read and write performance than SQL database storage, and can solve the problem that relational database cannot store data structure.

7.1 common K-V database

Redis: an open-source, network enabled, memory based, optional persistent key value pair storage database written in ANSI C.

From June 2015, the development of redis was sponsored by redis labs, while from May 2013 to June 2015, its development was sponsored by pivotal.

Prior to may 2013, its development was sponsored by VMware. According to data from db-engines.com, the monthly ranking website, redis is the most popular key value pair storage database.

Supplementary course of IM development basic knowledge (6): NoSQL or SQL for database? Reading this is enough!
Cassandra: Apache Cassandra (c *) is an open source distributed NoSQL database system.

It was originally developed by Facebook to store simple format data such as inbox, integrating the data model of Google BigTable with the fully distributed architecture of Amazon dynamo.

Facebook opened Cassandra in 2008, and since then, thanks to Cassandra’s good scalability and performance.

It has been adopted by apple, comcas, instagram, spotify, eBay, Rackspace, Netflix and other well-known websites, becoming a popular distributed structured data storage scheme.

Leveldb: it is a key / value pair embedded database management system programming library developed by Google, which is issued with an open source BSD license.

7.2 relevant characteristics

Take redis for example. The advantages of K-V database are as follows:

1) high performance: redis can support more than 10W TPS;

2) rich data types: redis supports string, hash, list, set, sorted set, bitmap and hyperloglog;

3) rich features: redis also supports publish / subscribe, notification, key expiration and other features.

The disadvantages are as follows:

For acid, redis transactions cannot support atomicity and persistence (A and D), only isolation and consistency (I and C).

Note in particular: the atomicity cannot be guaranteed here. It is a transaction operation for redis, because the transaction does not support roll back, and because of the single thread model of redis, the common operation of redis is atomic.

Most businesses do not need to strictly follow the acid principle, such as game real-time leaderboards, fans’ attention and other scenarios. Even if some data persistence fails, the business impact is very small. Therefore, in the design of the scheme, we need to choose according to the business characteristics and requirements.

7.3 use scenarios

Applicable scenarios:

Stores user information (such as sessions), profiles, parameters, shopping carts, and so on. This information is generally linked to the ID (key).

Not applicable scenario:

1) it needs to query by value, not by key: there is no way to query by value in key value database;

2) the relationship between data to be stored: data cannot be associated with two or more keys in the key value database;

3) transaction support is required: rollback is not allowed when a failure occurs in the key value database.

8. Document database

Document database (also known as document database) is a kind of database which aims to store semi-structured data as documents. Document databases usually store data in JSON or XML format.

Due to the no schema feature of document database, any data can be stored and read.

Because the data format used is JSON or bson, because JSON data is self describing, there is no need to define fields before use, reading a field that does not exist in JSON will not cause syntax errors like SQL, which can solve the problem of inconvenient Schema Extension of relational database table structure.

8.1 common document database

Mongodb: a document oriented database management system, written by C + +, solves a large number of practical problems in the application development community. In October 2007, mongodb was developed by the 10gen team. First launched in February 2009.

CouchDB: Apache CouchDB is an open source database, focusing on ease of use and becoming a “fully web embracing database.”.

It is a NoSQL database using JSON as storage format, JavaScript as query language, MapReduce and HTTP as API.

One significant feature is multi master replication. The first version of CouchDB was released in 2005 and became an Apache project in 2008.

8.2 relevant characteristics

Taking mongodb as an example, the advantages of document database are as follows:

1) the newly added fields are simple, so it is not necessary to execute DDL statements to modify the table structure like the relational database, and the program code can be read and written directly;

2) it is easy to be compatible with historical data. For historical data, even if there is no new field, it will not lead to errors, only null value will be returned. At this time, code compatibility can be handled;

3) it is easy to store complex data. JSON is a powerful description language, which can describe complex data structure.

Compared with the traditional relational database, the disadvantage of the document database is that it has weak transaction support for multiple data records, as shown below:

1) atomicity, which only supports single line / document level atomicity, and does not support multi line, multi document and multi statement atomicity;

2) solution. The isolation level only supports the read committed level, which may lead to the problem of non repeatable reading and unreal reading;

3) complex queries, such as join queries, are not supported. If you need join queries, you need to operate the database multiple times.

Mongondb also supports consistency and durability of multi document transactions. Although it is officially announced that mongodb will officially launch multi document acid transaction support in version 4.0, the final landing situation remains to be seen.

8.3 use scenarios

Applicable scenarios:

1) large amount of data or it will become large in the future;

2) the table structure is not clear and the fields are increasing, such as content management system and information management system.

Not applicable scenario:

1) transactions need to be added on different documents. Document oriented database does not support transactions between documents;

2) complex queries are needed between multiple documents, such as join.

9. Full text search engine

The traditional relational database mainly uses index to achieve the purpose of fast query. In the business of full-text search, index can’t do anything.

Mainly reflected in:

1) the conditions of full-text search can be arranged and combined at will. If it is satisfied by index, the number of indexes is very large;

2) the fuzzy matching method of full-text search can’t satisfy the index, so we can only use like query, which is a whole table scan, with very low efficiency.

The emergence of full-text search engine is to solve the problem of weak full-text search function of relational database.

9.1 basic principles

The technical principle of full-text search engine is called “inverted index”, which is an index method. Its basic principle is to build word to document index. Relative to it is the “positive row index”, whose basic principle is to set up the index of documents to words.

Now you have the following document collection:

The index of the row index is as follows:

As you can see above, the positive row index applies to querying document contents based on document names.

The simple inverted index is as follows:

The inverted index with word frequency information is as follows:

It can be seen from the above that inverted index is suitable for querying document content according to keywords.

9.2 common full-text search engines

Elastic search: it is a Lucene based search engine. It provides a distributed, multi tenant, full-text search with engine HTTP web interface and unstructured JSON files.

Elastic search is developed in Java and released as open source under the terms of the Apache license.

According to DB engines, elasticsearch is the most popular enterprise search engine, followed by Apache Solr based on Lucene.

Supplementary course of IM development basic knowledge (6): NoSQL or SQL for database? Reading this is enough!
Solr: it is the open source enterprise search platform of Apache Lucene project. Its main functions include full-text retrieval, hit tagging, faceted search, dynamic clustering, database integration, and rich text (such as word, PDF) processing. Solr is highly scalable and provides distributed search and index replication.

9.3 relevant characteristics

Taking elasticsearch as an example, the advantages of full-text search engine are as follows:

1) high query efficiency, near real-time processing of massive data;

2) scalability. Based on the cluster environment, it is easy to scale horizontally, and can carry Pb level data;

3) high availability, elastic search cluster flexibility. They will discover new or failed nodes, reorganize and rebalance data to ensure that the data is safe and accessible.

The disadvantages are as follows:

1) the acid support is insufficient. The data of a single document is acid, which does not support the normal rollback of transactions when the transaction contains multiple documents, I (isolation) isolation (based on optimistic locking mechanism), D (durability) persistence, a (atomicity) atomicity and C (consistency) consistency;

2) the support for complex multi table Association operations through foreign keys in similar databases is weak;

3) there is a certain delay in reading and writing, and the written data can be retrieved in the fastest 1s;

4) the updating performance is low. The underlying implementation is to delete data first and then insert new data;

5) memory consumption is large, because Lucene loads index part into memory.

9.4 use scenarios

Applicable scenarios are as follows:

1) distributed search engine and data analysis engine;

2) full text search, structured search and data analysis;

3) processing massive data in near real time can disperse massive data to multiple servers for storage and retrieval.

Not applicable scenarios are as follows:

1) data needs to be updated frequently;

2) complex associated query is required.

10. Graphic database

Graph database applies graph theory to store relation information between entities. The most common example is the relationship between people in social networks.

The effect of relational database used to store “relational” data is not good, its query is complex, slow and beyond expectation.

The unique design of graphic database just makes up for this defect, and solves the problem that the function of relational database is weak in storing and processing complex relational data.

10.1 common graphic database

Neo4j: it is a graphic database management system developed by neo4j, Inc. According to DB engines ranking, neo4j is the most popular graph database.

Arangodb: a native multi model database system developed by triagens GmbH. The database system supports three important data models (key / value, document, graph), including a database core and a unified query language, AQL (arangodb query language).

Query language is declarative, allowing different data access patterns to be combined in a single query. Arangodb is a NoSQL database system, but AQL is similar to SQL in many aspects.

Titan: it is an extensible graphics database, which optimizes the storage and query of graphics containing tens of billions of vertices and edges distributed in multi cluster.

Titan is a transactional database that can support thousands of concurrent users to perform complex graph traversal in real time.

10.2 relevant characteristics

Taking neo4j as an example, neo4j uses the concept of graph in data structure to model. The two basic concepts in neo4j are nodes and edges.

Nodes represent entities, while edges represent relationships between entities. Both nodes and edges can have their own attributes. Different entities are connected by different relationships to form a complex object graph.

For relational data, the storage structures of the two databases are different:

In neo4j, the “index free adjacency” is used to store nodes, that is, each node has a pointer to its neighbor node, which allows us to find the neighbor node in O (1) time.

In addition, according to the official statement, in neo4j, the edge is the most important, that is, “first class entities”. So separate storage is conducive to improving the speed of graph traversal, and it can also be easily traversed in any direction.

The advantages are as follows:

1) high performance, graph traversal is a unique algorithm of graph data structure, that is, starting from a node, according to its connection relationship, we can quickly and easily find its neighbor nodes.

This method of finding data is not affected by the amount of data, because neighbor queries always look for limited local data and do not search the whole database.

2) flexibility of design, natural extension of data structure and unstructured data format make graph database design have great flexibility and flexibility.

Because the nodes, relationships and their attributes increased with the change of requirements will not affect the normal use of the original data.

3) the agility of development, the intuitionistic data model, from the discussion of requirements, to the development and implementation of programs, and to the final preservation in the database, there seems to be no change in the appearance, or even the same.

4) it fully supports acid. Unlike other NoSQL databases, neo4j also has the feature of full transaction management, which fully supports acid transaction management.

The disadvantages are as follows:

1) the number of supporting nodes, relationships and attributes is limited;

2) splitting is not supported.

10.3 use scenarios

Applicable scenarios are as follows:

1) in some data with strong relationship, such as social network;

2) recommend engine. If we present the data in the form of graphs, it will be very helpful for the formulation of recommendations.

Not applicable scenarios are as follows:

1) record a large number of event based data (such as log entries or sensor data);

2) processing large-scale distributed data, similar to Hadoop;

3) it is suitable for structured data stored in relational database;

4) binary data storage.

11. Summary

For the selection of relational database and NoSQL database, several indicators need to be considered:

1) data volume;

2) concurrent quantity;

3) real time;

4) consistency requirements;

5) distribution and type of reading and writing;

6) safety;

7) operation and maintenance cost.

The common software system database selection reference is as follows:

1) for internal management system, such as operation system, the data volume is small and the concurrent volume is small, so the relationship type is preferred;

2) for large flow system, such as e-commerce single product page, the relationship type should be selected in the background, and the memory type should be selected in the foreground;

3) for log system, column selection is considered for original data, and inverted index is considered for log search;

4) search type system, such as in station search, non general search, such as commodity search, relationship type in the background and inverted index in the foreground;

5) transactional systems, such as inventory, transaction and bookkeeping, consider selecting relationship type + cache + consistency type agreements;

6) offline calculation, such as large amount of data analysis, can also be considered in column selection or relationship type;

7) for real-time calculation, such as real-time monitoring, memory type or column database can be selected.

In the design practice, we should be based on the demand and business driven architecture. No matter RDB / NoSQL / DRDB is selected, it must be demand-oriented, and the final data storage scheme must be a comprehensive design with various tradeoffs.

Appendix: summary of more im architecture and other hot issues

[1] articles on IM architecture design:

On the architecture design of IM system

Brief introduction to the pits of mobile IM development: architecture design, communication protocol and client

A set of practice sharing of mobile IM architecture design for mass online users (including detailed graphics and text)

A theoretical framework scheme of original distributed instant messaging (IM) system

From zero to excellence: the evolution of the technical architecture of JD customer service instant messaging system

Architecture selection of instant messaging / IM server development in mushroom Street

Ppt: technical challenges and architecture evolution of Tencent QQ 140 million online users

Design practice of cold and hot hierarchical architecture for massive data based on time sequence in wechat background

Wechat technical director talks about the structure: the way of wechat — from the avenue to the Jane (full speech)

How to interpret wechat technical director talking about architecture: the way of wechat

Fast fission: witness the evolution of wechat’s powerful background architecture from 0 to 1 (1)

17 years of practice: technical methodology of Tencent’s mass products

How to ensure the efficiency and real-time of large-scale group message push in mobile IM? “

Discussion on synchronization and storage of chat messages in modern IM system

Supplementary course of IM development basic knowledge (2): how to design a server storage architecture for a large number of image files? “

Supplementary course for basic knowledge of IM development (3): quick understanding of the principle of separation of reading and writing of server-side database and practical suggestions

Make up for basic knowledge of IM development (4): correct understanding of cookies, sessions and tokens in HTTP short connections

WhatsApp technology practice sharing: Technology myth created by 32 person engineering team

Technical challenges and practice summary behind the 100 billion visits of wechat friend circle

Behind the glory of the king 200 million users: product positioning, technical architecture, network solutions, etc

Selection of MQ message middleware in IM system: Kafka or rabbitmq? “

“Summary of Tencent senior architects: an article to understand all aspects of large-scale distributed system design”

Take microblog application scenarios as an example to summarize the architecture design steps of mass social system

Quick understanding of load balancing technology principle of high performance HTTP server

Behind bullet message Brilliance: Technology Practice of sharing billion level IM platform by chief architect of Netease Yunxin

Zhihu technology sharing: the road to redis high performance cache practice from single machine to 20 million QPS concurrency

Supplementary course of IM development basic knowledge (5): easy to understand, correctly understand and use MQ message queue

Wechat technology sharing: wechat’s massive IM chat message serial number generation practice (algorithm principle part)

Wechat technology sharing: wechat’s massive IM chat message serial number generation practice (disaster recovery plan)

Novice: understanding the evolution history, technical principles and best practices of large-scale distributed architecture with zero Foundation

Design practice of a set of highly available, scalable and highly concurrent IM group chat and single chat architecture

Alibaba technology sharing: in depth disclosure of the 10-year history of Alibaba database technology solutions

Alibaba technology sharing: the hard way for Alibaba to develop its own financial level database oceanbase

Decryption of social software red packet technology (I): comprehensive decryption of QQ red packet technology scheme — Architecture, technology implementation, etc

Social software red packet technology decryption (2): decryption wechat shake red packet technology evolution from 0 to 1

Decryption of red packet technology of social software (3): technical details behind wechat shaking red packet rain

Decryption of social software red packet technology (4): how wechat red packet system responds to high concurrency

Decryption of social software red packet technology (V): how wechat red packet system achieves high availability

Decryption of red packet technology of social software (VI): evolution practice of storage layer architecture of wechat red packet system

“Social software red envelope technology decryption (seven): Alipay red mass massive concurrent technology practice”

Decryption of social software red packet technology (VIII): comprehensive decryption of microblog red packet technology scheme

Decryption of red packet technology of social software (IX): functional logic, disaster tolerance, operation and maintenance, architecture, etc. of red packet of hand Q

New to instant messaging: what is nginx? Can it achieve im load balancing? “

New to instant messaging: a quick understanding of RPC technology – basic concepts, principles and uses

Multi dimensional comparison of 5 mainstream distributed MQ message queues, my mother is no longer worried about my technology selection

From the guerrillas to the regular army: the evolution of IM system architecture of Ma beehive tourism network

Supplementary course of IM development basic knowledge (6): NoSQL or SQL for database? Reading this is enough! “

More articles of the same kind

[2] more articles related to architecture design:

“Summary of Tencent senior architects: an article to understand all aspects of large-scale distributed system design”

Quick understanding of load balancing technology principle of high performance HTTP server

Behind bullet message Brilliance: Technology Practice of sharing billion level IM platform by chief architect of Netease Yunxin

Zhihu technology sharing: the road to redis high performance cache practice from single machine to 20 million QPS concurrency

Novice: understanding the evolution history, technical principles and best practices of large-scale distributed architecture with zero Foundation

Alibaba technology sharing: in depth disclosure of the 10-year history of Alibaba database technology solutions

Alibaba technology sharing: the hard way for Alibaba to develop its own financial level database oceanbase

The evolution practice of dada o2o background architecture: the effort behind high concurrent requests from 0 to 4000

Knowledge of excellent back-end architects: summary of the most complete MySQL large table optimization scheme in history

Xiaomi technology sharing: the evolution and practice of deciphering the ten million high concurrent architecture of Xiaomi rush to buy system

Read and understand the load balancing technology under the distributed architecture: classification, principle, algorithm, common scheme, etc

Easy to understand: how to design a database architecture that can support millions of concurrent databases? “

Multi dimensional comparison of 5 mainstream distributed MQ message queues, my mother is no longer worried about my technology selection

From novice to architect, one article is enough: the evolution of architecture with high concurrency from 1 to 10 million

Meituan technology sharing: deep decryption of meituan’s distributed ID generation algorithm

More articles of the same kind

[3] im development comprehensive article:

Beginner level one is enough: developing mobile IM from scratch

Required reading for mobile IM developers (1): easy to understand and understand the “weak” and “slow” of mobile networks

Required reading for mobile IM developers (2): summary of the most complete mobile weak network optimization methods in history

Talking about the message reliability and delivery mechanism of mobile IM from the perspective of client

Summary of optimization methods for short connection of modern mobile network: request speed, weak network adaptation and security

Tencent technology sharing: the evolution of bandwidth compression technology for social network pictures

Xiaobai must read: Gossip session and token in HTTP short connection

Supplementary course of IM development basic knowledge: correct understanding of the principle of pre HTTP SSO single point login interface

How to ensure the efficiency and real-time of large-scale group message push in mobile IM? “

“Technical problems for mobile IM development”

Is it better to develop Im by using byte stream or character stream? “

Do you know the mainstream way of voice message chat? “

Implementation of IM message delivery assurance mechanism (I): ensure reliable delivery of online real-time messages

Implementation of IM message delivery assurance mechanism (2): ensure reliable delivery of offline messages

How to ensure the “timing” and “consistency” of IM real-time messages? “

A low cost approach to ensure IM message timing

Should I use “push” or “pull” to synchronize online status in Im single chat and group chat? “

“IM group chat news is so complex, how to ensure that it is not lost or not heavy?”? “

On the optimization of login request in the development of mobile IM

How to save traffic by pulling data when mobile IM logs in? “

On the principle of multi point login and message roaming of mobile IM

How to design “failure retry” mechanism for self-developed im? “

Easy to understand: sharing of load balancing scheme for im access layer of mobile terminal based on Cluster

Technical test and analysis of the influence of wechat on the network (full text of the paper)

Principle, technology and application of instant messaging system (Technical Paper)

“The current situation of open source im project” mushroom street team talk “: an open source show with no end”

QQ music team sharing: details of image compression technology in Android (Part 1)

QQ music team sharing: detailed explanation of image compression technology in Android (Part 2)

Tencent original sharing (1): how to greatly improve the picture transmission speed and success rate of QQ mobile phone under the mobile network

Tencent original sharing (2): how to significantly reduce the traffic consumption of apps under the mobile network (Part 1)

Tencent original sharing (3): how to significantly reduce the traffic consumption of APP under the mobile network (Part 2)

As promised: cross platform component library Mars of mobile IM network layer for wechat’s own use has been officially opened

How does yelp based on social network achieve lossless compression of massive user images? “

Tencent technology sharing: how Tencent greatly reduces bandwidth and network traffic (picture compression)

Tencent technology sharing: how Tencent greatly reduces bandwidth and network traffic (audio and video technology)

What about character encoding: a quick understanding of ASCII, Unicode, GBK, and UTF-8

“Master the characteristics, performance, and optimization of mainstream image formats on mobile terminals”

Behind bullet message Brilliance: Technology Practice of sharing billion level IM platform by chief architect of Netease Yunxin

Supplementary course of IM development basic knowledge (5): easy to understand, correctly understand and use MQ message queue

Wechat technology sharing: wechat’s massive IM chat message serial number generation practice (algorithm principle part)

Is it so hard to develop Im by yourself? Hand in hand to teach you how to build a Android version of simple im (with source code)

Sharing of rongyun Technology: strategies for generating chat message ID of decrypting rongyun IM products

Supplementary course of IM development basic knowledge (6): NoSQL or SQL for database? Reading this is enough! “

More articles of the same kind

(this article is published synchronously at: http://www.52im.net/thread-27…)