Summary:What is a distributed system? Why use distributed systems? How are distributed systems distributed? Do you know that?
Distributed system is an old and broad topic. In recent years, due to the rise of the concept of “big data“, it has radiated new youth and vitality. This paper will discuss the distributed system through the following issues:
What is a distributed system?
Why use distributed systems?
Distributed system design deduction
What is the cap theorem?
How do distributed systems distribute?
What types of architectures are commonly used in distributed applications?
What are the advantages and disadvantages of distributed systems?
- What is a distributed system?
In short,A distributed system is a group of computer systems working together, which looks like a computer working to the end user。
This group of computers that work together haveShared status, they operate at the same time, and the failure of independent machines will not affect the normal operation of the whole system.
Let’s take an example. The traditional database is stored on the file system of a machine. Whenever we take out or insert information, we interact directly with that machine.
So now we design this traditional database intoDistributed database。 Suppose we useThree machines are used to build this distributed database. The result we pursue is to insert a record on machine 1 and return that record on machine 3. Of course, machines 1 and 2 should also be able to return this record。
- Why use distributed systems?
Managing distributed systems is a very complex topic, it’s full of traps and mines. Deploying, maintaining and debugging distributed systems is also a headache, so why do you have to do it?
The biggest advantage of distributed system is that it allows you to expand the system horizontally.
Taking the single database mentioned above as an example, it can handleThe only way to increase traffic is to upgrade the hardware on which the database runs, this is vertical expansion.
andVertical expansion has limitations。 When we reach a certain degree, we will find thatEven the best hardware cannot meet the current traffic requirements。
Scale out refers to improving the performance of the whole system by adding more machines, rather than upgrading the hardware of a single computer。
In terms of price, horizontal expansion is easier to control than vertical expansion.
The most fundamental problem is that vertical expansion has strong limitations. After reaching the capacity of the latest hardware, it still can not meet the technical requirements of medium or large workload.
Lateral expansion does not have this limit. It has no upper limit, whenever performance drops, you need to add a machine, soIn theory, unlimited workload support can be achieved。
In addition, inFault tolerance and low latency also have many advantages。Fault tolerance means that an error in a node of your distributed system will not lead to the paralysis of the whole system。 After a single system error, it may lead to the collapse of the whole system.
Low latency is to deploy different machines in different physical locations and reduce the access delay time by the principle of obtaining nearby.
The benefits of distributed systems are discussed above, but we must be clear that designing and running distributed systems is not easy.
- Distributed system design deduction
Let’s talk about a scene first, let’sExisting network applications are becoming more and more popular, the number of services is also increasing, resulting in our applications receiving far more requests per second than they can handle normally. This leads to a significant decline in application performance, which users will notice.
Let’s expand our application to meet higher requirements. Generally speaking, weThe frequency of reading information is much higher than the frequency of insertion or modification。
Let’s useThe master-slave replication strategy is used to expand the system。 We can create two new database servers that are synchronized with the primary server. The user business can only read these two new databases. Every time you insert and modify information into the primary databaseThe replica database will be asynchronously notified of updates and changes。
We already have at this stepThree times the performance support of the original system to read data。 But there is a problem here. In the design of database transactions, we follow the acid principle. But when we update the data of the other two databases at the same time, weOne time window lost the consistency principle。 If two new databases are queried within this time window, data may not be found. This timeIf the data of these three databases are synchronized, the performance of write operations will be affected。
This is usWhen designing distributed systems, we have to bear some costs。
The above master-slave replication strategy solves the user’s demand for reading performance, but when the amount of data reaches a certain level and cannot be stored on one machine, weNeed to extend write performance. To solve this problem, we can use partition technology。 Partition technology refers to writing to different databases according to specific algorithms, such as user names a to Z as different partitions. Each written database will have several reads synchronously from the database to improve the reading performance.
Of course, this makes the whole system more complex. The most important difficulty is partition algorithm. Imagine that if the user name starting with C is much more than other user names, it will lead to a very large amount of data in area C. accordingly, the requests for area C will be much larger than those in other areas. At this time, area C becomes a hot spot.To avoid hot spots, you need to split the C area. At this point, sharing data becomes very expensive and may even lead to downtime。
If everything is ideal, we willIt has n times the write traffic. N is the number of partitions。
Of course, there is a trap here, weAfter data partitioning, queries other than partition keys become very inefficient, especially for SQL statements such as join queries, it becomes very bad, resulting in some complex queries that can’t be used at all.
Here is a question:
How to choose a better partition strategy algorithm?
- What is the cap theorem?
This theorem meansA distributed system cannot have consistency, availability and partition tolerance at the same time。
Consistency: what is read and written in turn is what.
Availability: the whole system will not crash, and each non fault node will always have a corresponding.
Partition tolerant: Despite partitions, the system can continue to operate and maintain its consistency and availability.
For any distributed system,Partition tolerance is a given condition, consistency and availability cannot be achieved without this. Imagine if two nodes are disconnected, how can they be both usable and consistent?
Finally, you can only choose your system in the case of network partitionEither strong consistency or high availability。
Practice has shown that most applications value usability more。 The main reason for this consideration is that when you have to synchronize machines to achieve better consistency,Network latency can be a problem。
Such factors make applications often choose solutions that provide high availability.
In this caseThe weakest consistency model is solved, this model ensures that if there is no new update to a project, the final update to the project will be completedAll accesses return the latest value。
These systems provideBase attribute, this is relative to the acid of the traditional database. that is(basically available) is basically available, the system always returns a response.
(soft state), the system can change over time, even without input, such as synchronization to maintain final consistency.
(eventual consistency) final consistency, in the absence of input, the data will spread to each node sooner or later, so as to become consistent.
An example of a highly available distributed database is Cassandra；Value highly consistent databases, includingHBase， Redis， Zookeeper。
- How do distributed systems distribute?
Let’s take a look at the common methods of distribution in the distribution system:
withThe hash method hashes different values and maps them to different machines or nodesCome on. This method is difficult to expand because the data is scattered on multiple machines, which is prone to uneven distribution. Common hash objects include IP, URL, ID, etc.
- Data range
Distribution by data rangeFor example, those with ID 1 ~ 100 are on machine a, those with ID 100 ~ 200 are on machine B, and so on.This distribution method makes the data more uniform。 If a node has limited processing capacity, it can be split directly.
If the amount of original data for maintaining data distribution is very large, a single point bottleneck may occur.
So be sure toStrictly control the amount of metadata。
- Data volume
Distributing data according to the amount of data is to divide the data into several data blocks with a relatively fixed size, and then distribute different data blocks to different servers.
These data distributed in terms of data volume also need to be recorded and managed as metadata.
When the cluster size is large, the amount of metadata will also increase.
- Replica and data distribution
This means that the data is distributed to multiple servers. If one of them fails, the request will be forwarded to the other server. The principle isMultiple machines are copies of each other, which is an ideal way to realize load voltage sharing。
- Consistent Hashing
Consistency hash.The hash ring is constructed through the hash domain. When the machine is added, the nodes near it change and the pressure of nearby nodes is shared, the maintenance of metadata is consistent with that of quantity distribution.
Let’s now look at an example of distribution using the above method:
GFS, HDFS: distributed by data volume.
Map reduce: localize according to GFS data distribution.
BigTable and HBase are distributed by data range.
Pnuts: distributed by hash method or data range.
Dynamo, cassndra: distributed by consistent hash.
Mola, armor, bigpipe: distributed by hash.
Doris: combine by hash and by data volume distribution.
- What types of architectures are commonly used in distributed applications?
6.1 client server
In this type,The distributed system architecture has a server as a shared resource。 Such as printer database or network server. It has multiple clients that decide when to use shared resources, how to use and display changed data, and send it back to the server, such asGit such a code warehouse, this is a good example。
6.2 three tier architecture
This architecture divides the system intoPresentation layer, logic layer and data layer, which simplifies application deployment,Most early network applications were three-tier。
6.3 multi tier architecture
aboveThree tier architecture is a special form of multi tier architecture。
commonlyThe above three layers will be divided in more detail, for example, in the form of business。
6.4 point to point architecture
In this architecture, there is no dedicated machine to provide services or manage network resources. ButThe responsibility is uniformly distributed to all machines to become peers, a peer can be either a client or a server. Examples of such architectures include BitTorrent and blockchain.
6.5 database centric
This architecture refers toUse a shared database, which enables distributed nodes to work together without any form of direct communication.
- What are the advantages and disadvantages of distributed systems?
7.1 advantages of distributed system
- All nodes in the distributed system are interconnected. thereforeNodes can easily share data with other nodes。
- More nodes can be easily added to the distributed systemExpand as needed。
- The failure of one node will not lead to the failure of the whole distributed system。 Other nodes can still communicate with each other.
- Hardware resources can be shared with multiple nodes, not just one node.
7.2 disadvantages of distributed system
- In Distributed SystemsIt is difficult to provide adequate security, because nodes and connections need security.
- Some messages and data may be lost in the network when they are transferred from one node to another。
- Compared with single user system,Databases connected to distributed systems are quite complex and difficult to handle。
- IfAll nodes in the distributed system try to send data at the same time, which may lead to overload in the network。
FinallyDistributed systems and clustersAssociation of. My view is that the two are not opposite.
becauseA distributed system completes a task through a cluster of multiple nodes, make the outside world seem to deal with a system as a whole.
A distributed system can have multiple clusters, these clusters can be divided into business or physical areas. Each cluster can be used as a node of the distributed system.
The distributed system composed of these cluster nodes can be used as a single node to form a cluster with other nodes.