Interview questions and answers of the latest Alibaba C / C + + Linux background development


1. What storage engines does MySQL have? Please list the differences in detail?

InnoDB: transactional storage engine with high concurrent read frequency

Memory: storage engine, stored in memory, small amount of data, fast


Archive: it has a good compression mechanism

2. How to design the database layer of high concurrency system? What are the types of database locks? How to realize it?

1. Sub database and sub table: the same amount of data is equally stored in the same table (or different tables) of different databases to reduce the pressure of a single table. If it is still very large, each database can be divided into multiple tables, and the data can be stored in which table according to the hash value or other logic

2. Separation of reading and writing: databases are originally divided into master-slave databases. Queries are performed on the slave server, and additions and deletions are performed on the master server,

3. Distinguish between archiving and operation table: create an archiving table, put the historical data in, and store the table data that needs to be operated separately

4. The creation of index, for a single table with a large amount of data and more than one million levels, if the addition, deletion and modification operations are not frequent, you can create a bitmap index, which is much faster

1. Shared lock: it can only be operated after the first person releases the lock

2. Update lock: solve deadlock, others can read, but can’t operate

3. Exclusive lock: reading and writing are disabled

4. Xlock: locks some data in the table, which can be skipped when querying

5. Plan lock: during operation, other tables cannot connect to this table

3. Talk about distributed unique ID?

Make sure that the ID is stored in 64 bits. A 64 bit binary 1 is like this: 00000000….. 1100…… 0101. Cut 64 bits. A certain section of binary is expressed as a constraint condition. The first 41 bits are milliseconds, the next 9 bits are IP, and the following 9 bits are self incrementing binary. Record the number of IDS with the same number of faces. For example, there are 10 machines now, The limit of ID generated by this ID generator is 14 IDs of 2 generated by the same machine in 1ms.

Distributed unique id = timestamp < 41 bits, int type server number < 10, sequence self incrementing. Each time stamp can only generate a fixed number, such as (100000) self incrementing numbers. When the maximum value is reached, the next time stamp will be synchronously waiting, and the self incrementing starts from 0. Put the number of milliseconds at the highest level to ensure that the generated ID is trend increasing, and the ID generated by each business line, each computer room and each machine is different. For example, 39bit milliseconds, 4bit service lines, 2bit computer rooms, and 7bit serial numbers are reserved. If the system runs for 10 years, it will take at least 10 years, X 365 days, x 24 hours, X 3600 seconds, X 1000 milliseconds = 320 x 10 ~ 9, almost 39 bits for the number of milliseconds, the peak concurrency of a single machine per second is less than 100, and almost 7 bits for the auto increment number per millisecond. If the machine room is less than 100 machines within 5 years, 2 bits are reserved for the machine room, and each machine room is less than 100 machines, Reserve 7 bits for each computer room, less than 10 business lines, and reserve 4 bits for business line identification.

64bit distributed ID (42bit MS + 5bit machine ID + 12bit auto increment), etc

Method of generating distributed ID: A, two self incrementing tables, step length separated from each other, B, time in milliseconds or nanoseconds, C, UUID, 64 bit constraint (as above)

4. When the memory data of redis rises to a certain size, the data elimination strategy will be implemented. What are the six data elimination strategies provided by redis?

LRU: select the least recently used data from the data set with expiration time set

Random: select any data from the data whose expiration time has been set

TTL: select the data to be expired from the data set with expiration time set.

Notenvision: prohibit eviction of data

For example, MySQL has 20 million data, while redis only stores 200000 popular data. LRU or TTL meet the hot data read more, unlikely to timeout characteristics.

Redis features: speed block, O (1), rich data types, support for things atomicity, can be used for caching, faster than memecache speed block, can persist data.

Common problems and solutions: it is better for master not to persist, such as RDB snapshot and AOF log file; If the data is more important, a slave starts AOF backup data, and the policy is once per second. For the master-slave replication speed and stability, the MS master-slave is in the same LAN; Master slave replication should not use graph structure, but use one-way linked list to make m-s-s-s more stable….; When redis is expired, lazy + regular is used. Lazy check whether the key is expired when get / set, delete the key when it is expired, traverse each DB regularly, and check several keys; Adjust concurrency with server performance.

In case of obsolescence, data written to redis will be attached with a valid time. Within this valid time, the data is considered to be correct and does not care about the real situation. For example, for payment and other services, version number is used. Each data in redis maintains a version number, and one data in DB also maintains a version number. Only when the version of redis is consistent with that in dB, redis is considered to be valid, However, you still need to access the DB every time. You only need to query the version field.

C / C + + Linux background server development learning address:…(contents include:C / C + +, Linux, nginx, zeromq, mysql, redis, fastdfs, mongodb, ZK, streaming media, CDN, P2P, k8s, docker, TCP / IP, protocol, dpdk, etc

5. How is zookeeper distributed high availability?

During zookeeper operation, at least half of the machines in the cluster keep the latest data. If more than half of the machines in the cluster can work normally, the cluster can provide external services.

Zookeeper can select n machines as hosts, which can realize M: n backup; Keepalive can only select one machine as the host, so keepalive can only realize M: 1 backup.

There are usually two deployment schemes as follows: dual machine room deployment (a machine room with better stability and more reliable equipment is the main machine room, while the other machine room is cheaper. For example, for a zookeeper cluster composed of seven machines, four machines are usually deployed in the main machine room, The remaining three machines are deployed to another computer room; Three machine room deployment (no matter which machine room fails, the number of machines in the remaining two machine rooms is more than half. Several machines are deployed in three computer rooms to form a zookeeper cluster. Assuming that the total number of machines is n, the number of machines in each machine room: N1 = (n-1) / 2, N2 = 1 ~ (n-n1) / 2, N3 = n-n1-n2).

Horizontal expansion is to add more machines to the cluster. Zookeeper has two ways (imperfect), one is to restart the cluster as a whole, the other is to restart the servers one by one.

6. How to distribute the data in the redis database?

Redis itself supports 16 databases, which are set by database ID, and the default value is 0.

For example, jedis client settings. 1: Jedispool (org. Apache. Commons. Pool. Impl. Genericobjectpool. Config poolconfig, string host, int port, int timeout, string password, int database);

The first one is to select the library by specifying the database field of the constructor. If it is not set, the default is 0. 2: Jedis. Select (index); Call the select method of jedis.

7. How to deal with idempotent?

1、 Query and delete operations are natural idempotent

2、 Unique index to prevent new dirty data

3、 Token mechanism to prevent duplicate page submission

4、 Pessimistic lock for update

5、 Optimistic lock (implemented by version number / timestamp, restricted by condition where avai)_ amount-#subAmount# >= 0)

6、 Distributed lock

7、 The state machine is idempotent (if the state machine is in the next state and a change of the previous state comes, it can not be changed in theory. In this way, the idempotence of the finite state machine is guaranteed.)

8、 Select + insert (the background system with low concurrency, or some job tasks, in order to support idempotent, support repeated execution)

If you need more latest interview questions (Alibaba, Tencent, Baidu), you can add Qun (563998835)
Interview questions and answers of the latest Alibaba C / C + + Linux background developmentInterview questions and answers of the latest Alibaba C / C + + Linux background development

8. HTTP workflow?

a. The client sends the encryption rules it supports to the server, and the representative tells the server to connect

b. The server selects a set of encryption algorithm, hash algorithm and its own identity information (address, etc.) to send to the browser in the form of certificate. The certificate contains the server information, encryption public key and certificate method

c. After receiving the certificate of the website, the client should do the following:

C1. Verify the validity of the certificate

C2. If the certificate is verified, the browser will generate a series of random numbers as the key K, and use the public key in the certificate for encryption

C3. Use the agreed hash algorithm to calculate the handshake message, then encrypt it with the generated key K, and then send it to the server together

d. The server receives the information sent by the client and requires the following things:

D1. Use the private key to parse the password, use the password to parse the handshake message, and verify whether the hash value is consistent with the one sent by the browser

D2. Use the key to encrypt the message and send it back

If the hash values are consistent, the handshake is successful

9. How to deal with rabbitmq message heap?

Increase the processing power of consumers (such as optimizing code), or reduce the release frequency

Upgrading hardware is not the only way, it can only play a temporary role

Consider using the maximum queue length limit, which is supported by rabbitmq 3.1

Set the age of the message, and discard it when the message times out

By default, rabbitmq consumer is single thread serial consumption, and two key properties of concurrent consumption, concurrent consumers and prefetchcount, are set. Concurrent consumers sets the number of concurrent consumers set for each listener during initialization, and prefetchcount is the number of messages to be consumed from the broker at one time

Create a new queue, and consumers subscribe to the new and old ones at the same time

The producer side caches data and sends it to MQ after MQ is consumed

When the QoS value is used up and the new ack is not received by MQ, we can jump out of the sending cycle and receive the new message; The consumer actively blocks the receiving process. When the consumer feels that the received message is too fast, the consumer actively blocks, and uses block and unblock methods to adjust the receiving rate. When the receiving thread is blocked, it jumps out of the sending cycle.

Create a new topic, and the partition is 10 times of the original one; Then write a temporary data distribution consumer program. This program is deployed to consume the backlog of data. After consumption, it does not do time-consuming processing, and directly evenly poll and write the temporarily established 10 times number of queues; Then, 10 times of the machines are temporarily requisitioned to deploy the consumers, and each batch of consumers consumes the data of a temporary queue; After fast consuming the backlog data, we have to restore the original deployment architecture and use the original consumer machine to consume messages again;

10. A thread pool is processing services. What should I do if there is a sudden power failure?

The queue can be stored persistently and loaded automatically next time.

But the actual need depends on the situation, and the general idea is this.

Add flag bit, unprocessed 0, processing 1, processed 2. Every time you start up, set all the status of 1 to 0. Or timer processing

The key application is to equip the computer with ups.

C / C + + Linux background server development learning address:…(contents include:C / C + +, Linux, nginx, zeromq, mysql, redis, fastdfs, mongodb, ZK, streaming media, CDN, P2P, k8s, docker, TCP / IP, protocol, dpdk, etc

11. What are the characteristics of red black trees?

(1) Each node is either black or red.

(2) The root node is black.

(3) Each leaf node (NIL) is black[ Note: the leaf node here refers to the leaf node that is empty (nil or null)!]

(4) If a node is red, its children must be black.

(5) All paths from a node to its descendants contain the same number of black nodes[ Here is the path to the leaf node]

12. What are the characteristics of distributed service invocation?

Distributed service call can track the system, add call chain ID to business log, add call delay and QPS to RPC in each link.

Non business components should add less business code. The service call adopts buy point, which is the context information of the current node, including traceid, rpcid, start and end time, type, protocol, caller IP, port, service name, and other abnormal information, message and other extensions. The log adopts offline + real-time, such as flume combined with Kafka, The logs should be summarized according to traceid and sorted in rpcid order.

13. How to write high quality C + + code?

1) The function of extern C is to let the subsequent linker look for functions in C mode when the program is compiled by C + + compiler, which is convenient for C + + program to call C program.

2) C + + style comments, such as: / /…, are recommended. However, it is better to use C style (/ * * /) for the description of header file and the annotation of function default parameters.

3) Don’t write code that depends heavily on the compiler

For example: printf (“a% d% d”, P (), q ()), the order before and after the execution of P and Q functions is related to the compiler implementation, so this kind of code should be avoided. Similarly, C = p() * q() * r().

4) Try to use const, enum, inline instead of define

The inline keyword is used to expand function calls. Functions defined and implemented in class declarations are automatically inline functions. If you need to define other functions as inline functions, you need to declare this keyword in the function implementation header to let the compiler try to inline. As for whether to inline, you also need the function body to meet certain conditions. The overall principle is short and concise.

When using define, pay attention to use () to protect macro functions. For example: define max (a, b) ((a) > (b)( a) : (b))

5) The similarities and differences of struct in C and C + +, C language struct does not allow the definition of function programs, and C + + language struct can.

6) All data members are private. If the derived class needs to be used, it will be changed to the protection type when it is used. Otherwise, it will be declared as the private type and hidden. In specific declaration, you can declare multiple sections by type, such as private control, private data and private data.

7) When a class contains at least one virtual function, its destructor needs to be set as a virtual function. Do not call virtual functions in the constructor / destructor.

8) In the behavior centered class design, the external public function is placed in the front, and the protection virtual function that needs to be inherited is followed, followed by the private virtual function, ordinary function and member variable.

9) The meaning behind the syntax is semantics. The interface design should have clear semantics, and should not be ambiguous or unclear.

10) If an exception occurs at the bottom level, it needs to be reported level by level until the level that has the ability to handle the exception can handle it. If the program is not processed, it will be captured and terminated by the C + + system. Exceptions can separate the occurrence of errors from the handling of errors.

11) Generally, the exception is thrown by passing value and caught by const reference, which does not involve the cleaning of exception objects and no object cutting problem. For example, throw can be called after this level of processing.

12) Give priority to shared_ PTR, its internal working principle is reference counting, thread safety, support extension, recommended.

If you need more latest interview questions (Alibaba, Tencent, Baidu), you can add Qun (563998835)
Interview questions and answers of the latest Alibaba C / C + + Linux background developmentInterview questions and answers of the latest Alibaba C / C + + Linux background development

Recommended Today

Implementation example of go operation etcd

etcdIt is an open-source, distributed key value pair data storage system, which provides shared configuration, service registration and discovery. This paper mainly introduces the installation and use of etcd. Etcdetcd introduction etcdIt is an open source and highly available distributed key value storage system developed with go language, which can be used to configure sharing […]