Preface: a lot of students leave a message saying that they can’t understand the blog Kafka application example I wrote before. They don’t understand why the code is written so. I found the PPT about Kafka technology analysis I wrote before. Now I will sort out some key points and related principles for your reference.
1. Introduction of message queue
Kafka definition: Kafka is a distributed message queue based on publish / subscribe mode, which is mainly used in the field of big data real-time processing.
Definition of message queue
The difference between the two models
How does the publish subscribe model achieve load balancing?
2. Model comparison of popular queues
The production side sends a message to the queue through routing, and only one consumer can consume it.
When rabbitmq needs to support multiple subscriptions, the message sent by the publisher is written to multiple queues simultaneously through routing, and the message is consumed by different subscription groups.
3. Kafka architecture
Note 1: messages in a partition of a topic can only be consumed by one consumer in a consumer group
Note 2: however, different consumer groups can appear: messages in a partition can be consumed by multiple consumers in different consumer groups at the same time
Note 3: when the number of consumers in a consumer group is greater than the number of partitions, it will cause a waste of consumer resources
Note 4: when the number of consumers is less than the number of partitions: a consumer can consume messages in two consumer groups at the same time
4. Kafka production process analysis
1. Write mode — producer uses push mode to publish messages to broker
5. Kafka’s storage strategy
File storage mode