Kafka client parameter description (Kafka client version 2.4):

Time:2021-10-8

1: Consumer end
Consumer parameters are defined in the class: org.apache.kafka.clients.consumer.consumerconfig.

1.1: bootstrap.servers: default: null
List of host / port pairs used to establish an initial connection to the Kafka cluster. The client will use all the servers, not just the nodes configured here. Because these server addresses are only used to initialize the connection and discover all Kafka cluster members through the existing configuration (the cluster will change at any time), this list does not need to include the complete cluster address (but configure as many as possible to prevent the configured server from downtime)
Format:host1:port1,host2:port2,...

1.2: client.dns.lookup: default
Controls how clients use DNS lookups. If configured as use_ all_ dns_ IPS, connect to each returned IP address in turn until the connection is successfully established. If configured as resolve_ canonical_ bootstrap_ servers_ Only, resolve each boot address into a canonical name list.

1.3: group.id: default: null
Unique identification of the consumer group to which the consumer belongs. In the following two cases, group.id must be set: (1): Kafka based offset management policy. (2) : kafkaconsumer subscribes to messages using the subscribe interface.

1.4: group.instance.id: default: null
Consumer instance ID, only non empty strings are allowed. If set, the consumer will be treated as a static member, which means that only one instance with this ID is allowed in the consumer group at any time. If not set, consumers will join the group as dynamic members, which is a traditional behavior.

1.5: session.timeout.ms: default: 10000 MS
In the consumer group, detect the session timeout of consumer failure. The consumer will send the heartbeat message to the server in a fixed cycle. When the server does not receive the heartbeat message within the specified time, it is considered that the consumer is lost. At this time, the server will kick out the node from the consumer group, and then rebalance. Note that this value must be between group.min.session.timeout.ms and group.max.session.timeout.ms.

1.6: heartbeat.interval.ms: default: 3000 ms
The expected heartbeat interval in the Kafka consumer group. Heartbeat is used to ensure that consumer meetings remain active and to promote rebalancing when new consumers join or leave the group. The value must be set below the configuration of session.timeout.ms. But it should usually be set to less than one-third of this value.

1.7: partition.assignment.strategy: default value: rangeassignor.class
When using group management, the client will use the class name of the partition allocation policy to allocate partition ownership between consumer instances. You can insert custom allocation policies by implementing the org.apache.kafka.clients.consumer.consumerpartitionassignor interface.

1.8: metadata.max.age.ms: default: 5 * 60 * 1000 ms
Cycle time to force metadata refresh. You can proactively discover any new agents or partitions even if there are no zone leadership changes.

1.9: enable.auto.commit: default: true
If true, the consumer’s offset will be submitted automatically in the background periodically.

1.10: auto.commit.interval.ms: default: 5000 ms
How often consumers automatically submit offsets. Valid when the value of enable.auto.commit is true.

1.11: client.id: default value: “”
The ID string to be passed to the server when making a request. The purpose of this is to track the request source other than the IP / port by allowing the logical application name to be included in the server-side request log record.

1.12: client.rack: default:
Rack identifier for this client. This can be any string value indicating the physical location of this client. It corresponds to the broker configuration “broker. Rack”

1.13: max.partition.fetch.bytes: default: 1 * 1024 * 1024 (1m).
The maximum amount of data returned per request from each partition on the server. Consumers obtain data from the server in batches. If the number of the first non empty partition obtained is greater than this value, it will still be returned. The maximum amount of data returned by the server is determined by the message.max.bytes configuration (this is the server configuration). Or configure the settings through max.message.bytes of topic.

1.14: send.buffer.bytes: default value: 128 * 1024
The size of the TCP send buffer to use when sending data (so_sndbuf). If the value is – 1, the OS default is used.

1.15: receive.buffer.bytes: default value: 64 * 1024
The size of the TCP receive buffer (so_rcvbuf) to use when reading data. If the value is – 1, the OS default is used.

1.16: fetch.min.bytes: default: 1
Get the minimum amount of data returned by the request from the server. If there is not enough data available, the request will wait for a large amount of data to accumulate before answering the request. The default setting is 1 byte, which means that as long as one byte of data is available or the extraction request times out while waiting for the data to arrive, the extraction request will be responded to. Setting this value to a value greater than 1 will cause the server to wait for a greater amount of data accumulation, which will slightly improve the server throughput, but increase some latency.

1.17: fetch.max.bytes: default: 50 * 1024 * 1024 (50m)
The maximum amount of data returned from the server per request. Consumers obtain data from the server in batches. If the number of the first non empty partition obtained is greater than this value, it will still be returned. The maximum amount of data returned by the server is determined by the message.max.bytes configuration (this is the server configuration). Or configure the settings through max.message.bytes of topic.

1.18: fetch.max.wait.ms: default: 500 ms.
If there is not enough data to immediately meet the requirements given by fetch.min.bytes, the maximum time that the server will block before responding to the fetch request.

1.19: reconnect.backoff.ms: default: 50 ms
The base amount of time to wait before attempting to reconnect to a given host. This avoids repeated connections to the host in a tight loop. This fallback applies to all client to agent connection attempts.

1.20: connect.backoff.max.ms: default: 1000 ms
Maximum time (in milliseconds) to wait while reconnecting to an agent that failed a duplicate connection

1.21: retry.backoff.ms: default: 100ms
The amount of time to wait before attempting to retry a failed request for a given topic partition. This can avoid sending requests repeatedly in a tight loop in some fault situations.

1.22: auto.offset.reset: default: latest. Optional values (“latest”, “early”, “None”)
What if there is no initial offset in Kafka, or the current offset no longer exists on the server (for example, because the data has been deleted):
Early: automatically resets the offset to the earliest offset
Latest: automatically resets the offset to the latest offset
None: the consumer group does not find the position offset, and an exception is thrown

1.23: check.crcs: default: true
Automatically check CRC32 of consumed records. This ensures that online or disk corruption of messages does not occur. This check adds some overhead, so it may be disabled if extreme performance is sought.

1.24: metrics.sample.window.ms: default: 30000 MS
Calculate the time window for the measurement sample.

1.25:metrics.num.samples
The number of samples reserved to calculate the measure.

1.26: metrics.recording.level: default: info. Optional values: debug, info
Calculate the highest record level.

1.27: key.deserializer: default value:
Implementation class of interface org.apache.kafka.common.serialization.deserializer. Deserialize the key.

1.28: value.deserializer: default value:
Implementation class of interface org.apache.kafka.common.serialization.deserializer. Used to deserialize the value value.

1.29: request.timeout.ms: default: 30000 MS
The configuration controls the maximum time that a client waits for a response to an initiated request. If no response is received before the timeout, a retry will be initiated before the number of retries is used up, and a failure will be obtained after the number of retries is used up.

1.30: default.api.timeout.ms: default: 60000 milliseconds
The default timeout that the consumer API may block. This default value is used when the timeout parameter is not set.

1.31: connections.max.idle.ms: default: 9 * 60000 MS
Configure the maximum idle time of the connection. After this time, the connection will be closed.

1.32: interceptor.classes: default: empty list
List of implementation classes of interface org.apache.kafka.clients.consumer.consumerinterceptor. Used to add interceptors before consumers receive messages.

1.33: max.poll.records: default: 500
The maximum number of records obtained in a single poll request.

1.34: max.poll.interval.ms: default: 300000 MS
The maximum delay time for invoking the poll command when using consumer group management. This value is the maximum idle time before the consumer gets the record. If the consumer does not call the poll command before the timeout expires, it is considered that the consumer has failed. At this time, rebalancing of the consumer group will be triggered to assign this partition to other consumers. If the consumer’s group.instance.id configuration is not empty, the partition will not be reallocated immediately after reaching this timeout. At this time, the consumer will stop sending heartbeat messages. The partition will not be reallocated until the timeout configured by session.timeout.ms is reached. This reflects the behavior of closed consumers.

1.35: exclude.internal.topics: default: true
Should the internal topic of the subscription mode be excluded from the subscription topic

1.36: internal.leave.group.on.close: default: true
Remove from consumer group after consumer closes

1.37: isolation.level: default: read_ Uncommitted, optional value: read_ COMMITTED,READ_ UNCOMMITTED
Controls how messages written by transactions are read. If read is configured_ Committed, then the consumer can only read the transactional committed messages using the poll command. If read is configured_ Uncommitted, you will get all the messages, even if the transaction has been interrupted. Non transactional messages are returned unconditionally in either mode.

1.38: allow.auto.create.topics: default: true
Whether to allow automatic creation of topics on the server side when subscribing to or allocating topics. At the same time, the auto.create.topics.enable configuration of the server must also be true to automatically create.

1.39: security.providers: default: null
The implementation class of the interface org.apache.kafka.common.security.auth.securityprovidercreator is used to implement the security algorithm.

1.40: security.protocol: default value: plaintext
Communication protocol with server. Valid values are: utils. Join (securityprotocol. Names(),, “)

2: Production end
The parameters of the production side are defined in the class: org.apache.kafka.clients.producer.producerconfig

2.1: bootstrap.servers: default: null
List of host / port pairs used to establish an initial connection to the Kafka cluster. The client will use all the servers, not just the nodes configured here. Because these server addresses are only used to initialize the connection and discover all Kafka cluster members through the existing configuration (the cluster will change at any time), this list does not need to include the complete cluster address (but configure as many as possible to prevent the configured server from downtime)
Format:host1:port1,host2:port2,...

2.2: client.dns.lookup: default
Controls how clients use DNS lookups. If configured as use_ all_ dns_ IPS, connect to each returned IP address in turn until the connection is successfully established. If configured as resolve_ canonical_ bootstrap_ servers_ Only, resolve each boot address into a canonical name list.

2.3: buffer.memory: default value: 32 * 1024 * 1024l
The total number of memory bytes that the producer can use to buffer records waiting to be sent to the server. If the speed of sending records is faster than that of transmitting them to the server, the production end will be blocked_ BLOCK_ MS_ Config configuration time, and then an exception will be thrown. This setting should roughly correspond to the total memory that the producer will use, but not a hard limit. Because not all producers use buffers.

2.4: retries: default: integer.max_ VALUE。 Range: [0, integer. Max_value]
Setting a value greater than zero will cause the client to resend any records that failed due to temporary errors. This retry is no different from the client resending the record when it receives an error. If Max_ IN_ FLIGHT_ REQUESTS_ PER_ If the connection setting is not 1, retry may change the order in which messages arrive at the partition. For example, if the first message fails to retry and the second message succeeds, the second message may reach the partition before the first one. When delivery is configured_ TIMEOUT_ MS_ After the timeout of config, even if the number of retries is not used up, but the timeout has expired, it will fail. In the same city, you can use delivery instead of setting this property_ TIMEOUT_ MS_ Config to control.

2.5: acks: default value: “1”, optional values: “all”, “1”, “0”, “1”
The number of replies received by the leader determines whether the producer request has been processed.
If set to 0, the producer will not wait for a response from the server, and the message will be immediately added to the socket buffer and treated as sent. In this case, there is no guarantee that the message has been sent to the server, and the retries configuration will not take effect.
If it is set to 1, the message will be sent to the leader and written to the leader’s log, and will respond directly without waiting for the follow response. In this case, the leader can confirm that the message is received, but the message may be lost in follow.
If it is set to all, the leader will wait for all follow responses before responding. This strongly ensures that messages will not be lost.
If set to – 1, the effect is the same as setting to all.

2.6: compression.type: default: None
The compression type of all data generated by the producer. The default value is none, indicating no compression. Valid values are: none, gzip, snappy, lz4, zstd. Compression is the compression of the whole batch of data, so the effect of batch processing will also affect the compression ratio.

2.7: batch.size: default: 16384
Whenever multiple messages are sent to the same partition, the producer will try to batch records together to reduce requests. This helps improve the performance of clients and servers. This configuration controls the default batch size in bytes. No attempt will be made to batch records larger than this size. The request sent to the agent will contain multiple batches, one for each partition. The smaller batch.size is not very common and may reduce throughput. A very large batch.size may use memory, which is a bit wasteful, because we always allocate a buffer with a specified batch size to expect other records.

2.8: linker.ms: default: 0
The producer combines all records that arrive between request transfers into a single batch request. Usually, this happens only when the arrival speed of the record is faster than the transmission speed. However, in some cases, customers may want to reduce the number of requests even under the right load. This configuration sets a delay. In this way, the producer does not have to send messages immediately, but waits for the configured time to send messages in batches, which is similar to the Nagle algorithm in TCP. When we configure batch_ SIZE_ After config, linker.ms is the upper limit of waiting, even if the number of message bytes does not reach the configured value. If ringer_ MS_ Config is configured to 0, indicating that it does not wait.

2.9: delivery.timeout.ms: default: 120 * 1000 ms
After calling the send () method, the maximum time limit for success or failure is reported. This configuration limits the maximum time limit for delayed message sending, waiting for server confirmation and failure retry. When an unrecoverable error is encountered, the number of retries has been exhausted. When the time does not reach the upper limit, the result will be returned in advance. This value should not be less than request_ TIMEOUT_ MS_ Config and Ringer_ MS_ The value of the sum of config.

2.10: client.id: default value: “”
The ID string to be passed to the server when the request is made. The purpose of this is to track request sources other than IP / ports by allowing the logical application name to be included in the server-side request log record.

2.11: send.buffer.bytes: default value: 128 * 1024
The size of the TCP send buffer (so_sndbuf) to use when sending data. If the value is – 1, the OS default is used.

2.12: receive.buffer.bytes: default value: 32 * 1024
The size of the TCP receive buffer (so_rcvbuf) to use when reading data. If the value is – 1, the OS default is used.

2.13: max.request.size: default value: 1024 * 1024
The maximum size of the request in bytes. When the producer sends messages in batches, this setting limits the maximum value to avoid sending an oversized request. Note that the server also has a maximum request value, which may be different from this value.

2.14: reconnect.backoff.ms: default value: 50L
The base amount of time to wait before attempting to reconnect to a given host. This avoids repeated connections to the host in a tight loop.

2.15: reconnect.backoff.max.ms: default: 1000L
Maximum wait time to reconnect to the server. If provided, the reconnection of each host will multiply with each successive connection failure until this maximum.

2.16: retry.backoff.ms: default value: 100L
The amount of time to wait before attempting to retry a failed request for a given topic partition. This can avoid sending requests repeatedly in a tight loop in some fault situations.

2.17: max.block.ms: default value: 60 * 1000
Controls the blocking time of the kafkaproducer. Send() and kafkaproducer. Partitionsfor() commands. These methods may be blocked because the buffer is full or metadata is not available. Blocking in a user supplied serializer or partitioner will not count towards this timeout.

2.18: request.timeout.ms: default value: 30 * 1000
The configuration controls the maximum time the client waits for a response to a request. If you still don’t get a response before the timeout expires, you will retry or fail. The value of this configuration should be greater than the value of replica.lag.time.max.ms to reduce the possibility of message duplication due to unnecessary producer retries.

2.19: metadata.max.age.ms: default: 5 * 60 * 1000
A period of time in milliseconds. After this period, we force the metadata to be refreshed. Even if we do not see any changes in the leadership of the partition, we can actively discover any new agents or partitions.

2.20: metrics.sample.window.ms: default value: 30000
Calculate the time window for the measurement sample.

2.21: metrics.num.samples: default: 2
The number of samples reserved to calculate the measure.

2.22: metrics.recording.level: default: info. Optional values: info, debug
The highest record level of the metric.

2.23: metric.reports: default: null
A list of classes used as a measurement reporter. Is the implementation class of the org.apache.kafka.common.metrics.metricsreporter interface. Jmxreporter is always included in registered JMX statistics.

2.24: max.in.flight.requests.per.connection: default: 5
The maximum number of unacknowledged requests that a client will send on a single connection before blocking. Note that if this setting is set to greater than 1 and there is a failed send, there is a risk of message reordering due to retry.

2.25: key.serializer: default: none
The implementation class of the interface org.apache.kafka.common.serialization.serializer is used to serialize keys.

2.26: value.serializer: default: None
The implementation class of the interface org.apache.kafka.common.serialization.serializer is used to serialize value.

2.27: connections.max.idle.ms: default value: 9 * 60 * 1000
Close idle connections after the specified number of milliseconds for this configuration.

2.28: partitioner.class: default: none
The implementation class of the interface org.apache.kafka.clients.producer.partitioner is used to customize the fragmentation algorithm.

2.29: interceptor.classes: default: null
Interceptor list of implementation classes of interface org.apache.kafka.clients.producer.produceinterceptor. Allows messages to be intercepted before being sent to the Kafka cluster. By default, there is no interceptor.

2.30: security.protocol: default value: plaintext
Communication protocol with the server. Possible values are: utils. Join (securityprotocol. Names(), “)

2.31: security.providers: default: null
The implementation class list of the interface org.apache.kafka.common.security.auth.securityprovidercreator is used to implement the security algorithm.

2.32: enable.identotence: default: false
When set to “true”, the producer will ensure that only one copy of each message is written in the stream. If “false”, the producer retry caused by server failure may write a copy of the retry message in the stream. Note that if it is set to true, Max is configured_ IN_ FLIGHT_ REQUESTS_ PER_ The value of connection must be less than or equal to 5_ Config configuration must be greater than 0, acks_ Config configuration must be all. If the user does not explicitly set these values, the appropriate values will be selected. If an incompatible value is set, a configexception is thrown.

2.33: transaction.timeout.ms: default value: 60000
The maximum time (in milliseconds) that the transaction coordinator waits for the producer to update the transaction state before actively aborting an ongoing transaction. If this value is greater than the transaction.max.timeout.ms setting in the agent, the request will fail with an invalidtransactiontimeout error.

2.34: transactional.id: default: none
Transactionalid used for transaction delivery. This supports reliability semantics across multiple producer sessions because it allows clients to ensure that transactions using the same transactionalid have completed before starting any new transactions. If transactionalid is not provided, the producer is limited to idempotent passing. Please note that if transactionalid is configured, the configuration of enable.identotence must be true. The default value is null, which means that transactions cannot be used.

Recommended Today

New childhood fairy tale [139] little prince 32 — I planted this tree

Pain makes people sober. Everything in life originally depends on themselves. The mercy of others can’t win a better future. A petite thief said, “this time… I planted the tree and this time… I opened the road. If you want to buy the road money this time… ························································································· Another tall thief who blocked the way […]