Flink learning — ensuring end-to-end exactly once semantics


In December 2017, Apache Flink 1.4.0 was released. There is a milestone function: two phase submit sink functions,relevant Jira here)。 Twophasecommitsinkfunction is to divide the last logic written to storage into two parts for submission. In this way, it is possible to build an end-to-end Flink application with exactly once semantics from data source to data output. Of course, the data output of twophasecommitsinkfunction includes Apache Kafka 0.11 or above. Flink provides an abstract twophasecommitsinkfunction class to enable developers to implement end-to-end exactly once semantics with less code(Attach the document of twophasecommitsinkfunction)。

Next, let’s further introduce the feature of Flink

The function of checkpoint in Flink is to ensure the semantics of exactly once

How does Flink guarantee the exact only semantics from data source to data output through two submission protocols

An example is given to explain how to use twophasecommitsinkfunction to implement an exactly once sink

Exactly only semantics is to ensure that the final data processing results and data intake without data loss and duplication.

The checkpoint of the Flink contains the current status of the Flink application and the location of the data input stream (for Kafka, it is offset)

Checkpoint can be persisted asynchronously to a storage system like S3 or HDFS. If the Flink application fails or upgrades, you can pull the state in the checkpoint to recover the last failed data.

Before flink1.4.0, Flink ensured the exact only semantics of its application through checkpoint. Now we add two phase commit sink function to guarantee the end-to-end exactly once semantics.

In order to ensure semantics, the external system connected to Flink needs to support two-part submission, that is, the external system needs to support the features of pre submission and rollback of data without final submission. Later, we will discuss how Flink implements the second commit protocol with the external system to guarantee the semantics.

Use Flink to ensure that the end-to-end data is not lost or duplicated

Next, let’s take a look at the example of using flip to consume and write Kafka to guarantee the exact only semantics through two submissions.

Kafka supports object operation from 0.11. To use the end-to-end exact once semantics of Flink, we need that the Kafka of the sink of Flink is more than 0.11. At the same time, Dell / EMCPravegaIt also supports the use of Flink to ensure end-to-end exactly only semantics.
This example includes the following steps:

Read data from Kafka

An aggregate window operation

Write data to Kafka

In order to be exactly only, all operations written to Kafka must be transactional. Data should be submitted in batches between two checkpionts, so that the uncommitted data can be rolled back after the task fails.
However, a simple commit and rollback is not enough for a distributed streaming data processing system. Now let’s see how Flink solves this problem.

The first step of the two-part submission protocol is pre submission. The jobmanager of Flink inserts a checkpoint tag into the data stream (this tag can be used to distinguish the data of this checkpoint from that of the next checkpoint).

This tag is passed throughout the DAG. When an operator in each DAG encounters this flag, it triggers a snapshot of the state of the operator.

Read the operator of Kafka and store the offset of Kafka when the checkpoint mark is encountered. After that, the checkpoint marker is passed to the next operator.
Next comes the memory operator of Flink. These internal operators do not need to consider the two commit protocols, because their state will be updated or rolled back with the overall state of the Flink.

When dealing with external systems, we need two-step submission protocol to ensure that the data will not be lost and repeated. In the pre submit step, all data submitted to Kafka are pre submitted.

When the snapshot of all operators is completed, that is, this checkpoint is completed, the jobmanager of Flink will notify all operators that this checkpoint is completed, and the operator responsible for writing data to Kafka of Flink will also formally submit the data of the previous write operation. If a task fails at any stage of its operation, it will be recovered from the previous state, and all data not formally submitted will be rolled back.

To summarize the two steps of Flink submission:

When all operators have completed their snapshot, a formal commit operation is performed

When any subtask fails in the pre commit phase, other tasks stop immediately and roll back to the state of the last successful snapshot.

After the pre submit status is successful, the external system needs to perfectly support the operation before formal submission. If there is a submit failure, the entire Flink application will enter the failed state and restart. After the restart, it will continue to try to submit from the last state.

Application of two step submit operator in Flink

When using the two-step submit operator, we can inherit the virtual class twophasecommitsinkfunction.

Through a simple example of writing a file to explain this virtual class. This two-step submitted class has four states.

1. Begin transaction – create a temporary folder to write data to.

2. Precommit – write the data cached in memory to a file and close it.

3. Commit – put the temporary files written before into the target directory. This means that there will be some delay in the final data.

4. Abort – discards temporary files

If the failure occurs after successful pre submission and before formal submission. You can submit the pre submitted data according to the status, or delete the pre submitted data.


Flink guarantees the end-to-end exactly only semantics through state and two commit protocol.

In batch processing, Flink does not need to persist every calculation to memory

Flink supports the guarantee of producer’s exactly once semantics above pravega and Kafka 0.11