Apache Flink zero basic introduction (4): five modes of client operation

Time:2019-12-29

Author: Zhou Kaibo

1. Environmental description

In the previous several courses, I talked about the building of Flink development environment, the deployment and operation of applications. Today’s course is mainly about the client operation of Flink. This explanation focuses on practical operation. This course is community-based Flink version 1.7.2. The operating system is Mac, and the browser is Google Chrome. For the preparation of development environment and the deployment of cluster, please refer to the content of “construction of development environment and configuration, deployment and operation of application”.

2. Course summary

As shown in the figure below, Flink provides rich client operations to submit tasks and interact with tasks, including Flink command line, Scala shell, SQL client, restful API and web. The most important thing Flink provides first is the command line, then the SQL client is used to submit the running of SQL tasks, and the scala shell is used to submit the table API tasks. At the same time, Flink also provides restful service, which users can call through HTTP. In addition, there is a Web way to submit tasks.

Apache Flink zero basic introduction (4): five modes of client operation

In the bin directory of the Flink installation directory, you can see the files such as Flink, start-scala-shell.sh and sql-client.sh, which are all the entries for client operations.

Apache Flink zero basic introduction (4): five modes of client operation

3. Flink client operation

3.1 Flink command line

There are many command line parameters for Flink. Enter Flink – h to see the complete description:

  flink-1.7.2 bin/flink -h

If you want to see the parameters of a command, such as the run command, enter:

  flink-1.7.2 bin/flink run -h

This article mainly explains some common operations. For more details, please refer to the official Flink command line documentation.

3.1.1 Standalone

Start a standalone cluster first:

  flink-1.7.2 bin/start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host zkb-MBP.local.
Starting taskexecutor daemon on host zkb-MBP.local.

Open http://127.0.0.1:8081 to see the web interface.

Run

To run a task, take the example that comes with Flink, topspeedwindowing:

  flink-1.7.2 bin/flink run -d examples/streaming/TopSpeedWindowing.jar
Starting execution of program
Executing TopSpeedWindowing example with default input data set.
Use --input to specify file input.
Printing result to stdout. Use --output to specify output path.
Job has been submitted with JobID 5e20cb6b0f357591171dfcca2eea09de

After running, the default is one Concurrency:

Apache Flink zero basic introduction (4): five modes of client operation

Click task manager on the left, and then click stdout to see the output log:

Apache Flink zero basic introduction (4): five modes of client operation

Or view the *. Out file in the local log directory:

Apache Flink zero basic introduction (4): five modes of client operation

List

To view the task list:

  flink-1.7.2 bin/flink list -m 127.0.0.1:8081
Waiting for response...
------------------ Running/Restarting Jobs -------------------
24.03.2019 10:14:06 : 5e20cb6b0f357591171dfcca2eea09de : CarTopSpeedWindowingExample (RUNNING)
--------------------------------------------------------------
No scheduled jobs.

Stop

Stop the task. Use – m to specify the host address and port of the jobmanager to stop.

  flink-1.7.2 bin/flink stop -m 127.0.0.1:8081 d67420e52bd051fae2fddbaa79e046bb
Stopping job d67420e52bd051fae2fddbaa79e046bb.
------------------------------------------------------------
The program finished with the following exception:
  org.apache.flink.util.FlinkException: Could not stop the job   d67420e52bd051fae2fddbaa79e046bb.
  at org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:554)
  at org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:985)
  at org.apache.flink.client.cli.CliFrontend.stop(CliFrontend.java:547)
  at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1062)
  at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
  at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
  at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
Caused by: java.util.concurrent.ExecutionException: org.apache.flink.runtime.rest.util.RestClientException: [Job termination (STOP) failed: This job is not stoppable.]
  at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
  at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
  at org.apache.flink.client.program.rest.RestClusterClient.stop(RestClusterClient.java:392)
  at org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:552)
... 9 more
Caused by: org.apache.flink.runtime.rest.util.RestClientException: [Job termination (STOP) failed: This job is not stoppable.]
  at org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:380)
  at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:364)
  at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
  at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
  at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)

It can be seen from the log that the stop command failed to execute. A job can be stopped. All sources can be stopped. That is to say, the stoppablefunction interface is implemented.

/**
 *A function that can be stopped must implement this interface, such as the source of a streaming task.
 *The stop () method is called when the task receives a stop signal.
 *After receiving this signal, source must stop sending new data and stop gracefully.
 */
@PublicEvolving
public interface StoppableFunction {
    /**
      *Stop source. Unlike cancel(), this is a request for source to stop gracefully.
     *The waiting data can continue to be sent out without stopping immediately.
      */
    void stop();
}

Cancel

Cancel the task. If state.savepoints.dir is configured in conf / flip-conf.yaml, savepoint will be saved, otherwise savepoint will not be saved.

  flink-1.7.2 bin/flink cancel -m 127.0.0.1:8081 5e20cb6b0f357591171dfcca2eea09de
 
Cancelling job 5e20cb6b0f357591171dfcca2eea09de.
Cancelled job 5e20cb6b0f357591171dfcca2eea09de.

You can also display the specified savepoint directory when you stop.

  flink-1.7.2 bin/flink cancel -m 127.0.0.1:8081 -s /tmp/savepoint 29da945b99dea6547c3fbafd57ed8759
 
Cancelling job 29da945b99dea6547c3fbafd57ed8759 with savepoint to /tmp/savepoint.
Cancelled job 29da945b99dea6547c3fbafd57ed8759. Savepoint stored in file:/tmp/savepoint/savepoint-29da94-88299bacafb7.
 
  flink-1.7.2 ll /tmp/savepoint/savepoint-29da94-88299bacafb7
total 32K
-rw-r--r-- 1 baoniu 29K Mar 24 10:33 _metadata

The difference between canceling and stopping (streaming jobs) is as follows:

  • Cancel() call to immediately call the cancel() method of the job operator to cancel them as soon as possible. If the operator does not stop after receiving the cancel() call, Flink will start to interrupt the execution of the operator thread regularly until all operators stop.
  • The stop () call is a more elegant way to stop a running stream job. Stop() only applies to jobs where the source implements the stoppablefunction interface. When the user requests to stop the job, all sources of the job receive the stop () method call. The job will not end normally until all sources are shut down normally. In this way, the job can finish all jobs normally.

Savepoint

Trigger savepoint.

  flink-1.7.2 bin/flink savepoint -m 127.0.0.1:8081 ec53edcfaeb96b2a5dadbfbe5ff62bbb /tmp/savepoint
Triggering savepoint for job ec53edcfaeb96b2a5dadbfbe5ff62bbb.
Waiting for response...
Savepoint completed. Path: file:/tmp/savepoint/savepoint-ec53ed-84b00ce500ee
You can resume your program from this savepoint with the run command.

Note: the difference between savepoint and checkpoint (see the document for details):

  • Checkpoint is incremental, with a short time and a small amount of data each time. It will be triggered automatically as long as it is enabled in the program, and the user does not need to sense it. Checkpoint is used automatically when the job fails, and does not need to be specified by the user.
  • Savepoint is made in full amount, with a long time each time and a large amount of data, which needs to be triggered actively by the user. Savepoint is generally used for program version update (see the document for details), bug repair, a / B test and other scenarios, which need to be specified by the user.

Start from the specified savepoint with the – S parameter:

  flink-1.7.2 bin/flink run -d -s /tmp/savepoint/savepoint-f049ff-24ec0d3e0dc7 ./examples/streaming/TopSpeedWindowing.jar
Starting execution of program
Executing TopSpeedWindowing example with default input data set.
Use --input to specify file input.
Printing result to stdout. Use --output to specify output path.

View the log of jobmanager, and you can see a log similar to this:

2019-03-28 10:30:53,957 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     
- Starting job 790d7b98db6f6af55d04aec1d773852d from savepoint /tmp/savepoint/savepoint-f049ff-24ec0d3e0dc7 ()
2019-03-28 10:30:53,959 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator    
 - Reset the checkpoint ID of job 790d7b98db6f6af55d04aec1d773852d to 2.
2019-03-28 10:30:53,959 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     
- Restoring job 790d7b98db6f6af55d04aec1d773852d from latest valid checkpoint: Checkpoint 1 @ 0 for 790d7b98db6f6af55d04aec1d773852d.

Modify

Modify task parallelism.

To facilitate the demonstration, we modified conf / flink-conf.yaml to change the number of task slots from the default 1 to 4, and configured the savepoint directory. (modify parameter followed by – s specifies that there may be bugs in the current version of savepoint path, indicating unrecognized)

taskmanager.numberOfTaskSlots: 4
state.savepoints.dir: file:///tmp/savepoint

After modifying the parameters, restart the cluster to take effect, and then start the task:

  flink-1.7.2 bin/stop-cluster.sh && bin/start-cluster.sh
Stopping taskexecutor daemon (pid: 53139) on host zkb-MBP.local.
Stopping standalonesession daemon (pid: 52723) on host zkb-MBP.local.
Starting cluster.
Starting standalonesession daemon on host zkb-MBP.local.
Starting taskexecutor daemon on host zkb-MBP.local.
 
  flink-1.7.2 bin/flink run -d examples/streaming/TopSpeedWindowing.jar
Starting execution of program
Executing TopSpeedWindowing example with default input data set.
Use --input to specify file input.
Printing result to stdout. Use --output to specify output path.
Job has been submitted with JobID 7752ea7b0e7303c780de9d86a5ded3fa

From the page, you can see that the task slot changes to 4. At this time, the default concurrency of the task is 1.

Apache Flink zero basic introduction (4): five modes of client operation

Apache Flink zero basic introduction (4): five modes of client operation

By modifying the concurrency to 4 and 3, you can see that each modify command will trigger a savepoint.

  flink-1.7.2 bin/flink modify -p 4 7752ea7b0e7303c780de9d86a5ded3fa
Modify job 7752ea7b0e7303c780de9d86a5ded3fa.
Rescaled job 7752ea7b0e7303c780de9d86a5ded3fa. Its new parallelism is 4.
 
  flink-1.7.2 ll /tmp/savepoint
total 0
drwxr-xr-x 3 baoniu 96 Jun 17 09:05 savepoint-7752ea-00c05b015836/
 
  flink-1.7.2 bin/flink modify -p 3 7752ea7b0e7303c780de9d86a5ded3fa
Modify job 7752ea7b0e7303c780de9d86a5ded3fa.
Rescaled job 7752ea7b0e7303c780de9d86a5ded3fa. Its new parallelism is 3.
 
  flink-1.7.2 ll /tmp/savepoint
total 0
drwxr-xr-x 3 baoniu 96 Jun 17 09:08 savepoint-7752ea-449b131b2bd4/

Apache Flink zero basic introduction (4): five modes of client operation

To view the log of jobmanager, you can see:

2019-06-17 09:05:11,179 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Starting job 7752ea7b0e7303c780de9d86a5ded3fa from savepoint file:/tmp/savepoint/savepoint-790d7b-3581698f007e ()
2019-06-17 09:05:11,182 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Reset the checkpoint ID of job 7752ea7b0e7303c780de9d86a5ded3fa to 3.
2019-06-17 09:05:11,182 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Restoring job 790d7b98db6f6af55d04aec1d773852d from latest valid checkpoint: Checkpoint 2 @ 0 for 7752ea7b0e7303c780de9d86a5ded3fa.
2019-06-17 09:05:11,184 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - No master state to restore
2019-06-17 09:05:11,184 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job CarTopSpeedWindowingExample (7752ea7b0e7303c780de9d86a5ded3fa) switched from state RUNNING to SUSPENDING.
org.apache.flink.util.FlinkException: Job is being rescaled.

Info

The info command is used to view the execution plan (streamgraph) of the Flink task.

  flink-1.7.2 bin/flink info examples/streaming/TopSpeedWindowing.jar
----------------------- Execution Plan -----------------------
{"nodes":[{"id":1,"type":"Source: Custom Source","pact":"Data Source","contents":"Source: Custom Source","parallelism":1},{"id":2,"type":"Timestamps/Watermarks","pact":"Operator","contents":"Timestamps/Watermarks","parallelism":1,"predecessors":[{"id":1,"ship_strategy":"FORWARD","side":"second"}]},{"id":4,"type":"Window(GlobalWindows(), DeltaTrigger, TimeEvictor, ComparableAggregator, PassThroughWindowFunction)","pact":"Operator","contents":"Window(GlobalWindows(), DeltaTrigger, TimeEvictor, ComparableAggregator, PassThroughWindowFunction)","parallelism":1,"predecessors":[{"id":2,"ship_strategy":"HASH","side":"second"}]},{"id":5,"type":"Sink: Print to Std. Out","pact":"Data Sink","contents":"Sink: Print to Std. Out","parallelism":1,"predecessors":[{"id":4,"ship_strategy":"FORWARD","side":"second"}]}]}
--------------------------------------------------------------

Copy the output JSON content and paste it to this website: http://flick.apache.org/visualizer/

Apache Flink zero basic introduction (4): five modes of client operation

It can be compared with the actual physical execution plan:

Apache Flink zero basic introduction (4): five modes of client operation

3.1.2 Yarn per-job

Single task attach mode

The default is attach mode, that is, the client will wait until the end of the program before exiting.

  • Specify the yarn mode through – M yarn cluster
  • The display name on the yarn is Flink session cluster. The wordcount task of this batch will be finished and finished.
  • The client can see the result output
[[email protected] /home/admin/flink/flink-1.7.2]
$echo $HADOOP_CONF_DIR
/etc/hadoop/conf/
 
[[email protected] /home/admin/flink/flink-1.7.2]
$./bin/flink run -m yarn-cluster ./examples/batch/WordCount.jar
 
2019-06-17 09:15:24,511 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at z05c05217.sqa.zth.tbsite.net/11.163.188.29:8050
2019-06-17 09:15:24,690 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2019-06-17 09:15:24,690 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2019-06-17 09:15:24,907 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=1024, numberTaskManagers=1, slotsPerTaskManager=4}
2019-06-17 09:15:25,430 WARN  org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory       - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
2019-06-17 09:15:25,438 WARN  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - The configuration directory ('/Users/baoniu/Documents/work/tool/flink/flink-1.7.2/conf') contains both LOG4J and Logback configuration files. Please delete or rename one of them.
2019-06-17 09:15:36,239 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Submitting application master application_1532332183347_0724
2019-06-17 09:15:36,276 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Submitted application application_1532332183347_0724
2019-06-17 09:15:36,276 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Waiting for the cluster to be allocated
2019-06-17 09:15:36,281 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Deploying cluster, current state ACCEPTED
2019-06-17 09:15:40,426 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - YARN application has been deployed successfully.
Starting execution of program
Executing WordCount example with default input data set.
Use --input to specify file input.
Printing result to stdout. Use --output to specify output path.
(a,5)
(action,1)
(after,1)
(against,1)
(all,2)
... ...
(would,2)
(wrong,1)
(you,1)
Program execution finished
Job with JobID 8bfe7568cb5c3254af30cbbd9cd5971e has finished.
Job Runtime: 9371 ms
Accumulator Results:
- 2bed2c5506e9237fb85625416a1bc508 (java.util.ArrayList) [170 elements]

Apache Flink zero basic introduction (4): five modes of client operation

Apache Flink zero basic introduction (4): five modes of client operation

If we run the streaming task in attach mode, the client will wait for us to exit. You can run the following example to test:

./bin/flink run -m yarn-cluster ./examples/streaming/TopSpeedWindowing.jar

Single task detached mode

  • Due to the detached mode, the client exits after submitting the task
  • Flink per job cluster is displayed on yarn
$./bin/flink run -yd -m yarn-cluster ./examples/streaming/TopSpeedWindowing.jar
 
2019-06-18 09:21:59,247 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at z05c05217.sqa.zth.tbsite.net/11.163.188.29:8050
2019-06-18 09:21:59,428 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2019-06-18 09:21:59,428 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2019-06-18 09:21:59,940 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=1024, numberTaskManagers=1, slotsPerTaskManager=4}
2019-06-18 09:22:00,427 WARN  org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory       - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
2019-06-18 09:22:00,436 WARN  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - The configuration directory ('/Users/baoniu/Documents/work/tool/flink/flink-1.7.2/conf') contains both LOG4J and Logback configuration files. Please delete or rename one of them.
^@2019-06-18 09:22:12,113 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Submitting application master application_1532332183347_0729
2019-06-18 09:22:12,151 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Submitted application application_1532332183347_0729
2019-06-18 09:22:12,151 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Waiting for the cluster to be allocated
2019-06-18 09:22:12,155 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Deploying cluster, current state ACCEPTED
2019-06-18 09:22:16,275 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - YARN application has been deployed successfully.
2019-06-18 09:22:16,275 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - The Flink YARN client has been started in detached mode. In order to stop Flink on YARN, use the following command or a YARN web interface to stop it:
yarn application -kill application_1532332183347_0729
Please also note that the temporary files of the YARN session in the home directory will not be removed.
Job has been submitted with JobID e61b9945c33c300906ad50a9a11f36df

Apache Flink zero basic introduction (4): five modes of client operation

Apache Flink zero basic introduction (4): five modes of client operation

3.1.3 Yarn session

Start session

./bin/yarn-session.sh -tm 2048 -s 3

It means to start a yarn session cluster. The memory of each TM is 2G, and each TM has three slots. (Note: – n parameter is not valid)

  flink-1.7.2 ./bin/yarn-session.sh -tm 2048 -s 3
2019-06-17 09:21:50,177 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.address, localhost
2019-06-17 09:21:50,179 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.port, 6123
2019-06-17 09:21:50,179 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.heap.size, 1024m
2019-06-17 09:21:50,179 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.heap.size, 1024m
2019-06-17 09:21:50,179 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.numberOfTaskSlots, 4
2019-06-17 09:21:50,179 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: state.savepoints.dir, file:///tmp/savepoint
2019-06-17 09:21:50,180 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: parallelism.default, 1
2019-06-17 09:21:50,180 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: rest.port, 8081
2019-06-17 09:21:50,644 WARN  org.apache.hadoop.util.NativeCodeLoader                       - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-06-17 09:21:50,746 INFO  org.apache.flink.runtime.security.modules.HadoopModule        - Hadoop user set to baoniu (auth:SIMPLE)
2019-06-17 09:21:50,848 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at z05c05217.sqa.zth.tbsite.net/11.163.188.29:8050
2019-06-17 09:21:51,148 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=2048, numberTaskManagers=1, slotsPerTaskManager=3}
2019-06-17 09:21:51,588 WARN  org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory       - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
2019-06-17 09:21:51,596 WARN  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - The configuration directory ('/Users/baoniu/Documents/work/tool/flink/flink-1.7.2/conf') contains both LOG4J and Logback configuration files. Please delete or rename one of them.
^@2019-06-17 09:22:03,304 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Submitting application master application_1532332183347_0726
2019-06-17 09:22:03,336 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Submitted application application_1532332183347_0726
2019-06-17 09:22:03,336 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Waiting for the cluster to be allocated
2019-06-17 09:22:03,340 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Deploying cluster, current state ACCEPTED
2019-06-17 09:22:07,722 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - YARN application has been deployed successfully.
2019-06-17 09:22:08,050 INFO  org.apache.flink.runtime.rest.RestClient                      - Rest client endpoint started.
Flink JobManager is now running on z07.sqa.net:37109 with leader id 00000000-0000-0000-0000-000000000000.
JobManager Web Interface: http://z07.sqa.net:37109

The client defaults to attach mode and does not exit:

  • You can exit with Ctrl + C, and then connect it through. / bin / yarn-session.sh – ID application;
  • Or use – D at startup to set the selected mode

It is shown as Flink session cluster on Yan;

Apache Flink zero basic introduction (4): five modes of client operation

Apache Flink zero basic introduction (4): five modes of client operation

  • A file is generated in the local temporary directory (some machines are / tmp directory):
  flink-1.7.2 cat /var/folders/2b/r6d49pcs23z43b8fqsyz885c0000gn/T/.yarn-properties-baoniu
#Generated YARN properties file
#Mon Jun 17 09:22:08 CST 2019
parallelism=3
dynamicPropertiesString=
applicationID=application_1532332183347_0726

Submit tasks

./bin/flink run ./examples/batch/WordCount.jar

It will be submitted to the newly started session according to the contents of the / TMP /. Yarn properties admin file.

  flink-1.7.2 ./bin/flink run ./examples/batch/WordCount.jar
2019-06-17 09:26:42,767 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Found Yarn properties file under /var/folders/2b/r6d49pcs23z43b8fqsyz885c0000gn/T/.yarn-properties-baoniu.
2019-06-17 09:26:42,767 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Found Yarn properties file under /var/folders/2b/r6d49pcs23z43b8fqsyz885c0000gn/T/.yarn-properties-baoniu.
2019-06-17 09:26:43,058 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - YARN properties set default parallelism to 3
2019-06-17 09:26:43,058 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - YARN properties set default parallelism to 3
YARN properties set default parallelism to 3
2019-06-17 09:26:43,097 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at z05c05217.sqa.zth.tbsite.net/11.163.188.29:8050
2019-06-17 09:26:43,229 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2019-06-17 09:26:43,229 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2019-06-17 09:26:43,327 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Found application JobManager host name 'z05c07216.sqa.zth.tbsite.net' and port '37109' from supplied application id 'application_1532332183347_0726'
Starting execution of program
Executing WordCount example with default input data set.
Use --input to specify file input.
Printing result to stdout. Use --output to specify output path.
^@(a,5)
(action,1)
(after,1)
(against,1)
(all,2)
(and,12)
... ...
(wrong,1)
(you,1)
Program execution finished
Job with JobID ad9b0f1feed6d0bf6ba4e0f18b1e65ef has finished.
Job Runtime: 9152 ms
Accumulator Results:
- fd07c75d503d0d9a99e4f27dd153114c (java.util.ArrayList) [170 elements]

TM’s resources are released at the end of the run.

Apache Flink zero basic introduction (4): five modes of client operation

Submit to the specified session

Submit to the specified session with the – YID parameter.

$./bin/flink run -d -p 30 -m yarn-cluster -yid application_1532332183347_0708 ./examples/streaming/TopSpeedWindowing.jar
 
2019-03-24 12:36:33,668 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at z05c05217.sqa.zth.tbsite.net/11.163.188.29:8050
2019-03-24 12:36:33,773 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2019-03-24 12:36:33,773 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2019-03-24 12:36:33,837 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Found application JobManager host name 'z05c05218.sqa.zth.tbsite.net' and port '60783' from supplied application id 'application_1532332183347_0708'
Starting execution of program
Executing TopSpeedWindowing example with default input data set.
Use --input to specify file input.
Printing result to stdout. Use --output to specify output path.
Job has been submitted with JobID 58d5049ebbf28d515159f2f88563f5fd

Apache Flink zero basic introduction (4): five modes of client operation

Note: the difference between blink’s session and Flink’s session:

  • The session-n parameter of Flink does not take effect, and TM will not be started in advance;
  • Blink’s session can specify how many TM’s to start through – N, and TM will get up in advance;

3.2 Scala Shell

Official document: https://ci.apache.org/projects/flick/flick-docs-release-1.7/ops/scala_shell.html

3.2.1 Deploy

Local

$bin/start-scala-shell.sh local
Starting Flink Shell:
Starting local Flink cluster (host: localhost, port: 8081).
Connecting to Flink cluster (host: localhost, port: 8081).
... ...
scala>

Task operation description:

  • The batch task has built-in benv variables, and outputs the results to the console through print();
  • The streaming task has built-in SENV variable, which submits the task through senv.execute (“job name”), and the output of datastream is only printed to the console in local mode;

Remote

Start a yarn session cluster:

$./bin/yarn-session.sh  -tm 2048 -s 3

2019-03-25 09:52:16,341 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.address, localhost
2019-03-25 09:52:16,342 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.port, 6123
2019-03-25 09:52:16,342 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.heap.size, 1024m
2019-03-25 09:52:16,343 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.heap.size, 1024m
2019-03-25 09:52:16,343 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.numberOfTaskSlots, 4
2019-03-25 09:52:16,343 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: parallelism.default, 1
2019-03-25 09:52:16,343 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: state.savepoints.dir, file:///tmp/savepoint
2019-03-25 09:52:16,343 INFO  org.apache.flink.configuration.GlobalConfiguration            
… ...
Flink JobManager is now running on z054.sqa.net:28665 with leader id 00000000-0000-0000-0000-000000000000.
JobManager Web Interface: http://z054.sqa.net:28665

Start the scala shell and connect to JM:

$bin/start-scala-shell.sh remote z054.sqa.net 28665
Starting Flink Shell:
Connecting to Flink cluster (host: z054.sqa.net, port: 28665).
... ...
scala>

Yarn

$./bin/start-scala-shell.sh yarn -n 2 -jm 1024 -s 2 -tm 1024 -nm flink-yarn

Starting Flink Shell:
2019-03-25 09:47:44,695 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.address, localhost
2019-03-25 09:47:44,697 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.rpc.port, 6123
2019-03-25 09:47:44,697 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: jobmanager.heap.size, 1024m
2019-03-25 09:47:44,697 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.heap.size, 1024m
2019-03-25 09:47:44,697 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: taskmanager.numberOfTaskSlots, 4
2019-03-25 09:47:44,698 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: parallelism.default, 1
2019-03-25 09:47:44,698 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: state.savepoints.dir, file:///tmp/savepoint
2019-03-25 09:47:44,698 INFO  org.apache.flink.configuration.GlobalConfiguration            - Loading configuration property: rest.port, 8081
2019-03-25 09:47:44,717 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Found Yarn properties file under /tmp/.yarn-properties-admin.
2019-03-25 09:47:45,041 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at z05c05217.sqa.zth.tbsite.net/11.163.188.29:8050
2019-03-25 09:47:45,098 WARN  org.apache.hadoop.util.NativeCodeLoader                       - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-03-25 09:47:45,266 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2019-03-25 09:47:45,275 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - The argument yn is deprecated in will be ignored.
2019-03-25 09:47:45,357 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Cluster specification: ClusterSpecification{masterMemoryMB=1024, taskManagerMemoryMB=1024, numberTaskManagers=2, slotsPerTaskManager=2}
2019-03-25 09:47:45,711 WARN  org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory       - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
2019-03-25 09:47:45,718 WARN  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - The configuration directory ('/home/admin/flink/flink-1.7.2/conf') contains both LOG4J and Logback configuration files. Please delete or rename one of them.
2019-03-25 09:47:46,514 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Submitting application master application_1532332183347_0710
2019-03-25 09:47:46,534 INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl         - Submitted application application_1532332183347_0710
2019-03-25 09:47:46,534 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Waiting for the cluster to be allocated
2019-03-25 09:47:46,535 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Deploying cluster, current state ACCEPTED
2019-03-25 09:47:51,051 INFO  org.apache.flink.yarn.AbstractYarnClusterDescriptor           - YARN application has been deployed successfully.
2019-03-25 09:47:51,222 INFO  org.apache.flink.runtime.rest.RestClient                      - Rest client endpoint started.

Connecting to Flink cluster (host: 10.10.10.10, port: 56942).

Apache Flink zero basic introduction (4): five modes of client operation

After pressing Ctrl + C to exit the shell, the Flink cluster will continue to run and will not exit.

3.2.2 Execute

DataSet

  flink-1.7.2 bin/stop-cluster.sh
No taskexecutor daemon to stop on host zkb-MBP.local.
No standalonesession daemon to stop on host zkb-MBP.local.
  flink-1.7.2 bin/start-scala-shell.sh local
Starting Flink Shell:
Starting local Flink cluster (host: localhost, port: 8081).
Connecting to Flink cluster (host: localhost, port: 8081).

scala> val text = benv.fromElements("To be, or not to be,--that is the question:--")
text: org.apache.flink.api.scala.DataSet[String] = [email protected]

scala> val counts = text.flatMap { _.toLowerCase.split("\W+") }.map { (_, 1) }.groupBy(0).sum(1)
counts: org.apache.flink.api.scala.AggregateDataSet[(String, Int)] = [email protected]

scala> counts.print()
(be,2)
(is,1)
(not,1)
(or,1)
(question,1)
(that,1)
(the,1)
(to,2)

For a dataset task, print () triggers the execution of the task.

Apache Flink zero basic introduction (4): five modes of client operation

You can also output the result to a file (delete / TMP / out1 first, or an error will be reported that the file with the same name already exists). Continue to execute the following command:

scala> counts.writeAsText("/tmp/out1")
res1: org.apache.flink.api.java.operators.DataSink[(String, Int)] = DataSink '<unnamed>' (TextOutputFormat (/tmp/out1) - UTF-8)

scala> benv.execute("batch test")
res2: org.apache.flink.api.common.JobExecutionResult = [email protected]

Look at the / TMP / out1 file to see the output.

  flink-1.7.2 cat /tmp/out1
(be,2)
(is,1)
(not,1)
(or,1)
(question,1)
(that,1)
(the,1)
(to,2)

Apache Flink zero basic introduction (4): five modes of client operation

DataSteam

scala> val textStreaming = senv.fromElements("To be, or not to be,--that is the question:--")
textStreaming: org.apache.flink.streaming.api.scala.DataStream[String] = [email protected]

scala> val countsStreaming = textStreaming.flatMap { _.toLowerCase.split("\W+") }.map { (_, 1) }.keyBy(0).sum(1)
countsStreaming: org.apache.flink.streaming.api.scala.DataStream[(String, Int)] = [email protected]

scala> countsStreaming.print()
res3: org.apache.flink.streaming.api.datastream.DataStreamSink[(String, Int)] = [email protected]f

scala> senv.execute("Streaming Wordcount")
(to,1)
(be,1)
(or,1)
(not,1)
(to,2)
(be,2)
(that,1)
(is,1)
(the,1)
(question,1)
res4: org.apache.flink.api.common.JobExecutionResult = [email protected]

For a datastream task, print() does not trigger the execution of the task. It needs to display the call execute (“job name”) to execute the task.

Apache Flink zero basic introduction (4): five modes of client operation

Apache Flink zero basic introduction (4): five modes of client operation

TableAPI

In the open source version of blink, tasks are submitted in the form of tableapi (btenv.sqlquery can be used to submit SQL queries). The community version of Flink 1.8 will support tableapi: https://issues.apache.org/jira/browse/flink-9555

3.3 SQL Client Beta

At present, SQL client is only a beta version, which is in the development stage and can only be used for prototype verification of SQL. It is not recommended to be used in the production environment.

3.3.1 basic usage

  flink-1.7.2 bin/start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host zkb-MBP.local.
Starting taskexecutor daemon on host zkb-MBP.local.

  flink-1.7.2 ./bin/sql-client.sh embedded
No default environment specified.
Searching for '/Users/baoniu/Documents/work/tool/flink/flink-1.7.2/conf/sql-client-defaults.yaml'...found.
Reading default environment from: file:/Users/baoniu/Documents/work/tool/flink/flink-1.7.2/conf/sql-client-defaults.yaml
No session environment specified.
Validating current environment...done.
… …

Flink SQL> help;
The following commands are available:

QUIT        Quits the SQL CLI client.
CLEAR        Clears the current terminal.
HELP        Prints the available commands.
SHOW TABLES        Shows all registered tables.
SHOW FUNCTIONS        Shows all registered user-defined functions.
DESCRIBE        Describes the schema of a table with the given name.
EXPLAIN        Describes the execution plan of a query or table with the given name.
SELECT        Executes a SQL SELECT query on the Flink cluster.
INSERT INTO        Inserts the results of a SQL SELECT query into a declared table sink.
CREATE VIEW        Creates a virtual table from a SQL query. Syntax: 'CREATE VIEW <name> AS <query>;'
DROP VIEW        Deletes a previously created virtual table. Syntax: 'DROP VIEW <name>;'
SOURCE        Reads a SQL SELECT query from a file and executes it on the Flink cluster.
SET        Sets a session configuration property. Syntax: 'SET <key>=<value>;'. Use 'SET;' for listing all properties.
RESET        Resets all session configuration properties.

Hint: Make sure that a statement ends with ';' for finalizing (multi-line) statements.

Select query

Flink SQL> SELECT 'Hello World';

Apache Flink zero basic introduction (4): five modes of client operation

Press “Q” to exit this interface
Open http://127.0.0.1:8081 to see that the query task generated by this select statement has ended. This query uses the custom source to read the fixed data set and the stream collect sink to output only one result.

Note: if a file similar to. Yarn properties baoiu exists in the local temporary directory, the task will be submitted to yarn.

Apache Flink zero basic introduction (4): five modes of client operation

Apache Flink zero basic introduction (4): five modes of client operation

Explain

The explain command can view the execution plan of SQL.

Flink SQL> explain SELECT name, COUNT(*) AS cnt FROM (VALUES ('Bob'), ('Alice'), ('Greg'), ('Bob')) AS NameTable(name) GROUP BY name;

==Abstract syntax tree = = // abstract syntax tree
LogicalAggregate(group=[{0}], cnt=[COUNT()])
  LogicalValues(tuples=[[{ _UTF-16LE'Bob  ' }, { _UTF-16LE'Alice' }, { _UTF-16LE'Greg ' }, { _UTF-16LE'Bob  ' }]])

==Optimized logical plan = = // optimized logical execution plan
DataStreamGroupAggregate(groupBy=[name], select=[name, COUNT(*) AS cnt])
  DataStreamValues(tuples=[[{ _UTF-16LE'Bob  ' }, { _UTF-16LE'Alice' }, { _UTF-16LE'Greg ' }, { _UTF-16LE'Bob  ' }]])

==Physical execution plan = = // physical execution plan
Stage 3 : Data Source
    content : collect elements with CollectionInputFormat

    Stage 5 : Operator
        content : groupBy: (name), select: (name, COUNT(*) AS cnt)
        ship_strategy : HASH

3.3.2 result display

SQL client supports two modes to maintain and display query results:

  • Table mode: materialize the query results in memory and display them in the form of paging table. The user can enable table mode through the following command;
 SET execution.result-mode=table
  • Changing mode: it does not materialize query results, but directly displays the results of adding and withdrawing generated by continuous query.
SET execution.result-mode=changelog

Next, the actual example is used for demonstration.

Table mode

Flink SQL> SET execution.result-mode=table;
[INFO] Session property has been set.

Flink SQL> SELECT name, COUNT(*) AS cnt FROM (VALUES ('Bob'), ('Alice'), ('Greg'), ('Bob')) AS NameTable(name) GROUP BY name;

The operation results are as follows:

Apache Flink zero basic introduction (4): five modes of client operation

Apache Flink zero basic introduction (4): five modes of client operation

Apache Flink zero basic introduction (4): five modes of client operation

Changlog mode

Flink SQL> SET execution.result-mode=changelog;
[INFO] Session property has been set.

Flink SQL> SELECT name, COUNT(*) AS cnt FROM (VALUES ('Bob'), ('Alice'), ('Greg'), ('Bob')) AS NameTable(name) GROUP BY name;

The operation results are as follows:

Apache Flink zero basic introduction (4): five modes of client operation

Where ‘-‘ stands for the withdrawal message.

Apache Flink zero basic introduction (4): five modes of client operation

Apache Flink zero basic introduction (4): five modes of client operation

3.3.3 Environment Files

At present, SQL client does not support DDL statement. It can only define the table, UDF, running parameters and other information needed by SQL query through yaml file.

First, prepare the env.yaml and input.csv files.

  flink-1.7.2 cat /tmp/env.yaml
tables:
  - name: MyTableSource
    type: source-table
    update-mode: append
    connector:
      type: filesystem
      path: "/tmp/input.csv"
    format:
      type: csv
      fields:
        - name: MyField1
          type: INT
        - name: MyField2
          type: VARCHAR
      line-delimiter: "\n"
      comment-prefix: "#"
    schema:
      - name: MyField1
        type: INT
      - name: MyField2
        type: VARCHAR
  - name: MyCustomView
    type: view
    query: "SELECT MyField2 FROM MyTableSource"
  - name: MyTableSink
    type: sink-table
    update-mode: append
    connector:
      type: filesystem
      path: "/tmp/output.csv"
    format:
      type: csv
      fields:
        - name: MyField1
          type: INT
        - name: MyField2
          type: VARCHAR
    schema:
      - name: MyField1
        type: INT
      - name: MyField2
        type: VARCHAR
        
# Execution properties allow for changing the behavior of a table program.

execution:
  type: streaming                   # required: execution mode either 'batch' or 'streaming'
  result-mode: table                # required: either 'table' or 'changelog'
  max-table-result-rows: 1000000    # optional: maximum number of maintained rows in
                                    #   'table' mode (1000000 by default, smaller 1 means unlimited)
  time-characteristic: event-time   # optional: 'processing-time' or 'event-time' (default)
  parallelism: 1                    # optional: Flink's parallelism (1 by default)
  periodic-watermarks-interval: 200 # optional: interval for periodic watermarks (200 ms by default)
  max-parallelism: 16               # optional: Flink's maximum parallelism (128 by default)
  min-idle-state-retention: 0       # optional: table program's minimum idle state time
  max-idle-state-retention: 0       # optional: table program's maximum idle state time
  restart-strategy:                 # optional: restart strategy
    type: fallback                  #   "fallback" to global restart strategy by default

# Deployment properties allow for describing the cluster to which table programs are submitted to.

deployment:
  response-timeout: 5000

  flink-1.7.2 cat /tmp/input.csv
1,hello
2,world
3,hello world
1,ok
3,bye bye
4,yes

Start SQL client:

  flink-1.7.2 ./bin/sql-client.sh embedded -e /tmp/env.yaml
No default environment specified.
Searching for '/Users/baoniu/Documents/work/tool/flink/flink-1.7.2/conf/sql-client-defaults.yaml'...found.
Reading default environment from: file:/Users/baoniu/Documents/work/tool/flink/flink-1.7.2/conf/sql-client-defaults.yaml
Reading session environment from: file:/tmp/env.yaml
Validating current environment...done.

Flink SQL> show tables;
MyCustomView
MyTableSink
MyTableSource

Flink SQL> describe MyTableSource;
root
 |-- MyField1: Integer
 |-- MyField2: String

Flink SQL> describe MyCustomView;
root
 |-- MyField2: String

Flink SQL> create view MyView1 as select MyField1 from MyTableSource;
[INFO] View has been created.

Flink SQL> show tables;
MyCustomView
MyTableSource
MyView1

Flink SQL> describe MyView1;
root
 |-- MyField1: Integer

Flink SQL> select * from MyTableSource;

Apache Flink zero basic introduction (4): five modes of client operation

Apache Flink zero basic introduction (4): five modes of client operation

Apache Flink zero basic introduction (4): five modes of client operation

Use insert into to write the result table:

Flink SQL> insert into MyTableSink select * from MyTableSource;
[INFO] Submitting SQL update statement to the cluster...
[INFO] Table update statement has been successfully submitted to the cluster:
Cluster ID: StandaloneClusterId
Job ID: 3fac2be1fd891e3e07595c684bb7b7a0
Web interface: http://localhost:8081

Apache Flink zero basic introduction (4): five modes of client operation

Apache Flink zero basic introduction (4): five modes of client operation

Query the generated result data file:

  flink-1.7.2 cat /tmp/output.csv
1,hello
2,world
3,hello world
1,ok
3,bye bye
4,yes

You can also define UDF in the environment file, query and use it in the SQL client through “how functions”, which will not be explained here.

The SQL client functional community is still under development. See flip-24 for details.

3.4 Restful API

Next, we show how to submit jar packages and execute tasks through the rest API.

For more details, please refer to Flink’s restful API document: https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html

  flink-1.7.2 curl http://127.0.0.1:8081/overview
{"taskmanagers":1,"slots-total":4,"slots-available":0,"jobs-running":3,"jobs-finished":0,"jobs-cancelled":0,"jobs-failed":0,"flink-version":"1.7.2","flink-commit":"ceba8af"}%

  flink-1.7.2 curl -X POST -H "Expect:" -F "[email protected]/Users/baoniu/Documents/work/tool/flink/flink-1.7.2/examples/streaming/TopSpeedWindowing.jar" http://127.0.0.1:8081/jars/upload
{"filename":"/var/folders/2b/r6d49pcs23z43b8fqsyz885c0000gn/T/flink-web-124c4895-cf08-4eec-8e15-8263d347efc2/flink-web-upload/6077eca7-6db0-4570-a4d0-4c3e05a5dc59_TopSpeedWindowing.jar","status":"success"}%       
                                                                                                                                                                                                     flink-1.7.2 curl http://127.0.0.1:8081/jars
{"address":"http://localhost:8081","files":[{"id":"6077eca7-6db0-4570-a4d0-4c3e05a5dc59_TopSpeedWindowing.jar","name":"TopSpeedWindowing.jar","uploaded":1553743438000,"entry":[{"name":"org.apache.flink.streaming.examples.windowing.TopSpeedWindowing","description":null}]}]}%
                                                                                                                                         
  flink-1.7.2 curl http://127.0.0.1:8081/jars/6077eca7-6db0-4570-a4d0-4c3e05a5dc59_TopSpeedWindowing.jar/plan
{"plan":{"jid":"41029eb3feb9132619e454ec9b2a89fb","name":"CarTopSpeedWindowingExample","nodes":[{"id":"90bea66de1c231edf33913ecd54406c1","parallelism":1,"operator":"","operator_strategy":"","description":"Window(GlobalWindows(), DeltaTrigger, TimeEvictor, ComparableAggregator, PassThroughWindowFunction) -> Sink: Print to Std. Out","inputs":[{"num":0,"id":"cbc357ccb763df2852fee8c4fc7d55f2","ship_strategy":"HASH","exchange":"pipelined_bounded"}],"optimizer_properties":{}},{"id":"cbc357ccb763df2852fee8c4fc7d55f2","parallelism":1,"operator":"","operator_strategy":"","description":"Source: Custom Source -> Timestamps/Watermarks","optimizer_properties":{}}]}}%                                                                                                                                                      flink-1.7.2 curl -X POST http://127.0.0.1:8081/jars/6077eca7-6db0-4570-a4d0-4c3e05a5dc59_TopSpeedWindowing.jar/run
{"jobid":"04d80a24b076523d3dc5fbaa0ad5e1ad"}%

Apache Flink zero basic introduction (4): five modes of client operation

Apache Flink zero basic introduction (4): five modes of client operation

Restful API also provides a lot of monitoring and metrics related functions, and also supports a comprehensive operation of task submission.

3.5 Web

On the left side of the Flink dashboard page, you can see a place called “submit new job”. Users can upload jar packages and display execution plans and submit tasks. The web submission function is mainly used for novice introduction and demonstration.

Apache Flink zero basic introduction (4): five modes of client operation

4. end

This course is over here. We mainly discuss five ways of task submission of Flink. Mastering various task submission methods is conducive to improving our daily development and operation and maintenance efficiency.

Video review: https://zh.ververica.com/deve