Learn Flink from 0 to 1 – how to customize data sink?

Time:2021-11-25

Learn Flink from 0 to 1 - how to customize data sink?

<!–more–>

preface

Previous articleLearning Flink from 0 to 1 – Introduction to data sinkThis paper introduces Flink data sink and its own sink, so how to customize your own sink? In this article, we will write a demo to teach you to transfer the data sink from Kafka source to MySQL.

preparation

Let’s take a look at the demo of Flink obtaining data from Kafka topic. First, you need to install Flink and Kafka.

Run and start Flink, zookepeer, Kafka,

Learn Flink from 0 to 1 - how to customize data sink?

Learn Flink from 0 to 1 - how to customize data sink?

All right, it’s all started!

Database table creation

DROP TABLE IF EXISTS `student`;
CREATE TABLE `student` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(25) COLLATE utf8_bin DEFAULT NULL,
  `password` varchar(25) COLLATE utf8_bin DEFAULT NULL,
  `age` int(10) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

Entity class

Student.java

package com.zhisheng.flink.model;

/**
 * Desc:
 * weixin: zhisheng_tian
 * blog: http://www.54tianzhisheng.cn/
 */
public class Student {
    public int id;
    public String name;
    public String password;
    public int age;

    public Student() {
    }

    public Student(int id, String name, String password, int age) {
        this.id = id;
        this.name = name;
        this.password = password;
        this.age = age;
    }

    @Override
    public String toString() {
        return "Student{" +
                "id=" + id +
                ", name='" + name + '\'' +
                ", password='" + password + '\'' +
                ", age=" + age +
                '}';
    }

    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public String getPassword() {
        return password;
    }

    public void setPassword(String password) {
        this.password = password;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }
}

Tool class

The tool class sends data to Kafka topic student

import com.alibaba.fastjson.JSON;
import com.zhisheng.flink.model.Metric;
import com.zhisheng.flink.model.Student;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerRecord;

import java.util.HashMap;
import java.util.Map;
import java.util.Properties;

/**
 *Write data to Kafka
 *You can use this main function to test it
 * weixin: zhisheng_tian
 * blog: http://www.54tianzhisheng.cn/
 */
public class KafkaUtils2 {
    public static final String broker_list = "localhost:9092";
    public static final String topic = "student";  // Kafka topic needs to use the same topic as the Flink program

    public static void writeToKafka() throws InterruptedException {
        Properties props = new Properties();
        props.put("bootstrap.servers", broker_list);
        props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        KafkaProducer producer = new KafkaProducer<String, String>(props);

        for (int i = 1; i <= 100; i++) {
            Student student = new Student(i, "zhisheng" + i, "password" + i, 18 + i);
            ProducerRecord record = new ProducerRecord<String, String>(topic, null, null, JSON.toJSONString(student));
            producer.send(record);
            System.out.println ("send data:" + json.tojsonstring (student));
        }
        producer.flush();
    }

    public static void main(String[] args) throws InterruptedException {
        writeToKafka();
    }
}

SinkToMySQL

This class is sink function. It inherits richsink function and overrides the methods in it. Insert data into MySQL in the invoke method.

package com.zhisheng.flink.sink;

import com.zhisheng.flink.model.Student;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;

/**
 * Desc:
 * weixin: zhisheng_tian
 * blog: http://www.54tianzhisheng.cn/
 */
public class SinkToMySQL extends RichSinkFunction<Student> {
    PreparedStatement ps;
    private Connection connection;

    /**
     *Establish a connection in the open () method so that you don't have to establish and release the connection every time you invoke
     *
     * @param parameters
     * @throws Exception
     */
    @Override
    public void open(Configuration parameters) throws Exception {
        super.open(parameters);
        connection = getConnection();
        String sql = "insert into Student(id, name, password, age) values(?, ?, ?, ?);";
        ps = this.connection.prepareStatement(sql);
    }

    @Override
    public void close() throws Exception {
        super.close();
        //Close connections and free resources
        if (connection != null) {
            connection.close();
        }
        if (ps != null) {
            ps.close();
        }
    }

    /**
     *The invoke () method is called once for each data insertion
     *
     * @param value
     * @param context
     * @throws Exception
     */
    @Override
    public void invoke(Student value, Context context) throws Exception {
        //Assemble the data and perform the insert operation
        ps.setInt(1, value.getId());
        ps.setString(2, value.getName());
        ps.setString(3, value.getPassword());
        ps.setInt(4, value.getAge());
        ps.executeUpdate();
    }

    private static Connection getConnection() {
        Connection con = null;
        try {
            Class.forName("com.mysql.jdbc.Driver");
            con = DriverManager.getConnection("jdbc:mysql://localhost:3306/test?useUnicode=true&characterEncoding=UTF-8", "root", "root123456");
        } catch (Exception e) {
            System.out.println("-----------mysql get connection has exception , msg = "+ e.getMessage());
        }
        return con;
    }
}

Flink program

The source here reads the data from Kafka, and then Flink reads the data (JSON) from Kafka, parses it into a student object with ALI fastjson, and then uses the sinktomysql we created in addsink to store the data in MySQL.

package com.zhisheng.flink;

import com.alibaba.fastjson.JSON;
import com.zhisheng.flink.model.Student;
import com.zhisheng.flink.sink.SinkToMySQL;
import org.apache.flink.api.common.serialization.SimpleStringSchema;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.sink.PrintSinkFunction;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer011;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer011;

import java.util.Properties;

/**
 * Desc:
 * weixin: zhisheng_tian
 * blog: http://www.54tianzhisheng.cn/
 */
public class Main3 {
    public static void main(String[] args) throws Exception {
        final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

        Properties props = new Properties();
        props.put("bootstrap.servers", "localhost:9092");
        props.put("zookeeper.connect", "localhost:2181");
        props.put("group.id", "metric-group");
        props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
        props.put("auto.offset.reset", "latest");

        SingleOutputStreamOperator<Student> student = env.addSource(new FlinkKafkaConsumer011<>(
                "Student", // this Kafka topic needs to be consistent with the topic of the tool class above
                new SimpleStringSchema(),
                props)).setParallelism(1)
                .map(string -> JSON.parseObject(string, Student.class)); // Fastjson parses strings into student objects

        student.addSink(new SinkToMySQL()); // Data sink to MySQL

        env.execute("Flink add sink");
    }
}

result

Run the Flink program, and then run the kafkautils2.java tool class, which is OK.

If the data insertion is successful, let’s check our database:

Learn Flink from 0 to 1 - how to customize data sink?

100 pieces of data we sent from Kafka have been inserted into the database. Prove that our sinktomysql works. Isn’t it simple?

Project structure

I’m afraid you don’t know my project structure. Here’s a screenshot:

Learn Flink from 0 to 1 - how to customize data sink?

last

This article mainly uses a demo to tell you how to customize the sink function and transfer the data sink from Kafka to MySQL. If there are other data sources in your project, you can also change to the corresponding source. It is also possible that your sink is to other places or in other different ways, then it is still the same routine: inherit the richsink function abstract class, Override the invoke method.

Pay attention to me

Please indicate the original address for Reprint:http://www.54tianzhisheng.cn/2018/10/31/flink-create-sink/

In addition, I have compiled some Flink learning materials, and I have put all the official account of WeChat. You can add my wechat: Zhisheng_ Tian, and then reply to the keyword: Flink, you can get it unconditionally.

Learn Flink from 0 to 1 - how to customize data sink?

Related articles

1、Learning Flink from 0 to 1 – Introduction to Apache Flink

2、Learning Flink from 0 to 1 — an introduction to building Flink 1.6.0 environment and building and running simple programs on MAC

3、Learn Flink from 0 to 1 – detailed explanation of Flink profile

4、Learning Flink from 0 to 1 – Introduction to data source

5、Learn Flink from 0 to 1 – how to customize the data source?

6、Learning Flink from 0 to 1 – Introduction to data sink

7、Learn Flink from 0 to 1 – how to customize data sink?

Recommended Today

Apache sqoop

Source: dark horse big data 1.png From the standpoint of Apache, data flow can be divided into data import and export: Import: data import. RDBMS—–>Hadoop Export: data export. Hadoop—->RDBMS 1.2 sqoop installation The prerequisite for installing sqoop is that you already have a Java and Hadoop environment. Latest stable version: 1.4.6 Download the sqoop installation […]