Linkis1. 0.2 installation and use guide

Time:2022-5-14

Linkis installation and use guide

This article is mainly used to guide usersLinkis
andDataSphereStudioInstallation and deployment of, as well as hive, spark and Flink in scriptis function
Test the engine script so that users can quickly start with linkis and understand its core functions. The functions such as data exchange, data service, data quality and task scheduling are not tested. They can be installed and tested in combination with official documents.

1. Background

The company’s self-developed big data middle platform products are used to help users quickly collect data, sort out data, build data warehouses, data services and data asset management. Many big data components are involved, and each component has its own
API, which leads to high learning cost and difficult maintenance for developers. Therefore, it is considered to pull out the computing layer, be responsible for docking the upper application, and the work of connecting the lower storage of big data and the computing engine is also handled by the computing layer
It provides this capability, connects multiple computing and storage engines (such as spark, Flink, hive, python, etc.) and provides a unified rest / websocket / JDBC interface. Therefore, linkis is installed to test its core functions.

2. Introduction

2.1 Linkis

As the computing middleware between the upper application and the lower engine, the upper application can easily connect and access MySQL / spark / hive / Presto / Flink by using the rest / websocket / JDBC and other standard interfaces provided by linkis
At the same time, it realizes the cross upper application interworking of user resources such as variables, scripts, functions and resource files. As a computing middleware, linkis provides powerful connectivity, reuse, orchestration, expansion, governance and control capabilities.
The application layer and engine layer are decoupled by computing middleware, which simplifies the complex network call relationship, reduces the overall complexity, and saves the overall development and maintenance cost.

On August 2, 2021, Weizhong bank’s open source projectLinkisOfficially passed the international top open source organization Apache Software Foundation (ASF)
)The outstanding performance of the voting resolution passed by a unanimous vote has become the ASF incubator project.

2.1.1 core features

  • Rich underlying computing storage engine support
    Currently supported computing storage engines: spark, hive, python, presto, elasticsearch, mlsql, tispark, JDBC, shell, Flink, etc.
    Supported scripting languages: sparksql, hiveql, python, shell, pyspark, R, Scala, JDBC, etc.
  • Strong computing governance。 Based on services such as orchestrator, label manager and customized spring cloud gateway, linkis can provide cross cluster / cross IDC based on multi-level labels
    Support capabilities of fine-grained routing, load balancing, multi tenancy, traffic control, resource control and scheduling strategies (such as dual active, active standby, etc.).
  • Full stack computing storage engine architecture support。 It can receive, execute and manage tasks and requests for various computing and storage engines, including offline batch tasks, interactive query tasks, real-time streaming tasks and storage tasks.
  • Resource management capability。 ResourceManager not only has the resource management capability for yarn and linkis enginemanager, but also will provide multi-level resource allocation and recycling capability based on tags, so that ResourceManager
    Strong resource management capabilities across clusters and computing resource types.
  • Unified context service。 Generate context ID for each computing task, manage user and system resource files (jar, zip, properties, etc.), result set, parameter variables, functions, etc. across users, systems and computing engines, set them in one place and reference them automatically everywhere.
  • Unified material。 System and user level material management, which can share and flow, and share materials across users and systems.

2.2 DataSphereStudio

Datasphere studio (hereinafter referred to as DSS) is a data application development and management integration framework developed by Weizhong bank. Based on plug-in integration framework design and computing middleware linkis, it can easily access various data application systems on the upper layer, making data development simple and easy to use. Under the unified UI,
With a workflow style graphical drag and drop development experience, datasphere studio will meet the needs of the whole process of data application development from data exchange, desensitization cleaning, analysis and mining, quality detection, visual display, regular scheduling to data output application.

DSS is highly integrated. At present, the integrated systems include:

3. Installation

Linkis version 1.0.2 and DSS version 1.0.2 are used in the installation and testing of this article
1.0.0. Since the installation is in the internal test stage, it is directly usedDSS1. 0 + Linkis1. 0.2 one click deployment package
, you can directly click to download and install. This deployment package mainly includes scriptis (data development panel) and management console (engine, micro service management and global history log). For visualization, data quality, workflow scheduling, data exchange, data service and other functions, you can refer to the official documents for installation, which will not be repeated in this paper.

The components involved in this installation include Hadoop, hive, spark and Flink. The jar packages related to this environment will also be put on the network disk, including hive’s support for tez engine, Spark’s support for hive and Flink’s support for various
Connector support). In addition, jar packages in lib directory of hive engine and Flink engine will also be uploaded for reference. Some problems are caused by lack of jar packages or version problems.

Link: https://pan.baidu.com/s/17g05rtfE_JSt93Du9TXVug 
Extraction code: zpep

 Computing layer
    Link engine #linkis engine plug-in package
    │      flink_engine.zip
    │      hive_ engine. Zip # supports tez
    │      spark_engine.zip
    └ - local cluster # local cluster configuration and jar package
    │       flink_linkis.zip
    │       hive_linkis.zip
    │       spark_linkis.zip
    └ - UDF # custom function test jar package
            hive_udf.jar
            flink_udf.jar

If you encounter problems during installation, you can first consult the official Q & A and record the common problems during installation and use. The address is:https://docs.qq.com/doc/DSGZhdnpMV3lTUUxq

Since it is only a functional test, DSS and linkis are installed in this paper
They are all stand-alone versions without multi live and multi copy deployment. For multi node deployment, please refer to the official documentCluster_Deployment

3.1 version description of components involved

Linkis1. 0.2 installation and use guide

Version Description

Due to the difference between our cluster component version and the engine component version supported by linkis by default, we need to compile the corresponding plug-ins by ourselves. You need to download the source code of linkis, modify the corresponding component version and recompile.

3.2 environment dependent installation

As a computing middleware, the storage of its own metadata depends on MySQL, and some computing and storage engines are installed according to our needs. This paper mainly uses hive, spark and Flink engines, of which Flink engine will be involved
Kafka, redis, mongodb, elasticsearch and other components. Before installing linkis, you should ensure that these components have been installed and can be used normally. This article tests that the dependent cluster is a non secure cluster and Kerberos authentication is not enabled.

Among them, the spark official website installation package does not support hive, so spark needs to be compiled to support hive. It is necessary to correctly specify the Hadoop version, Scala version and add hive support, and ensure that sparksql can be successfully run locally.

Theoretically, the server installing linkis only needs to ensure network interoperability with the server installing the above services.

3.3 preparation of installation package

have access toDSS1. 0 + Linkis1. 0.2 one click deployment package
Install, but because the versions of the linker engine plug-in are inconsistent, you need to change the corresponding component version globally and recompile linker. And in version 1.0.2, Flink
Although the engine has been supported, it will not be added to the installation package during compilation. It needs to be compiled separately to add a new plug-in. It will also be described in detail later.

The following are the compilation commands:

//To pull the code for the first time, you need to execute the following commands to complete initialization
mvn -N install
//Execute package command
mvn clean install -Dmaven.test.skip=true

3.4 installation

3.4.1 installation environment inspection

Before the formal installation of linkis, some preparations need to be done:

  • The hardware environment check mainly ensures that the microservice can be started normally and will not be unable to start normally due to insufficient resources.

  • The dependency environment check mainly ensures that the linkis startup can be used normally, so as to avoid the failure of script execution caused by the inability to execute commands.

  • Installation user check mainly checks whether the installation user exists and configures the corresponding permissions. Linkis supports specifying submission and execution users.

  • The installation command check mainly ensures the smooth installation. Some commands will be used during the installation process. Check in advance to ensure smooth installation.

  • The directory check mainly ensures the existence of the cache directory configured by linkis to avoid the directory not being found during execution.

3.4.1.1 hardware environment inspection

By default, the heap memory of each microservice JVM is 512 M, which can be modifiedSERVER_HEAP_SIZETo uniformly adjust the heap memory of each microservice. If the server resources are small, it is recommended to modify this parameter to 128 M. As follows:

    vim ${LINKIS_HOME}/config/linkis-env.sh
    # java application default jvm memory.
    export SERVER_HEAP_SIZE="128M"

Installing DSS and linkis services will start 6 DSS microservices and 8 linkis microservices. When linkis performs hive, spark, Flink and other tasks, it will also startLINKIS-CG-ENGINECONN
Microservices are installed in a stand-alone version. It is necessary to ensure that all microservices can be started.

Environmental dependence inspection

Hadoop environment:It needs to be configuredHADOOP_HOMEHADOOP_CONF_DIREnvironment variables, and these two directories exist. And it can be executed on the server where linkis is installedhadoop fs -ls /Command.

Hive environment:It needs to be configuredHIVE_HOMEHIVE_CONF_DIREnvironment variables, and these two directories exist. If hive configuration file cannot be read, metadata information may not be obtained normally, and built-in derby will be used as hive metabase.

Spark environment:It needs to be configuredSPARK_HOMESPARK_CONF_DIREnvironment variables, and these two directories exist, so it is necessary to ensure that they can be executed on the server where the spark engine plug-in is installedspark-submit --versionCommand, spark tasks will be submitted to yarn for execution through this command. In order to ensure sparksql’s support for hive, in addition to ensuring successful local operationspark-sqlThe command also needs to ensure that the sparksql on yarn mode can be executed successfully. The specific command is./spark-sql --master yarn --deploy-mode client, test the SQL task in the client.

Flink environment:It needs to be configuredFLINK_HOMEFLINK_CONF_DIRFLINK_LIB_DIREnvironment variables, and these three directories exist.

Direct copy is recommendedHadoopHiveSparkFlinkDirectories and subdirectories to the corresponding nodes, and configure environment variables. After the environment variables are modified, they need to take effect. Commandsource /etc/profile, environment variables are referenced as follows:

export JAVA_HOME=/opt/jdk1.8
export CLASSPATH=.$CLASSPATH:$JAVA_HOME/lib
export HADOOP_HOME=/opt/install/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HIVE_HOME=/opt/install/hive
export HIVE_CONF_DIR=$HIVE_HOME/conf
export FLINK_HOME=/opt/install/flink
export FLINK_CONF_DIR=/opt/install/flink/conf
export FLINK_LIB_DIR=/opt/install/flink/lib
export SPARK_HOME=/opt/install/spark
export SPARK_CONF_DIR=$SPARK_HOME/conf
export PATH=$MAVEN_HOME/bin:$HADOOP_HOME/bin:$HIVE_HOME/bin:$SPARK_HOME/bin:$SQOOP_HOME/bin/:$FLINK_HOME/bin:$FLINKX_HOME/bin:$JAVA_HOME/bin:$PATH
export CLASSPATH=.$CLASSPATH:$JAVA_HOME/lib

Check whether the environment variable is effective:

sudo su - ${username}
echo ${JAVA_HOME}
echo ${FLINK_HOME}

MySQL environment:Because linkis uses Mysql to save metadata, and the query syntax used is incompatible with the default configuration of MySQL, it will appearONLY_FULL_GROUP_BYThe error is reported and needs to be modifiedsql_mode。 In addition, in the test of Flink engine, it is necessary to open MySQL binlog and make modifications when checking the environment. If it is not necessary to open binlog, it can also be left unmodified.

i. Modificationsql_modeto configure:

1. View the current SQL_ mode
select @@global.sql_mode;
2. Modify SQL_ mode
vim /etc/my.cnf
sql_mode=STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
3. Restart MySQL service
service mysqld restart
service mysqld status

ii. Enable binlog

1. Modify the configuration VIM / etc / my CNF, add the following configuration
   server_id=1
   log_bin=mysql-bin
   binlog_format=ROW
   expire_logs_days=30
2. Restart MySQL service
service mysqld restart
service mysqld status
3. View status
show VARIABLES LIKE 'log_bin';
show global variables like "binlog%";
3.4.1.3 installation user inspection

For example:The deployment user is a Hadoop account

First check whether there are Hadoop users in the system. After setting up the cluster, Hadoop users may already exist. If they already exist, they can be authorized directly; If it does not exist, create a user first and then authorize.

  • Check whether there are Hadoop users. The command is:cat /etc/passwd | grep hadoop
httpfs:x:983:976:Hadoop HTTPFS:/var/lib/hadoop-httpfs:/bin/bash
mapred:x:982:975:Hadoop MapReduce:/var/lib/hadoop-mapreduce:/bin/bash
kms:x:979:972:Hadoop KMS:/var/lib/hadoop-kms:/bin/bash
  • If it does not exist, create a Hadoop user and join the Hadoop user group. The command is:sudo useradd hadoop -g hadoop

  • Grant sudo permission to Hadoop users. The command is:vi /etc/sudoers, add to filehadoop ALL=(ALL) NOPASSWD: NOPASSWD: ALLContent, because the file is read-only, usewq!
    Force save

  • Modify the environment variables of the installation user,vim /home/hadoop/.bash_rcConfigure environment variables as follows:

export JAVA_HOME=/opt/jdk1.8
export HADOOP_HOME=/opt/install/hadoop
export HADOOP_CONF_DIR=/opt/install/hadoop/etc/hadoop
export HIVE_HOME=/opt/install/hive
export HIVE_CONF_DIR=/opt/install/hive/conf
export FLINK_HOME=/opt/install/flink
export FLINK_CONF_DIR=/opt/install/flink/conf
export FLINK_LIB_DIR=/opt/install/flink/lib
export SPARK_HOME=/opt/install/spark
export SPARK_CONF_DIR=/opt/install/spark/conf
3.4.1.4 installation order inspection

Command tools required by linkis (before formal installation, the script will automatically detect whether these commands are available. If they do not exist, the script will try to install them automatically. If the installation fails, the user needs to manually install the following basic shell command tools):

  • telnet
  • tar
  • sed
  • dos2unix
  • yum
  • java
  • unzip
  • expect

Can viewvim bin/checkEnv.shThe commands checked in the script can be commented out for the command check of some unnecessary functions. For example: check of Python command, etc.

3.4.1.5 catalog inspection

The linkis service requires the user to configure the local engine directoryENGINECONN_ROOT_PATHAnd log cache directoryHDFS_USER_ROOT_PATH, you can choose to cache logs on HDFS or locally. If HDFS is configured
Path, the log and execution results will be written to HDFS by default.

ENGINECONN_ROOT_PATHAs a local directory, users need to create it in advance and complete the authorization commandChmod - R 777 / directory, if linkis1 In version 0.2, there is no need to create and authorize in advance. It will be automatically created and authorized in scripts and programs.

HDFS_USER_ROOT_PATHThe path on HDFS needs to be created in advance and the authorization command is completedHadoop FS - Chmod - R 777 / directory

3.4.2 unzip the installation package

useunzipCommand decompression, including the installation packages of linkis, DSS and web. Each component also has its own installation and configuration scripts. The general principle is to modify the configuration file in the conf directory as needed. If the configuration is modified, the installation and startup scripts in the bin directory can be used to complete the installation and startup operations.
Users can use the one click Install command for one click installation, or decompress each compressed package and install it by themselves. If you use one click installation, the unified configuration may not be synchronized to components such as linkers, DSS and web. You need to check carefully before starting.

The decompression directory is as follows:

│  wedatasphere-dss-1.0.0-dist.tar. GZ #dss backend installation package will be automatically decompressed by using the one click Install command
│ the webdatasphere-dss-web-1.0.0-dist.zip #web front-end installation package will be automatically decompressed by using the one click Install command
│  wedatasphere-linkis-1.0.2-combined-package-dist.tar. The GZ #linkis backend installation package will be automatically decompressed by using the one click Install command 
│
├─bin
│      checkEnv. SH # pre installation command check script. For unnecessary commands, you can comment and skip the check
│      install. SH # one click installation command will complete operations such as decompressing, creating necessary directories, importing metadata, etc
│      replace. SH # uses scripts internally to complete the coverage of unified configuration
│      start-all. SH # one click to start all microservice scripts. Start linkis first, then the back end of DSS, and then the front end of DSS
│      stop-all. SH # one click to stop all microservice scripts
│
└─conf
        config. SH # unified configuration script, through replace SH script to cover the configuration in each microservice of each component
        db. SH # unified database configuration script, including linkis metabase configuration and hive metabase configuration

3.4.3 modify configuration

Users need toconf/db.shConfigure the meta database connection information of linkis and hive, andconf/config.shConfigure DSS and linkis in the script
Installation and startup information. Since there are more than a dozen microservices, special attention should be paid when configuring the microservice port number to avoid the port number being occupied.

View port number occupancy:

#View all port numbers
netstat -ntlp
#Check whether it is currently occupied
netstat -tunlp |grep  8080

db. SH configuration example:

## for DSS-Server and Eventchecker APPJOINT
MYSQL_HOST=host
MYSQL_PORT=port
MYSQL_DB=db
MYSQL_USER=user
MYSQL_PASSWORD=password
##Hive configuration
HIVE_HOST=host
HIVE_PORT=port
HIVE_DB=db
HIVE_USER=user
HIVE_PASSWORD=password

config. SH configuration example:

### deploy user
deployUser=hadoop
### Linkis_VERSION
LINKIS_VERSION=1.0.2
### DSS Web
DSS_NGINX_IP=127.0.0.1
DSS_WEB_PORT=8088
### DSS VERSION
DSS_VERSION=1.0.0
##################################################################################
### Generally local directory
WORKSPACE_USER_ROOT_PATH=file:///tmp/linkis/ 
### User's root hdfs path
HDFS_USER_ROOT_PATH=hdfs:///tmp/linkis 
### Path to store job ResultSet:file or hdfs path
RESULT_SET_ROOT_PATH=hdfs:///tmp/linkis 
### Path to store started engines and engine logs, must be local
ENGINECONN_ROOT_PATH=/appcom/tmp
###Engine environment variable configuration
HADOOP_CONF_DIR=/opt/install/hadoop/etc/hadoop
HIVE_CONF_DIR=/opt/install/hive/conf
SPARK_CONF_DIR=/opt/install/spark/conf
##YARN REST URL  spark engine required
YARN_RESTFUL_URL=http://127.0.0.1:8088
### for install
LINKIS_PUBLIC_MODULE=lib/linkis-commons/public-module
##Microservice port configuration
###  You can access it in your browser at the address below:http://${EUREKA_INSTALL_IP}:${EUREKA_PORT}
#LINKIS_EUREKA_INSTALL_IP=127.0.0.1         # Microservices Service Registration Discovery Center
LINKIS_EUREKA_PORT=20303
###  Gateway install information
#LINKIS_GATEWAY_PORT =127.0.0.1
LINKIS_GATEWAY_PORT=8001
### ApplicationManager
#LINKIS_MANAGER_INSTALL_IP=127.0.0.1
LINKIS_MANAGER_PORT=8101
### EngineManager
#LINKIS_ENGINECONNMANAGER_INSTALL_IP=127.0.0.1
LINKIS_ENGINECONNMANAGER_PORT=8102
### EnginePluginServer
#LINKIS_ENGINECONN_PLUGIN_SERVER_INSTALL_IP=127.0.0.1
LINKIS_ENGINECONN_PLUGIN_SERVER_PORT=8103
### LinkisEntrance
#LINKIS_ENTRANCE_INSTALL_IP=127.0.0.1
LINKIS_ENTRANCE_PORT=8104
###  publicservice
#LINKIS_PUBLICSERVICE_INSTALL_IP=127.0.0.1
LINKIS_PUBLICSERVICE_PORT=8105
### cs
#LINKIS_CS_INSTALL_IP=127.0.0.1
LINKIS_CS_PORT=8108
##########Linkis micro service configuration completed##### 
################### The install Configuration of all DataSphereStudio's Micro-Services #####################
#Used to store temporary zip package files published to schedule
WDS_SCHEDULER_PATH=file:///appcom/tmp/wds/scheduler
### This service is used to provide dss-framework-project-server capability.
#DSS_FRAMEWORK_PROJECT_SERVER_INSTALL_IP=127.0.0.1
DSS_FRAMEWORK_PROJECT_SERVER_PORT=9007
### This service is used to provide dss-framework-orchestrator-server capability.
#DSS_FRAMEWORK_ORCHESTRATOR_SERVER_INSTALL_IP=127.0.0.1
DSS_FRAMEWORK_ORCHESTRATOR_SERVER_PORT=9003
### This service is used to provide dss-apiservice-server capability.
#DSS_APISERVICE_SERVER_INSTALL_IP=127.0.0.1
DSS_APISERVICE_SERVER_PORT=9004
### This service is used to provide dss-workflow-server capability.
#DSS_WORKFLOW_SERVER_INSTALL_IP=127.0.0.1
DSS_WORKFLOW_SERVER_PORT=9005
### dss-flow-Execution-Entrance
### This service is used to provide flow execution capability.
#DSS_FLOW_EXECUTION_SERVER_INSTALL_IP=127.0.0.1
DSS_FLOW_EXECUTION_SERVER_PORT=9006
### This service is used to provide dss-datapipe-server capability.
#DSS_DATAPIPE_SERVER_INSTALL_IP=127.0.0.1
DSS_DATAPIPE_SERVER_PORT=9008
##########DSS micro service configuration completed#####
#################################################################################
## java application minimum jvm memory
export SERVER_HEAP_SIZE="128M"
##Sendemail configuration only affects the email function in DSS workflow
EMAIL_HOST=smtp.163.com
EMAIL_PORT=25
[email protected]
EMAIL_PASSWORD=xxxxx
EMAIL_PROTOCOL=smtp

3.4.4 installation directory and configuration inspection

i. Installation

After modifying the configuration, use the one click installation commandbin/install.sh, complete the installation.

After installation, three directories will be generated: linkis, DSS and web. The directory tree of each directory is listed below, and only the main directories are displayed.

The linkis directory tree is as follows:

├── linkis
│♪ bin # mainly stores commands related to linkis functions, such as the client executing hive and spark tasks
│   │   ├── linkis-cli
│   │   ├── linkis-cli-hive
│   │   ├── linkis-cli-spark-sql
│   │   ├── linkis-cli-spark-submit
│   │   └── linkis-cli-start
│♪ configuration file of conf #linkis microservice
│   │   ├── application-eureka.yml
│   │   ├── application-linkis.yml
│   │   ├── linkis-cg-engineconnmanager.properties
│   │   ├── linkis-cg-engineplugin.properties
│   │   ├── linkis-cg-entrance.properties
│   │   ├── linkis-cg-linkismanager.properties
│   │   ├── linkis-cli
│   │   │   ├── linkis-cli.properties
│   │   │   └── log4j2.xml
│   │   ├── linkis-env.sh
│   │   ├── linkis-mg-gateway.properties
│   │   ├── linkis.properties
│   │   ├── linkis-ps-cs.properties
│   │   ├── linkis-ps-publicservice.properties
│   │   ├── log4j2.xml
│   │   └── token.properties
│♪ SQL script for DB #linkis metadata initialization
│   │   ├── linkis_ddl.sql
│   │   ├── linkis_dml.sql
│♪ - dependent packages of Lib #linkis modules
│   │   ├── linkis-commons
│   │   ├── linkis-computation-governance
│   │   │   ├── linkis-cg-engineconnmanager
│   │   │   ├── linkis-cg-engineplugin
│   │   │   ├── linkis-cg-entrance
│   │   │   ├── linkis-cg-linkismanager
│   │   │   └── linkis-client
│   │   │       └── linkis-cli
│   │   ├── linkis-engineconn-plugins
│   │   │   ├── appconn
│   │   │   ├── flink
│   │   │   ├── hive
│   │   │   ├── python
│   │   │   ├── shell
│   │   │   └── spark
│   │   ├── linkis-public-enhancements
│   │   │   ├── linkis-ps-cs
│   │   │   └── linkis-ps-publicservice
│   │   └── linkis-spring-cloud-services
│   │       ├── linkis-mg-eureka
│   │       └── linkis-mg-gateway
│   ├── LICENSE
│   ├── README_CN.md
│   ├── README.md
│ └ - SBIN #linkis startup script, which is used to start various micro services
│       ├── common.sh
│       ├── ext
│       │   ├── linkis-cg-engineconnmanager
│       │   ├── linkis-cg-engineplugin
│       │   ├── linkis-cg-entrance
│       │   ├── linkis-cg-linkismanager
│       │   ├── linkis-common-start
│       │   ├── linkis-mg-eureka
│       │   ├── linkis-mg-gateway
│       │   ├── linkis-ps-cs
│       │   └── linkis-ps-publicservice
│       ├── linkis-daemon.sh
│       ├── linkis-start-all.sh
│       └── linkis-stop-all.sh

The DSS directory tree is as follows:

├── dss
│♪ bin #dss installation script directory
│   │   ├── appconn-install.sh
│   │   ├── checkEnv.sh
│   │   ├── excecuteSQL.sh
│   │   └── install.sh
│♪ conf #dss various micro service configuration directories
│   │   ├── application-dss.yml
│   │   ├── config.sh
│   │   ├── db.sh
│   │   ├── dss-apiservice-server.properties
│   │   ├── dss-datapipe-server.properties
│   │   ├── dss-flow-execution-server.properties
│   │   ├── dss-framework-orchestrator-server.properties
│   │   ├── dss-framework-project-server.properties
│   │   ├── dss.properties
│   │   ├── dss-workflow-server.properties
│   │   ├── log4j2.xml
│   │   ├── log4j.properties
│   │   └── token.properties
│♪ - DSS appconns #dss integrates other system storage directories, such as visualization, data quality, scheduling, etc
│♪ lib #dss various microservice dependency packages
│   ├── README.md
│ └ - SBIN #dss micro service startup script directory, which supports one click Startup and single startup
│       ├── common.sh
│       ├── dss-daemon.sh
│       ├── dss-start-all.sh
│       ├── dss-stop-all.sh
│       └── ext
│           ├── dss-apiservice-server
│           ├── dss-datapipe-server
│           ├── dss-flow-execution-server
│           ├── dss-framework-orchestrator-server
│           ├── dss-framework-project-server
│           └── dss-workflow-server

The web directory tree is as follows:

├── web
│   ├── config. Configuration script of SH #0 web front end, such as gateway address, etc
│♪ dist #dss front end static file
│♪ - DSS #linkis front-end static file (the management console is integrated by linkis)
│   │   └── linkis
│   └── install. SH # install startup script, install and configure nginx

ii. Check configuration

Configuration check:After the installation is completed by using the one click installation command, some configurations are not completely overwritten, which needs to be checked by the user to ensure that the configuration is correct. The following are the problems encountered during installation:

1. The gateway address in DSS is configured incorrectly. Modify DSS Properties configuration file to correctly configure the gateway address
2. Config. In the web In the SH script, the gateway address is incorrectly configured and needs to be modified by the user
3. linkis1. The engine directory in 0.2 will complete automatic authorization before creating the engine, and the agent needs to be started. Modify linkis CG engineconnmanager Properties, add WDS linkis. storage. enable. io. proxy=true

3.4.5 start service

i. Start service

After completing the installation and configuration check steps, there are two ways to start the microservice:

One is to use a one click Start scriptbin/start-all.shTo start all microservices, including linkis back-end, DSS back-end and web front-end.

Another way is to enter the respective installation directory and start all micro services by yourself. First start the linkis service and use thelinkis/sbin/linkis-start-all.shOf course, for the linkis service, you can also start and stop each micro service separately. Restart DSS
Service, usedss/sbin/dss-start-all.shCommand. Finally, start the web service and useweb/install.sh, it will automatically check whether it is installed
Nginx, if not, will automatically download and install and complete the configuration. In addition, it should be noted that,web/install.shScript configuration of ngnix is an overlay method. If multiple web services need to be started and multiple nginx listening needs to be configured on a server, you need to modify the script to avoid ngnix
The configuration is overwritten.

ii. Check whether the startup is successful

You can view the startup of various micro services in the background of linkis & DSS in Eureka interface. When the task is not executed, there are 8 micro services in linkis and 6 micro services in DSS; When a scriptis task is executed, linkis
Will startLINKIS-CG-ENGINECONNService. The log directory of the default microservice is given below:

// 1.  Linkis micro service log directory. The logs of the eight micro services started by default are all here. You can view the logs of each micro service accordingly
linkis/logs
// 2.  Microengine reference ` engine_ ROOT_ Path ` get the root directory of the engine. Generally, if the engine fails to start successfully, you need to pay attention to the 'links CG engineconnmanager' log; If the startup is successful, you need to pay attention to the engine log; If the engine starts successfully and the task fails, you can check the engine log first. If there is no specific information, you can check the yarn log and check the specific error reports.
${ENGINECONN_ROOT_PATH}/hadoop/workDir/UUID/logs
// 3.  DSS microservice logs. The six microservice logs started by default are all here. You can view each microservice log accordingly
dss/logs
// 4.  For the front-end problem, you can open debugging, view the specific request, obtain the specific microservice interface problem according to the request, and then view the log of the specific microservice according to the above directory

Eureka microservice interface:

Linkis1. 0.2 installation and use guide

DSS micro service

Linkis1. 0.2 installation and use guide

Links – micro services

III. Google browser access

Please use Google browser to access the following front-end address:http://DSS_NGINX_IP:DSS_WEB_PORTThe startup log prints this access address. When logging in, the user name and password of the administrator are the deployment user name. If the deployment user is Hadoop, the user name of the administrator is
User name / password: Hadoop / Hadoop.

Can be inlinkis-mg-gateway.propertiesConfigure LDAP information in configuration to access internal LDAP services.

Based on DSS1 0 trial version, many functions are limited:

  • On the login page, the main function panels and cases will be displayed on the home page;
Linkis1. 0.2 installation and use guide

home page
  • Scriptis panel is the focus of our installation and testing this time. It is used to write hive, spark, Flink and other scripts and function management;
Linkis1. 0.2 installation and use guide

Scriptis
  • The management console is integrated into the foreground interface of linkis, mainly including global history (script execution log), resource management (the usage of engine resources will be displayed when the engine is started), parameter configuration (yarn resource queue, engine resource configuration, etc.), global variables (global variable configuration), and ECM Management (ECM)
    Instance management, which can also manage the engine under ECM), microservice Management (microservice management panel)
Linkis1. 0.2 installation and use guide

Management desk

3.4.6 function test

This paper mainly tests hive, spark and Flink engines. The default installed linkis does not integrate Flink engine, so test hive and spark engines first. In addition, the custom function is also tested.

Some error reports and solutions encountered during use will also be described belowBest practicesIt is pointed out in. In case of error, you can refer toBest practices

3.4.6.1 Hive

i. Hive profile

Hive connector supports a variety of computing engines, such as Mr, tez, spark, etc. Mr engine is used by default, which needs to be inhive-site.xmlAs specified in, this article is configured for testing and has not been optimized. It is for reference only:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>hive.metastore.schema.verification</name>
        <value>false</value>
    </property>
    <property>
        <name>hive.metastore.uris</name>
        <value>thrift://host:9083</value>
    </property>
    <property>
        <name>spark.master</name>
        <value>yarn-cluster</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://host:3306/hive</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.cj.jdbc.Driver</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>MySQL5.7</value>
    </property>
    <property>
        <name>hive.auto.convert.join</name>
        <value>false</value>
        <description>Enables the optimization about converting common join into mapjoin</description>
    </property>
</configuration>

ii. Script test

Create a new script in the scriptis panel, and select hive as the script type. Script test requires a slightly more complex SQL to avoid hive parsing, only local query, and Mr task is not started. For script reference:

show
tables;
select name, addr, id
from linkis_1
group by name, addr, id
order by id;
select a.name, a.addr, b.phone
from linkis_1 a
         left join linkis_2 b on a.id = b.id
group by a.name, a.addr, b.phone
order by a.name;

III. benchmarking

If you need a benchmark, you can refer tohive-testbenchThe benchmark framework is tested based on tpc-ds and tpc-h
For the benchmark data generator and sample query, tpc-ds adopts star, snowflake and other multidimensional data modes. It contains 7 fact tables and 17 latitude tables, with an average of 18 columns per table. Its workload includes 99 SQL queries, covering the core parts of sql99 and 2003 and OLAP.
This test set contains complex applications such as statistics, report generation, online query and data mining of large data sets. The data and values used in the test are inclined and consistent with the real data. Tpc-ds is the best test set to objectively measure multiple different Hadoop versions and SQL on Hadoop technology.
This benchmark has the following main features:

  • A total of 99 test cases follow the syntax standards of sql99 and SQL 2003. SQL cases are relatively complex
  • The amount of data analyzed is large, and the test case is answering real business questions
  • Test cases include various business models (such as analysis report type, iterative online analysis type, data mining type, etc.)
  • Almost all test cases have high IO load and CPU computing requirements
3.4.6.2 Spark

Linkis’s support for spark enginecon plugin basically does not need to be changed. The main problems are: first, compile spark plug-in and select Scala version and JDK version that are the same as spark cluster environment; Second, spark
For the correct configuration of the cluster environment, if the following steps can be correctly executed locally, the general linkis plug-in can also be correctly executed.

i. Local test

// 1.  Ensure that the spakr job can be submitted successfully. The test command is as follows:

./spark-submit \
--class org.apache.spark.examples.SparkPi \
--master yarn \
--executor-memory 1G \
--total-executor-cores 2 \
/opt/install/spark/examples/jars/spark-examples_2.11-2.4.3.jar \
100

// 2.  Make sure spark on hive is in yaw mode and can be executed successfully. The default startup mode is the local mode, which can succeed as long as there is a dependency on hive locally. In the yarn mode, you need to upload the jar packages under the jars directory of spark to HDFS

./spark-sql  --master yarn --deploy-mode client 

//You can execute the following SQL to test
show tables;
select name,addr,id from linkis_1 group by name,addr,id order by id;
select a.name,a.addr,b.phone from linkis_1 a left join linkis_2 b on a.id=b.id group by a.name,a.addr,b.phone  order by a.name;

ii. Spark profile

  • spark-env.sh
#!/usr/bin/env bash
SPARK_CONF_DIR=/opt/install/spark/conf
HADOOP_CONF_DIR=/opt/install/hadoop/etc/hadoop
YARN_CONF_DIR=/opt/install/hadoop/etc/hadoop
SPARK_EXECUTOR_CORES=3
SPARK_EXECUTOR_MEMORY=4g
SPARK_DRIVER_MEMORY=2g
  • spark-defaults.conf
spark.yarn.historyServer.address=host:18080
spark.yarn.historyServer.allowTracking=true
spark.eventLog.dir=hdfs://host/spark/eventlogs
spark.eventLog.enabled=true
spark.history.fs.logDirectory=hdfs://host/spark/hisLogs
spark.yarn.jars=hdfs://host/spark-jars/*

III. linkis test

Create a new script in the scriptis panel, and select SQL as the script type. The test based on hive testbench also provides spark query statements. You can refer to this scenario for testing.

show
tables;
select name, addr, id
from linkis_1
group by name, addr, id
order by id;
select a.name, a.addr, b.phone
from linkis_1 a
         left join linkis_2 b on a.id = b.id
group by a.name, a.addr, b.phone
order by a.name;

Similarly, you can also select Scala type, initialize sqlcontext in the script, and directly execute SQL statements.

val sql = "show tables"
val df = sqlContext.sql(sql)
df.show()
3.4.6.3 UDF function

Linkis provides a portable way for users to implement custom functions and use them in scripts. At present, hive and spark engine plug-ins support custom functions. After testing, Flink engine does not support creating functions temporarily. The current version of plug-ins only supports some syntax.

The functions created through the DSS console are temporary by default. When the engine plug-in is started, they are valid in the current session.

  • i. Use process
1. Develop UDF function locally and complete packaging.
2. Upload the jar package on the scriptis interface of DSS.
3. Create a function in the DSS interface and specify the jar package, function name and function format (supplement the main class).
4. Select whether to load. The default is load. When the engine is initialized, a temporary function will be created. Adding and modifying functions requires restarting the engine to take effect.
Linkis1. 0.2 installation and use guide

1

Linkis1. 0.2 installation and use guide

2
  • ii. Loading process
1. When creating enginecon in engineconnserver, there will be execution logic before and after creating the engine.
2. Execute the afterexecutionexecute method, get all the UDF functions to be loaded from udfloadengineconnhook, check the UDF registration format, and traverse the registration.
3. From the perspective of loading process, the life cycle of UDF function is the life cycle of the engine. After the modification of UDF function is completed, the engine must be restarted before it can take effect.
4. If the UDF function is selected to load, the jar package will be placed under the classpath path of the engine and registered when the engine is created; If it is not loaded, the jar package will not exist in the classpath path and will not be registered; And they are all session level functions by default.
5. The detailed loading process can be searched by udfinfo keyword, and then the specific logic can be viewed.
  • III. API call

If you do not create or modify functions through the DSS console, you can refer to the APIUDFApiView the list of supported APIs. Here is an example:

POST http://gateway_ip:8001/api/rest_j/v1/udf/update

{"isShared":false,"udfInfo":{"id":4,"udfName":"testudf2","description":"7777","path":"file:///tmp/linkis/hadoop/udf/hive/hive_function.jar","shared":false,"useFormat":"testudf2()","load":true,"expire":false,"registerFormat":"create temporary function testudf2 as \" com.troila.hive.udf.MaskFromEnds\"","treeId":9,"udfType":0}}
3.4.6.4 linkis debugging mode

In addition to script debugging on DSS console, you can also use client mode, SDK and other methods

3.4.6.4.1 client mode

Use example:

./linkis-cli -engineType spark-2.4.3 -codeType sql -code "select count(*) from default.ct_test;"  -submitUser hadoop -proxyUser hadoop 
./linkis-cli -engineType hive-2.3.3 -codeType sql -code "select count(*) from default.ct_test;"  -submitUser hadoop -proxyUser hadoop 
./linkis-cli -engineType hive-2.3.3 -codeType sql -code "select * from ${table};" -varMap table=default.ct_test  -submitUser hadoop -proxyUser hadoop
3.4.6.4.2 SDK mode
  • Introduce dependency

<dependency>
    <groupId>com.webank.wedatasphere.linkis</groupId>
    <artifactId>linkis-computation-client</artifactId>
    <version>${linkis.version}</version>
    <exclusions>
        <exclusion>
            <artifactId>commons-codec</artifactId>
            <groupId>commons-codec</groupId>
        </exclusion>
        <exclusion>
            <artifactId>slf4j-api</artifactId>
            <groupId>org.slf4j</groupId>
        </exclusion>
        <exclusion>
            <artifactId>commons-beanutils</artifactId>
            <groupId>commons-beanutils</groupId>
        </exclusion>
    </exclusions>
</dependency>
  • Scala code example
package com.troila.bench.linkis.spark

import com.webank.wedatasphere.linkis.common.utils.Utils
import com.webank.wedatasphere.linkis.httpclient.dws.authentication.StaticAuthenticationStrategy
import com.webank.wedatasphere.linkis.httpclient.dws.config.DWSClientConfigBuilder
import com.webank.wedatasphere.linkis.manager.label.constant.LabelKeyConstant
import com.webank.wedatasphere.linkis.ujes.client.UJESClient
import com.webank.wedatasphere.linkis.ujes.client.request.{JobSubmitAction, ResultSetAction}
import org.apache.commons.io.IOUtils
import java.util
import java.util.concurrent.TimeUnit

object ScalaClientTest {

  def main(args: Array[String]): Unit = {
    val user = "hadoop"
    val username = "hadoop"
    val password = "hadoop"
    val yarnQueue = "default"
    val executeCode = "select name,addr,id from linkis_1 group by name,addr,id order by id"
    val gatewayUrl = "http://gateway_ip:8001"

    // 1.  Configure dwsclientbuilder and obtain a dwsclientconfig through dwsclientbuilder
    val clientConfig = DWSClientConfigBuilder.newBuilder()
      . addserverurl (gatewayurl) // specify the serverurl and the address of the linkis server-side gateway, such as http: // {IP}: {port}
      . connectiontimeout (30000) // connectiontimeout client connection timeout
      .discoveryEnabled(false). Discoveryfrequency (1, timeunit. Minutes) // whether to enable registration discovery. If enabled, the newly started gateway will be found automatically
      . loadbalancerenabled (true) // whether load balancing is enabled. If registration discovery is not enabled, load balancing is meaningless
      . maxconnectionsize (5) // specify the maximum number of connections, that is, the maximum number of concurrent connections
      .retryEnabled(false). Readtimeout (30000) // execution failed. Do you want to retry
      . setauthenticationstrategy (New staticauthenticationstrategy()) // authenticationstrategy linkis authentication method
      .setAuthTokenKey(username). Setauthtokenvalue (password) // authentication key, usually user name; Authentication value is generally the password corresponding to the user name
      .setDWSVersion("v1"). Build() // the version of the background protocol of linkis. The current version is V1

    // 2.  Obtain a ujesclient through dwsclientconfig
    val client = UJESClient(clientConfig)

    try {
      // 3.  Start code execution
      println("user : " + user + ", code : [" + executeCode + "]")
      val startupMap = new java.util.HashMap[String, Any]()
      startupMap. Put ("WDS. Links. Yarnqueue", yarnqueue) // start parameter configuration
      //Specify label
      val labels: util.Map[String, Any] = new util.HashMap[String, Any]
      //Add the tags that this execution depends on, such as enginelabel
      labels.put(LabelKeyConstant.ENGINE_TYPE_KEY, "spark-2.4.3")
      labels.put(LabelKeyConstant.USER_CREATOR_TYPE_KEY, "hadoop-IDE")
      labels.put(LabelKeyConstant.CODE_TYPE_KEY, "sql")
      //Specify source
      val source: util.Map[String, Any] = new util.HashMap[String, Any]
      //Parameter substitution
      val varMap: util.Map[String, Any] = new util.HashMap[String, Any]
      //      varMap.put("table", "linkis_1")

      val jobExecuteResult = client.submit(JobSubmitAction.builder
        .addExecuteCode(executeCode)
        .setStartupParams(startupMap)
        . setuser (user) // job submission user
        . addexecuteuser (user) // actual execution user
        .setLabels(labels)
        .setSource(source)
        .setVariableMap(varMap)
        . build) // user, request the user; Used for user level multi tenant isolation
      println("execId: " + jobExecuteResult.getExecID + ", taskId: " + jobExecuteResult.taskID)

      // 4.  Gets the execution status of the script
      var jobInfoResult = client.getJobInfo(jobExecuteResult)
      val sleepTimeMills: Int = 1000
      while (!jobInfoResult.isCompleted) {
        // 5.  Get the progress of script execution
        val progress = client.progress(jobExecuteResult)
        val progressInfo = if (progress.getProgressInfo != null) progress.getProgressInfo.toList else List.empty
        println("progress: " + progress.getProgress + ", progressInfo: " + progressInfo)
        Utils.sleepQuietly(sleepTimeMills)
        jobInfoResult = client.getJobInfo(jobExecuteResult)
      }
      if (!jobInfoResult.isSucceed) {
        println("Failed to execute job: " + jobInfoResult.getMessage)
        throw new Exception(jobInfoResult.getMessage)
      }
      // 6.  Get the job information of the script
      val jobInfo = client.getJobInfo(jobExecuteResult)
      // 7.  Get the list of result sets (multiple result sets will be generated if users submit multiple SQL at one time)
      val resultSetList = jobInfoResult.getResultSetList(client)
      println("All result set list:")
      resultSetList.foreach(println)
      val oneResultSet = jobInfo.getResultSetList(client).head
      // 8.  Obtain the specific result set through a result set information
      val fileContents = client.resultSet(ResultSetAction.builder().setPath(oneResultSet).setUser(jobExecuteResult.getUser).build()).getFileContent
      println("First fileContents: ")
      println(fileContents)
    } catch {
      case e: Exception => {
        e.printStackTrace()
      }
    }
    IOUtils.closeQuietly(client)
  }
}

3.5 extended functions

This part mainly focuses on the practice of hive on tez, and also includes the support for LLAP; In addition, we compiled the Flink engine plug-in, practiced Kafka, elasticsearch, mysql, CDC and other connectors, and realized
Sink connectors of redis and mongodb.

3.5.1 hive supports tez engine

For the support of tez engine, there are two main modifications: one is that hive cluster environment needs to support tez; Second, the linkis engine plug-in also needs corresponding dependencies. When switching the tez engine, if an error is reported, it is mostly caused by the lack of jar package or guava package conflict. Completion of testing
The jar package will be uploaded to the online disk for retention.

Users need to download and compile tez, complete local configuration, and start hive client locally to ensure that it is started with tez engine and successfully execute SQL logic. This process will not be repeated in this article.

3.5.1.1 linkis operation

To support the tez engine, you need totez-*Copy the first jar package to the engine dependency path of linkis, and then restart the ECM service.

For the early test, you may need to adjust the jar package frequently and start the ECM service frequently. The whole process will be slow. In the test stage, you can copy the jar package directly to theengineConnPublickDirDirectory. After the ECM is started, the Lib dependency and conf of the engine will be
Put them in this public directory. After the engine starts, it will suggest soft links from this directory. Therefore, you can directly copy the required jar package to this directory, so you don’t have to restart the ECM service. After the test is successful, remember to add the jar
Bag putlinkis/lib/linkis-engineconn-plugins/hive/dist/v2.3.7/libDirectory to avoid restarting the service, resulting in the loss of jar package.

List of jar packages to be copied:

// linkis/lib/linkis-engineconn-plugins/hive/dist/v2.3.7/lib
//In this directory, when the engine starts for the first time, it will generate a lib Zip cache package. If the jar package under lib is modified and the compressed package is not updated, the latest jar package cannot be used
tez-api-0.9.2.jar
tez-build-tools-0.9.2.jar
tez-common-0.9.2.jar
tez-dag-0.9.2.jar
tez-examples-0.9.2.jar
tez-ext-service-tests-0.9.2.jar
tez-history-parser-0.9.2.jar
tez-javadoc-tools-0.9.2.jar
tez-job-analyzer-0.9.2.jar
tez-mapreduce-0.9.2.jar
tez-protobuf-history-plugin-0.9.2.jar
tez-runtime-internals-0.9.2.jar
tez-runtime-library-0.9.2.jar
tez-tests-0.9.2.jar
tez-yarn-timeline-history-0.9.2.jar
tez-yarn-timeline-history-with-acls-0.9.2.jar
hadoop-yarn-registry-2.8.5.jar
3.5.1.2 local cluster configuration

In hive on tez mode, hive has two execution modes, one is container mode; The other is the LLAP mode. LLAP provides a hybrid model, which includes a resident process for IO interaction with datanode directly and is tightly integrated in
Dag framework can significantly improve the efficiency of hive query.

3.5.1.2.1 container mode
  • Prepare tez dependency package, upload it to HDFS, and complete authorization.
#Tez official documents indicate that this path can be a compressed package or a decompressed jar file. After testing, it is recommended to directly upload the unzipped jar file.
hdfs dfs -mkidr /tez_linkis
#Tez directory is the complete jar package after compiling tez
hdfs dfs -put tez  /tez_linkis
#Complete the authorization and ensure that the user submitting the link can read the tez file
hadoop fs -chmod -R 755 /tez_linkis
  • modifyhive-site.xml, switch the engine and configure the container mode
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>hive.metastore.schema.verification</name>
        <value>false</value>
    </property>
    <property>
        <name>hive.metastore.uris</name>
        <value>thrift://host:9083</value>
    </property>
    <property>
        <name>spark.master</name>
        <value>yarn-cluster</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://host:3306/hive</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.cj.jdbc.Driver</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>MySQL5.7</value>
    </property>
    <property>
        <name>hive.execution.engine</name>
        <value>tez</value>
        < description > modify the execution engine of hive to tez < / description >
    </property>

    <!-- container -->
    <property>
        <name>hive.execution.mode</name>
        <value>container</value>
    </property>
</configuration>
  • modify${hadoop_conf_dir}/etc/hadoop/tez-site.xml, configure tez dependency
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>tez.lib.uris</name>
        <value>${fs.defaultFS}/tez_linkis/tez</value>
    </property>
    <!--  tez. lib. uris. The path of some customized extensions can not be set as the main path of the user-defined extension
    <property>
        <name>tez.lib.uris.classpath</name>
        <value>${fs.defaultFS}/tez_linkis/tez</value>
    </property>
    <property>
        <name>tez.use.cluster.hadoop-libs</name>
        <value>true</value>
    </property>
    <property>
        <name>tez.history.logging.service.class</name>
        <value>org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService</value>
    </property>
</configuration>
3.5.1.2.2 LLAP mode

The user needs to start the linkap service in the same mode as the user’s deployment of linkap. Otherwise, the linkap service must be started in the same mode as the user’s deployment of linkapNo LLAP Daemons are running

  • Refer to the container mode to complete the dependency upload and configuration of tez

  • modifyhive-site.xml, switch the engine and configure the LLAP mode

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>hive.metastore.schema.verification</name>
        <value>false</value>
    </property>
    <property>
        <name>hive.metastore.uris</name>
        <value>thrift://host:9083</value>
    </property>
    <property>
        <name>spark.master</name>
        <value>yarn-cluster</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://host:3306/hive</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.cj.jdbc.Driver</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>MySQL5.7</value>
    </property>

    <property>
        <name>hive.execution.engine</name>
        <value>tez</value>
        < description > modify the execution engine of hive to tez < / description >
    </property>

    <!-- llap -->
    <property>
        <name>hive.execution.mode</name>
        <value>llap</value>
    </property>
    <property>
        <name>hive.llap.execution.mode</name>
        <value>all</value>
    </property>
    <property>
        <name>hive.llap.daemon.service.hosts</name>
        <value>@llap_service</value>
    </property>
    <property>
        <name>hive.zookeeper.quorum</name>
        <value>ct4:2181,ct5:2181,ct6:2181</value>
    </property>
    <property>
        <name>hive.zookeeper.client.port</name>
        <value>2181</value>
    </property>
    <property>
        <name>hive.llap.daemon.num.executors</name>
        <value>1</value>
    </property>
</configuration>
  • Deploy LLAP service

If the version of Hadoop yarn used is below 3.1.0 (excluding 3.1.0), you need to use Apache slider to deploy, because before Hadoop yarn 3.1.0, yarn itself does not support long-running services
Running services), and the slider component can package, manage and deploy long-running services to run on yarn.

If the version of Hadoop yarn is 3.1.0 or above, the slider component is not required at all, because since Hadoop yarn 3.1.0, yarn has merged and supported long running services and slider
The project also stopped updating.

Our Hadoop version is 2.8.5, so we need to deploy the LLAP service with the help of Apache slider. The specific process is as follows:

1. Install the slider and configure the environment variable slider_ Home, path, etc.
2. Execute the hive command to generate the startup package of LLAP. Ensure that the service name here is consistent with the name configured in hist site.
hive --service llap --name llap_service  --instances 2 --cache 512m --executors 2 --iothreads 2 --size 1024m --xmx 512m --loglevel INFO --javaHome /opt/jdk1.8
3. Because linkes uses Hadoop users to submit tasks, in order to ensure that tez applications can obtain the process of LLAP, it is necessary to switch to Hadoop users to start the LLAP service. If linkis uses other users to submit jobs, LLAP should also be started by the same user. Linkis can specify that the DSS console uses Hadoop users by default.
su hadoop;./llap-slider-31Aug2021/run.sh
4. The authentication service is available, and LLAP is successfully submitted on the page of yarn_ Service, and the user is Hadoop, and then use the JPS command on the server to view the process. The appearance of llapdaemon indicates success.
5. As long as the submitting users are the same, this service can be obtained by other applications, so you only need to start this service on one hive node, and other hive nodes do not need to install slider, LLAP slider startup package, etc.
3.5.1.3 linkage script test

After the local cluster configuration is completed and the local test is successful, it can be tested on the DSS console. It is necessary to ensure that the login user of DSS and the user starting LLAP service are the same user, otherwise it may appearNo LLAP Daemons are runningCan also be used
API mode, switching execution user:

//Usercreator can be specified as Hadoop IDE, so user is Hadoop.
POST http://gateway_ip:8001/api/rest_j/v1/entrance/submit

{
    "executionContent": {"code": "select name,addr,id from linkis_1 group by name,addr,id order by id", "runType":  "sql"},
    "params": {"variable": {}, "configuration": {}},
    "source":  {"scriptPath": ""},
    "labels": {
        "engineType": "hive-2.3.7",
        "userCreator": "root-IDE"
    }
}

3.5.2 Flink engine support

Flink engine has been integrated in linkis 1.0.2, but it will not be put into the installation package during compilation. It needs to be manually configured by adding an engine.

The debugging of Flink engine plug-in is mostly caused by jar package problems. Need guarantee${flink_lib_dir}The required connector package and format package exist in the Flink engine directory of and linkis. At present, the commissioning has passed
Kafka, mysql, CDC, elasticsearch, redis, mongodb and other connectors, and the data format supports CSV, JSON, etc. The complete jar package will be stored on the network disk.

Generally speaking, the debugging of Flink conector is mostly the error that the class cannot be found. It can be solved from the following ideas:

  • If it isCould not find any factory for identifier 'elasticsearch-7' that implements 'org.apache.flink.table.factories.DynamicTableFactory' in the classpath.
    This kind of error is generally that there is no corresponding connector package in the linkis engine plug-in directory, because the package in the engine plug-in directory will be placed on the classpath when starting.

  • If it isCaused by: java.lang.ClassNotFoundException: org.apache.flink.connector.jdbc.table.JdbcRowDataInputFormatThis kind of mistake is obvious
    This package already exists on the classpath and contains this class. Generally, this package does not exist in Flink’s Lin directory.

  • In addition, for some connector packages with SQL connector, this package is preferred. This package will introduce the connector package, so you can directly use the SQL connector package.

  • For some special data formats, you need to compile Flink format by yourself and put it in the Lib directory and the linkage engine directory. At present, CSV and JSON formats are supported. For debezium, Maxwell, canal, etc., you need to compile by yourself.

3.5.2.1 install Flink locally

i. Download installation package

You can directly download the compiled installation package provided by the official website, or download the source code and compile it yourself. The corresponding Scala version is 2.11 to avoid problems due to inconsistent Scala versions.

ii. Flink profile

  • flink-conf.yaml

vim flink-conf.yamlConfigure JDK and other information. Configuration example:

jobmanager.archive.fs.dir: hdfs://ct4:8020/flink-test
env.java.home: /opt/jdk1.8
classloader.resolve-order: parent-first
parallelism.default: 1
taskmanager.numberOfTaskSlots: 1
historyserver.archive.fs.dir: hdfs://ct4:8020/flink
jobmanager.rpc.port: 6123
taskmanager.heap.size: 1024m
jobmanager.heap.size: 1024m
rest.address: ct6
  • Configure environment variables

Examples of environment variables have been given in the environment check. Refer to the examples.

III. selective compilation

Flink provides a variety of formats to support the conversion of different data formats. CSV, JSON and other transformations are provided in the default installation package. Avro, ORC, raw, parquet, Maxwell, canal and debezium need to be compiled by themselves.

Flink provides a variety of connectors to support source, sink, etc. of different data sources. The default installation package will not be all provided and needs to be compiled by itself.

  • Compilation process:
1. Format the code first 
mvn spotless:apply
2. Package and compile 
mvn clean install -Dmaven.test.skip=true
3.5.2.2 add Flink engine plug-in

Because linkis1 0.2 doesn’t make Flink
The engine is automatically written into the engine plug-in. You need to manually add an engine plug-in. For details, please refer toEngine plug-in installation documentation

Linkis1. Version 0.2 is slightly different from the official document description. The following is the installation process of the computing layer:

1. Manually compile the Flink plug-in in the linkis project, copy and upload the Flink enginecon zip
mvn clean install -Dmaven.test.skip=true

2. Unzip the compressed file Flink engineconn Zip to the ` ${linkis_home} / lib / links engineconn plugins' directory
unzip flink-engineconn.zip

3. There are two directories for uploading the required connector package and data format conversion package. The following is an example of the directory:
${LINKIS_HOME}/lib/linkis-engineconn-plugins/flink/dist/v1.12.2/lib
${FLINK_HOME}/lib

4. Refresh the engine. Request the 'linkis-cg-engineplugin' service through the restful interface hot load engine, and you can get the port number of this service in the configuration file.
POST http://LINKIS-CG-ENGINEPLUGIN_IP:LINKIS-CG-ENGINEPLUGIN_PORT/api/rest_j/v1/rpc/receiveAndReply
{
  "method": "/enginePlugin/engineConn/refreshAll"
}

5. Optional operation. The parameters of the new engine need to be managed dynamically. You can add engine parameters to the meta database of linkis, so that the parameters of engine startup can be visually modified in the -- > parameter configuration of the management console. You can refer to the initialized SQL statement and the configuration of the Flink plug-in for insertion.
  • Basic test

Submit Basic test scripts through DSS console to ensure normal execution:

SELECT 'linkis flink engine test!!!';
SELECT name, COUNT(*) AS cnt
FROM (VALUES ('Bob'), ('Alice'), ('Greg'), ('Bob')) AS NameTable(name)
GROUP BY name;
3.5.2.3 Flink connector debugging

The debugging of Flink connector is mostly the problem of jar package, and the complete jar package has been put on the network disk.

Link: https://pan.baidu.com/s/17g05rtfE_JSt93Du9TXVug 
Extraction code: zpep

Again, make sure that the connector package and format package are uploaded in the engine directory of linkis and the installation directory of Flink.

3.5.2.3.1 Kafka Connector

Kafka connector can be used as either source or sink.

  • Compile according to the above compilation methodflink-sql-connector-kafka_2.11-1.12.2.jarPackage and upload to the two directories mentioned above

  • Test script

CREATE TABLE source_kafka
(
    id   STRING,
    name STRING,
    age  INT
) WITH (
      'connector' = 'kafka',
      'topic' = 'flink_sql_1',
      'scan.startup.mode' = 'earliest-offset',
      'properties.bootstrap.servers' = 'ct4:9092,ct5:9092,ct6:9092',
      'format' = 'json'
      );
CREATE TABLE sink_kafka
(
    id   STRING,
    name STRING,
    age  INT,
    PRIMARY KEY (id) NOT ENFORCED
) WITH (
      'connector' = 'upsert-kafka',
      'topic' = 'flink_sql_3',
      'properties.bootstrap.servers' = 'ct4:9092,ct5:9092,ct6:9092',
      'key.format' = 'json',
      'value.format' = 'json'
      );
INSERT INTO sink_kafka
SELECT `id`,
       `name`,
       `age`
FROM source_kafka;

Detailed configuration referenceApache Kafka SQL Connector

3.5.2.3.2 Mysql Connector

MySQL connector can be used as either source or sink. As a source, it will not monitor the changes of the database in real time.

  • uploadflink-connector-jdbc_2.11-1.12.2.jarandmysql-connector-java-5.1.49.jarTo the above two directories

  • Test script

CREATE TABLE source_mysql
(
    id   STRING,
    name STRING,
    age  int,
    PRIMARY KEY (id) NOT ENFORCED
) WITH (
      'connector' = 'jdbc',
      'url' = 'jdbc:mysql://host:3306/computation',
      'table-name' = 'flink_sql_1',
      'username' = 'root',
      'password' = 'MySQL5.7'
      );
CREATE TABLE sink_kafka
(
    id   STRING,
    name STRING,
    age  INT,
    PRIMARY KEY (id) NOT ENFORCED
) WITH (
      'connector' = 'upsert-kafka',
      'topic' = 'flink_sql_3',
      'properties.bootstrap.servers' = 'ct4:9092,ct5:9092,ct6:9092',
      'key.format' = 'json',
      'value.format' = 'json'
      );
INSERT INTO sink_kafka
SELECT `id`,
       `name`,
       `age`
FROM source_mysql;

Detailed configuration referenceJDBC SQL Connector

3.5.2.3.3 Mysql CDC Connector

MySQL CDC connector can be used as a source to monitor the changes of the database in real time and send it to Flink SQL source, thus eliminating the need for secondary sending using tools such as debezium, canal or Maxwell.

flink-connector-mysql-cdc
You can directly click to download this packageVervericaYes, compatible with Flink 1.12 and can be used directly.

  • uploadflink-connector-mysql-cdc-1.2.0.jarandmysql-connector-java-5.1.49.jarTo the above two directories

  • Test script

CREATE TABLE mysql_binlog
(
    id   STRING NOT NULL,
    name STRING,
    age  INT
) WITH (
      'connector' = 'mysql-cdc',
      'hostname' = 'host',
      'port' = '3306',
      'username' = 'root',
      'password' = 'MySQL5.7',
      'database-name' = 'flink_sql_db',
      'table-name' = 'flink_sql_2',
      'debezium. snapshot. locking. Mode '=' none '-- it is recommended to add, otherwise it will require locking the table
      );
CREATE TABLE sink_kafka
(
    id   STRING,
    name STRING,
    age  INT,
    PRIMARY KEY (id) NOT ENFORCED
) WITH (
      'connector' = 'upsert-kafka',
      'topic' = 'flink_sql_3',
      'properties.bootstrap.servers' = 'ct4:9092,ct5:9092,ct6:9092',
      'key.format' = 'json',
      'value.format' = 'json'
      );
INSERT INTO sink_kafka
SELECT `id`,
       `name`,
       `age`
FROM mysql_binlog;
3.5.2.3.4 Elasticsearch Connector

Elasticsearch connector can be used as sink side to persist data into es, select the corresponding version and compile it. If it is Flink
SQL, direct compilation is recommendedflink-sql-connector-elasticsearch7_2.11Just.

  • uploadflink-sql-connector-elasticsearch7_2.11-1.12.2.jarTo the above two directories

  • Test script

CREATE TABLE mysql_binlog
(
    id   STRING NOT NULL,
    name STRING,
    age  INT
) WITH (
      'connector' = 'mysql-cdc',
      'hostname' = 'host',
      'port' = '3306',
      'username' = 'root',
      'password' = 'MySQL5.7',
      'database-name' = 'flink_sql_db',
      'table-name' = 'flink_sql_2',
      'debezium. snapshot. locking. Mode '=' none '-- it is recommended to add, otherwise it will require locking the table
      );
CREATE TABLE sink_es
(
    id   STRING,
    name STRING,
    age  INT,
    PRIMARY KEY (id) NOT ENFORCED
) WITH (
      'connector' = 'elasticsearch-7',
      'hosts' = 'http://host:9200',
      'index' = 'flink_sql_cdc'
      );
INSERT INTO sink_es
SELECT `id`,
       `name`,
       `age`
FROM mysql_binlog;

Detailed configuration referenceElasticsearch SQL Connector

3.5.2.4 custom development connector

The connectors officially provided by Flink are limited. For some scenarios that need to push data to redis and mongodb through Flink SQL, they can not be well met. Therefore, corresponding connectors need to be developed to handle data push. Currently developed
Redis and mongodb connectors only support sink operations.

The complete code of the uploaded GitHub can be referred toflink-connector

In addition,bahir-flinkMany connectors that Flink doesn’t officially have are also maintained on. You can refer to them if necessary.

3.5.2.4.1 Redis Connector

The development of redis connector is based onbahir-flinkRedis connector in supports the configuration of sentinel mode and cluster mode. It has been optimized in two aspects:

1. Add the connection configuration and processing logic of stand-alone redis
2. The enabled code in the code is deleted, and the new versions of 'dynamictablesink' and 'dynamictablesinkfactory' are used to realize dynamic sink processing
  • uploadflink-connector-redis_2.11.jarTo the above two directories

  • Test script

CREATE TABLE datagen
(
    id   INT,
    name STRING
) WITH (
      'connector' = 'datagen',
      'rows-per-second' = '1',
      'fields.name.length' = '10'
      );
CREATE TABLE redis
(
    name STRING,
    id   INT
) WITH (
      'connector' = 'redis',
      'redis.mode' = 'single',
      'command' = 'SETEX',
      'single.host' = '172.0.0.1',
      'single.port' = '6379',
      'single.db' = '0',
      'key.ttl' = '60',
      'single.password' = 'password'
      );
insert into redis
select name, id
from datagen;

Detailed description referenceFlink connector redis description

3.5.2.4.2 MongoDB Connector

Development reference of mongodb connectorVerverica-ConnectorMongodb connector, which retains the core processing logic.

Process development articles can be referred toFlink SQL connector mongodb Development Guide

  • uploadflink-connector-mongodb_2.11.jarTo the above two directories

  • Test script

CREATE TABLE datagen
(
    id   INT,
    name STRING
) WITH (
      'connector' = 'datagen',
      'rows-per-second' = '1',
      'fields.name.length' = '10'
      );
CREATE TABLE mongoddb
(
    id   INT,
    name STRING
) WITH (
      'connector' = 'mongodb',
      'database' = 'mongoDBTest',
      'collection' = 'flink_test',
      'uri' = 'mongodb://user:[email protected]:27017/?authSource=mongoDBTest',
      'maxConnectionIdleTime' = '20000',
      'batchSize' = '1'
      );
insert into mongoddb
select id, name
from datagen;
3.5.2.5 submit Flink job

Submit the Flink SQL job through the scriptis of DSS, and start the session mode, which is applicable to select syntax, view data or test. For insert syntax, the default is 3 minutes, which will kill the task. Therefore, this scriptis
This method is not suitable for long running tasks. In the production environment, such tasks should be submitted in the way of oncejob, that is, the way of pre job in Flink.

  • introducelinkis-computation-clientPOM dependency

<dependency>
    <groupId>com.webank.wedatasphere.linkis</groupId>
    <artifactId>linkis-computation-client</artifactId>
    <version>${linkis.version}</version>
    <exclusions>
        <exclusion>
            <artifactId>commons-codec</artifactId>
            <groupId>commons-codec</groupId>
        </exclusion>
        <exclusion>
            <artifactId>slf4j-api</artifactId>
            <groupId>org.slf4j</groupId>
        </exclusion>
        <exclusion>
            <artifactId>commons-beanutils</artifactId>
            <groupId>commons-beanutils</groupId>
        </exclusion>
    </exclusions>
</dependency>
  • resourcesLower configurationlinkis.propertiesSpecify gateway address
wds.linkis.server.version=v1
wds.linkis.gateway.url=http://host:8001/
  • Code example
import com.webank.wedatasphere.linkis.common.conf.Configuration
import com.webank.wedatasphere.linkis.computation.client.once.simple.SimpleOnceJob
import com.webank.wedatasphere.linkis.computation.client.utils.LabelKeyUtils
import com.webank.wedatasphere.linkis.manager.label.constant.LabelKeyConstant

/**
 * Created on 2021/8/24.
 *
 * @author MariaCarrie
 */
object OnceJobTest {

  def main(args: Array[String]): Unit = {
    val sql =
      """CREATE TABLE source_from_kafka_8 (
        |  id STRING,
        |  name STRING,
        |  age INT
        |) WITH (
        |    'connector' = 'kafka',
        |    'topic' = 'flink_sql_1',
        |    'scan.startup.mode' = 'earliest-offset',
        |    'properties.bootstrap.servers' = 'ct4:9092,ct5:9092,ct6:9092',
        |    'format' = 'json'
        |);
        |CREATE TABLE sink_es_table1 (
        |  id STRING,
        |  name STRING,
        |  age INT,
        |  PRIMARY KEY (id) NOT ENFORCED
        |) WITH (
        |  'connector' = 'elasticsearch-7',
        |  'hosts' = 'http://host:9200',
        |  'index' = 'flink_sql_8'
        |);
        |INSERT INTO
        |  sink_es_table1
        |SELECT
        |  `id`,
        |  `name`,
        |  `age`
        |FROM
        |  source_from_kafka_8;
        |""".stripMargin

    val onceJob = SimpleOnceJob.builder().setCreateService("Flink-Test").addLabel(LabelKeyUtils.ENGINE_TYPE_LABEL_KEY, "flink-1.12.2")
      .addLabel(LabelKeyUtils.USER_CREATOR_LABEL_KEY, "hadoop-Streamis").addLabel(LabelKeyUtils.ENGINE_CONN_MODE_LABEL_KEY, "once")
      .addStartupParam(Configuration.IS_TEST_MODE.key, true)
      .addStartupParam("flink.taskmanager.numberOfTaskSlots", 4)
      .addStartupParam("flink.container.num", 4)
      .addStartupParam("wds.linkis.engineconn.flink.app.parallelism", 8)
      .addStartupParam(Configuration.IS_TEST_MODE.key, true)
      .setMaxSubmitTime(300000)
      .addExecuteUser("hadoop").addJobContent("runType", "sql").addJobContent("code", sql).addSource("jobName", "OnceJobTest")
      .build()

    onceJob.submit()
    onceJob.waitForCompleted()
    System.exit(0)
  }
}

4. Best practices

4.1 Hive

4.1.1 engine startup failure due to insufficient permissions

  • Problem description

The engine failed to start after submitting the job through linkis.

  • Detailed error reporting
Error: Could not find or load main class com.webank.wedatasphere.linkis.engineconn.launch.EngineConnServer

Caused by: LinkisException{errCode=10010, desc='DataWorkCloud service must set the version, please add property [[wds.linkis.server.version]] to properties file.', ip='null', port=0, serviceKind='null'}
  • Solution

The above two errors are caused by insufficient engine permissions, and the jar file or configuration file cannot be loaded. When you start the engine for the first time, linkis will put the dependencies of various engines intoengineConnPublickDirUnder, including lib and conf. The engine will be created when the engine is created
Directories, generatingengineConnExec.sh, andengineConnPublickDirEstablish soft links to lib and conf under. The reason for this problem isengineConnPublickDirInsufficient permissions under.

optimizationhandleInitEngineConnResourcesMethod to complete the authorization operation when initializing the engine. Recompilelinkis-engineconn-manager-server
Package, replacelinkis/lib/linkis-computation-governance/linkis-cg-engineconnmanagerThe jar in the directory, and then restart the ECM service separately. The code is as follows:

// todo fix bug. Failed to load com.webank.wedatasphere.linkis.engineconn.launch.EngineConnServer.
val publicDir = localDirsHandleService.getEngineConnPublicDir
val bmlResourceDir = Paths.get(publicDir).toFile.getPath
val permissions = Array(PosixFilePermission.OWNER_READ, PosixFilePermission.OWNER_WRITE, PosixFilePermission.OWNER_EXECUTE, PosixFilePermission.GROUP_READ, PosixFilePermission.GROUP_WRITE, PosixFilePermission.GROUP_EXECUTE, PosixFilePermission.OTHERS_READ, PosixFilePermission.OTHERS_WRITE, PosixFilePermission.OTHERS_EXECUTE)
//Authorization root path
warn(s"Start changePermission ${ENGINECONN_ROOT_DIR}")
changePermission(ENGINECONN_ROOT_DIR, true, permissions)

private def changePermission(pathStr: String, isRecurisive: Boolean, permissions: Array[PosixFilePermission]): Unit = {
  val path = Paths.get(pathStr)
  if (!Files.exists(path)) {
    warn(s"ChangePermission ${pathStr} not exists!")
    return
  }
  try {
    val perms = new util.HashSet[PosixFilePermission]()
    for (permission <- permissions) {
      perms.add(permission)
    }
    Files.setPosixFilePermissions(path, perms)
    warn(s"Finish setPosixFilePermissions ${pathStr} ")
  } catch {
    case e: IOException =>
      if (e.isInstanceOf[UserPrincipalNotFoundException]) {
        return
      }
      e.printStackTrace()
  }
  //When it is a directory, set file permissions recursively
  if (isRecurisive && Files.isDirectory(path)) {
    try {
      val ds = Files.newDirectoryStream(path)
      import java.io.File
      import scala.collection.JavaConversions._
      for (subPath <- ds) {
        warn(s"Recurisive setPosixFilePermissions ${subPath.getFileName} ")
        changePermission(pathStr + File.separator + subPath.getFileName, true, permissions)
      }
    } catch {
      case e: Exception => e.printStackTrace()
    }
  }
}

4.1.2 Container exited with a non-zero exit code 1

  • Problem description

After switching the tez engine and submitting the hive job through linkis, the engine can start successfully, and the job has been submitted to yarn, but the execution status has failed all the time.

  • Detailed error reporting
2021-08-30 11:18:33.018 ERROR SubJob : 73 failed to execute task, code : 21304, reason : Task is Failed,errorMsg: errCode: 12003 ,desc: MatchError: LinkisException{errCode=30002, desc='failed to init engine .reason:SessionNotRunning: TezSession has already shutdown. Application application_1630056358308_0012 failed 2 times due to AM Container for appattempt_1630056358308_0012_000002 exited with  exitCode: 1

Application error information on yarn: ` error: could not find or load main class org apache. tez. dag. app. DAGAppMaster`
  • Solution

The tez engine is enabled, but the jar package that the engine depends on cannot be read completely. The tez official website supports the configuration of compressed files and decompressed files, but this problem will occur in the configuration of compressed files when integrating with linkis.

Upload the local unzipped tez dependent folder and modify ittez-site.xmlintez.lib.urisIs the extracted directory and subdirectory.

4.1.3 NoSuchMethodError

  • Problem description

Tez engine is switched and configuredhive.execution.modeIs LLAP. When submitting a job through linkis, the engine can create it successfully, and the job can also be submitted to yarn. The execution fails.

  • Detailed error reporting
Error reported on the linkis console: return code 1 from org apache. hadoop. hive. ql.exec. tez. TezTask

Application error on yarn:
2021-08-30 16:04:35,564 [FATAL] [[email protected]] |yarn.YarnUncaughtExceptionHandler|: Thread Thread[[email protected],5,main] threw an Error.  Shutting down now...
java.lang.NoSuchMethodError: com.google.common.base.Stopwatch.elapsed(Ljava/util/concurrent/TimeUnit;)J
    at org.apache.hadoop.hive.common.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:185)
    at java.lang.Thread.run(Thread.java:748)
  • Solution

Because the version of guava that tez depends on is too low, when local hive is executed, it can be loaded to the higher version of guava locally, while the version of guava that tez uploads to HDFS depends on is too low.

Copyhive/libDownload a higher version of guava package and upload it totez.lib.urisDirectory.

4.1.4 No LLAP Daemons are running

  • Problem description

Tez engine is switched and configuredhive.execution.modeIs LLAP. When submitting a job through linkis, the engine can create it successfully, and the job can also be submitted to yarn. An error is reported in the engine log of linkis.

  • Detailed error reporting
2021-08-31 18:05:11.421 ERROR [BDP-Default-Scheduler-Thread-3] SessionState 1130 printError - Status: Failed
Dag received [DAG_TERMINATE, SERVICE_PLUGIN_ERROR] in RUNNING state.
2021-08-31 18:05:11.421 ERROR [BDP-Default-Scheduler-Thread-3] SessionState 1130 printError - Dag received [DAG_TERMINATE, SERVICE_PLUGIN_ERROR] in RUNNING state.
Error reported by TaskScheduler [[2:LLAP]][SERVICE_UNAVAILABLE] No LLAP Daemons are running
2021-08-31 18:05:11.422 ERROR [BDP-Default-Scheduler-Thread-3] SessionState 1130 printError - Error reported by TaskScheduler [[2:LLAP]][SERVICE_UNAVAILABLE] No LLAP Daemons are running
Vertex killed, vertexName=Reducer 3, vertexId=vertex_1630056358308_0143_1_02, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to DAG_TERMINATED, failedTasks:0 killedTasks:1, Vertex vertex_1630056358308_0143_1_02 [Reducer 3] killed/failed due to:DAG_TERMINATED]
2021-08-31 18:05:11.422 ERROR [BDP-Default-Scheduler-Thread-3] SessionState 1130 printError - Vertex killed, vertexName=Reducer 3, vertexId=vertex_1630056358308_0143_1_02, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to DAG_TERMINATED, failedTasks:0 killedTasks:1, Vertex vertex_1630056358308_0143_1_02 [Reducer 3] killed/failed due to:DAG_TERMINATED]
Vertex killed, vertexName=Reducer 2, vertexId=vertex_1630056358308_0143_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to DAG_TERMINATED, failedTasks:0 killedTasks:1, Vertex vertex_1630056358308_0143_1_01 [Reducer 2] killed/failed due to:DAG_TERMINATED]
  • Solution

Because the user who started the LLAP service is different from the user who submitted the task by linkis, the user of linkis cannot obtain the LLAP process.

Specify the user of linkis, or use the linkis user to start the LLAP service.

4.2 Spark

4.2.1 ClassNotFoundException

  • Problem description

After the local spark cluster is installed, the spark SQL job is submitted using linkis, and the spark engine can be started successfully, but there is an error in submitting the spark job

  • Detailed error reporting
68e048a8-c4b2-4bc2-a049-105064bea6dc:   at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
68e048a8-c4b2-4bc2-a049-105064bea6dc:   at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
68e048a8-c4b2-4bc2-a049-105064bea6dc:   at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
68e048a8-c4b2-4bc2-a049-105064bea6dc:   at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
68e048a8-c4b2-4bc2-a049-105064bea6dc:   at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
68e048a8-c4b2-4bc2-a049-105064bea6dc:   at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
68e048a8-c4b2-4bc2-a049-105064bea6dc:   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
68e048a8-c4b2-4bc2-a049-105064bea6dc:   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
68e048a8-c4b2-4bc2-a049-105064bea6dc:Caused by: java.lang.ClassNotFoundException: scala.Product$class
68e048a8-c4b2-4bc2-a049-105064bea6dc:   at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
68e048a8-c4b2-4bc2-a049-105064bea6dc:   at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
68e048a8-c4b2-4bc2-a049-105064bea6dc:   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
68e048a8-c4b2-4bc2-a049-105064bea6dc:   at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
68e048a8-c4b2-4bc2-a049-105064bea6dc:   ... 20 more
  • Solution

Because the local spark cluster is compiled with Scala version 2.12 and the spark engine plug-in is compiled with Scala version 2.11, thescala.ProductUnable to find.

The local cluster is recompiled with Scala 2.11. It must be noted that the versions of scala and SDK of the linkis engine plug-in are consistent with those in the cluster.

4.2.2 ClassCastException

  • Problem description

Local usespark-sqlSpark SQL jobs can be submitted and executed successfully. Spark SQL jobs submitted with linkis cannot be run.

  • Detailed error reporting
Caused by: java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD
  • Solution

becausespark-sqlThe deploy mode of yarn is not specified, so fromspark/libDrop down dependency. andspark-defaults.confConfigured inspark.yarn.jarsJar under path
Package that does not contain hive dependencies.

Upload local dependency with hive support tospark.yarn.jarsPath, execution./spark-sql --master yarn --deploy-mode client, ensure that it can be successfully executed, and then use linkis to submit the task.

4.3 Flink

4.3.1 method did not exist

  • Problem description

After the mongodb connector is developed, there is no problem in the local test, and the data can be written successfully. Upload the jar package to the linkis engine plug-in, and submit the Flink job through linkis with an error, which makes the spring boot job unable to start normally.

  • Detailed error reporting
***************************
APPLICATION FAILED TO START
***************************
Description:
An attempt was made to call a method that does not exist. The attempt was made from the following location:
    org.springframework.boot.autoconfigure.mongo.MongoClientFactorySupport.applyUuidRepresentation(MongoClientFactorySupport.java:85)
The following method did not exist:
    com.mongodb.MongoClientSettings$Builder.uuidRepresentation(Lorg/bson/UuidRepresentation;)Lcom/mongodb/MongoClientSettings$Builder;
  • Solution

Due to the introduction of mongodb connector, mongoautoconfiguration of spring boot will use themongoclient, because the version of the custom mongodb connector is inconsistent.

For those who need to use the spring boot project, special attention should be paid to the Mongo version built in spring boot. This time, the Mongo driver version in the connector is upgraded.

5. Reference

https://github.com/WeBankFinTech/Linkis-Doc

https://github.com/WeBankFinTech/DataSphereStudio-Doc

https://github.com/apache/tez

Flink table & SQL connectors website