An interesting trigger implementation based on sql-on-hbase products

Time:2019-10-23

background

Now there are many SQ on HBase products: Trafodion, Impala, Greplum and so on, but few have support the trigger function of products, perhaps because under this application scenario is rarely used to trigger, but there is no denying that the trigger is a very useful function, this article focuses on some lack of trigger function products provide an interesting idea.

An overview of

As shown below:

An interesting trigger implementation based on sql-on-hbase products

  1. When data is inserted into SQEngine, it is stored in HBase.
  2. With the Observer in HBase, BaseRegionObserver will be notified when there is data Trigger, and prePut, PostPut and other methods will be implemented to complete the processing of Event according to the needs of the project. In this case, it will be sent to Trigger Server in the form of HTTP request.
  3. Execute PLSQL or SQL, this step can be decided according to the characteristics of the product, if there is PLSQL function, PLSQL syntax can be written in the Trigger content, otherwise it can be executed in the form of SQL.
  4. PLSQL is ultimately executed through the SQ Engine, so this step is generally a function of the data database itself.

Realize the HBase Observer

Inherit the BaseRegionObserver, implement the corresponding method, and in this case implement put, which triggers the action at insert/update time.

import org.apache.hadoop.conf.Configuration;  
import org.apache.hadoop.hbase.*;  
import org.apache.hadoop.hbase.client.*;  
import org.apache.hadoop.hbase.coprocessor.BaseRegionObserver;  
import org.apache.hadoop.hbase.coprocessor.ObserverContext;   
import org.apache.hadoop.hbase.coprocessor.RegionCoprocessorEnvironment;  
import org.apache.hadoop.hbase.filter.ByteArrayComparable;
import org.apache.hadoop.hbase.filter.CompareFilter;
import org.apache.hadoop.hbase.regionserver.RegionScanner;
import org.apache.hadoop.hbase.regionserver.wal.WALEdit;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.DefaultHttpClient;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.net.URI;
import java.io.IOException;
import java.util.List;
public class EsgObserver extends BaseRegionObserver {
    @Override
    public void prePut(ObserverContext<RegionCoprocessorEnvironment> e, Put put, WALEdit edit,
    Durability durability) throws IOException {
    String tableName = e.getEnvironment().getRegion().getTableDesc().getNameAsString();
    e.getEnvironment().getRegionInfo().getRegionId();
    if(put.getTimeStamp() != ts){
        sendRequest(tableName,"prePut");
    }
    this.ts = put.getTimeStamp();
}
private void sendRequest(String tableName,     String API){
    HttpClient client = new DefaultHttpClient();  
    HttpGet get = new HttpGet();
    HttpResponse response;
    String url, ip, port;
    Configuration conf = HBaseConfiguration.create();
    try {
        Connection conn = ConnectionFactory.createConnection(conf);
        Table t = conn.getTable(TableName.valueOf(tableName));
        HTableDescriptor htd = t.getTableDescriptor();
        String value = htd.getValue(“coprocessor$4”);
        String args = value.split(“\|”)[3];
        ip = (args.split(“,”)[0]).split(“=”)[1];port = (args.split(“,”)[1]).split(“=”)[1];
        url = “http://” + ip + “:” + port + “/esg/exec/”;
    } catch (Exception e) {
        logger.error(e.getMessage(), e);
        return;
        
    }
    String table = tableName.split(“:”)[1];
    String uri = url + API + “?tableName=” + table;
    logger.info(“uri:” + uri);
    try{
        get.setURI(new URI(uri));
        response = client.execute(get);
        logger.info(“request response code:” + response.getStatusLine().getStatusCode());
        logger.info(“request response reason:” + response.getStatusLine().getReasonPhrase());
    } catch (Exception e){
        logger.error(e.getMessage(),e);
    }
    }
}

The core implementation of Trigger Server

Trigger Server adopts Spring boot + Antlr4 implementation. Spring boot is responsible for the construction of the web project, and Antlr4 implements the parsing separation of trigger statements.

Statement parsing is as follows:

Statement parsing is mainly used to trigger the creation, modification and deletion of statements.

grammar Sql;
@header {
    package com.esg.hbase.generated;
}

program : block EOF;

//block: T_CREATE T_OR T_REPLACE;
block: create_trigger_stmt | drop_stmt | switch_trigger_stmt;

create_trigger_stmt: T_CREATE has_replace T_TRIGGER trigger_name before_after cuid_ops T_ON table_name for_each_row plsql_block;

has_replace: (T_OR T_REPLACE)? ;

trigger_name: L_ID;

before_after: (T_BEFORE | T_AFTER);

cuid_ops: (T_INSERT | T_DELETE | T_UPDATE T_OF L_ID);

table_name: L_ID;

for_each_row: (T_FOR T_EACH T_ROW)?;

plsql_block: L_PLSQL_BLOCK;

drop_stmt: T_DROP T_TRIGGER trigger_name T_SEMICOLON?;

enable_disable: (T_ENABLE | T_DISABLE);

switch_trigger_stmt: (switch_trigger_all_stmt | switch_trigger_specify_stmt);

switch_trigger_specify_stmt: T_ALTER T_TRIGGER trigger_name enable_disable;

switch_trigger_all_stmt: T_ALTER T_TABLE table_name enable_disable T_ALL T_TRIGGERS;
The core receives the request
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.Iterator;
import java.util.List;

@RequestMapping("esg/exec") 
@RestController 
public class ExecutorController {
  @Autowired private TriggerService triggerService;
 @GetMapping("/prePut") public void prePut(String tableName) {
    List<Trigger> triggers = this.triggerService.find(tableName, true, 1);
    Iterator<Trigger> iter = triggers.iterator();
    while(iter.hasNext()){
      Trigger trigger = iter.next();
      if (trigger.getHasEnabled()){
        String plsql = trigger.getPlsql();
        executeQuery(url, username, password, plsql);
      }
    }
  }
 }

conclusion

This paper provides an idea of trigger implementation on sql-on-hbase database, which realizes the independent deployment and execution of trigger function in this way, which is convenient and concise, and completes the execution of trigger with the help of Web system. This scheme can realize data synchronization, such as synchronization to Kafka messaging system and other libraries, and there are some suitable application scenarios.
However, this method also has some obvious defects, which may cause a big problem in performance. All the triggering events need to pass through the same web server, which increases the burden of the web system. Once there is a large amount of data inserted and modified, the performance will become the bottleneck. On the transaction side, the statement in the trigger is not part of the same transaction as the original insert statement, resulting in inconsistent data. Fault tolerance is not considered in this scenario.
This article for the introduction of jade, welcome everyone to discuss a better and more reasonable scheme.

Recommended Today

The solution of flash blocking layer

Problems: 1. There is flash animation under the pull-down menu made by layer, and the menu is blocked by animation2. When the layer floating advertisement in the page passes through the flash animation, the floating layer passes through the animation drawing and the actual display is blocked resolvent: Set the properties of the flash object […]