How does Java implement a database difference comparison tool from zero?

Time:2021-7-28

The pain of comparing data

I wonder if you are like me. When writing code happily, you must compare some database data.

Admittedly, there are general data differences. For example, the content of each line is different. There are relatively mature comparison tools such as compare2 in the market.

However, if each column of the comparison data is the same, this will become more troublesome.

How does Java implement a database difference comparison tool from zero?

V1.0 pure manual comparison

When we perform some data migration and other functions, we often need to compare whether the data is correct. The most common method is to manually compare one column by one.

At the beginning, Lao Ma was right with his colleagues. After a few days, he felt that the efficiency was really low, and he was easy to see.

So I just thought, this thing should be much easier to compare with programs.

V2.0 semi manual comparison

Just do it. I spent half a day implementing a jsqlparser based parsing similar toinsert into xxx (xx, xx, xx) values (xx, xx, xx);Tool class.

Then compare the data on two sides. For the comparison of hundreds of fields in a table, some become much faster and the accuracy is much higher.

Don’t ask me why there are hundreds of fields, which are treasures precipitated by history…

PS: whether the insert into statement is manually exported through the database connection tool.

Later, I found another problem: there are too many tables. If I want to change a data comparison, it will take me dozens of minutes to export it manually. The key is repetition and boring.

How does Java implement a database difference comparison tool from zero?

Since it is repeated, can it be implemented by program?

V3.0 vs. basic automation

So I stayed up late after work to implement this version: the Java program implements the export and persistence of data, and then compares the differences before and after modification.

Let me share my thoughts and core source code. There are download benefits at the end of the article.

Hope to help you with your work and study.

Overall concept

I hope this tool is the concept of MVP, from simple to complex, and gradually enrich the features in the later stage.

It should be scalable. At present, it supports mainstream databases such as MySQL / Oracle / SQL server, and users can customize the development.

Minimize dependencies, use native JDBC, and do not need to introduce frameworks such as mybatis.

Core dependency

Here are the core dependencies I use:

Fastjson is used to persist data to JSON

MySQL connector Java database connection driver

Jsqlparser auxiliary tool, used to parse SQL, not required

Realization idea

  1. Automatically select the corresponding JDBC implementation according to the specified JDBC connection information.
  2. Execute the corresponding SQL, parse the result into a map, and persist JSON
  3. Compare the differences between the persistent JSON and show the different results

With this idea, everything will become plain.

Of course, before that, we need to implement the code. Let’s enter the bug writing link:

How does Java implement a database difference comparison tool from zero?

JDBC implementation

Core interface

Considering the later implementation of different databases, we uniformly define a query interface

/**
 *JDBC access layer
 *@ author old ma xiaoxifeng
 * @date 2017/8/1
 */
public interface JdbcMapper {

    /**
     *Execute query statement
     * @param querySql
     * @return
     */
    ResultSet query(String querySql);

}

Abstract implementation

The basic Abstract implementation is provided here.

Subclasses only need to implement the corresponding connection to obtain information.

public abstract class AbstractJdbcMapper implements JdbcMapper {

    protected JdbcVo jdbcVo;

    public AbstractJdbcMapper(JdbcVo jdbcVo) {
        this.jdbcVo = jdbcVo;
    }

    /**
     *Get database connection
     * @return
     */
    protected abstract Connection getConnection();

    @Override
    public ResultSet query(String querySql) {
        ResultSet rs = null;
        Connection connection = getConnection();
        try {
            Statement stmt = null;
            stmt = connection.createStatement();
            rs = stmt.executeQuery(querySql);
        } catch (Exception e) {
            System.out.println("SQL: " + querySql);
            throw new ExportdbException(e);
        }
        return rs;
    }

}

Jdbcvo connection information

This object is mainly the database connection information object:

public class JdbcVo {

    /**
     *Driver class name
     */
    private String driverClassName;

    /**
     *Database link
     */
    private String url;

    /**
     *User name
     */
    private String username;

    /**
     *Code
     */
    private String password;

    //getter & setter
}

MySQL implementation

Take MySQL as an example:

import com.github.houbb.exportdb.dto.JdbcVo;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;

/**
 *MySQL implementation
 *@ author old ma xiaoxifeng
 * @date 2017/8/1
 */
public class MySqlJdbcMapper extends AbstractJdbcMapper {

    public MySqlJdbcMapper(JdbcVo jdbcVo) {
        super(jdbcVo);
    }

    @Override
    protected Connection getConnection() {
        try {
            Class.forName(jdbcVo.getDriverClassName());
            return DriverManager.getConnection(jdbcVo.getUrl(),
                    jdbcVo.getUsername(),
                    jdbcVo.getPassword());
        } catch (ClassNotFoundException | SQLException e) {
            e.printStackTrace();
        }
        return null;
    }
}

This is mainly about the initialization of connections. Connecting to different databases requires the introduction of corresponding data sources.

Row data export implementation

The following is the core implementation of the export:

Interface definition

public interface IExportdb {

    /**
     *Inquiry
     *Context @ param
     * @param sql sql
     *@ return result
     * @since 0.0.1
     */
    QueryResultVo query(final ExportdbContext context, final String sql);

}

The SQL to be executed is specified here.

In the context, there is only jdbcmapper for later expansion.

The returned queryresultvo is the query result, which is defined as follows:

public class QueryResultVo {
    /**
     *Table name
     */
    private String tableName;

    /**
     *Database name
     *
     * @since 0.0.2
     */
    private String databaseName;

    /**
     *Result set
     */
    private List<Map<String, Object>> resultMaps;

    /**
     *Executed SQL
     */
    private String sql;

    //getter & setter
}

Default implementation

The default export implementation is as follows:

import com.github.houbb.exportdb.core.ExportdbContext;
import com.github.houbb.exportdb.core.IExportdb;
import com.github.houbb.exportdb.dal.JdbcMapper;
import com.github.houbb.exportdb.dto.QueryResultVo;
import com.github.houbb.exportdb.exception.ExportdbException;
import com.github.houbb.heaven.util.lang.StringUtil;
import net.sf.jsqlparser.JSQLParserException;
import net.sf.jsqlparser.parser.CCJSqlParserUtil;
import net.sf.jsqlparser.statement.Statement;
import net.sf.jsqlparser.statement.insert.Insert;
import net.sf.jsqlparser.statement.select.PlainSelect;
import net.sf.jsqlparser.statement.select.Select;

import java.sql.ResultSet;
import java.sql.ResultSetMetaData;
import java.sql.SQLException;
import java.util.ArrayList;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;

/**
 * @author binbin.hou
 * @since 0.0.1
 */
public class Exportdb implements IExportdb {

    @Override
    public QueryResultVo query(ExportdbContext context, String sql) {
        try {
            final JdbcMapper jdbcMapper = context.jdbcMapper();

            ResultSet resultSet = jdbcMapper.query(sql);
            List<Map<String, Object>> maps = new ArrayList<>();

            String tableName = null;
            while (resultSet.next()) {
                final ResultSetMetaData metaData = resultSet.getMetaData();
                //Set table name
                if(tableName == null) {
                    tableName = metaData.getTableName(1);
                }

                Map<String, Object> map = new LinkedHashMap<>();
                //If it is blank, it will be returned directly. If it is greater than 1, an error will be reported
                //Total number of columns
                int columnCount = metaData.getColumnCount();

                for (int i = 1; i <= columnCount; i++) {
                    String columnName = metaData.getColumnName(i);
                    Object value = resultSet.getObject(columnName);

                    map.put(columnName, value);
                }

                maps.add(map);
            }

            if(StringUtil.isEmptyTrim(tableName)) {
                Statement statement = CCJSqlParserUtil.parse(sql);
                Select select = (Select)statement;
                PlainSelect plainSelect = (PlainSelect) select.getSelectBody();
                tableName = plainSelect.getFromItem().toString();
            }

            return QueryResultVo.newInstance().tableName(tableName)
                    .databaseName("")
                    .sql(sql)
                    .resultMaps(maps);
        } catch (SQLException | JSQLParserException throwables) {
            throw new ExportdbException(throwables);
        }
    }
}

In fact, the implementation is very simple. We mainly talk about two points:

(1) Table name

After testing, MySQL can be obtained in the following ways:

resultSet.getMetaData();
tableName = metaData.getTableName(1);

When I was testing Oracle, I found that I couldn’t get it. Therefore, it is obtained by parsing our query statements with the help of sqlparser.

For the time being, it mainly supports query, so some of the information written here is fixed and can be optimized later.

if(StringUtil.isEmptyTrim(tableName)) {
    Statement statement = CCJSqlParserUtil.parse(sql);
    Select select = (Select)statement;
    PlainSelect plainSelect = (PlainSelect) select.getSelectBody();
    tableName = plainSelect.getFromItem().toString();
}

(2) Column information

Each query may correspond to multiple records.

Let’s look at the construction of each record:

while (resultSet.next()) {
    final ResultSetMetaData metaData = resultSet.getMetaData();
    Map<String, Object> map = new LinkedHashMap<>();
    //If it is blank, it will be returned directly. If it is greater than 1, an error will be reported
    //Total number of columns
    int columnCount = metaData.getColumnCount();
    for (int i = 1; i <= columnCount; i++) {
        String columnName = metaData.getColumnName(i);
        Object value = resultSet.getObject(columnName);
        map.put(columnName, value);
    }
    maps.add(map);
}

This little friend who often writes JDBC must be no stranger.

You say you use mybatis now. Who still writes JDBC? It’s so low.

Well, if you write a mybatis by yourself, these are also inevitable.

Handwritten mybatis (I) MVP version from scratch

Difference comparison

Use of export

We can export a row of data, which can be exported before and after modification.

If you export to different libraries and tables, you can export between different library tables.

After exporting the results, you need to compare them.

Contrast implementation

Interface definition

For the processing of export results, you can choose according to your actual situation.

For example, export to CSV / JSON / insert, etc. the comparison difference can also be customized according to your own needs.

public interface IQueryResultHandler {

    /**
     *Result processing class
     *@ param queryresultvo query result
     */
    void handler(final QueryResultVo queryResultVo);

}

Persistence

Here is a simple and practical way: JSON persistence.

import com.alibaba.fastjson.JSON;
import com.alibaba.fastjson.serializer.SerializerFeature;
import com.github.houbb.exportdb.dto.QueryResultVo;
import com.github.houbb.exportdb.support.result.IQueryResultHandler;
import com.github.houbb.heaven.util.io.FileUtil;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;

/**
 *@ author old ma xiaoxifeng
 * @since 0.0.1
 */
public class FileJsonQueryResultHandler implements IQueryResultHandler {

    /**
     *Default file output path
     *
     *Automatically set according to the operating system
     * @since 0.0.1
     */
    private final String dir;

    public FileJsonQueryResultHandler(String dir) {
        this.dir = dir;
    }

    public FileJsonQueryResultHandler() {
        this("D:\\exportdb\\");
    }

    /**
     *Result processing class
     *
     *@ param queryresultvo query result
     */
    @Override
    public void handler(final QueryResultVo queryResultVo) {
        String path = dir+queryResultVo.tableName()+".edb";
        System.out.println ("file path:" + path);

        List<Map<String, Object>> list = queryResultVo.resultMaps();
        List<String> lines = new ArrayList<>(list.size()+1);

        lines.add("-- "+queryResultVo.sql());
        for(Map<String, Object> map : list) {
            lines.add(JSON.toJSONString(map, SerializerFeature.WriteMapNullValue));
        }

        FileUtil.write(path, lines);
    }

}

We persist the row data to a file. Note that it is specified hereJSON.toJSONString(map, SerializerFeature.WriteMapNullValue)

In this way, the null field can also be output, which is more convenient for comparison.

File difference comparison implementation

Let’s assume that the file is output to two files, and the file path can be specified below for comparison:

/**
 *Difference comparison
 *@ param oldpath original path
 *@ param newpath new path
 */
public static void differ(final String oldPath, final String newPath) {
    List<String> oldLines = FileUtil.readAllLines(oldPath);
    List<String> newLines = FileUtil.readAllLines(newPath);
    System. Out. Println (fileutil. Getfilename (oldpath) + "comparison start ------------);
    for(int i = 0; i < oldLines.size(); i++) {
        String oldL = oldLines.get(i);
        String newL = newLines.get(i);
        if(oldL.startsWith("--")) {
            continue;
        }
        System. Out. Println ("line" + (I + 1) + "comparison:");
        differMaps(oldL, newL);
    }
    System. Out. Println (fileutil. Getfilename (oldpath) + "comparison end ------------);
    System.out.println();
}

private static void differMaps(final String oldMap, final String newMap) {
    Map<String, Object> om = JSON.parseObject(oldMap);
    Map<String, Object> nm = JSON.parseObject(newMap);
    for(Map.Entry<String, Object> entry : om.entrySet()) {
        String key = entry.getKey();
        Object oldV = om.get(key);
        Object newV = nm.get(key);
        //Skip null comparison
        if(oldV == null && newV == null) {
            continue;
        }
        if(!ObjectUtil.isEquals(oldV, newV)) {
            System. Out. Println ("difference column:" + key + ", old value:" + oldv + ", new value:" + newv ");
        }
    }
}

Here, the difference content will be output directly to the console.

folder

Of course, we can also compare the contents of the two folders.

The implementation is as follows:

public static void differDir(final String oldDir, final String newDir) {
    File[] oldFiles = new File(oldDir).listFiles();

    for(File file : oldFiles) {
        String fileName = file.getName();
        String aop = file.getAbsolutePath();
        String anp = newDir+fileName;
        differ(aop, anp);
    }
}

Boot class

Convenience

We have completed the core implementation above, but it is not convenient for users to use. Because the configuration is not elegant enough.

Therefore, we introduce the boot class to help users quickly use:

/**
 *@ author old ma xiaoxifeng
 * @since 0.0.1
 */
public class ExportdbBs {

    private ExportdbBs(){}

    /**
     *Export implementation
     * @since 0.0.1
     */
    private final IExportdb exportdb = new Exportdb();

    /**
     *Driver class name
     */
    private String driverName = DriverNameConstant.MYSQL;

    /**
     *Database link
     */
    private String url = "jdbc:mysql://localhost:3306/test";

    /**
     *User name
     */
    private String username = "root";

    /**
     *Code
     */
    private String password = "123456";


    public static ExportdbBs newInstance() {
        return new ExportdbBs();
    }

    public ExportdbBs driverName(String driverName) {
        this.driverName = driverName;
        return this;
    }

    public ExportdbBs url(String url) {
        this.url = url;
        return this;
    }

    public ExportdbBs username(String username) {
        this.username = username;
        return this;
    }

    public ExportdbBs password(String password) {
        this.password = password;
        return this;
    }

    /**
     *Inquiry
     * @param sql sql
     *@ return result
     * @since 0.0.1
     */
    public QueryResultVo query(final String sql) {
        //1. Build Vo
        JdbcVo jdbcVo = new JdbcVo(driverName, url, username, password);

        //2. Get mapper
        final JdbcMapper jdbcMapper = getJdbcMapper(jdbcVo);

        //3. Build context
        final ExportdbContext context = ExportdbContext.newInstance().jdbcMapper(jdbcMapper);
        return this.exportdb.query(context, sql);
    }

    /**
     *Inquiry并且处理
     *@ param queryresulthhandler query result processor
     * @param sql sql
     * @since 0.0.1
     */
    public void queryAndHandle(final IQueryResultHandler queryResultHandler,
                               final String sql, final String... otherSqls) {
        QueryResultVo queryResultVo = this.query(sql);
        queryResultHandler.handler(queryResultVo);

        //Handle other SQL in the same way
        for(String os : otherSqls) {
            QueryResultVo vo = this.query(os);
            queryResultHandler.handler(vo);
        }
    }

    /**
     *Inquiry并且处理
     *@ param queryresulthhandler query result processor
     *@ param sqllist SQL list
     * @since 0.0.2
     */
    public void queryAndHandle(final IQueryResultHandler queryResultHandler,
                               List<String> sqlList) {
        //Handle other SQL in the same way
        for(String sql : sqlList) {
            System. Out. Println ("start execution:" + SQL);
            QueryResultVo vo = this.query(sql);
            queryResultHandler.handler(vo);
        }
    }

    private JdbcMapper getJdbcMapper(JdbcVo jdbcVo) {
        if(DriverNameConstant.MYSQL.equalsIgnoreCase(driverName)) {
            return new MySqlJdbcMapper(jdbcVo);
        }
        if(DriverNameConstant.ORACLE.equalsIgnoreCase(driverName)) {
            return new OracleJdbcMapper(jdbcVo);
        }
        if(DriverNameConstant.SQL_SERVER.equalsIgnoreCase(driverName)) {
            return new SqlServerJdbcMapper(jdbcVo);
        }

        throw new UnsupportedOperationException();
    }

}

Here, the user is provided with the most basic configuration of MySQL and common query processing methods.

test

Let’s take a look at the effect of the test:

Direct query

QueryResultVo resultVo = ExportdbBs.newInstance().query("select * from user;");
System.out.println(resultVo);

Query and process

final String sql = "select * from user;";
final IQueryResultHandler handler = new FileJsonQueryResultHandler();
ExportdbBs.newInstance().queryAndHandle(handler, sql);

File paths can be specified for two exports, for example:

D:\exportdb\old\andD:\exportdb\new\

Comparison of two results

final String oldP = "D:\\exportdb\\old\\";
final String newP = "D:\\exportdb\\new\\";

CompareUtil.differDir(oldP, newP);

The difference result will be output to the console.

How does Java implement a database difference comparison tool from zero?

Everything is going well, but the revolution has not yet succeeded. Students still need to work overtime~~~

deficiencies

This is a version of v0.0.1, with many shortcomings.

For example:

  • Export as CSV
  • Export as insert / UPDATE statement
  • Exported file name custom policy
  • You can specify whether multiple SQL statements are generated in the same file
  • The export path is changed automatically according to the operating system
  • It is more convenient to use. For example, the page specifies the data source + SQL, and the page displays the corresponding difference results.

However, it is also basically available, which is in line with our initial assumption.

Summary

I don’t know how you usually compare the data?

If you need this tool, you can pay attention to [old horse roaring west wind] and reply to [comparison] in the background.

I hope this article is helpful to you. If you have other ideas, you can also share them with you in the comment area.

everybodyGeekMy favorite collection and forwarding is the biggest driving force for Lao Ma to continue writing!

How does Java implement a database difference comparison tool from zero?

Recommended Today

VBS obtains the operating system and its version number

VBS obtains the operating system and its version number ? 1 2 3 4 5 6 7 8 9 10 11 12 ‘************************************** ‘*by r05e ‘* operating system and its version number ‘************************************** strComputer = “.” Set objWMIService = GetObject(“winmgmts:” _  & “{impersonationLevel=impersonate}!\\” & strComputer & “\root\cimv2”) Set colOperatingSystems = objWMIService.ExecQuery _  (“Select * from […]