Orca is an open-source optimizer of Postgres and Greenplum. Compared with the built-in optimizer of Greenplum and Postgres, Orca has a very good performance improvement in complex queries, partition tables and other occasions. Here’s how to make Greenplum enable Orca optimizer, and how to run Greenplum’s test case installcheck world.
Setting up the development environment
Before you start, you need to install build and run dependencies, including those of Greenplum and Orca. The compilation environment used here is CentOS 7. First, install the system package, and execute the following commands:
sudo yum -y groupinstall "Development Tools" sudo yum -y install readline-devel zlib-devel curl-devel apr-devel libevent-devel libxml2-devel bzip2-devel python-devel openssl-devel which iproute net-tools perl-Env wget sudo yum install -y epel-release centos-release-scl sudo yum install -y python-pip python-psutil cmake3 sudo yum install -y devtoolset-6-toolchain sudo yum install -y xerces-c-devel
Readme.centos.bash in the Greenplum source protection package has related dependency package settings. The commands here increase some dependencies compared with readme.centos.bash. Next, install the related Python dependencies. The commands are as follows:
sudo pip install --upgrade pip sudo pip install --no-cache-dir lockfile paramiko setuptools psutil conan
Then install the build management public Ninja that Orca relies on, which is version 1.8.2. The latest 1.9 relies on a higher version of the C + + runtime library, and an error will be reported when it is executed on centos7.
wget https://github.com/ninja-build/ninja/releases/download/v1.8.2/ninja-linux.zip unzip ninja-linux.zip sudo mv ninja /usr/local/bin/
Finally, configure additional environment settings:
sudo mkdir /usr/local/gpdb sudo chown -R `whoami` /usr/local/gpdb source scl_source enable devtoolset-6 sudo ln -sf /usr/bin/cmake3 /usr/bin/cmake
Now we have finished all the preparations and can start compiling.
Prepare Xerces dependency Library (optional)
Before compiling orca, you need to compile its dependencies, that is, Xerces, which Orca uses to read and write XML data formats. This step is optional, because we can use either the installed system version or the version with GP patch (address: https://github.com/greenplus-db/gp-xerces). If you want to use the GP patch version, you need to execute the following commands:
git clone https://github.com/greenplum-db/gp-xerces.git cd gp-xerces/ mkdir build && cd build ../configure --prefix=/usr/local/gpdb make && make install && make install
Compiling Orca requires cmake3 and gcc-6. In the first step, the configuration has been completed. Before starting, you can confirm it by the following command:
gcc --version cmake --version
If the version is incorrect or the command is not found, make sure that the following two operations are performed correctly:
source scl_source enable devtoolset-6 sudo ln -sf /usr/bin/cmake3 /usr/bin/cmake
The command ‘source scl_source enable devtoolset-6’ is used to modify the GCC version, which can be added to the login script to execute automatically
echo 'source scl_source enable devtoolset-6' >> ~/.bashrc
Because there is a version matching problem between Greenplum and orca, we use the orca version in the Greenplum source code to compile. The specific steps are as follows:
git clone https://github.com/greenplum-db/gpdb.git --branch 6X_STABLE --single-branch --depth 1 -b 6X_STABLE 6X_STABLE cd 6X_STABLE/depends CFLAGS="-L/usr/local/gpdb/lib/" ./configure --prefix=/usr/local/gpdb make make install_local
At this time, the orca is installed in the directory / usr / local / GPDB.
Greenplum has many extension functions that can be controlled in the command line. Here, the focus is on compiling orca, so the configuration used turns off some other compilation parameters. Execute the following command in the source root directory of Greenplum:
export LD_LIBRARY_PATH=/usr/local/gpdb/libCFLAGS="-I/usr/local/gpdb/include" LDFLAGS="-L/usr/local/gpdb/lib/" ./configure --enable-orca --without-perl --without-python --with-libxml --without-gssapi --disable-pxf --without-zstd -without-openssl make -j4 && make install
When the command is executed successfully, Congratulations, your own Greenplum is ready. The compiled Greenplum is installed in the / usr / local / GPDB directory. It can be packaged as a whole and deployed to the same location of other machines for cluster testing. The compiled GPDB directory is roughly as follows:
$ ls /usr/local/gpdb bin docs etc greenplum_path.sh include lib sbin share
Run test cases
After Greenplum is compiled, we need to make sure that all relevant tests can run normally. Greenplum inherits the Postgres test framework and provides its own test target: installcheck world. In addition, Greenplum also includes the configuration for creating a test cluster, so our goal is to use the test cluster to perform the installcheck world test.
Prepare system configuration
To ensure the normal execution of the test, it is strongly recommended to configure the corresponding system configuration files according to the official documents of Greenplum, including / etc / security / limits.conf and / etc / sysctl.conf, which are also described in readme.linux.md of the source directory.
In addition, Greenplum needs to use SSH to execute commands even in the stand-alone version, so password free access needs to be configured:
ssh-keygen cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys chmod 600 ~/.ssh/authorized_keys
Create test cluster
A Greenplum cluster of 3 primary and 3 mirror can be created by the following command:
source /usr/local/gpdb/greenplum_path.sh make create-demo-cluster
When you see the following information, the demo cluster configuration is successful.
optimizer ----------- on (1 row) gp_opt_version ---------------------------------------------- GPOPT version: 3.48.0, Xerces version: 3.1.2(1 row)
If there is an error, you need to repair it according to the prompt. After repair, you can use the command
To force the end of an incomplete operation, and then execute it again.
After creating the test cluster, you can run the test. Execute the following command in the Greenplum source root directory:
PGPORT=15432 make installcheck-world
All tests are performed with Orca open. In addition, although we use the stable branch, there may still be some test failures. You are welcome to provide patches or report bugs.
For more information about Greenpum technical dry goods, please visit the Greenplum Chinese community website.