Compiling Hadoop based on docker


Abstract:Installing the dependent software needed to compile Hadoop into the docker image and then compiling Hadoop in the docker container can improve the compilation efficiency and avoid polluting the host. When compiling other software, you can also refer to this blog method.

GitHub address:

Compiling Hadoop based on docker

In my previous blog post, I introducedSteps to compile Hadoop in 64 bit Ubuntu。 This blog will introduce how to compile Hadoop based on docker.

1、 Compilation steps

1. Download docker image

sudo docker pull kiwenlau/compile-hadoop

Or build a docker image yourself

sudo docker build -t kiwenlau/compile-hadoop .

2. Download and unzip Hadoop source files

export VERSION=2.7.2
tar -xzvf hadoop-$VERSION-src.tar.gz

3. Run docker container and compile Hadoop in container

sudo docker run -v $(pwd)/hadoop-$VERSION-src:/hadoop-$VERSION-src kiwenlau/compile-hadoop /root/ $VERSION

This step is time-consuming and takes about 15-30 minutes.

The results of correct implementation are as follows:

[INFO] ------------------------------------------------------------------------
[INFO] Total time: 23:46.056s
[INFO] Finished at: Tue May 31 16:40:53 UTC 2016
[INFO] Final Memory: 210M/915M
[INFO] ------------------------------------------------------------------------

comile hadoop 2.7.2 success!

The compiled binary package is located in the


The steps to compile other versions of Hadoop are the same, just change the version value.

You can use the WGet command to download the Hadoop binary package on GitHub directly


2、 Method summary

When compiling other software, you can also refer to the method introduced in this article, and the specific details can refer to the source codekiwenlau/compile-hadoop

1. Build the docker image needed for compilation

Compiling software often needs to install many dependencies, and compiling different software sometimes requires different versions of dependencies. If you directly install these dependencies on the host, it will pollute the host and is not easy to repeat.

2. Download the software source code

The source code is not placed in the docker image, which can facilitate the compilation of different versions of software and improve the efficiency of constructing the docker image.

3. Run docker container compilation software

The source code is mounted in a container in the form of volume, and the compiled executable file will also be located in the data volume.

Copyright notice
Please indicate the author when reprintingKiwenLauAnd this article address: