Hadoop learning – Deployment

Time:2020-10-23

system information

  • master

    os: MAC OSX 10.10
    ip: 192.168.2.108
    hostname: master
  • slaves1

    os: MAC OSX 10.10
    ip: 192.168.2.104
    hostname: s1
    

/Etc / hosts edit

On master and slaves1 hosts, edit / etc / hosts:

# add config
192.168.2.108   master
192.168.2.104   s1

SSH password free login configuration

http://www.jianshu.com/p/1fdc…

Hadoop installation

mkdir -p ~/work/hadoop
mkdir -p ~/work/hadoop/hadoop
mkdir -p ~/work/hadoop/hbase
  • Hadoop installation directory preparation

    <1> . download Hadoop 2.7.3 tar.gz
    <2> . decompress
    <3> . copy it to the Hadoop deployment directory and rename it
cp ~/Documents/tool/hadoop/hadoop-2.7.3.tar.gz ~/work/hadoop/
tar xzvf hadoop-2.7.3.tar.gz 
mv hadoop-2.7.3.tar.gz hadoop
  • Hadoop configuration directory preparation

mkdir -p ~/work/hadoop/hadoop-config

base_ Profile preparation

# config java env
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_40.jdk/Contents/Home
e
# config hadoop
export HADOOP_CONF_DIR=/Users/jingchen/work/hadoop/hadoop-config
export HBASE_CONF_DIR=/Users/jingchen/work/hadoop/hbase-config
export HADOOP_HOME=/Users/jingchen/work/hadoop/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME

## config native lib of hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="${HADOOP_OPTS} -Djava.library.path=${HADOOP_HOME}/lib/native/"
export HADOOP_ROOT_LOGGER=DEBUG,console

Hadoop configuration file

  • Configuration file preparation
    Copy the configuration files in Hadoop / etc directory to Hadoop config directory, and modify the following configuration files

core-site.xml            hdfs-site.xml            mapred-site.xml.template yarn-env.cmd             yarn-site.xml
hadoop-env.sh                    slaves                   yarn-env.sh

slaves

Add slave nodes, one for each row

s1

core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>

<property>
<name>hadoop.tmp.dir</name>
<value>/Users/jingchen/work/hadoop/hadoop/tmp</value>
</property>

</configuration>

hdfs-site.xml

<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
  <value>hdfs://master:9001</value>
 </property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/Users/jingchen/work/hadoop/hadoop/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/Users/jingchen/work/hadoop/hadoop/data/</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>

</configuration>

mapred-site.xml

<configuration>
<!--
<property>
<name>mapred.job.tracker</name>
<value>hdfs://master:9001/</value>
</property>
-->

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>

<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>

</configuration>

yarn-env.sh

# add java env conf
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_40.jdk/Contents/Home

# add native lib conf
YARN_OPTS="$YARN_OPTS -Djava.library.path=${HADOOP_HOME}/lib/native/"

yarn-site.xml

<configuration>

<!-- Site specific YARN configuration properties -->
<property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
</property>

<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>

<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>

<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>

<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>

<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>

</configuration>

hadoop-env.sh

# add config item
export JAVA_HOME="/Library/Java/JavaVirtualMachines/jdk1.7.0_40.jdk/Contents/Home"
export HADOOP_PID_DIR="/Users/jingchen/work/hadoop/hadoop/tmp"
export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}

# A string representing this instance of hadoop. $USER by default.
#export HADOOP_OPTS="$HADOOP_OPTS -Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"
HADOOP_OPTS="${HADOOP_OPTS} -Djava.security.krb5.realm= -Djava.security.krb5.kdc="
HADOOP_OPTS="${HADOOP_OPTS} -Djava.security.krb5.conf=/dev/null"
export HADOOP_ROOT_LOGGER=DEBUG,console
export HADOOP_COMMON_LIB_NATIVE_DIR="${HADOOP_HOME}/lib/native"
export HADOOP_OPTS="${HADOOP_OPTS} -Djava.library.path=${HADOOP_HOME}/lib/native/"

yarn-env.sh

# add config item
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_40.jdk/Contents/Home

Hadoop files are deployed to the slaves

scp -r ~/work/hadoop/hadoop  [email protected]:~/work/hadoop/hadoop

scp -r ~/work/hadoop/hadoop-config  [email protected]:~/work/hadoop/hadoop-config

Mac environment, very important: MAC Hadoop native library compilation

Reasons for compiling native lib:

The native lib of Hadoop is stored in the Lib / native directory of Hadoop. In this paper, it is ~ / work / Hadoop / Hadoop / lib / native
However, the MAC starts after Hadoop is installed and an error is reported

WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable。

If you are interested, you can compile it yourself, because many of the native library resources and methods on the Internet are not available. It is recommended that you compile locally on the official website.

The pre-built 32-bit i386-Linux native hadoop library is available as part of the hadoop distribution and is located in the lib/native directory. You can download the hadoop distribution from Hadoop Common Releases.
The native hadoop library is supported on *nix platforms only. The library does not to work with Cygwin or the Mac OS X platform.

Using the new native Lib

Copy the compiled native library to the corresponding directory of the downloaded binary version Hadoop 0

## 1. backup old native lib
mv ~/work/hadoop/hadoop/lib/native ~/work/hadoop/hadoop/lib/native_bak
## mv new native libs override old ones
cp hadoop-2.7.3-src/hadoop-dist/target/hadoop-2.7.3/lib/native/* ~/work/hadoop/hadoop/lib/native