[HDFS 12] ha high availability – hdfs-ha cluster configuration

Time:2021-3-8

Just keep your rhythm going

Hdfds-ha cluster configuration

(1) Environmental preparation

  • Modify IP
  • Modify the host name and the mapping between host name and IP address
  • Turn off firewall
  • SSH password free login
  • Install JDK, configure environment variables, etc

(2) Planning cluster

We can see that we distribute namenode to two machines to ensure the high availability of the cluster

hadoop102 hadoop103 hadoop104
NameNode NameNode
JournalNode JournalNode JournalNode
DataNode DataNode DataNode
ZK ZK ZK
ResourceManager
NodeManager NodeManager NodeManager

(3) Configure zookeeper cluster

1. Cluster planning

Zookeeper is deployed on Hadoop 102, Hadoop 103 and Hadoop 104 nodes.

2. Decompression and installation

(1) Unzip the zookeeper installation package to the / opt / module / directory

tar -zxvf zookeeper-3.4.10.tar.gz -C /opt/module/

(2) Create zkdata in the directory / opt / module / zookeeper-3.4.10 /

mkdir -p zkData

(3) Rename the zoo in the directory / opt / module / zookeeper-3.4.10/conf_ sample.cfg by zoo.cfg

mv zoo_sample.cfg zoo.cfg

3. Configuration zoo.cfg file

(1) Specific configuration

dataDir=/opt/module/zookeeper-3.4.10/zkData

Add the following configuration

\#######################cluster##########################

server.2=hadoop102:2888:3888

server.3=hadoop103:2888:3888

server.4=hadoop104:2888:3888

(2) Interpretation of configuration parameters

Server.A=B:C:D。

A is a number indicating the number of servers;

B is the IP address of the server;

C is the port for the server to exchange information with the leader server in the cluster;

D is in case the leader server in the cluster hangs up, a port is needed to conduct the election again and select a new leader. This port is the port used for the servers to communicate with each other when the election is executed.

In cluster mode, configure a file myid. This file is in the dataDir directory. There is a value of a in this file. When zookeeper starts, it reads this file and gets the data and data in it zoo.cfg Compare the configuration information to determine which server it is.

4. Cluster operation

(1) Create a myid file in the directory / opt / module / zookeeper-3.4.10/zkdata

touch myid
Add the myid file. Note that it must be created in Linux. In Notepad + +, it may be garbled

(2) Edit myid file

vi myid
Add the number corresponding to server in the file: for example, 2

(3) Copy the configured zookeeper to other machines

scp -r zookeeper-3.4.10/ [[email protected]:/opt/app/](mailto:[email protected]:/opt/app/)

scp -r zookeeper-3.4.10/ [[email protected]:/opt/app/](mailto:[email protected]:/opt/app/)

And modify the contents of myid file to 3 and 4

(4) Start zookeeper respectively

[[email protected] zookeeper-3.4.10]# bin/zkServer.sh start

[[email protected] zookeeper-3.4.10]# bin/zkServer.sh start

[[email protected] zookeeper-3.4.10]# bin/zkServer.sh start

(5) View status

[[email protected] zookeeper-3.4.10]# bin/zkServer.sh status
JMX enabled by default
Using config: /opt/module/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: follower

[[email protected] zookeeper-3.4.10]# bin/zkServer.sh status
JMX enabled by default
Using config: /opt/module/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: leader

[[email protected] zookeeper-3.4.5]# bin/zkServer.sh status
JMX enabled by default
Using config: /opt/module/zookeeper-3.4.10/bin/../conf/zoo.cfg
Mode: follower

(4) Configure hdfs-ha cluster

1. Official address

http://hadoop.apache.org/

2. Create a ha folder in the opt directory

mkdir ha

3. Copy hadoop-2.7.2 under / opt / APP / to / opt / ha directory

cp -r hadoop-2.7.2/ /opt/ha/

4. Configure Hadoop- env.sh

export JAVA_HOME=/opt/module/jdk1.8.0_144

5. Configure core- site.xml

<configuration>
<! -- assemble the addresses of two namenodes into a cluster mycluster -- >
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://mycluster</value>
        </property>

        <! -- specifies the storage directory of files generated by Hadoop runtime -- >
        <property>
            <name>hadoop.tmp.dir</name>
            <value>/opt/ha/hadoop-2.7.2/data/tmp</value>
        </property>
</configuration>

6. Configure HDFS- site.xml

<configuration>
    <! -- fully distributed cluster name -- >
    <property>
        <name>dfs.nameservices</name>
        <value>mycluster</value>
    </property>

    <! -- what are the namenode nodes in the cluster
    <property>
        <name>dfs.ha.namenodes.mycluster</name>
        <value>nn1,nn2</value>
    </property>

    <! -- RPC address of NN1 -- >
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn1</name>
        <value>hadoop102:9000</value>
    </property>

    <! -- RPC address of nn2 -- >
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn2</name>
        <value>hadoop103:9000</value>
    </property>

    <! -- HTTP address of NN1 -- >
    <property>
        <name>dfs.namenode.http-address.mycluster.nn1</name>
        <value>hadoop102:50070</value>
    </property>

    <! -- HTTP address of nn2 -- >
    <property>
        <name>dfs.namenode.http-address.mycluster.nn2</name>
        <value>hadoop103:50070</value>
    </property>

    <! -- specifies the storage location of namenode metadata on the journal node -- >
    <property>
        <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://hadoop102:8485;hadoop103:8485;hadoop104:8485/mycluster</value>
    </property>

    <! -- configure the isolation mechanism, that is, only one server can respond at the same time -- >
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>sshfence</value>
    </property>

    <! -- SSH login without secret key is required when using isolation mechanism -- >
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/home/zhutiansama/.ssh/id_rsa</value>
    </property>

    <! -- declare the journal node server storage directory -- >
    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/opt/ha/hadoop-2.7.2/data/jn</value>
    </property>

    <! -- turn off permission checking -- >
    <property>
        <name>dfs.permissions.enable</name>
        <value>false</value>
    </property>

    <! -- access agent class: client, mycluster, active configuration failed, automatic switch implementation mode -- >
    <property>
          <name>dfs.client.failover.proxy.provider.mycluster</name>
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
</configuration>

7. Copy the configured Hadoop environment to other nodes

(5) Start hdfs-ha cluster

1. On each journal node, enter the following command to start the journal node service

sbin/hadoop-daemon.sh start journalnode

2. On [NN1], format it and start it

bin/hdfs namenode -format
sbin/hadoop-daemon.sh start namenode

3. On [nn2], synchronize the metadata information of NN1

bin/hdfs namenode -bootstrapStandby

4. Start [nn2]

sbin/hadoop-daemon.sh start namenode

5. Web page view

[HDFS 12] ha high availability - hdfs-ha cluster configuration

6. On [NN1], start all datanodes

sbin/hadoop-daemons.sh start datanode

7. Switch [NN1] to active

bin/hdfs haadmin -transitionToActive nn1    

8. Check whether it is active

bin/hdfs haadmin -getServiceState nn1

(6) Configure hdfs-ha automatic fail over

1. Specific configuration

(1) in HDFS- site.xml Medium increase

<property>
    <name>dfs.ha.automatic-failover.enabled</name>
    <value>true</value>
</property>

(2) at the core- site.xml Added to the document

<property>
    <name>ha.zookeeper.quorum</name>
    <value>hadoop102:2181,hadoop103:2181,hadoop104:2181</value>
</property>

2. Start up

(1) Shut down all HDFS services:

sbin/stop-dfs.sh

(2) Start zookeeper cluster:

bin/zkServer.sh start

(3) Initialize ha status in zookeeper:

bin/hdfs zkfc -formatZK

(4) Start HDFS service:

sbin/start-dfs.sh

(5) To start dfszk over controller on each namenode, the namenode of the first machine is the active namenode

sbin/hadoop-daemin.sh start zkfc

3. Verification

(1) kill the active namenode process

Kill - 9 process ID of namenode

(2) disconnect the active namenode machine from the network

service network stop

Relevant information

[HDFS 12] ha high availability - hdfs-ha cluster configuration

Recommended Today

Looking for frustration 1.0

I believe you have a basic understanding of trust in yesterday’s article. Today we will give a complete introduction to trust. Why choose rust It’s a language that gives everyone the ability to build reliable and efficient software. You can’t write unsafe code here (unsafe block is not in the scope of discussion). Most of […]