系统环境
RHEL7.5 selinux and iptables is disabled Hadoop 、jdk、zookeeper 程序使用 nfs 共享同步配置文件
软件版本
hadoop-2.7.3.tar.gz zookeeper-3.4.9.tar.gz jdk-7u79-linux-x64.tar.gz hbase-1.2.4-bin.tar.gz
主机列表
IP主机名角色
172.25.5.1server1.example.comNameNode DFSZKFailoverController ResourceManager172.25.5.5server5.example.comNameNode DFSZKFailoverController ResourceManager172.25.5.2server2.example.comJournalNode QuorumPeerMain DataNode NodeManager172.25.5.3server3.example.comJournalNode QuorumPeerMain DataNode NodeManager172.25.5.4server4.example.comJournalNode QuorumPeerMain DataNode NodeManager
在server1上停掉dfs
[hadoop@server1 hadoop
]$ sbin/stop-dfs.sh
[hadoop@server1 hadoop
]$
cd etc/hadoop/
[hadoop@server1 hadoop
]$ vim mapred-site.xml
<configuration
>
<property
>
<name
>mapreduce.framwork.name
</name
>
<yarn
>yarn
</yarn
>
</property
>
</configuration
>
[hadoop@server1 hadoop
]$ vim yarn-site.xml
<configuration
>
<property
>
<name
>yarn.nodemanager.aux-services
</name
>
<value
>mapreduce_shuffle
</value
>
</property
>
</configuration
>
开启服务
[hadoop@server1 hadoop
]$
cd ../
..
[hadoop@server1 hadoop
]$ sbin/start-yarn.sh
[hadoop@server1 hadoop
]$ sbin/start-dfs.sh
[hadoop@server1 hadoop
]$ jps
在slave节点查看
[hadoop@server2 hadoop
]$ jps
由于之前server4节点是没有通过master节点的worker文件修改加入集群,在此看不到jps修改后的信息,因此在server1上修改worker文件和hdfs-site.xml文件将server4加入到集群中
[hadoop@server1 hadoop
]$
cd etc/hadoop/
[hadoop@server1 hadoop
]$ vim workers
[hadoop@server1 hadoop
]$ vim hdfs-site.xml
重新启动服务
[hadoop@server1 hadoop
]$ sbin/start-yarn.sh
[hadoop@server1 hadoop
]$ sbin/start-dfs.sh
在server4上查看
[hadoop@server4 hadoop
]$ jps
停掉dfs、yarn服务
[hadoop@server1 hadoop
]$ sbin/stop-dfs.sh
[hadoop@server1 hadoop
]$ sbin/stop-yarn.sh
删除所有/tmp底下的东西
[root@server1 hadoop
]
server2、3、4也删除/tmp底下的所有东西
现在开始搭建zookeeper Zookeeper 集群至少三台,总节点数为奇数个。
在真机上传送文件给server1:
[root@foundation5 ~
]
在server1上配置zookeeper
[hadoop@server1 ~
]$
ls
hadoop hadoop-3.2.1.tar.gz jdk1.8.0_181 zookeeper-3.4.9.tar.gz
hadoop-3.2.1 java jdk-8u181-linux-x64.tar.gz
[hadoop@server1 ~
]$
tar zxf zookeeper-3.4.9.tar.gz
[hadoop@server1 ~
]$
ls
hadoop hadoop-3.2.1.tar.gz jdk1.8.0_181 zookeeper-3.4.9
hadoop-3.2.1 java jdk-8u181-linux-x64.tar.gz zookeeper-3.4.9.tar.gz
[hadoop@server1 ~
]$
cd zookeeper-3.4.9/conf/
[hadoop@server1 conf
]$
ls
configuration.xsl log4j.properties zoo_sample.cfg
[hadoop@server1 conf
]$
cp zoo_sample.cfg zoo.cfg
[hadoop@server1 conf
]$ vim zoo.cfg
server.1
=172.25.5.2:2888:3888
server.2
=172.25.5.3:2888:3888
server.3
=172.25.5.4:2888:3888
分别在2、3、4上配置zookeeper
各节点配置文件相同,并且需要在/tmp/zookeeper 目录中创建 myid 文件,写入一个唯一的数字,取值范围在 1-255。比如:172.25.5.2 节点的 myid 文件写入数字“1”,此数字与配置文件中的定义保持一致,(server.1=172.25.5.2:2888:3888)其它节点依次类推。
server2:
[hadoop@server2 ~
]$
mkdir /tmp/zookeeper
[hadoop@server2 ~
]$
echo 1
> /tmp/zookeeper/myid
[hadoop@server2 ~
]$
cd zookeeper-3.4.9/
[hadoop@server2 zookeeper-3.4.9
]$ bin/zkServer.sh start
[hadoop@server2 zookeeper-3.4.9
]$ bin/zkServer.sh status
server3:
[hadoop@server3 ~
]$
mkdir /tmp/zookeeper
[hadoop@server3 ~
]$
echo 2
> /tmp/zookeeper/myid
[hadoop@server3 ~
]$
cd zookeeper-3.4.9/
[hadoop@server3 zookeeper-3.4.9
]$ bin/zkServer.sh start
[hadoop@server3 zookeeper-3.4.9
]$ bin/zkServer.sh status
server4:
[hadoop@server4 ~
]$
mkdir /tmp/zookeeper
[hadoop@server4 ~
]$
echo 3
> /tmp/zookeeper/myid
[hadoop@server4 ~
]$
cd zookeeper-3.4.9/
[hadoop@server4 zookeeper-3.4.9
]$ bin/zkServer.sh start
[hadoop@server4 zookeeper-3.4.9
]$ bin/zkServer.sh status
Hadoop配置
[hadoop@server1 ~
]$
cd hadoop
[hadoop@server1 hadoop
]$
cd etc/
[hadoop@server1 etc
]$
cd hadoop/
[hadoop@server1 hadoop
]$ vim core-site.xml
<configuration
>
<property
>
<name
>fs.defaultFS
</name
>
<value
>hdfs://masters
</value
>
</property
>
<property
>
<name
>ha.zookeeper.quorum
</name
>
<value
>172.25.5.2:2181,172.25.5.3:2181,172.25.5.4:2181
</value
>
</property
>
</configuration
>
[hadoop@server1 hadoop
]$ vim hdfs-site.xml
<configuration
>
<property
>
<name
>dfs.replication
</name
>
<value
>3
</value
>
</property
>
<property
>
<name
>dfs.nameservices
</name
>
<value
>masters
</value
>
</property
>
<property
>
<name
>dfs.ha.namenodes.masters
</name
>
<value
>h1,h2
</value
>
</property
>
<property
>
<name
>dfs.namenode.rpc-address.masters.h1
</name
>
<value
>172.25.5.1:9000
</value
>
</property
>
<property
>
<name
>dfs.namenode.http-address.masters.h1
</name
>
<value
>172.25.0.5:9870
</value
>
</property
>
<property
>
<name
>dfs.namenode.rpc-address.masters.h2
</name
>
<value
>172.25.5.5:9000
</value
>
</property
>
<property
>
<name
>dfs.namenode.http-address.masters.h2
</name
>
<value
>172.25.5.5:9870
</value
>
</property
>
<property
>
<name
>dfs.namenode.shared.edits.dir
</name
>
<value
>qjournal://172.25.5.2:8485
;172.25.5.3:8485
;172.25.5.4:8485/masters
</value
>
</property
>
<property
>
<name
>dfs.journalnode.edits.dir
</name
>
<value
>/tmp/journaldata
</value
>
</property
>
<property
>
<name
>dfs.ha.automatic-failover.enabled
</name
>
<value
>true
</value
>
</property
>
<property
>
<name
>dfs.client.failover.proxy.provider.masters
</name
>
<value
>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
</value
>
</property
>
<property
>
<name
>dfs.ha.fencing.methods
</name
>
<value
>
sshfence
shell
(/bin/true
)
</value
>
</property
>
<property
>
<name
>dfs.ha.fencing.ssh.private-key-files
</name
>
<value
>/home/hadoop/.ssh/id_rsa
</value
>
</property
>
<property
>
<name
>dfs.ha.fencing.ssh.connect-timeout
</name
>
<value
>30000
</value
>
</property
>
</configuration
>
启动 hdfs 集群(按顺序启动)
在三个 DN 上依次启动 journalnode(第一次启动 hdfs 必须先启动 journalnode)
server2:
[hadoop@server2 hadoop
]$ bin/hdfs --daemon start journalnode
[hadoop@server2 hadoop
]$ jps
server3:
[hadoop@server3 hadoop
]$ bin/hdfs --daemon start journalnode
[hadoop@server3 hadoop
]$ jps
server4:
[hadoop@server4 hadoop
]$ bin/hdfs --daemon start journalnode
[hadoop@server4 hadoop
]$ jps
格式化 HDFS 集群
server1:
[hadoop@server1 hadoop
]$ bin/hdfs namenode -format