首先创建4台虚拟机,规划如下
IP地址主机名安装软件运行服务192.168.16.134hadoop1jdk,hadoop namenode,Journalnode,ZKFC,Resourcemanager192.168.16.135hadoop2jdk,hadoop,zookeeper namenode,datanode,Journalnode,ZKFC,Resourcemanager,zookeeper192.168.16.136hadoop3jdk,hadoop,zookeeper namenode,datanode,Journalnode,zookeeper192.168.16.137hadoop4jdk,hadoop,zookeeper namenode,datanode,zookeeper
接下来关闭防火墙和selinux,可以在每台服务器上运行如下命令(操作节点:所有)
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config systemctl stop firewalld systemctl disable firewalld setenforce 0更改每台主机主机名,如下命令(操作节点:所有)
hostnamectl set-hostname hadoop1每台主机上添加hosts解析(操作节点:所有)
echo " 192.168.16.134 hadoop1 192.168.16.135 hadoop2 192.168.16.136 hadoop3 192.168.16.137 hadoop4">>/etc/hosts所有主机上创建hadoop用户(操作节点:所有)
useradd hadoop passwd hadoop切换到普通用户,配置免密,(操作节点:hadoop1)
su - hadoop #ssh-keygen,一路回车即可 ssh-keygen #本机也要拷贝一份 ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@hadoop1 ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@hadoop2 ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@hadoop3 ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@hadoop4切回root用户,安装jdk,openjdk安装方便,所以直接yum安装了,创建hadoop目录(操作节点:所有)
yum install -y java-1.8.0-openjdk.x86_64 #查看是否安装成功 java -version#JAVA_HOME地址:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.222.b10-0.el7_6.x86_64/jremkdir /hadoopchown -R hadoop:hadoop /hadoop配置zookeeper,zookeeper下载地址:https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/
su - hadoop cd /hadoopcurl -O https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper-3.5.5/apache-zookeeper-3.5.5-bin.tar.gz#解压安装包tar -zxvf apache-zookeeper-3.5.5-bin.tar.gzcd apache-zookeeper-3.5.5-bin/conf/#更改配置文件名字mv zoo_sample.cfg zoo.cfg#创建zookeeper数据存储的目录mkdir -p /hadoop/data/zookeeperzoo.cfg配置文件
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/hadoop/data/zookeeper # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 server.1=hadoop2:2888:3888 server.2=hadoop3:2888:3888 server.3=hadoop4:2888:3888 View Code在dataDir目录下创建myid文件,内容分别为1、2、3
#hadoop2 echo 1 > /hadoop/data/zookeeper/myid #hadoop3 echo 2 > /hadoop/data/zookeeper/myid #hadoop4 echo 3 > /hadoop/data/zookeeper/myid配置环境变量
vim ~/.bashrc
# .bashrc # Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc fi # Uncomment the following line if you don't like systemctl's auto-paging feature: # export SYSTEMD_PAGER= # User specific aliases and functions export ZOOKEEPER_HOME=/hadoop/apache-zookeeper-3.5.5-bin PATH=$PATH:$ZOOKEEPER_HOME/bin source ~/.bashrc启动zookeeper集群
zkServer.sh start #查看是否启动成功$ zkServer.sh status /bin/java ZooKeeper JMX enabled by default Using config: /hadoop/apache-zookeeper-3.5.5-bin/bin/../conf/zoo.cfg Client port found: 2181. Client address: localhost. Mode: leaderhadoop部署(操作码节点:hadoop1)
下载地址:https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/stable2/hadoop-3.2.0.tar.gz
cd /hadoop curl -O https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/stable2/hadoop-3.2.0.tar.gz tar -zxvf hadoop-3.2.0.tar.gz增加环境变量
hadoop1
export HADOOP_HOME=/hadoop/hadoop-3.2.0 PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin View Codehadoop2、3、4
# .bashrc # Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc fi # Uncomment the following line if you don't like systemctl's auto-paging feature: # export SYSTEMD_PAGER= # User specific aliases and functions export ZOOKEEPER_HOME=/hadoop/apache-zookeeper-3.5.5-bin export HADOOP_HOME=/hadoop/hadoop-3.2.0 PATH=$PATH:$ZOOKEEPER_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin View Code更改配置文件
hadoop-env.sh,任意位置加上以下语句
JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.222.b10-0.el7_6.x86_64/jrecore-site.xml,在configuration内加入以下语句
<property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/hadoop/data</value> </property>hdfs-stie.xml
<property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <property> <name> dfs.ha.namenodes.mycluster </name> <value> nn1,nn2 </value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>hadoop1:8020</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>hadoop2:8020</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>hadoop1:50070</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>hadoop2:50070</value> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/mycluster</value> </property> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/hadoop/data</value> </property> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>hadoop2:2181,hadoop3:2181,hadoop4:2181</value> </property>mapred-site.xml
<configuration>
<property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>hadoop1:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hadoop1:19888</value> </property></configuration>
yarn-site.xml
<configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- Site specific YARN configuration properties --> <!--启用resourcemanager ha--> <!--是否开启RM ha,默认是开启的--> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <!--声明两台resourcemanager的地址--> <property> <name>yarn.resourcemanager.cluster-id</name> <value>rmcluster</value> </property> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <property> <name>yarn.resourcemanager.ha.id</name> <value>rm1</value> </property> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>hadoop1</value> </property> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>hadoop2</value> </property> <!--指定zookeeper集群的地址--> <property> <name>yarn.resourcemanager.zk-address</name> <value>hadoop2:2181,hadoop3:2181,hadoop4:2181</value> </property> <!--启用自动恢复,当任务进行一半,rm坏掉,就要启动自动恢复,默认是false--> <property> <name>yarn.resourcemanager.recovery.enabled</name> <value>true</value> </property> <!--指定resourcemanager的状态信息存储在zookeeper集群,默认是存放在FileSystem里面。--> <property> <name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> </property> </configuration>workers
hadoop2 hadoop3 hadoop4scp hadoop文件夹至另外三台机器
分别在hadoop1和hadoop2的yarn-site.xml上添加
<property> <name>yarn.resourcemanager.ha.id</name> <value>rm1</value> </property> <property> <name>yarn.resourcemanager.ha.id</name> <value>rm2</value> </property>启动zookeeper
启动journalnode,分别是hadoop1,hadoop2,hadoop3
hadoop-daemon.sh start journalnode格式化hadoop1节点namenode
hdfs namenode –format启动hadoop1的namenode
hadoop-deamon.sh start namenode在hadoop2上同步hadoop1的元数据,在hadoop2上运行
hdfs namenode -bootstrapStandby启动hadoop2上的namenode
hadoop-deamon.sh start namenode格式化zookeeper,格式化完成后可以进zookeeper确认
hdfs zkfc –formatZK在hadoop1上输入
start-dfs.shstart-yarn.sh
转载于:https://www.cnblogs.com/hope123/p/11274172.html
相关资源:hadoop高可用集群搭建手册.docx