一些基本的配置请参考我的另一篇文章: hadoop的基本部署 规划
masterslave1slave2namenode√√datanode√√√nodemanager√√√resourcemanager√√journalnode√√√1、修改hadoop-env.sh
JAVA_HOME=/usr/local/java2、修改core-site.xml
<!--修改hdfs的nameservice为ler(可以自定义)--> <property> <name>fs.defaultFS</name> <value>hdfs://ler</value> </property> <!--指定hadoop的工作目录--> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/hadoop/tmp</value> </property> <!--指定zk集群地址--> <property> <name>ha.zookeeper.quorum</name> <value>master:2181,slave1:2181,slave2:2181</value> </property>3、修改hdfs-site.xml
<!--指定副本数,不超过datanode的数量--> <property> <name>dfs.replication</name> <value>2</value> </property> <!--指定hdfs的nameservices为ler,与core-site.xml中保持一致--> <property> <name>dfs.nameservices</name> <value>ler</value> </property> <!--高可用配置两个namenode--> <property> <name>dfs.ha.namenodes.ler</name> <value>nn1,nn2</value> </property> <!--配置nn1、nn2的rpc通讯地址--> <property> <name>dfs.namenode.rpc-address.ler.nn1</name> <value>master:9000</value> </property> <property> <name>dfs.namenode.rpc-address.ler.nn2</name> <value>slave1:9000</name> </property> <!--配置nn1、nn2的http通信地址--> <property> <name>dfs.namenode.http-address.ler.nn1</name> <value>master:50070</value> </property> <property> <name>dfs.namenode.http-address.ler.nn2</name> <value>slave1:50070</value> </property> <!--配置namenode在journal上的地址--> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://master:8485;slave1:8485;slave2:8485/ler</value> </property> <!--配置journalnode的本地存储地址--> <property> <name>dfs.journalnode.edits.dir</name> <value>/usr/local/hadoop/journaldata</name> </property> <!--开启节点失败自动切换--> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <property> <name>dfs.client.failover.proxy.provider.ler</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <!--ssh密钥存储地址--> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/mr/.ssh/id_rsa</value> </property> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value> </property>4、修改mapreduce-site.xml
<property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>master:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>master:19888</value> </property>5、修改yarn-site.xml
<!--开启RM高可用--> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <!--可以自定义id--> <property> <name>yarn.resourcemanager.cluster-id</name> <value>lyarn</name> </property> <!--高可用两个resourcemanager的名字--> <property> <name>yarn.resourcemanager.ha.rm-ids<name> <value>rm1,rm2</value> </property> <!--配置rm的地址--> <property> <name>yarn.resourcemanager.hostname.rm1<name> <value>master</value> </property> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>slave1</value> </property> <!--指定当前机器的id--> <property> <name>yarn.resourcemanager.ha.id</name> <value>rm1</value> </property> <!--指定zk地址--> <property> <name>yarn.resourcemanager.zk-address</name> <value>master:2181,slave1:2181,slave2:2181</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!--开启日志聚合服务--> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property>6、配置slaves文件
master slave1 slave27、将配置好的hadoop分发给slave1、slave2
scp -r hadoop slave1:/usr/local/ scp -r hadoop slave2:/usr/local/8、修改slave1节点的yarn-site.xml -----将原来的rm1修改为rm2
<property> <name>yarn.resourcemanager.ha.id</name> <value>rm2</value> </property>9、在各节点启动journalnode
hadoop.daemon.sh start journalnode10、在master节点上格式化namenode
hadoop namenode -format11、同步两个namenode的数据 a)、首先在master节点上启动namenode
hadoop-daemon.sh start namenodeb)、登陆slave1节点执行
hadoop namenode -bootstrapStandby12、在master节点上格式化zkfc
hadoop zkfc -formatZK13、启动dfs
start-dfs.sh14、启动yarn
start-yarn.sh(切记不能start-all.sh) 15、登陆slave1启动resourcemanager(仅首次启动需要)
yarn-daemon.sh start resourcemanager启动历史进程服务
mr-jobhistory-daemon.sh start historyserver