搭建hadoop2.6.0集群环境

mac2022-06-30  28

一、规划 (一)硬件资源 10.171.29.191 master 10.171.94.155  slave1 10.251.0.197 slave3

(二)基本资料 用户:  jediael 目录:/mnt/jediael/ 二、环境配置 (一)统一用户名密码,并为jediael赋予执行所有命令的权限 #passwd   # useradd jediael   # passwd jediael   # vi /etc/sudoers   增加以下一行: jediael ALL=(ALL) ALL 

(二)创建目录/mnt/jediael $sudo chown jediael:jediael /opt   $ cd /opt   $ sudo mkdir jediael   注意:/opt必须是jediael的,否则会在format namenode时出错。 (三)修改用户名及/etc/hosts文件 1、修改/etc/sysconfig/network NETWORKING=yes   HOSTNAME=*******

  2、修改/etc/hosts 10.171.29.191 master 10.171.94.155  slave1 10.251.0.197 slave3 注 意hosts文件不能有127.0.0.1  *****配置,否则会导致出现异常。org.apache.hadoop.ipc.Client: Retrying connect to server: master/10.171.29.191:9000. Already trie 

3、hostname命令 hostname ****   (四)配置免密码登录 以上命令在master上使用jediael用户执行: $ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa   $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys   然后,将authorized_keys复制到slave1,slave2 scp ~/.ssh/authorized_keys slave1:~/.ssh/   scp ~/.ssh/authorized_keys slave2:~/.ssh/   注意 (1)若提示.ssh目录不存在,则表示此机器从未运行过ssh,因此运行一次即可创建.ssh目录。 (2).ssh/的权限为600,authorized_keys的权限为700,权限大了小了都不行。 (五)在3台机器上分别安装java,并设置相关环境变量 参考http://blog.csdn.net/jediael_lu/article/details/38925871 (六)下载hadoop-2.6.0.tar.gz,并将其解压到/mnt/jediael wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz tar -zxvf hadoop-2.6.0.tar.gz三、修改配置文件 【3台机器上均要执行,一般先在一台机器上配置完成,再用scp复制到其它机器】 (一)hadoop_env.sh export JAVA_HOME=/usr/java/jdk1.7.0_51   (二)修改core-site.xml        

<property> <name>hadoop.tmp.dir</name> <value>/mnt/tmp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> <property> <name>io.file.buffer.size</name> <value>4096</value> </property> (三)修改hdfs-site.xml <property> <name>dfs.replication</name> <value>2</value> </property> (四)修改mapred-site.xml   <property> <name>mapreduce.framework.name</name> <value>yarn</value> <final>true</final> </property> <property> <name>mapreduce.jobtracker.http.address</name> <value>master:50030</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>master:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>master:19888</value> </property> <property> <name>mapred.job.tracker</name> <value>http://master:9001</value> </property> (五)修改yarn.xml <property> <name>yarn.resourcemanager.hostname</name> <value>master</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>master:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>master:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>master:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>master:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>master:8088</value> </property> (六)修改slaves 【不用修改masters文件??】 slaves:   slave1 slave3 四、启动并验证 1、格式 化namenode [jediael@master hadoop-1.2.1]$  bin/hadoop namenode -format   2、启动hadoop【此步骤只需要在master上执行】 [jediael@master hadoop-1.2.1]$ bin/start-all.sh    3、验证1:向hdfs中写入内容 [jediael@master hadoop-2.6.0]$ bin/hadoop fs -ls / [jediael@master hadoop-2.6.0]$ bin/hadoop fs -mkdir /test [jediael@master hadoop-2.6.0]$ bin/hadoop fs -ls /        Found 1 items drwxr-xr-x   - jediael supergroup          0 2015-04-19 23:41 /test 4、验证:登录页面 NameNode    http://ip:50070    5、查看各个主机的java进程 (1)master: $ jps 3694 NameNode 3882 SecondaryNameNode 7216 Jps 4024 ResourceManager

(2)slave1: $ jps 1913 NodeManager 2673 Jps 1801 DataNode

(3)slave3: $ jps 1942 NodeManager 2252 Jps 1840 DataNode 五、运行一个完整的mapreduce程序:运行自带的wordcount程序 $ bin/hadoop fs -mkdir /input

$ bin/hadoop fs -ls /         Found 2 items drwxr-xr-x   - jediael supergroup          0 2015-04-20 18:04 /input drwxr-xr-x   - jediael supergroup          0 2015-04-19 23:41 /test

$ bin/hadoop fs -copyFromLocal etc/hadoop/mapred-site.xml.template /input  

$ pwd /mnt/jediael/hadoop-2.6.0/share/hadoop/mapreduce

$ /mnt/jediael/hadoop-2.6.0/bin/hadoop jar hadoop-mapreduce-examples-2.6.0.jar wordcount /input /output 15/04/20 18:15:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/04/20 18:15:48 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 15/04/20 18:15:48 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 15/04/20 18:15:49 INFO input.FileInputFormat: Total input paths to process : 1 15/04/20 18:15:49 INFO mapreduce.JobSubmitter: number of splits:1 15/04/20 18:15:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local657082309_0001 15/04/20 18:15:50 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 15/04/20 18:15:50 INFO mapreduce.Job: Running job: job_local657082309_0001 15/04/20 18:15:50 INFO mapred.LocalJobRunner: OutputCommitter set in config null 15/04/20 18:15:50 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 15/04/20 18:15:50 INFO mapred.LocalJobRunner: Waiting for map tasks 15/04/20 18:15:50 INFO mapred.LocalJobRunner: Starting task: attempt_local657082309_0001_m_000000_0 15/04/20 18:15:50 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ] 15/04/20 18:15:50 INFO mapred.MapTask: Processing split: hdfs://master:9000/input/mapred-site.xml.template:0+2268 15/04/20 18:15:51 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 15/04/20 18:15:51 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 15/04/20 18:15:51 INFO mapred.MapTask: soft limit at 83886080 15/04/20 18:15:51 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 15/04/20 18:15:51 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 15/04/20 18:15:51 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 15/04/20 18:15:51 INFO mapred.LocalJobRunner: 15/04/20 18:15:51 INFO mapred.MapTask: Starting flush of map output 15/04/20 18:15:51 INFO mapred.MapTask: Spilling map output 15/04/20 18:15:51 INFO mapred.MapTask: bufstart = 0; bufend = 1698; bufvoid = 104857600 15/04/20 18:15:51 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26213916(104855664); length = 481/6553600 15/04/20 18:15:51 INFO mapred.MapTask: Finished spill 0 15/04/20 18:15:51 INFO mapred.Task: Task:attempt_local657082309_0001_m_000000_0 is done. And is in the process of committing 15/04/20 18:15:51 INFO mapred.LocalJobRunner: map 15/04/20 18:15:51 INFO mapred.Task: Task 'attempt_local657082309_0001_m_000000_0' done. 15/04/20 18:15:51 INFO mapred.LocalJobRunner: Finishing task: attempt_local657082309_0001_m_000000_0 15/04/20 18:15:51 INFO mapred.LocalJobRunner: map task executor complete. 15/04/20 18:15:51 INFO mapred.LocalJobRunner: Waiting for reduce tasks 15/04/20 18:15:51 INFO mapred.LocalJobRunner: Starting task: attempt_local657082309_0001_r_000000_0 15/04/20 18:15:51 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ] 15/04/20 18:15:51 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@39be5e01 15/04/20 18:15:51 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=363285696, maxSingleShuffleLimit=90821424, mergeThreshold=239768576, ioSortFactor=10, memToMemMergeOutputsThreshold=10 15/04/20 18:15:51 INFO reduce.EventFetcher: attempt_local657082309_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events 15/04/20 18:15:51 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local657082309_0001_m_000000_0 decomp: 1566 len: 1570 to MEMORY 15/04/20 18:15:51 INFO reduce.InMemoryMapOutput: Read 1566 bytes from map-output for attempt_local657082309_0001_m_000000_0 15/04/20 18:15:51 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1566, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->1566 15/04/20 18:15:51 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning 15/04/20 18:15:51 INFO mapred.LocalJobRunner: 1 / 1 copied. 15/04/20 18:15:51 INFO reduce.MergeManagerImpl: finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs 15/04/20 18:15:51 INFO mapred.Merger: Merging 1 sorted segments 15/04/20 18:15:51 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1560 bytes 15/04/20 18:15:51 INFO reduce.MergeManagerImpl: Merged 1 segments, 1566 bytes to disk to satisfy reduce memory limit 15/04/20 18:15:51 INFO reduce.MergeManagerImpl: Merging 1 files, 1570 bytes from disk 15/04/20 18:15:51 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce 15/04/20 18:15:51 INFO mapred.Merger: Merging 1 sorted segments 15/04/20 18:15:51 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1560 bytes 15/04/20 18:15:51 INFO mapred.LocalJobRunner: 1 / 1 copied. 15/04/20 18:15:51 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords 15/04/20 18:15:51 INFO mapreduce.Job: Job job_local657082309_0001 running in uber mode : false 15/04/20 18:15:51 INFO mapreduce.Job:  map 100% reduce 0% 15/04/20 18:15:51 INFO mapred.Task: Task:attempt_local657082309_0001_r_000000_0 is done. And is in the process of committing 15/04/20 18:15:51 INFO mapred.LocalJobRunner: 1 / 1 copied. 15/04/20 18:15:51 INFO mapred.Task: Task attempt_local657082309_0001_r_000000_0 is allowed to commit now 15/04/20 18:15:51 INFO output.FileOutputCommitter: Saved output of task 'attempt_local657082309_0001_r_000000_0' to hdfs://master:9000/output/_temporary/0/task_local657082309_0001_r_000000 15/04/20 18:15:51 INFO mapred.LocalJobRunner: reduce > reduce 15/04/20 18:15:51 INFO mapred.Task: Task 'attempt_local657082309_0001_r_000000_0' done. 15/04/20 18:15:51 INFO mapred.LocalJobRunner: Finishing task: attempt_local657082309_0001_r_000000_0 15/04/20 18:15:51 INFO mapred.LocalJobRunner: reduce task executor complete. 15/04/20 18:15:52 INFO mapreduce.Job:  map 100% reduce 100% 15/04/20 18:15:52 INFO mapreduce.Job: Job job_local657082309_0001 completed successfully 15/04/20 18:15:52 INFO mapreduce.Job: Counters: 38         File System Counters                 FILE: Number of bytes read=544164                 FILE: Number of bytes written=1040966                 FILE: Number of read operations=0                 FILE: Number of large read operations=0                 FILE: Number of write operations=0                 HDFS: Number of bytes read=4536                 HDFS: Number of bytes written=1196                 HDFS: Number of read operations=15                 HDFS: Number of large read operations=0                 HDFS: Number of write operations=4         Map-Reduce Framework                 Map input records=43                 Map output records=121                 Map output bytes=1698                 Map output materialized bytes=1570                 Input split bytes=114                 Combine input records=121                 Combine output records=92                 Reduce input groups=92                 Reduce shuffle bytes=1570                 Reduce input records=92                 Reduce output records=92                 Spilled Records=184                 Shuffled Maps =1                 Failed Shuffles=0                 Merged Map outputs=1                 GC time elapsed (ms)=123                 CPU time spent (ms)=0                 Physical memory (bytes) snapshot=0                 Virtual memory (bytes) snapshot=0                 Total committed heap usage (bytes)=269361152         Shuffle Errors                 BAD_ID=0                 CONNECTION=0                 IO_ERROR=0                 WRONG_LENGTH=0                 WRONG_MAP=0                 WRONG_REDUCE=0         File Input Format Counters                 Bytes Read=2268         File Output Format Counters 

$ /mnt/jediael/hadoop-2.6.0/bin/hadoop fs -cat /output/*

转载于:https://www.cnblogs.com/jinhong-lu/p/4559329.html

相关资源:JAVA上百实例源码以及开源项目
最新回复(0)