企业实战——Hadoop大数据平台单机版实现和伪分布式的实现

mac2024-04-11  29

实验环境

主机信息主机功能server1(172.25.5.1)hadoop真机(172.25.5.250)测试 [root@foundation5 ~]# scp jdk-8u181-linux-x64.tar.gz hadoop-3.2.1.tar.gz server1: ##物理机传送安装包

到此为止,基本的实验环境已经搭建完毕

单机版的实现过程如下

[root@server1 ~]# useradd hadoop ##建立hadoop普通用户,因为java进程会开启很多 [root@server1 ~]# id hadoop

[root@server1 ~]# mv * /home/hadoop/ ##移动到hadoop家目录下 [root@server1 ~]# su - hadoop [hadoop@server1 ~]$ tar zxf hadoop-3.2.1.tar.gz ##解压 [hadoop@server1 ~]$ tar zxf jdk-8u181-linux-x64.tar.gz ##解压 [hadoop@server1 ~]$ ln -s hadoop-3.2.1 hadoop ##作软链接 [hadoop@server1 ~]$ ln -s jdk1.8.0_181 java [hadoop@server1 ~]$ vim .bash_profile #写入java命令绝对路径方便命令使用

[hadoop@server1 ~]$ source .bash_profile ##使更改生效 [hadoop@server1 ~]$ cd hadoop [hadoop@server1 hadoop]$ cd etc/ [hadoop@server1 etc]$ cd hadoop/ [hadoop@server1 hadoop]$ vim hadoop-env.sh ##声明java的环境变量位置

[hadoop@server1 hadoop]$ pwd /home/hadoop/hadoop/etc/hadoop [hadoop@server1 hadoop]$ cd ../.. [hadoop@server1 hadoop]$ ls bin include libexec NOTICE.txt sbin etc lib LICENSE.txt README.txt share [hadoop@server1 hadoop]$ pwd /home/hadoop/hadoop [hadoop@server1 hadoop]$ mkdir input ##创建一个input目录 [hadoop@server1 hadoop]$ ls bin include lib LICENSE.txt README.txt share etc input libexec NOTICE.txt sbin [hadoop@server1 hadoop]$ cp etc/hadoop/*.xml input/ [hadoop@server1 hadoop]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar grep input output 'dfs[a-z.]+' ##独立操作debug,运行了一个程序

到此为止,基本的单机版搭建已经完毕,接下来实现伪分布式的搭建

伪分布式的实现过程如下

1.做本机的免密,因为此时的伪分布式也是在一台节点上实现的

[hadoop@server1 hadoop]$ ssh-keygen

2.此时的workers文件里面既可以写localhost,也可以写ip地址 为了后续实验方便,在这里写ip地址 3.设置slave节点为本机

[hadoop@server1 hadoop]$ vim hdfs-site.xml

4.设置副本个数为1,因为此时只有本机一个节点开启datanode进程 5.设置master节点也为本机

[hadoop@server1 hadoop]$ vim core-site.xml

6.初始化

[hadoop@server1 hadoop]$ pwd /home/hadoop/hadoop/etc/hadoop [hadoop@server1 hadoop]$ cd .. [hadoop@server1 etc]$ cd .. [hadoop@server1 hadoop]$ ls bin include lib LICENSE.txt output sbin etc input libexec NOTICE.txt README.txt share [hadoop@server1 hadoop]$ bin/hdfs namenode -format

7.可以发现,初始化之后会在/tmp这个目录下面生成一些临时目录以及进程的pid文件 8.开启服务

[hadoop@server1 hadoop]$ sbin/start-dfs.sh

9.此时datanode和namenode进程均开启在本节点上

[hadoop@server1 hadoop]$ jps

10.查看服务端口的开启情况

[hadoop@server1 hadoop]$ netstat -antlupe

11.在真机上做好解析之后进行测试 在浏览器里面可以看到图形化界面 12.查看一些主机的信息,在线还是不在线

[hadoop@server1 hadoop]$ bin/hdfs dfsadmin -report

13.建立数据目录,上床数据

[hadoop@server1 hadoop]$ bin/hdfs dfs -mkdir /user/hadoop [hadoop@server1 hadoop]$ bin/hdfs dfs -ls [hadoop@server1 hadoop]$ bin/hdfs dfs -put input [hadoop@server1 hadoop]$ bin/hdfs dfs -ls [hadoop@server1 hadoop]$ bin/hdfs dfs -ls input/

14.在浏览器里面可以看到刚刚上传上去的文件 15.在图形化界面里面没有直接删除文件的权限

[hadoop@server1 hadoop]$ rm -fr input/ output/ [hadoop@server1 hadoop]$ ls bin include libexec logs README.txt share etc lib LICENSE.txt NOTICE.txt sbin [hadoop@server1 hadoop]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar grep input output 'dfs[a-z.]+'

[hadoop@server1 hadoop]$ ls bin include libexec logs README.txt share etc lib LICENSE.txt NOTICE.txt sbin [hadoop@server1 hadoop]$ bin/hdfs dfs -ls Found 2 items drwxr-xr-x - hadoop supergroup 0 2019-10-30 13:55 input drwxr-xr-x - hadoop supergroup 0 2019-10-30 14:11 output [hadoop@server1 hadoop]$ bin/hdfs dfs -cat output/* 1 dfsadmin [hadoop@server1 hadoop]$ bin/hdfs dfs -get output [hadoop@server1 hadoop]$ ls bin include libexec logs output sbin etc lib LICENSE.txt NOTICE.txt README.txt share [hadoop@server1 hadoop]$ cd output/ [hadoop@server1 output]$ ls part-r-00000 _SUCCESS [hadoop@server1 output]$ cat * 1 dfsadmin [hadoop@server1 output]$ cd .. [hadoop@server1 hadoop]$ rm -fr output/ [hadoop@server1 hadoop]$ ls bin include libexec logs README.txt share etc lib LICENSE.txt NOTICE.txt sbin [hadoop@server1 hadoop]$

16.浏览器查看

最新回复(0)