Hadoop(一)环境搭建

本文阐述了在Ubuntu16环境下安装和配置Hadoop-2.7.3的全过程。

  1. 安装Ubuntu

  2. 安装ssh并启用

1
2
sudo apt-get install openssh-server
sudo /etc/init.d/ssh start
  1. 安装vim
1
sudo apt install vim
  1. 下载并安装JDK到/usr/java目录
1
sudo tar zxvf jdk-8u111-linux-x64.tar.gz -C /usr/lib/jvm
  1. 配置JDK环境变量,并使环境变量生效(vim用法自行Google)
1
sudo vim /etc/profile
1
2
3
4
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_111
export JAVA_BIN=$JAVA_HOME/bin
export JAVA_LIB=$JAVA_HOME/lib
export CLASSPATH=.:$JAVA_LIB/tools.jar:$JAVA_LIB/dt.jar
1
sudo vim /etc/environment
1
2
3
   PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/jvm/jdk1.8.0_111/bin"
CLASSPATH="/usr/lib/jvm/jdk1.8.0_111/lib"
JAVA_HOME="/usr/lib/jvm/jdk1.8.0_111"
1
source ~/.bashrc

告诉ubuntu系统,我们使用的sun的JDK,而非OpenJDK

1
2
3
sudo update-alternatives --install /usr/bin/java java /usr/lib/jvm/jdk1.8.0_111/bin/java 300
sudo update-alternatives --install /usr/bin/javac javac /usr/lib/jvm/jdk1.8.0_111/bin/javac 300
sudo update-alternatives --config java

如果没有显示有多个jdk,则结束配置,如果有显示如下:

1
2
3
4
5
6
7
sudo update-alternatives --config java
有 2 个候选项可用于替换 java (提供 /usr/bin/java)。
选择 路径 优先级 状态-------------------------------*
0 /usr/lib/jvm/java-6-openjdk/jre/bin/java 1061 自动模式
1 /usr/lib/jvm/java-6-openjdk/jre/bin/java 1061 手动模式
2 /usr/lib/jvm/jdk1.8.0_05/bin/java 300 手动模式
要维持当前值[*]请按回车键,或者键入选择的编号:?

想用哪个输哪个号码,如上所示,这样就设置好了要使用的jdk了。

1
2
3
4
5
/* 验证jdk是否生效 */
tyrival@ubuntu:/usr/java$ java -version
java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)
  1. 下载Hadoop并解压缩至/usr/local/hadoop
1
2
sudo tar zxvf hadoop-2.7.3.tar.gz -C /usr/local
sudo mv /usr/local/hadoop-2.7.3 /usr/local/hadoop
  1. 给/usr/local/hadoop设置访问权限(如果启动时报没有权限的错误,很大可能是因为这一步没完成)
1
sudo chmod 777 /usr/local/hadoop
  1. 配置.bashrc文件
1
sudo vim ~/.bashrc

在文件末尾追加下面内容:

1
2
3
4
5
6
7
8
9
10
11
12
#HADOOP VARIABLES START
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_111
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
#HADOOP VARIABLES END

使添加的环境变量生效

1
source ~/.bashrc
  1. 配置Hadoop
1
sudo vim /usr/local/hadoop/etc/hadoop/hadoop-env.sh
1
2
3
4
# The java implementation to use.
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_111
export HADOOP=/usr/local/hadoop
export PATH=$PATH:/usr/local/hadoop/bin
1
sudo vim /usr/local/hadoop/etc/hadoop/yarn-env.sh
1
2
# export JAVA_HOME=/home/y/libexec/jdk1.6.0/
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_111

在home目录地下创建 /home/tyrival/hadoop_tmp目录

1
sudo mkdir /home/tyrival/hadoop_tmp
1
sudo vim /usr/local/hadoop/etc/hadoop/core-site.sh
1
2
3
4
5
6
7
8
9
10
11
12
<configuration>
<!-- 指定HDFS老大(namenode)的通信地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<!-- 指定hadoop运行时产生文件的存储目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/home/tyrival/hadoop_tmp</value>
</property>
</configuration>
1
sudo vim /usr/local/hadoop/etc/hadoop/hdfs-site.xml
1
2
3
4
5
6
7
<configuration>
<!-- 指定HDFS副本的数量 -->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
1
sudo vim /usr/local/hadoop/etc/hadoop/yarn-site.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>127.0.0.1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>127.0.0.1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>127.0.0.1:8031</value>
</property>
</configuration>
  1. 格式化
1
hdfs namenode -format
1
2
3
4
5
6
7
8
/* 出现下列消息表示成功 */
...
... INFO common.Storage: Storage directory /home/windghoul/tmp/dfs/name has been successfully formatted.
...
...
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ubuntu/127.0.1.1
************************************************************/
  1. 启动,中间可能要输入几次密码
1
2
start-all.sh

  1. 查看jps
1
2
jps

1
2
3
4
5
6
/* 显示下列信息说明运行正常 */
5760 Jps
3058 DataNode
3286 SecondaryNameNode
2879 NameNode

  1. 访问 http://localhost:50070http://localhost:8088,可以访问则说明成功。

分享到