基本了解请看:http://hadoop.apache.org
下载hadoop,目前版本是0.16.0
http://apache.mirror.phpchina.com/hadoop/core/hadoop-0.16.0/hadoop-0.16.0.tar.gz
QuickStart文档
http://hadoop.apache.org/core/docs/current/quickstart.html
以linux作为开发平台,搭建a single-node Hadoop
linux版本
Red Hat Enterprise Linux AS release 3 (Taroon Update 4)
java版本
1.5.0_06-b05
ssh版本
SSH Version 1.2.33 [i686-unknown-linux], protocol version 1.5.
rsync版本
version 2.5.7 protocol version 26
1:将下载的hadoop解压,放到linux user目录下
2:修改hadoop/conf/hadoop-env.sh,找到# export JAVA_HOME=/usr/lib/j2sdk1.5-sun的一行,放开并修改设定到你自己的JAVA_HOME(如果在你的.bash_profile中已export过JAVA_HOME,那这个步骤也可忽略?)
3:运行bin/hadoop,显示相当于help的脚本提示
4:接下去按照quickstart提示,跑了一下sample,没有任何问题
mkdir input
cp conf/*.xml input
bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
cat output/*
5:修改conf/hadoop-site.xml配置文件如下:
<configuration>
<property>
<name>fs.default.name</name>
<value>localhost:10100</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>localhost:10101</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/zxf/hadoop/tmp/</value>
</property>
</configuration>
6:建立ssh的通道
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
quick start里的面登陆命令是ssh2版本的,我的是ssh1故用一下命令
cd ~/.ssh ; ssh-keygen; echo identity.pub >> authorized_keys; chmod 644 identity.pub; chmod 600 authorzed_keys
7:创建新的distributed-filesystem:
bin/hadoop namenode -format
9:启动hadoop daemons:
bin/start-all.sh
10:Copy the input files into the distributed filesystem
执行bin/hadoop dfs -put conf input,输出一堆info或warn的警告
11:Run some of the examples provided:
执行bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+',输出同样的错误
12:Examine the output files:
Copy the output files from the distributed filesystem to the local filesytem and examine them:
12.1
bin/hadoop dfs -get output output
提示:get: output: No such file or directory
cat output/*
还是第一次运行的结果
12.2
bin/hadoop dfs -cat output/*
输出cat: java.io.IOException: Cannot open filename /user/zxf/output/part-00000
13:
查看output下文件的最后修改时间是以前的,故怀疑10开始的步骤没有被成功执行(至少其中一个步骤有问题).
bin/hadoop stop-all.sh
看到localhost: no datanode to stop,说明datanode服务没有启动
查看datanode.log发现启动时报错如下:
2008-03-18 10:20:57,180 ERROR org.apache.hadoop.dfs.DataNode: java.io.IOException: Incompatible namespaceIDs in /tmp/hadoop-zxf/dfs/data: namenode namespaceID = 611295554; datanode namespaceID = 1744145012
at org.apache.hadoop.dfs.DataStorage.doTransition(DataStorage.java:298)
at org.apache.hadoop.dfs.DataStorage.recoverTransitionRead(DataStorage.java:142)
at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:236)
at org.apache.hadoop.dfs.DataNode.<init>(DataNode.java:162)
at org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:2510)
at org.apache.hadoop.dfs.DataNode.run(DataNode.java:2454)
at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:2475)
at org.apache.hadoop.dfs.DataNode.main(DataNode.java:2671)
此时我在hadoop-site.xml中加入
<property>
<name>hadoop.tmp.dir</name>
<value>/home/zxf/hadoop/tmp/</value>
</property>
的属性
重新namenode format
重新启动,发现系统又工作正常
再次datanode format,再启动系统,又出现datanode启动失败,大概明白了原因:每次namenode format会重新创建一个namenodeId,而tmp/dfs/data下包含了上次format下的id,namenode format清空了namenode下的数据,但是没有晴空datanode下的数据,导致启动时失败,所要做的就是每次fotmat前,清空tmp一下的所有目录,即可启动正常.
新手在按照quickstart做时需注意.
后台在网上查到了这个问题,应该是一个bug,http://issues.apache.org/jira/browse/HADOOP-1212,奇怪的是0.12.2出现的bug,到现在还没有修复,以下是顾忌的
Data-nodes should be reformatted whenever the name-node is. I see 2 approaches here:
1) In order to reformat the cluster we call "start-dfs -format" or make a special script "format-dfs".
This would format the cluster components all together. The question is whether it should start
the cluster after formatting?
2) Format the name-node only. When data-nodes connect to the name-node it will tell them to
format their storage directories if it sees that the namespace is empty and its cTime=0.
The drawback of this approach is that we can loose blocks of a data-node from another cluster
if it connects by mistake to the empty name-node.
至此a single-node in a pseudo-distributed mode运行成功
心情: 一般