スタンドアロンモードでの実行まで
sun-java6-jdkのインストール
$ sudo apt-get install sun-java6-jdk $ java -version java version "1.6.0_24" Java(TM) SE Runtime Environment (build 1.6.0_24-b07) Java HotSpot(TM) Client VM (build 19.1-b02, mixed mode, sharing)
http://www.apache.org/dyn/closer.cgi/hadoop/core/
$ wget http://www.meisei-u.ac.jp/mirror/apache/dist//hadoop/core/stable/hadoop-0.20.2.tar.gz $ tar zxvf hadoop-0.20.2.tar.gz $ cd hadoop-0.20.2/conf/ $ which java /usr/bin/java $ vi hadoop-env.sh export JAVA_HOME=/usr/ $ cd ../bin/ $ ./hadoop Usage: hadoop [--config confdir] COMMAND where COMMAND is one of: namenode -format format the DFS filesystem secondarynamenode run the DFS secondary namenode namenode run the DFS namenode datanode run a DFS datanode dfsadmin run a DFS admin client mradmin run a Map-Reduce admin client fsck run a DFS filesystem checking utility fs run a generic filesystem user client balancer run a cluster balancing utility jobtracker run the MapReduce job Tracker node pipes run a Pipes job tasktracker run a MapReduce task Tracker node job manipulate MapReduce jobs queue get information regarding JobQueues version print the version jarrun a jar file distcp copy file or directories recursively archive -archiveName NAME * create a hadoop archive daemonlog get/set the log level for each daemon or CLASSNAME run the class named CLASSNAME Most commands print help when invoked w/o parameters.
スタンドアロンモードでの実行
$ cd .. $ mkdir input $ cp conf/*.xml input $ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+' 11/04/21 23:59:29 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 11/04/21 23:59:29 INFO mapred.FileInputFormat: Total input paths to process : 5 11/04/21 23:59:29 INFO mapred.FileInputFormat: Total input paths to process : 5 11/04/21 23:59:29 INFO mapred.JobClient: Running job: job_local_0001 11/04/21 23:59:29 INFO mapred.MapTask: numReduceTasks: 1 11/04/21 23:59:29 INFO mapred.MapTask: io.sort.mb = 100 11/04/21 23:59:30 INFO mapred.JobClient: map 0% reduce 0% 11/04/21 23:59:30 INFO mapred.MapTask: data buffer = 79691776/99614720 11/04/21 23:59:30 INFO mapred.MapTask: record buffer = 262144/327680 11/04/21 23:59:30 INFO mapred.MapTask: Starting flush of map output 11/04/21 23:59:31 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting 11/04/21 23:59:31 INFO mapred.LocalJobRunner: file:/home/ubuntu/hadoop-0.20.2/input/capacity-scheduler.xml:0+3936 11/04/21 23:59:31 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done. 11/04/21 23:59:31 INFO mapred.MapTask: numReduceTasks: 1 11/04/21 23:59:31 INFO mapred.MapTask: io.sort.mb = 100 11/04/21 23:59:31 INFO mapred.MapTask: data buffer = 79691776/99614720 11/04/21 23:59:31 INFO mapred.MapTask: record buffer = 262144/327680 11/04/21 23:59:31 INFO mapred.MapTask: Starting flush of map output 11/04/21 23:59:31 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000001_0 is done. And is in the process of commiting 11/04/21 23:59:31 INFO mapred.LocalJobRunner: file:/home/ubuntu/hadoop-0.20.2/input/core-site.xml:0+178 11/04/21 23:59:31 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000001_0' done. 11/04/21 23:59:31 INFO mapred.MapTask: numReduceTasks: 1 11/04/21 23:59:31 INFO mapred.MapTask: io.sort.mb = 100 11/04/21 23:59:31 INFO mapred.MapTask: data buffer = 79691776/99614720 11/04/21 23:59:31 INFO mapred.MapTask: record buffer = 262144/327680 11/04/21 23:59:31 INFO mapred.MapTask: Starting flush of map output 11/04/21 23:59:31 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000002_0 is done. And is in the process of commiting 11/04/21 23:59:31 INFO mapred.LocalJobRunner: file:/home/ubuntu/hadoop-0.20.2/input/hdfs-site.xml:0+178 11/04/21 23:59:31 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000002_0' done. 11/04/21 23:59:31 INFO mapred.MapTask: numReduceTasks: 1 11/04/21 23:59:31 INFO mapred.MapTask: io.sort.mb = 100 11/04/21 23:59:31 INFO mapred.JobClient: map 100% reduce 0% 11/04/21 23:59:32 INFO mapred.MapTask: data buffer = 79691776/99614720 11/04/21 23:59:32 INFO mapred.MapTask: record buffer = 262144/327680 11/04/21 23:59:32 INFO mapred.MapTask: Starting flush of map output 11/04/21 23:59:32 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000003_0 is done. And is in the process of commiting 11/04/21 23:59:32 INFO mapred.LocalJobRunner: file:/home/ubuntu/hadoop-0.20.2/input/mapred-site.xml:0+178 11/04/21 23:59:32 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000003_0' done. 11/04/21 23:59:32 INFO mapred.MapTask: numReduceTasks: 1 11/04/21 23:59:32 INFO mapred.MapTask: io.sort.mb = 100 11/04/21 23:59:33 INFO mapred.MapTask: data buffer = 79691776/99614720 11/04/21 23:59:33 INFO mapred.MapTask: record buffer = 262144/327680 11/04/21 23:59:33 INFO mapred.MapTask: Starting flush of map output 11/04/21 23:59:33 INFO mapred.MapTask: Finished spill 0 11/04/21 23:59:33 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000004_0 is done. And is in the process of commiting 11/04/21 23:59:33 INFO mapred.LocalJobRunner: file:/home/ubuntu/hadoop-0.20.2/input/hadoop-policy.xml:0+4190 11/04/21 23:59:33 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000004_0' done. 11/04/21 23:59:33 INFO mapred.LocalJobRunner: 11/04/21 23:59:33 INFO mapred.Merger: Merging 5 sorted segments 11/04/21 23:59:33 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 21 bytes 11/04/21 23:59:33 INFO mapred.LocalJobRunner: 11/04/21 23:59:33 INFO mapred.TaskRunner: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting 11/04/21 23:59:33 INFO mapred.LocalJobRunner: 11/04/21 23:59:33 INFO mapred.TaskRunner: Task attempt_local_0001_r_000000_0 is allowed to commit now 11/04/21 23:59:33 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to file:/home/ubuntu/hadoop-0.20.2/grep-temp-859935326 11/04/21 23:59:33 INFO mapred.LocalJobRunner: reduce > reduce 11/04/21 23:59:33 INFO mapred.TaskRunner: Task 'attempt_local_0001_r_000000_0' done. 11/04/21 23:59:34 INFO mapred.JobClient: map 100% reduce 100% 11/04/21 23:59:34 INFO mapred.JobClient: Job complete: job_local_0001 11/04/21 23:59:34 INFO mapred.JobClient: Counters: 13 11/04/21 23:59:34 INFO mapred.JobClient: FileSystemCounters 11/04/21 23:59:34 INFO mapred.JobClient: FILE_BYTES_READ=969833 11/04/21 23:59:34 INFO mapred.JobClient: FILE_BYTES_WRITTEN=1030667 11/04/21 23:59:34 INFO mapred.JobClient: Map-Reduce Framework 11/04/21 23:59:34 INFO mapred.JobClient: Reduce input groups=1 11/04/21 23:59:34 INFO mapred.JobClient: Combine output records=1 11/04/21 23:59:34 INFO mapred.JobClient: Map input records=219 11/04/21 23:59:34 INFO mapred.JobClient: Reduce shuffle bytes=0 11/04/21 23:59:34 INFO mapred.JobClient: Reduce output records=1 11/04/21 23:59:34 INFO mapred.JobClient: Spilled Records=2 11/04/21 23:59:34 INFO mapred.JobClient: Map output bytes=17 11/04/21 23:59:34 INFO mapred.JobClient: Map input bytes=8660 11/04/21 23:59:34 INFO mapred.JobClient: Combine input records=1 11/04/21 23:59:34 INFO mapred.JobClient: Map output records=1 11/04/21 23:59:34 INFO mapred.JobClient: Reduce input records=1 11/04/21 23:59:34 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 11/04/21 23:59:34 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 11/04/21 23:59:34 INFO mapred.FileInputFormat: Total input paths to process : 1 11/04/21 23:59:34 INFO mapred.JobClient: Running job: job_local_0002 11/04/21 23:59:34 INFO mapred.FileInputFormat: Total input paths to process : 1 11/04/21 23:59:34 INFO mapred.MapTask: numReduceTasks: 1 11/04/21 23:59:34 INFO mapred.MapTask: io.sort.mb = 100 11/04/21 23:59:35 INFO mapred.MapTask: data buffer = 79691776/99614720 11/04/21 23:59:35 INFO mapred.MapTask: record buffer = 262144/327680 11/04/21 23:59:35 INFO mapred.MapTask: Starting flush of map output 11/04/21 23:59:35 INFO mapred.MapTask: Finished spill 0 11/04/21 23:59:35 INFO mapred.TaskRunner: Task:attempt_local_0002_m_000000_0 is done. And is in the process of commiting 11/04/21 23:59:35 INFO mapred.LocalJobRunner: file:/home/ubuntu/hadoop-0.20.2/grep-temp-859935326/part-00000:0+111 11/04/21 23:59:35 INFO mapred.TaskRunner: Task 'attempt_local_0002_m_000000_0' done. 11/04/21 23:59:35 INFO mapred.LocalJobRunner: 11/04/21 23:59:35 INFO mapred.Merger: Merging 1 sorted segments 11/04/21 23:59:35 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 21 bytes 11/04/21 23:59:35 INFO mapred.LocalJobRunner: 11/04/21 23:59:35 INFO mapred.TaskRunner: Task:attempt_local_0002_r_000000_0 is done. And is in the process of commiting 11/04/21 23:59:35 INFO mapred.LocalJobRunner: 11/04/21 23:59:35 INFO mapred.TaskRunner: Task attempt_local_0002_r_000000_0 is allowed to commit now 11/04/21 23:59:35 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local_0002_r_000000_0' to file:/home/ubuntu/hadoop-0.20.2/output 11/04/21 23:59:35 INFO mapred.LocalJobRunner: reduce > reduce 11/04/21 23:59:35 INFO mapred.TaskRunner: Task 'attempt_local_0002_r_000000_0' done. 11/04/21 23:59:35 INFO mapred.JobClient: map 100% reduce 100% 11/04/21 23:59:35 INFO mapred.JobClient: Job complete: job_local_0002 11/04/21 23:59:35 INFO mapred.JobClient: Counters: 13 11/04/21 23:59:35 INFO mapred.JobClient: FileSystemCounters 11/04/21 23:59:35 INFO mapred.JobClient: FILE_BYTES_READ=640527 11/04/21 23:59:35 INFO mapred.JobClient: FILE_BYTES_WRITTEN=684253 11/04/21 23:59:35 INFO mapred.JobClient: Map-Reduce Framework 11/04/21 23:59:35 INFO mapred.JobClient: Reduce input groups=1 11/04/21 23:59:35 INFO mapred.JobClient: Combine output records=0 11/04/21 23:59:35 INFO mapred.JobClient: Map input records=1 11/04/21 23:59:35 INFO mapred.JobClient: Reduce shuffle bytes=0 11/04/21 23:59:35 INFO mapred.JobClient: Reduce output records=1 11/04/21 23:59:35 INFO mapred.JobClient: Spilled Records=2 11/04/21 23:59:35 INFO mapred.JobClient: Map output bytes=17 11/04/21 23:59:35 INFO mapred.JobClient: Map input bytes=25 11/04/21 23:59:35 INFO mapred.JobClient: Combine input records=0 11/04/21 23:59:35 INFO mapred.JobClient: Map output records=1 11/04/21 23:59:35 INFO mapred.JobClient: Reduce input records=1 $ cat output/* 1 dfsadmin
参考
http://oss.infoscience.co.jp/hadoop/common/docs/current/quickstart.html