1 / 13

Set up environment for mapreduce developing on Hadoop

Set up environment for mapreduce developing on Hadoop. Outline. Hadoop server information Create user for running on the server Setup jdk Get your own Hadoop Get eclipse and the mapreduce plugin Something Important Experiences. Hadoop server information. Server ip: 166.111.68.153.

quyn-gray
Download Presentation

Set up environment for mapreduce developing on Hadoop

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Set up environment for mapreduce developing on Hadoop

  2. Outline • Hadoop server information • Create user for running on the server • Setup jdk • Get your own Hadoop • Get eclipse and the mapreduce plugin • Something Important • Experiences

  3. Hadoop server information • Server ip: 166.111.68.153. • Account: last three number of your student number (password the same, ask TA to change it). The account is used to run mapreduce on the server. If you want use it to ssh to the server, please don’t remove or modify any files on the server.

  4. Create user for running on the server • To run mapreduce on the Hadoop, you must have the same user on your machine as on the server. • Use useradd to create. • Then work as the created user.

  5. Setup jdk • You can get jdk by: scp user@166.111.68.153:/course_files/ download/jdk-1_5_0_12-linux-i586.bin local_path • Install it on your machine and find out where it is.

  6. Get your own Hadoop • You can get Hadoop by: scp user@166.111.68.153:/course_files/ download/hadoop.tar.gz local_path • Extract the file • Run scripts/make_dirs.sh • Modify the JAVA_HOME in hadoop/conf/hadoop-env.sh

  7. Get your own Hadoop • Now you can access the Hadoop on the server by run command on your own Hadoop. • Find files on the HDFS hadoop/bin/hadoop fs –ls /user/root/ • Run MapReduce on the server hadoop/bin/hadoop jar hadoop/hadoop-0.13.1-examples.jar wordcount /user/root/lab0 test

  8. Get eclipse and the mapreduce plugin • You can get eclipse and the plugin by: scp user@166.111.68.153:/course_files/ download/eclipse-SDK-3.2.2-linux-gtk.tar.gz local_path scp user@166.111.68.153:/course_files/ download/mapreduce_tools.zip local_path

  9. Get eclipse and the mapreduce plugin • Extract the eclipse and the plugin • Copy the com.ibm.hipods.mapreduce_1.0.4 folder to the eclispe/plugins • Run eclipse, if you can see Window->Preferences->Hadoop Home Directory, your plugin run successfully.

  10. Get eclipse and the mapreduce plugin • Information of the plugin: http://www.alphaworks.ibm.com/tech/mapreducetools • I found I can’t pass arguments to Hadoop when running mapreduce. So I just use the plugin to write code, then make jar package by ant and run it in the term. If you can resolve the problem, please tell me, thanks.

  11. Something Important • Don’t ssh to the server.If you have to do this, please don’t remove or modify any files. • Access the HDFS or run MapReduce on the server by run command on your own Hadoop.

  12. Experiences • When you write MapReduce on Hadoop, you can get help from: http://wiki.apache.org/lucene-hadoop/ http://lucene.apache.org/hadoop/api/index.html hadoop exmaple hadoop api source

  13. That’s all, Enjoy our labs.

More Related