1 / 16

Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

-by Rewati Ovalekar. Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid. Step 1: Code is available on: http://code.google.com/p/cyberaide/

makaio
Download Presentation

Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. -by Rewati Ovalekar Tutorial: To run the MapReduce EEMD code with Hadoop on Futuregrid

  2. Step 1: • Code is available on: http://code.google.com/p/cyberaide/ • Download the code from: http://code.google.com/p/cyberaide/source/browse/#svn%2Ftrunk%2Fproject%2Fspring2011%2FEEMDAnalysis%2FEEMDJava

  3. Step 2: • Create a futuregrid account • For further details refer: https://portal.futuregrid.org/tutorials (FutureGrid Tutorial)

  4. Step 3: • Login to Futuregrid • ssh username@india.futuregrid.org • Following message will be displayed for successful login

  5. Step 4: • Create a jar file • Step 5: • To transfer the jar file and the input file: • sftp username@india.futuregrid.org • put /../filepath

  6. Step 6: • In order to run Hadoop on FutureGrid create an eucalyptus account • For further details refer: https://portal.futuregrid.org/tutorials/eucalyptus • Step 7: • Once the account is approved, load the eucalyptus tools : Module load euca2ools

  7. Step 8: • Make sure that the jar file and the input file are in the same directory as the username.private key • Run the image which has hadoop on it: euca-run-instances -k rovaleka -t c1.xlarge emi-D778156D -k indicates the key name -t indicates the type of instance emi-D778156D indicates the image name -n indicates the number of clusters to run

  8. Step 8: • Check the status using: • euca-describe-instances • Keep checking till the status is running, once the status is running one can login to run the Hadoop. It will be displayed as below:

  9. Step 9: • Transfer the input file and the jar file to the required VM using: scp –i username.private filename root@149.165.146.171:/ (Make sure that the address is same as the address assigned to you else it will ask for password) • Login using: scp –i username.private root@149.165.146.171 (Make sure the address is same)

  10. Step 10: • Above message will be displayed for successful login • Retrieve the transferred files and transfer it in the Hadoop folder: cd /.. mv filename /opt/hadoop-0.20.2 cd /opt/hadoop-0.20.2 SINGLE NODE

  11. Step 11: • To run Hadoop: cd /opt/hadoop-0.20.2 bin/start-all.sh • To check if everything is started: jps

  12. Step 12: • Transfer the input file on the HDFS: bin/hadoop dfs –copyFromLocal inputfile name_in_HDFS • To check if it is present on HDFS: bin/hadoop dfs –ls NOTE: We need to transfer the input file whenever we start Hadoop

  13. Step 13: • To run the code: bin/hadoop jar [jarFile] EEMDHadoop [inputfilename] [required_output_file]

  14. Step 14: • Retrieve the output : bin/hadoop dfs -copyToLocal [outputFileName] [outputfileNameToBeGiven] (output will be avaliable in part-00000 file) To check the logs and to debug the code go to folder logs/userlogs

  15. Step 15: • Stop the Hadoop: bin/stop-all.sh exit

  16. Thank you!!!

More Related