1 / 19

An example Apache Hadoop Yarn upgrade

This is a simple example of how Hadoop on Ubuntu Linux can be upgraded from V1 to Yarn. It shows the steps, the configuration, a mapreduce check and the errors encountered.

semtechs
Download Presentation

An example Apache Hadoop Yarn upgrade

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Apache Yarn Upgrade • Example upgrade • From V1 -> Yarn • Environment • Approach • Install steps • Install check www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  2. Yarn Upgrade Environment • Java OpenJDK 1.6.0_27 • Ubuntu 12.04 • Maven 3.0.4 • Hadoop 1.2.0 • Mahout 0.9 • Hadoop to install • 2.0.6-alpha Full details are available from our web site site under guides folder www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  3. Yarn Upgrade Approach • Install along side existing Hadoop on all nodes • Use existing hdfs • Change cfg files on all nodes • Set up as single nodes and test via mapreduce • Create cluster and test via mapreduce • Check web GUI access Full details are available from our web site site under guides folder www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  4. Yarn Upgrade Install • Build with Maven into a distribution directory mvn clean package -Pdist -Dtar -DskipTests -Pnative release created under ./hadoop-dist/target/hadoop-2.0.6-alpha • Only skip tests after first build to speed things up • Configure $HOME/.bashrc • HADOOP_COMMON_HOME • HADOOP_HDFS_HOME • HADOOP_MAPRED_HOME • HADOOP_YARN_HOME • HADOOP_CONF_DIR • YARN_CONF_DIR • MAPRED_CONF_DIR • HADOOP_PREFIX • PATH • YARN_CLASSPATH www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  5. Yarn Upgrade Install • Set up core-site.xml cd $HADOOP_COMMON_HOME/etc/hadoop • Alter values for • fs.default.name • hadoop.tmp.dir • fs.checkpoint.dir www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  6. Yarn Upgrade Install • Set up hdfs-site.xml cd $HADOOP_HDFS_HOME/etc/hadoop • Alter values for • dfs.name.dir • dfs.data.dir • dfs.http.address • dfs.secondary.http.address • dfs.https.address www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  7. Yarn Upgrade Install • Set up yarn-site.xml cd $YARN_CONF_DIR • Alter values for • yarn.resourcemanager.resource-tracker.address • yarn.resourcemanager.scheduler.address • yarn.resourcemanager.scheduler.class • yarn.resourcemanager.address • yarn.nodemanager.local-dirs • yarn.nodemanager.address • yarn.nodemanager.resource.memory-mb • yarn.nodemanager.remote-app-log-dir • yarn.nodemanager.log-dirs • yarn.nodemanager.aux-services • yarn.web-proxy.address www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  8. Yarn Upgrade Install • Set up mapred-site.xml cd $MAPRED_CONF_DIR • Alter values for • mapreduce.cluster.temp.dir • mapreduce.cluster.local.dir • mapreduce.jobhistory.address • mapreduce.jobhistory.webapp.address www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  9. Yarn Upgrade Install • Set up capcity-scheduler.xml cd $HADOOP_YARN_HOME/etc/hadoop • Alter values for • yarn.scheduler.capacity.maximum-applications • yarn.scheduler.capacity.maximum-am-resource-percent • yarn.scheduler.capacity.resource-calculator • yarn.scheduler.capacity.root.queues • yarn.scheduler.capacity.child.queues • yarn.scheduler.capacity.child.unfunded.capacity • yarn.scheduler.capacity.child.default.capacity • yarn.scheduler.capacity.root.capacity • yarn.scheduler.capacity.root.unfunded.capacity • yarn.scheduler.capacity.root.default.capacity • yarn.scheduler.capacity.root.default.user-limit-factor • yarn.scheduler.capacity.root.default.maximum-capacity • yarn.scheduler.capacity.root.default.state • yarn.scheduler.capacity.root.default.acl_submit_applications • yarn.scheduler.capacity.root.default.acl_administer_queue • yarn.scheduler.capacity.node-locality-delay www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  10. Yarn Upgrade Install • Start Resource Manager cd $HADOOP_YARN_HOME sbin/yarn-deamon.sh start resourcemanager • Start Node Manager cd $HADOOP_YARN_HOME sbin/yarn-deamon.sh start ndemanager • Test via map reduce job cd $HADOOP_MAPRED_HOME/share/hadoop/mapreduce $HADOOP_COMMON_HOME/bin/hadoop jar \ hadoop-mapreduce-examples-2.0.6-alpha.jar randomwriter out www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  11. Yarn Upgrade Install • Mapreduce job should end with BYTES_WRITTEN=1073750341 RECORDS_WRITTEN=102099 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=1085699265 Job ended: Sun Aug 25 12:45:35 NZST 2013 The job took 89 seconds. • Run this test on each node being upgraded www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  12. Yarn Upgrade Install • Stop the servers cd $HADOOP_YARN_HOME sbin/yarn-daemon.sh stop resourcemanager stopping resourcemanager sbin/yarn-daemon.sh stop nodemanager stopping nodemanager • Alter Hadoop env cd $HADOOP_CONF_DIR vi hadoop-env.sh add a JAVA_HOME definition at the end. i.e. export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386 www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  13. Yarn Upgrade Install • Alter $HADOOP_CONF_DIR/slaves file • Add details ( one per line ) for slave nodes • Format the cluster • DONT have the cluster running else you will lose data • hdfs namenode -format • Now proceed to start the cluster www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  14. Yarn Upgrade Install cd $HADOOP_COMMON_HOME sbin/hadoop-daemon.sh --config $HADOOP_COMMON_HOME/etc/hadoop --script hdfs start namenode cd $HADOOP_COMMON_HOME sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode cd $HADOOP_YARN_HOME sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager cd $HADOOP_YARN_HOME sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager cd $HADOOP_YARN_HOME bin/yarn start proxyserver --config $HADOOP_CONF_DIR cd $HADOOP_MAPRED_HOME sbin/mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  15. Yarn Upgrade Install • Use jps to check servers running jps 5856 DataNode 6434 Jps 5776 NameNode 6181 NodeManager 6255 WebAppProxyServer 5927 ResourceManager 6352 JobHistoryServer • Then run the same mapreduce job on the cluster www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  16. Web Access www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  17. Web Access www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  18. Web Access www.semtech-solutions.co.nz info@semtech-solutions.co.nz

  19. Contact Us • Feel free to contact us at • www.semtech-solutions.co.nz • info@semtech-solutions.co.nz • We offer IT project consultancy • We are happy to hear about your problems • You can just pay for those hours that you need • To solve your problems

More Related