1 / 17

The Hadoop Distributed Filesystem : Balancing Portability and Performance

The Hadoop Distributed Filesystem : Balancing Portability and Performance. Jeffrey Shafer, Scott Rixner and Alan L. Cox Presented by: Bhavani Sankar Ikkurthi CS 775, Spring 2011, Old Dominion University. Bottlenecks. Software Architectural Bottlenecks Portability Limitations

howie
Download Presentation

The Hadoop Distributed Filesystem : Balancing Portability and Performance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Hadoop Distributed Filesystem: Balancing Portability and Performance Jeffrey Shafer, Scott Rixner and Alan L. Cox Presented by: Bhavani Sankar Ikkurthi CS 775, Spring 2011, Old Dominion University

  2. Bottlenecks • Software Architectural Bottlenecks • Portability Limitations • Portability Assumptions

  3. Components • MapReduce Engine • Hadoop Distributed File System • HDFS Replication

  4. Setup

  5. Evaluation • Experimental Setup • 5 – node • 4 nodes – Computation and Storage • 1 node – Scheduler and Storage Manager • 2-processor Opteron, 2.4GHz, 4GB RAM • UFS2 Filesystem, 16kB block size • Hadoop replication disabled

  6. Evaluation

  7. Evaluation

  8. Evaluation

  9. Evaluation

  10. Evaluation

  11. Evaluation

  12. Evaluation

  13. Solutions

  14. Solutions • Application Disk Scheduling

  15. Solutions • Non-portable • OS Hints • Filesystem Selection • Cache Bypass • Local Filesystem Elimination

  16. Conclusions • Interactions between Hadoop and storage are characterized • Bottlenecks found in HDFS implementation • Solutions are proposed to escape bottlenecks • Maintain Hadoop portability whenever possible

  17. Thank You Questions?

More Related