1 / 17

Community 1.0.0 release riding an elephant was never so easy…

Community 1.0.0 release riding an elephant was never so easy…. Agenda. Differentiator Ultra light Deployment Features – Description Strategic Users & Positioning. Differentiator. What Jumbune solves?.

jonny
Download Presentation

Community 1.0.0 release riding an elephant was never so easy…

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Community 1.0.0 release riding an elephant was never so easy…

  2. Agenda • Differentiator • Ultra light Deployment • Features – Description • Strategic Users & Positioning

  3. Differentiator

  4. What Jumbune solves? • Detects Job Code & data Inaccuracies: Hadoop MapReduce analytics output is not as expected • Analyses Job Profiling: MapReduce Jobs have some performance bottlenecks • On demand Cluster Monitoring: Whole Cluster can’t be monitored/unmonitored at will • Non intrusive operation: Don’t want intrusive deployments on monitoring or analyzing daemons

  5. Key Differentiating Features • Cluster Monitoring can be turned-on on demand • MapReduce Flow drill down (World’s only) • Decoupled installation from Hadoop • MapReduce Phase wise statistics (time vs. data flow ratevs. resource)

  6. Jumbune (Non-intrusive) ultra light deployment

  7. Decoupled Jumbune & Hadoop

  8. Features - Description

  9. Features & Users • Hadoop Cluster Monitoring • MR Job Profiling • HDFS Data Validator • MR Flow Debugger

  10. Features & Recommended Environments

  11. Supported Deployments

  12. Analytic Solution Costs & Solutions • MapReduce Solution Development Costs: • Fault prone - Development and Data Staging • Days to resolve on real data (because of Volume) • performance bottlenecks may be present - MapReduce Jobs • Hadoop Cluster Monitoring Costs: • Administrator – analyzes each node separately • On Each node – install & run monitoring daemons • Cluster resources – running daemons will consume them

  13. Hadoop Cluster Monitoring • Data Centre & Rack aware nodes view • Dynamic Interval based monitoring • Hadoop JMX, Node Resource Statistics • Network Latency across Hadoop nodes • Per file, node wise replica Placement (which nodes have replicas of a given file ?) • HDFS data placement view (HDFS balanced ?) • HDFS Health statistics (HDFS corrupted ?)

  14. MR Job Profiling • Per Job Phase wise • performance for each JVM • data flow rate • Resource usage • Per Job Heap sites for Mapper & Reducer • Per Job CPU cycles for Mapper & Reducer

  15. HDFS Data Validator • Validates inconsistencies in HDFS data in the form of : - Null checks - Data type checks - Regular expression checks

  16. MapReduce Flow Debugger • Verifies the flow of input records in user’s map reduce implementation • Drill down visualization helps developer to quickly identify the problem. • Only tool to assist developers to figure out MapReduce implementation faults without any extra coding

  17. Thanks

More Related