Stinger Initiative: Deep Dive

Stinger Initiative: Deep Dive Interactive Query on Hadoop Chris Harris E-Mail : charris@hortonworks.com Twitter : cj_harris5

Agenda • Key Hive Use Cases • Brief Refresher on Hive • The Stinger Initiative: Interactive Query for Hive

Key Hive Use Cases • RDBMS / MPP Offload • More data under query. • Database unable to keep up with SLAs. • Analysis of semi-structured data. • ETL / Data Refinement • +++ Increasingly: Business Intelligence and interactive query

BI Use Cases Enterprise Reports Dashboard / Scorecard Parameterized Reports Visualization Data Mining

Organize Tiers and Process with Metadata Access Tier Pig Conform, Summarize, Access HiveQL Gold Tier • Organize data based on source/derived relationships • Allows for fault and rebuild process HCat Pig Transform, Integrate, Storage MapReduce Work Tier Provides unified metadata access to Pig, Hive & MapReduce Pig Standardize, Cleanse, Transform MapReduce Raw Tier WebHDFS Extract & Load Flume Sqoop

Hive Current Focus Area Real-Time Interactive Non-Interactive Batch • Online systems • R-T analytics • CEP • Parameterized Reports • Drilldown • Visualization • Exploration • Data preparation • Incremental batch processing • Dashboards / Scorecards • Operational batch processing • Enterprise Reports • Data Mining Current Hive Sweet Spot 1m – 1h 1h+ 0-5s 5s – 1m Data Size

Stinger: Extending Hive’s Sweetspot Real-Time Interactive Non-Interactive Batch Current Hive Sweet Spot • Online systems • R-T analytics • CEP • Parameterized Reports • Drilldown • Visualization • Exploration • Data preparation • Incremental batch processing • Dashboards / Scorecards • Operational batch processing • Enterprise Reports • Data Mining Future Hive Expansion 1m – 1h 1h+ 0-5s 5s – 1m Data Size • Improve Latency & Throughput • Query engine improvements • New “Optimized RCFile” column store • Next-gen runtime (elim’s M/R latency) • Extend Deep Analytical Ability • Analytics functions • Improved SQL coverage • Continued focus on core Hive use cases

The top BI vendors support Hive today

Agenda • Key Hive Use Cases • Brief Refresher on Hive • The Stinger Initiative: Interactive Query for Hive

Brief Refresher on Hive The State of Hive Today (0.10)

Hive’s Origins Hive was originally developed at Facebook. More data than existing RDBMS could handle. 60,000+ Hive queries per day. More than 1,000 users per day. 100+ PB of data. 15+ TB of data loaded daily. Hive is a proven solution at extreme scale.

Hive 0.10 Capabilities • De-facto SQL Interface for Hadoop • Multiple persistence options: • Flat text for simple data imports. • Columnar format (RCFile) for high performance processing. • Secure and concurrent remote access • ODBC/JDBC connectivity • Highly extensible: • Supports User Defined Functions and User Defined Aggregation Functions. • Ships with more than 150 UDF/UDAF. • Extensible readers/writers can process any persisted data. • Support from 10+ BI vendors

HDP 1.2: ODBC Access for Popular BI Tools • Seamless integration with BI tools such as Excel, PowerPivot, MicroStrategy, and Tableau • Efficiently maps advanced SQL functionality into HiveQL • With configurable pass-through of HiveQL for Hive-aware apps • ODBC 3.52 standard compliant • Supports Linux & Windows Applications & Spreadsheets Visualization & Intelligence ODBC Hortonworks Data Platform High quality ODBC driver developed in partnership with Simba. Free to download & use with Hortonworks Data Platform.

0 to Big Data in 15 Minutes Hands on tutorials integrated into Sandbox HDP environment for evaluation

Agenda • Brief Refresher on Hive • Key Hive Use Cases • The Stinger Initiative: Interactive Query for Hive

The Stinger Initiative Interactive Query on Hadoop

Stinger Initiative: 2-Pronged Approach Making Hive Best for Interactive Query Improve Latency and Throughput Extend Deep Analytical Ability • Tez • New primitives move beyond map-reduce and beyond batch • Avoid unnecessary persistence of temporary data • Hive, Pig and others generate Tezplans for high perf • Query Engine Improvements • Cost-based optimizer • In-memory joins • Caching hot tables • Vector processing • State-of-the-art Column Store • “Optimized RCFile” or ORCFile • Minimizes disk IO and deserialization • TezService • Always-on service for query interactivity • Analytics Functions • SQL:2003 Compliant • OVER with PARTITION BY and ORDER BY • Wide variety of windowing functions: • RANK • LEAD/LAG • ROW_NUMBER • FIRST_VALUE • LAST_VALUE • Many more • Aligns well with BI ecosystem • Improved SQL Coverage • Non-correlated Subqueries using IN in WHERE • Expanded SQL types including DATETIME, VARCHAR, etc.

Hive: Performance Improvements

Stinger Initiative At A Glance

Base Optimizations: Intelligent Optimizer • Introduction of In-Memory Hash Join: • For joins where one side fits in memory: • New in-memory-hash-join algorithm. • Hive reads the small table into a hash table. • Scans through the big file to produce the output. • Introduction of Sort-Merge-Bucket Join: • Applies when tables are bucketed on the same key. • Dramatic speed improvements seen in benchmarks. • Other Improvements: • Lower the footprint of the fact tables in memory. • Enable the optimizer to automatically pick map joins.

Dimensionally Structured Data • Extremely common pattern in EDW. • Results in large “fact tables” and small “dimension tables”. • Dimension tables often small enough to fit in RAM. • Sometimes called Star Schema.

A Query on Dimensional Data • Derived from TPC-DS Query 27 • Dramatic speedup on Hive 0.11 SELECT col5, avg(col6) FROM fact_table join dim1 on (fact_table.col1 = dim1.col1) join dim2 on (fact_table.col2 = dim2.col1) join dim3 on (fact_table.col3 = dim3.col1) join dim4 on (fact_table.col4 = dim4.col1) GROUP BY col5 ORDER BY col5 LIMIT 100;

Star Schema Join Improvements in 0.11

Hive: Bucketing • Bucketing causes Hive to physically co-locate rows within files. • Buckets can be sorted or unsorted. CREATE EXTERNAL TABLE IF NOT EXISTS test_table ( Id INT, name String ) PARTITIONED BY (dt STRING, hour STRING) CLUSTERED BY(country,continent) SORTED BY(country,continent) INTO n BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LOCATION '/home/test_dir';

ORCFile - Optimized Column Storage • Make a better columnar storage file • Tightly aligned to Hive data model • Decompose complex row types into primitive fields • Better compression and projection • Only read bytes from HDFS for the required columns. • Store column level aggregates in the files • Only need to read the file meta information for common queries • Stored both for file and each section of a file • Aggregates: min, max, sum, average, count • Allows fast access by sorted columns • Ability to add bloom filters for columns • Enables quick checks for whether a value is present

Performance Futures - Vectorization • Operates on blocks of 1K or more records, rather than one record at a time • Each block contains an array of Java scalars, one for each column • Avoids many function calls, virtual dispatch, CPU pipeline stalls • Size to fit in L1 cache, avoid cache misses • Generate code for operators on the fly to avoid branches in code, maximize deep pipelines of modern processers • Up to 30x faster processing of records • Beta possible in 2H 2013

Performance Futures – Cost-Based Optimizer • Generate more intelligent DAGs based on properties of data being queried, e.g. table size, statistics, histograms, etc.

Performance Futures - Buffering • Query workloads always have hotspots: • Metadata • Small dimension tables • Build into YARN or Tez Service ways of buffering frequently used data into memory so it is not always read from disk. • Part of the “last mile” of latency efforts.

Yarn Moving Hive and Hadoop beyond MapReduce

Hadoop 2.0 Innovations - YARN • Focus on scale and innovation • Support 10,000+ computer clusters • Extensible to encourage innovation • Next generation execution • Improves MapReduce performance • Supports new frameworks beyond MapReduce • Low latency, Streaming, Services • Do more with a single Hadoop cluster Graph Processing Other MapReduce Tez YARN: Cluster Resource Management HDFS Redundant, Reliable Storage

Tez Moving Hive and Hadoop beyond MapReduce

Tez • Low level data-processing execution engine • Use it for the base of MapReduce, Hive, Pig, Cascading etc. • Enables pipelining of jobs • Removes task and job launch times • Hive and Pig jobs no longer need to move to the end of the queue between steps in the pipeline • Does not write intermediate output to HDFS • Much lighter disk and network usage • Built on YARN

Tez - Core Idea Task with pluggable Input, Processor & Output Processor Output Input Task Tez Task - <Input, Processor, Output> YARN ApplicationMaster to run DAG of Tez Tasks

Tez – Blocks for building tasks MapReduce ‘Map’ MapReduce ‘Reduce’ Map Processor Reduce Processor Reduce Processor Sorted Output HDFS Output Sorted Output HDFS Input Shuffle Input Shuffle Input Intermediate ‘Reduce’ for Map-Reduce-Reduce MapReduce ‘Map’ Task MapReduce ‘Reduce’ Task Intermediate ‘Reduce’ for Map-Reduce-Reduce

Tez – More tasks Special Pig/Hive ‘Map’ In-memory Map Map Processor Map Processor Reduce Processor Pipeline Sorter Output In-memory Sorted Output Sorted Output HDFS Input HDFSInput Shuffle Skip-merge Input Tez Task Tez Task Tez Task Special Pig/Hive ‘Reduce’

Pig/Hive-MR versus Pig/Hive-Tez SELECT a.state, COUNT(*), AVERAGE(c.price) FROM a JOIN bON (a.id = b.id) JOIN cON (a.itemId = c.itemId) GROUP BY a.state Job 1 Job 2 I/O Synchronization Barrier I/O Synchronization Barrier Single Job Job 3 Pig/Hive - Tez Pig/Hive - MR

FastQuery: Beyond Batch with YARN Always-On Tez Service Low latency processing for all Hadoop data processing Tez Generalizes Map-Reduce Simplified execution plans process data more efficiently

Tez Service • MR Query Startup Expensive • Job launch & task-launch latencies are fatal for short queries (in order of 5s to 30s) • Solution • Tez Service • Removes task-launch overhead • Removes job-launch overhead • Hive/Pig • Submit query-plan to Tez Service • Native Hadoop service, not ad-hoc

Tez Service Delivers Low Latency SELECT a.state, COUNT(*), AVERAGE(c.price) FROM a JOIN b ON (a.id = b.id) JOIN c ON (a.itemId = c.itemId) GROUP BY a.state * Numbers for illustration only

Recap and Questions: Hive Performance

Improving Hive’s SQL Support

Stinger: Deep Analytical Capabilities • SQL:2003 Window Functions • OVER clauses • Multiple PARTITION BY and ORDER BY supported • Windowing supported (ROWS PRECEDING/FOLLOWING) • Large variety of aggregates • RANK • FIRST_VALUE • LAST_VALUE • LEAD / LAG • Distrubutions

Hive Data Type Conformance • Data Types: • Add fixed point NUMERIC and DECIMAL type (in progress) • Add VARCHAR and CHAR types with limited field size • Add DATETIME • Add size ranges from 1 to 53 for FLOAT • Add synonyms for compatibility • BLOB for BINARY • TEXT for STRING • REAL for FLOAT • SQL Semantics: • Sub-queries in IN, NOT IN, HAVING. • EXISTS and NOT EXISTS

Questions?

Thank You! Questions & Answers

Stinger Initiative: Deep Dive