1 / 27

Query Task Model (QTM): Modeling Query Execution with Tasks

Query Task Model (QTM): Modeling Query Execution with Tasks. Steffen Zeuch and Johann-Christoph Freytag. Motivation. Different DBMS execute the same QEP using different schedules Run-time execution not query optimization No uniform scheduling format

genero
Download Presentation

Query Task Model (QTM): Modeling Query Execution with Tasks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Query Task Model (QTM):Modeling Query Execution with Tasks • Steffen Zeuchand Johann-Christoph Freytag

  2. Motivation • Different DBMS execute the same QEP using different schedules • Run-time execution not query optimization • No uniform scheduling format • Query execution in different DBMS are not comparable • Major differences between DBMS: • Chunk Size: Size of operator’s input • Scheduling Strategy: Execution model vs. run-time scheduler How to make different schedules comparable to explain why one schedule performs better than another?

  3. Outline • Parallel Query Execution • QTM: Query Task Model • Evaluation • Outlook

  4. Chunk Size • Column- • at-a-time • Tuple- • at-a-time • Buffer- • at-a-time t1,t2,t3 t4, t5, t6 • Selection t1 t1,t2, t3 • t1 • t2 • t3 • t4 • t5 • t6

  5. Scheduling Strategie • Hash • Probe (R) • Hash • Probe (S) • Selection • Hash • Build • Hash • Build • R • T • S

  6. Volcano Execution Model(Open-Next-Close Iterator) • Hash • Probe (R) Tuple Next • Hash • Probe (S) Tuple Next • Selection • Hash • Build • Hash • Build Next Tuple • R • T • S

  7. (Run-time) Scheduler • Spatial • Locality • Temporal • Locality • Hash • Probe (R) • Prob_R(t2) • Prob_R(t2) • Prob_R(t1) • Prob_S(t2) • Hash • Probe (S) • Prob_S(t2) • Sel(t2) • Prob_S(t1) • Prob_R(t1) • Selection • Sel(t2) • Prob_S(t1) • Sel(t1) • Sel(t1) • Time • T • Further Optimiziation Criteria: • I/O, NUMA or Memory Usage

  8. Dynamic Load Balancing • ⋈ T2 T1 T3 T4 T5 • ⋈ σ σ CPU1 CPU2 T2 T1 T3 • R • S • T T5 T4

  9. DBMS Landscape MonetDBMIL Column-at-a time MonetDB X100 DB2 BLU StagedDB Hyper DB2 PostgreSQL Buffer-at-a time SAP HANA Chunk Size System R MySQL • PostgreSQL Tuple-at-a time Volcano Execution Model (Run-time) Scheduler Dynamic Load Balancing Scheduling Strategy

  10. Outline • Parallel Query Execution • QTM: Query Task Model • Evaluation • Outlook

  11. QTM: Query Task Model • Idea:A model that describes parallel query execution with tasks • QEP: Queue of tasks • Task: Encapsulate a piece of work on some data • Goals: • Open a design space for DBMS schedules • Make main aspects of query scheduling comparable: • Execution order, degree of parallelism and thread coordination, and partitioning

  12. Query Task Model Work Data Task Queue T1 T2 T3 Processing Strategies • t1 • Table Data Queue • t2 t1 t2 t3 • t3

  13. QTM Transformation: Input QEP Table Format Hardware Architecture

  14. QTM Transformation Choosing Hash Join Max. Pipelines + Dependency Graph QEP

  15. QTM: Task Configuration • Max. Pipelines + Dependency Graph Task Configurations (Task Blueprints)

  16. QTM: Tasks Instantiation Set of Tasks (TC Instantiation) Task Configuration (Task Blueprints)

  17. QTM: Implementation Compile-time Run-time

  18. Outline • Parallel Query Execution • QTM: Query Task Model • Evaluation • Outlook

  19. Evaluation:Scenario

  20. Evaluation: Configuration

  21. Evaluation: Runtimes

  22. Evaluation: Sampling Data-related Misses Instruction-related Misses

  23. Evaluation:Miss Distribution

  24. Evaluation: Scalability

  25. Evaluation:Insights • Tradeoff between data and instruction cache performance • Sweet spot: Largest private cache size vs. slightly larger buffer • Medium sized tasks are data-efficient: • Pros: Buffer fits entirely into cache, high data locality • Cons: High number of tasks and instructions • Large tasks are instruction-efficient: • Pros: Decrease number of instructions and tasks, high instruction locality • Cons: More data cache misses if cache size is exceeded • QTM: Cache-performance can be adjusted by buffer size

  26. Outline • Parallel Query Execution • QTM: Query Task Model • Evaluation • Outlook

  27. Outlook • Contributions: • QTM: A model for parallel query execution using tasks • Open a design space for DBMS schedules • Make different schedules present in different DBMS comparable • Thanks! • Future Work: • Cost Model • Transformation process for an arbitrary QEP

More Related