1 / 25

Enhancing Query Processing with the Data Cyclotron Architecture

This paper presents the Data Cyclotron, a query processing scheme designed to exploit high-performance local networks for improved database management. By addressing the limitations of traditional load balancing in skewed workloads, this architecture minimizes query response time and maximizes throughput, all while preventing single points of failure. The paper explores motivations for this design, outlines the system overview, and describes the request propagation and chunk handling process. Experimental results validate the effectiveness of the Data Cyclotron in managing diverse workloads.

diep
Download Presentation

Enhancing Query Processing with the Data Cyclotron Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Data Cyclotron The Data Cyclotron Query Processing Scheme Romulo Goncalves Martin Kersten ACM Trans. Database Syst. (TODS) 36(4):27 (2011)

  2. Outline • Introduction • Motivations • Design Goals • Interaction Protocols • Experimental Results • Conclusions

  3. System Overview

  4. Motivations • High-performance local networks can be faster than disks • Load balancing based on data allocation doesn't perform well with skewed workloads

  5. Design Goals • Exploit high-performance local networks • Load balancing using data movements (handle skewed work-loads) • Minimize query response time • Maximize query throughput • No single point of failure

  6. System Overview

  7. Motivations: RDMA

  8. Motivations: RDMA

  9. Architecture

  10. The DBMS Layer

  11. Data Cyclotron Layer

  12. The Network Layer

  13. DC Layer: Request Propagation • Request_msg = (SenderID, ChunkID) • If (SenderID==me) : raise exception (data is missing!) • If ChunkID is stored here : • It was already loaded to the ring? Ignore the request msg • Is the ring is full ? • YES : add the request to the pending queue • NO : load the data to the ring • If I need the very same ChunkID : absorb the request • Else: forward the request to the next node in the ring

  14. DC Layer: Pending Requests • Check for available space at regular intervals • Handle the older request that fits the actual available space in the ring

  15. DC Layer: handling data-chunks • Data-chunk header: • Data-chunk-uid • Owner • LOI (Level Of Interest) • Copies • Hops • Cycles

  16. DC Layer: handling foreign chunks • Hops++ • If I was waiting for this Data-chunk-uid: • Wake-up pinned queries • Copies ++ • Forward the chunk

  17. DC Layer: handling local chunks How to set the LOI threshold? • Fragments are loaded/unload into the ring depending on their LOI

  18. Some questions • How does it scale? • No single point of failure? • How does it manage local bandwidth? • Resource allocation is fair?

  19. Experiments: limited ring capacity

  20. Experiments: limited ring capacity

  21. Experiments: limited ring capacity

  22. Experiments: skewed workloads

  23. Experiments: non uniform workloads

  24. Possible extensions • UPDATE queries • Pulsating rings • Nomadic queries

  25. Thank you! • Q&A time!

More Related