1 / 20

Scalable Join Processing in Global-Scale Scientific Federations

This paper discusses the challenges and techniques for join processing in large-scale scientific federations, focusing on throughput optimization and global-scale join processing in heterogeneous networks. The authors propose a new metric and optimization goal to balance network usage and computation, and introduce algorithms for identifying network structure and optimizing join processing in non-uniform and non-metric networks.

Download Presentation

Scalable Join Processing in Global-Scale Scientific Federations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Throughput-Optimized, Global-Scale Join Processing in Scientific Federations Xiaodan Wang, Randal Burns, and Andreas Terzis The Johns Hopkins University

  2. Data volume and geography deter scalability • Performance is network bound • Intermediate results are often hundreds of megabytes • 30 sites across North America, Europe, Asia • Community has identified 100 sites to be included

  3. Join Processing in Heterogeneous Networks • Query plans optimized for scalability • Without latency/response time constraints • On global-scale, heterogeneous networks • For applications that transfer hundreds of MBs among continents • Balanced utilization of all network paths • A new query optimization goal (metric) • Exploit excess capacity where available • Avoid narrow, long-haul paths when possible • Join processing techniques and algorithms • Identifying network structure: clusters of sites and path throughput • Optimize for non-uniform and non-metric networks • Balance network usage and computation

  4. Why do we need a new metric? • Minimizing response time • Consumes all available resources to achieve the goal • Minimizing computation costs • Does not address network bound applications • Minimizing the volume of network traffic • Insensitive to network heterogeneity • And we are concerned with • Polynomial-time algorithms for large-scale federations • Avoiding multi-objective optimization

  5. count * • SkyQuery’s computation oriented optimization • Schedule sites in order of increasing cardinality • Minimizes computation costs under several assumptions • Perfect join selectivity (holds in practice) • Computation costs linear in the size of intermediate results (because it’s an index join) • Occasionally transfers data across the Atlantic multiple times

  6. Balanced Network Utilization • Cost of using a path is product of the volume of data transmitted and the inverse TCP throughput • Cost of a schedule is the sum over all paths • Takes advantage of path heterogeneity • By using higher-throughput paths proportionally more • Reduces contention on narrow, long-haul paths • By making them costly • But, its not a direct measure of scalability • Does not load balance paths over multiple queries

  7. Path Throughput • Measure throughput among all federation sites pairwise • Using a nearby PlanetLab proxy site for each SkyQuery site • 3 times a day, bulk TCP transfer • TCP throughput reflects geography • Dominant 1/distance trend correlates well with 1/RTT • But, highly non-metric • Input to scheduling

  8. Throughput Stability • Should we measure throughput more often? • Accurate measurements are intrusive (bulk-transfer) • Short duration measures are error prone (cross-traffic) • The most volatile paths are stable • <30% throughput variation

  9. Join Scheduling • Assumptions • Accurate cardinality estimates • Perfect join selectivity • Ignore the effect of attribute aggregation • Simplify one aspect of optimization (selectivity) in order to consider non-uniform, non-metric networks • cannot use Dynamic Programming in this environment as it lacks sub-problem optimality • Two algorithms based on Minimum Spanning Trees • Two-approximate balanced network utilization • Clustering variant defines computation and utilization trade-offs

  10. Spanning Tree Approximation (STA) • Inputs: pairwise throughputs, site cardinalities, and a node to which we deliver results • Min: node with lowest cardinality

  11. Spanning Tree Approximation (STA) • Construct a minimum spanning tree

  12. Achieving the Bound • From min to sink visiting all sites • Cost(STA)  2*cost (MST)  2*OPT • Same intuition as 2-approximate Euclidean TSP • STA can visit each site more than once • Applies to non-metric networks

  13. Heuristic Improvement • For paths on which the triangle inequality holds • Route directly to next unvisited node • 30% improvement in practice • Identify and use metric regions in the network

  14. Clustered-STA • Well-connected clusters separated by narrow, long-haul paths • Optimize for computation inside clusters (count *) • Optimize balanced network utilization among clusters (STA)

  15. Clustering Sites • Organize sites using Bond-Energy Algorithm • Minimize difference between adjacent elements • Extract clusters with a threshold • 3 Mbps produces 6 clusters for 30 SkyQuery sites • Define computation versus utilization tradeoff • By tuning the extraction threshold

  16. Network Utilization • Results are independent of assumptions • OPT is best serial plan • STA often finds OPT plan • C-STA performs poorly within clusters • Also poor on narrow paths due to attribute aggregation

  17. Computation Time • count * represents a “soft” lower bound • C-STA reduces computation costs

  18. Discussion • Balanced network utilization metric captures path heterogeneity • Avoids narrow, long-haul paths • Scheduling algorithms of low complexity • OPT is a viable alternative for serial plans • Limitations of C-STA • Does not really create meaningful utilization/computation tradeoffs • Threshold can only find natural clusters • Systematically aggregate attributes in each cluster • Semi-joins address these limitations • Extending this work to parallel schedules • Applicability to other workloads? OLAP?

  19. A World-Wide Telescope • Federations of sky surveys make the world’s best telescope • whole sky coverage • multi-spectral (optical, radio, infrared, x-ray) • data are always available (no clouds, no moon, day or night) • Multi-spectral and temporal experiments have already lead to many new discoveries

  20. The Crossmatch Query SELECT O.object_id, O.right_accession, T.object_id FROM SDSS:Photo_Object O, TWOMASS:Photo_Primary T, FIRST:Primary_Object P WHERE AREA (185.0,-0.5,4.5) AND XMATCH (O,T,P) <3.5 AND O.type= GALAXY AND (O.i_flux - T.i_flux)>2}

More Related