310 likes | 357 Views
This study presents Flux, a fault-tolerant load-balancing exchange operator for continuous query (CQ) systems that adapts to changing conditions and addresses scalability issues. The Flux operator efficiently manages short-term and long-term imbalances, memory constraints, and offers memory-adaptive features. Experimental results demonstrate Flux's effectiveness in balancing processing loads and enabling graceful degradation under varying conditions. The study concludes with insights on Flux's adaptability and its significance in enhancing CQ system performance.
E N D
Flux Flux: An Adaptive Partitioning Operator for Continuous Query Systems M.A. Shah, J.M. Hellerstein, S. Chandrasekaran, M.J. Franklin UC Berkeley Presenter: Bradley Momberger
Overview • Introduction • Background • Experiments and Considerations • Conclusion
Introduction • Continuous query (CQ) systems • Create unbounded, streaming results from unbounded, streaming data sources. • May in the long run have scalability issues, due to the need for fast response times, the possibility of large numbers of users, and the management of potentially large histories. • Are only as fast as their constituent operators will allow.
Parallelism • Traditional parallelism techniques • Poor fit for CQ systems • Not adaptive • CQ requires adaptability to changing conditions
Overview • Introduction • Background • Experiments and Considerations • Conclusion
Background • Exchange • Producer-consumer pair • Ex-Prod: Intermediate producer instance connected to consumers • Ex-Cons: Intermediate consumer instance which polls inputs from all producers. • “Content sensitive” routing • RiverDQ • “Content insensitive” routing • Random choice of Ex-Cons target
Flux • Flux, Fault-tolerant Load-balancing eXchange • Load balancing through active repartitioning • Producer-consumer pair • Buffering and reordering • Detection of imbalances
Short Term Imbalances • A stage runs only as fast as its slowest Ex-Cons • Head-of-line blocking • Uneven distribution over time • The Flux-Prod solution • Transient Skew buffer • Hashtable buffer between producer and Flux-Prod • Get new tuples for each Flux-Cons as buffer space becomes available. • On-demand input reordering
Long Term Imbalances • Eventually overload fixed size buffers • Cannot use same strategy as short term • The Flux-Cons solution • Repartition at consumer level • Move states • Aim for maximal benefit per state moved • Avoid “thrashing”
Memory Constrained Environment • First tests were done with adequate memory • Does not necessarily reflect reality • Memory shortages • Large histories • Extra operators • Load shedding with little memory • Push to disk • Move to other site • Decrease history size • May not be acceptable in some applications
Flux and Constrained Memory • Dual-destination repartitioning • Other machines • Disk storage • Local mechanism • Flux-Cons spills to disk when memory is low • Retrieves from disk when memory becomes available • Global Memory Constrained Repartitioning • Poll Flux-Cons operators for memory usage • Repartition based on results
Overview • Introduction • Background • Experiments and Considerations • Conclusion
Experimental Methodology • Example operator • Hash-based, windowed group-by-aggregate • Statistic over fixed-size history • Cluster hardware • CPU: 1000 MIPS • 1GB main memory • Network simulation • 1K packet size, infinite bandwidth, 0.07ms latency • Virtual machines, simulated disk.
Experimental Methodology • Simulator • TelegraphCQ base system • Operators share physical CPU with event simulator • Aggregate evaluation and scheduler simulated • Testbed • Single producer-consumer stage • 32 nodes in simulated cluster • Ex-Cons operator dictates performance
Short Term Imbalance Experiment • Give Flux stage a transient skew buffer • Compare to base Exchange stage with equivalent space • Comparison statistics • 500ms load per virtual machine, round robin • Simulated process: 0.1ms processing, 0.05ms sleep • 16s runtime (32 machines 0.5s/machine)
Long Term Imbalance Experiment • Operator stage • 64 partitions per virtual machine • 10,000 tuple (800KB) history per partition • 160KB skew buffer • 0.2μs per tuple for partition processing • Network • 500mbps throughput for partitions • 250mbps point-to-point
Memory Constrained Experiments • Memory “pressure” • 768MB initial memory load • 6MB/partition 128 partitions/machine • Available memory => 512MB (down from 1GB) • Change made after 1s of simulation • 14s required to push the remaining 256MB • May be to disk or to other machines
Hybrid Policy • Combines previous policies • Memory-based policy when partitions are on disk • Minimize latency • Load-balancing policy when all partitions are in memory • Maximize throughput
Comparative Review ┌ Steady state ┐ ┌ last 20 seconds of simulation ┐
Overview • Introduction • Background • Experiments and Considerations • Conclusion
Conclusions • Flux • Is a reusable mechanism • Encapsulates adaptive repartitioning • Extends the Exchange operator • Alleviates short- and long-term imbalances • Outperforms static partitioning when correcting imbalances • Can use hybrid policies to adapt to changing processing and memory requirements.