1 / 23

Applying Control Theory to Stream Processing Systems

Applying Control Theory to Stream Processing Systems. Wei Xu ( xuw@cs.berkeley.edu ) Bill Kramer ( kramer@lbl.gov ) Joe Hellerstein ( hellers@us.ibm.com ). TCQ drops tuples silently if result queue is full. Description of the system. TCQ Complex internal structure. Data Source.

bona
Download Presentation

Applying Control Theory to Stream Processing Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Applying Control Theory to Stream Processing Systems Wei Xu (xuw@cs.berkeley.edu) Bill Kramer (kramer@lbl.gov) Joe Hellerstein ( hellers@us.ibm.com )

  2. TCQ drops tuples silently if result queue is full Description of the system TCQ Complex internal structure Data Source Input Buffer

  3. Why do we need control? • Data source does not provide accurate data rate

  4. Why do we need control? • TCQ node drops tuples when result queue fill up Source Buffer TCQ Result Q

  5. Control Problems • Providing an accurate data source • Get the actual data rate • Regulate queue length on TCQ node • Prevent dropping tuples • Maximize throughput (and adapts when disturbance happens)

  6. 2 Queue Length Monitor System with Control Controlled Data Source Output Rate Controller

  7. PI Controller The Control Architecture P Controller

  8. Result – An accurate data source P Controller with Pre-compensation PI Controller

  9. Result – regulating queue length Source Buffer TCQ Result Q

  10. Result – Under CPU Contention Source Buffer TCQ Result Q

  11. Why theory is useful? • One of my implementations .. What happened? Source Buffer TCQ Result Q

  12. What is going on? Controlled Output Thread(Code Reuse) Queue Length Controller Desired Queue length Data Rate to TCQ Actual Queue Length

  13. Output Y from simulation Theory meets reality Queue length Time

  14. Tricky part of parameter estimation Model evaluation – Making the system operate in desired range Data rate vs free space Free Space Non-Linear range Easy for data source, but queue length ..

  15. Settling Time and Overshoot matters A lot of small disturbance in a Java program Incremental garbage collection P Controller PI Controller

  16. Conclusion • Advantages of feedback control • Make system more robust under disturbance • Treat complex systems as black boxes • Cope with the system characteristics instead of having to change it • Encourage reporting system statistics • Implementation is easy and has theoretical guarantees

  17. Future Work • Load balancer • Smaller sample time to reduce disturbance caused by Java GC? • Controller on scheduling of system shared by multiple streams

  18. Backup Slides

  19. Outline • Problems and Motivation • Controller design • Result • Discussion

  20. Description of the System Tuples TCQ Node Tuple Blocks Routing Logic Input Buffer Data Source TCQ Node Load Splitter Tuples Queue length • Operation of Load Splitter • Arriving blocks wait in Input Buffer • Tuples are routed to balance TCQ queue lengths • Stop routing if queue length is too large to avoid tuple discards Revised

  21. Compare to Open Loop Control We know Y(k) , and we know what we want y(k+1) to be.. Use transfer function to solve for u(k)… (Expected result – accuracy and disturbance ) -- do be done

  22. Estimation of the transfer function y(k+1)=ay(k)+bu(k) Regression

  23. Tricky part of parameter estimation Model evaluation – A data rate that make it operate in linear range

More Related