- 57 Views
- Uploaded on
- Presentation posted in: General

Applying Control Theory to Stream Processing Systems

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Applying Control Theory to Stream Processing Systems

Wei Xu (xuw@cs.berkeley.edu)

Bill Kramer (kramer@lbl.gov)

Joe Hellerstein ( hellers@us.ibm.com )

TCQ drops tuples silently if result queue is full

TCQ

Complex internal structure

Data Source

Input Buffer

- Data source does not provide accurate data rate

- TCQ node drops tuples when result queue fill up

Source

Buffer

TCQ

Result Q

- Providing an accurate data source
- Get the actual data rate

- Regulate queue length on TCQ node
- Prevent dropping tuples
- Maximize throughput (and adapts when disturbance happens)

2

Queue Length Monitor

Controlled

Data Source

Output Rate

Controller

PI Controller

P Controller

P Controller with Pre-compensation

PI Controller

Source

Buffer

TCQ

Result Q

Source

Buffer

TCQ

Result Q

- One of my implementations .. What happened?

Source

Buffer

TCQ

Result Q

Controlled

Output Thread(Code Reuse)

Queue Length

Controller

Desired

Queue length

Data Rate to TCQ

Actual Queue Length

Output Y from simulation

Queue length

Time

Model evaluation – Making the system operate in desired range

Data rate vs free space

Free Space

Non-Linear range

Easy for data source, but queue length ..

A lot of small disturbance in a Java program

Incremental garbage collection

P Controller

PI Controller

- Advantages of feedback control
- Make system more robust under disturbance
- Treat complex systems as black boxes
- Cope with the system characteristics instead of having to change it

- Encourage reporting system statistics
- Implementation is easy and has theoretical guarantees

- Load balancer
- Smaller sample time to reduce disturbance caused by Java GC?
- Controller on scheduling of system shared by multiple streams

- Problems and Motivation
- Controller design
- Result
- Discussion

Tuples

TCQ Node

Tuple

Blocks

Routing

Logic

Input Buffer

Data

Source

TCQ Node

Load Splitter

Tuples

Queue length

- Operation of Load Splitter
- Arriving blocks wait in Input Buffer
- Tuples are routed to balance TCQ queue lengths
- Stop routing if queue length is too large to avoid tuple discards

Revised

We know

Y(k) , and we know what we want y(k+1) to be.. Use transfer function to solve for u(k)…

(Expected result – accuracy and disturbance ) -- do be done

y(k+1)=ay(k)+bu(k)

Regression

Model evaluation – A data rate that make it operate in linear range