1 / 13

Replica Placement for High Availability in Distributed Stream Processing Systems

Replica Placement for High Availability in Distributed Stream Processing Systems. Vinay Dhareshwar. Overview. Introduction Middleware System Model Designing Replica Placement for High Availability Distributed Placement Protocol Conclusion. Distributed Stream Processing Systems.

archer
Download Presentation

Replica Placement for High Availability in Distributed Stream Processing Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Replica Placement for High Availability in Distributed Stream Processing Systems Vinay Dhareshwar

  2. Overview • Introduction • Middleware • System Model • Designing Replica Placement for High Availability • Distributed Placement Protocol • Conclusion

  3. Distributed Stream Processing Systems • Event based systems deal with large volume and high rate data feeds • Data streams are processed in or near real-time • Application domains include network traffic management, financial trades surveillance, e-commerce applications • Distributed stream processing systems provide low-latency and high throughput processing of data streams

  4. Characteristics of Distributed Stream Processing Systems • Availability • Replication • Strict • Sharing of components • Failure Recovery • Large number of components

  5. Where does replica placement fit in? • Not all primary replicas can be hosted by the same server • Practical constraints in replica placement

  6. Middleware overview

  7. System Model • Residual processing capacity rpvi • Residual available bandwidth rbej • Communication latency lej • Component ci • Query Plan • Application Component Graph • Replication Component Graph • Primary/backup replication scheme • Replication degree

  8. Designing Replica Placement For High Availability • Maximizing Application Availability • Respecting Resource Availability • Maximizing Application Performance • Inter-operation communication • Intra-operation communication

  9. Distributed Placement Protocol • Phase 1: Bootstrapping • Phase 2: Propagation • Step 1: Primary Placement Selection • Step 2: Primary Placement Negotiation • Step 3: Primary Placement Evaluation • Step 4: Primary Placement decision • Phase 3: Completion • Failure Handling

  10. Algorithm 1 Placement algorithm. • Input: query plan , replication degree, node vs • Output: application component graph, replication component graph • for each node vi in path • perform transient resource allocation at vi • identify candidate nodes already used for placement • select candidate nodes meeting bandwidth requirements • sort candidate nodes by latency • for each primary replica of downstream component • send placement request or placement negotiation • receive placement reply • send placement decision • for each backup replica of current component • send placement decision

  11. Conclusion • Design principles for replica placement • Maximize availability while respecting resource constraints • Making performance aware decisions • Decentralized protocol

  12. Thank you

More Related