1 / 36

Go Stream

Go Stream. Matvey Arye , Princeton/Cloudflare Albert Strasheim , Cloudflare. Awesome CDN service for websites big & small Millions of request a second peak 24 data centers across the globe. Data Analysis. Customer facing analytics System health monitoring Security monitoring

marrim
Download Presentation

Go Stream

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Go Stream Matvey Arye, Princeton/Cloudflare Albert Strasheim, Cloudflare

  2. Awesome CDN service for websites big & small Millions of request a second peak 24 data centers across the globe

  3. Data Analysis • Customer facing analytics • System health monitoring • Security monitoring => Need global view

  4. Functionality • Calculate aggregate functions on fast, big data • Aggregate across nodes (across datacenters) • Data stored at different time granularities

  5. Basic Design Requirements • Reliability – Exactly-once semantics • High Data Volumes

  6. Our Environment Source Storage Stream processing Source

  7. Basic Programming Model Storage Op Op Op Storage Storage Op Op Op Storage Op Op Op

  8. Existing Systems S4 The reliability model is not consistent Storm Exactly-once-semantics requires batching Reliability only inside the stream processing system What if a source goes down? The DB?

  9. The Need For End-to-End Reliability Source Storage Stream Proccessing When source comes back up where does it start sending data from? If using something like Storm, need additional reliability mechanisms

  10. The Takeaway Need end-to-end reliability - Or- Multiple reliability mechanisms Reliability of stream processing not enough

  11. Design of Reliability • Avoid queuing because destination has failed • Rely on storage at the edges • Minimize replication • Minimize edge cases • No specialized hardware

  12. Big Design Decisions End-to-end reliability Only transient operator state

  13. Recovering From Failure Source I am starting a stream with you. What have you already seen from me? Storage I’ve seen <X> Source Okie dokie. Here is all the new stuff.

  14. Tracking what you have seen Store identifier for all items Store one identifier for highest number 4 3 2 1

  15. Tracking what you have seen Store identifier for all items The answer to what have I seen is huge Requires lots of storage for IDs Store one identifier for highest number Parallel processing of ordered data is tricky 4 3 2 1

  16. Tension between Ordering Reliability Parallelization High Volume Data

  17. Go Makes This Easier Language from Google written for concurrency Goroutine I run code Goroutine I run code Channels send data between Go routines Most synchronization is done by passing data

  18. Goroutine Scheduling Channels are FIFO queues that have a maximum capacity So goroutine can be in 4 states: • Executing Code • Waiting for a thread to execute code • Blocking to receive data from a channel • Blocking to send data to a channel Scheduler optimizes assignment of goroutines to threads.

  19. Efficient Ordering Under The Hood Count of output tuples for each input 1 Input tuple 2 Actual result tuples 3 4 Reading from each worker: Read one tuple off the count channel. Assign count to X Read X tuples of the result channel Source distributes items to workers in a specific order

  20. Intuition behind design Multiple output channels allows each worker to write independently. Count channel tells reader how many tuples to expect. Does not block except when result needed to satisfy ordering. Judicious blocking allows scheduler to use blocking as a signal for which worker to schedule.

  21. Throughput does not suffer

  22. The Big Picture - Reliability • Source provide monotonically increasing ids • per stream • Stream processor preserves ordering • per source-stream • Central DB maintains mapping of: Source-stream => highest ID processed

  23. Functionality of Stream Processor • Compression, serialization • Partitioning for distributed sinks • Bucketing • Take individual records and construct aggregates • Across source nodes • Across time – adjustable granularity • Batching • Submitting many records at once to the DB • Bucketing and batching all done with transient state

  24. Where to get the code Stable https://github.com/cloudflare/go-stream BleedingEdge https://github.com/cevian/go-stream arye@cs.princeton.edu

  25. Data Model Streaming OLAP-like cubes Useful summaries of high-volume data

  26. Cube Dimensions 01:01:01 Time 01:01:00 bar.com/m bar.com/n foo.com/q foo.com/r URL

  27. Cube Aggregates (Count, Max) 01:01:01 bar.com/m

  28. Updating A Cube Request #1 bar.com/m 01:01:00 Latency: 90 ms 01:01:01 Time (0,0) (0,0) (0,0) (0,0) 01:01:00 bar.com/m bar.com/n foo.com/q foo.com/r URL

  29. Map Request To Cell Request #1 bar.com/m 01:01:00 Latency: 90 ms 01:01:01 Time (0,0) (0,0) (0,0) (0,0) 01:01:00 bar.com/m bar.com/n foo.com/q foo.com/r URL

  30. Update The Aggregates Request #1 bar.com/m 01:01:00 Latency: 90 ms 01:01:01 Time (1,90) (0,0) (0,0) (0,0) 01:01:00 bar.com/m bar.com/n foo.com/q foo.com/r URL

  31. Update In-Place Request #2 bar.com/m 01:01:00 Latency: 50 ms 01:01:01 Time (2,90) (0,0) (0,0) (0,0) 01:01:00 bar.com/m bar.com/n foo.com/q foo.com/r URL

  32. Cube Slice 01:01:59 Slice 01:01:58 … Time 01:01:01 01:01:00 bar.com/m bar.com/n foo.com/q foo.com/r URL

  33. Cube Rollup URL: foo.com/* Time: 01:01:01 URL: bar.com/* Time: 01:01:01 Time 01:01:00 bar.com/m bar.com/n foo.com/q foo.com/r URL

  34. Rich Structure E D … (5,90) A 01:01:59 B (3,75) bar.com/m 01:01:58 (8,199) C bar.com/n (21,40) 01:01:01 foo.com/q 01:01:00 foo.com/r

  35. Key Property 2 types of rollups • Across Dimensions • Across Sources We use the same aggregation function for both Powerful conceptual constraints Semantic properties preserved when changing the granularity of reporting

  36. Where to get the code Stable https://github.com/cloudflare/go-stream BleedingEdge https://github.com/cevian/go-stream arye@cs.princeton.edu

More Related