Real time stream processing architecture for comcast ip video
Sponsored Links
This presentation is the property of its rightful owner.
1 / 20

Real-time Stream Processing Architecture for Comcast IP Video PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Real-time Stream Processing Architecture for Comcast IP Video. Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau. Agenda. Comcast VIPER Overview Architecture Overview Q & A. Comcast Video IP Engineering and Research (VIPER).

Download Presentation

Real-time Stream Processing Architecture for Comcast IP Video

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Real-time Stream Processing Architecture for Comcast IP Video

Strata Conference + HadoopWorld 2013

Chris Lintz

Gabriel Commeau


  • Comcast VIPER Overview

  • Architecture Overview

  • Q & A

Comcast Video IP Engineering and Research (VIPER)

Preparation DeliveryVideo Players







Video Players

Xbox Live




Why Do We Focus on Real-time?

  • Proactively diagnose issues

  • Form real-time intelligence

  • Help deliver best possible video experience


Prime Time

Video Player Analytics Protocol

  • Live and On Demand

  • JSON event objects

  • Key metrics

    • Bitrate

    • Frame rate

    • Fragments

    • Errors

We collect and use all data in accordance with best consumer

privacy practices and applicable laws

Player Sessions: Key In Understanding Video Experience

High Level Architecture And Data Flow

Flume: Data collection Tier

  • Collect, aggregate and move large amounts of data

  • Distributed, scalable, reliable, customizable

  • Multi-tier architecture

Storm: Stream Processing Tier

Player Sessions in Real-time

  • Sessions in Flume?

    • Technical issues: consistent hash and exactly-once semantics

    • Design goals

    • Separation of concerns

  • Session write-through rate?

Flume Edge Tier: Video Player Analytics End Point

  • Analytics events over HTTPS

  • HTTP Source

  • Re-batch with inner sink and source

Flume Mid Tier: Processing and Routing Data

  • Video Player Event processing

    • Geo-location, asset metadata, validation, to-storm

  • Replication channel processor:

    • HDFS sink

    • Storm sink

Bridging Flume to Storm: Flume2Storm Connector

  • Service discovery

  • Distributed, scalable and reliable

  • Low latency

Simplified Video Player Storm Topology

Requirements for Read/Writes from Storm Bolts

  • Functionality beyond key/value stores

  • Real-time and historic window queries

  • Speed of in-memory writes and durability of disk

Utilizing MemSQL for Persistence

  • Distributed in-memory SQL database

  • ACID, highly available, fault tolerant

  • Aggregators route queries to leaves

  • Leaves are auto-sharded

  • Solves our intense


Isolated Analysts and Ingest Aggregators

Achievements In Utilizing MemSQL

  • Complex queries in milliseconds

  • Fault-tolerant Storm bolt state

  • Joins now available outside of Storm bolts

    • Foreign key shards

  • Complex data streams

    • Dynamic alters without locks or down time

    • JSON type

Wrapping Up

  • Real-time at Comcast scale

    • Millions of video players

    • Horizontal scale everywhere

    • Aggregated metrics across US and complex analysis

    • Real-time API

  • Builds foundation

    • Advanced real-time analytics

    • Better platform for innovation

      • Alerts on complex objects

      • Supplemental real-time data back to clients

      • Popularity-based CDN

Thank You

  • Login