Presented by H. Mark Okada CMPT 820 February 18, 2009

Sections 14.1 - 14.4Streaming Media on Demand and Live BroadcastMultimedia over IP and wireless networks: compression, networking, and systemsMihaela van der Schaar & Philip A. Chou Presented by H. Mark Okada CMPT 820 February 18, 2009

Streaming Media • Media on demand: a user scenario characterised by audio or videoplayback locally from a CD or DVD • interactive controls: fast forward, pause, seek, etc. • Live broadcast: a user scenario characterised by tuning into a radio or television program • only has ability to join or leave a session • Both are prevalent in the internet today Eg. • interactive music and video playback • internet radio • chapter 14 looks at how these services are available • Sections 14.2-14.4 will only cover media on demand

Overview Section 14.2 • Overview of • Architectures • Protocols • Format issues Section 14.3 • Buffering and timing fundamentals Section 14.4 • How media data is communicated for streaming on demand NOT COVERED - Section 14.5 • Live broadcast

Architectures - 14.2.1 • Streaming media on demand and live broadcast require different architectures Figure 14.1

Streaming media on demand • source of media is encoded off line to a media file • streaming using different protocols (Section 14.2.2) • media file may be specialized to support various modes of streaming (discussed in Section 14.2.3) • client temporarily buffers encoded media into decoder buffer • temporarily buffers decoded media in a render buffer • fairly short (a frame or two) as it has large decoded frames • enable experience through playback commands • play, FF, stop, seek Communication between server & client tailored to • client’s resources • network connection Figure 14.1a

Progressive downloading • type of streaming - media can be streamed faster than playback. i.e. downloading entire file • If able to decode sequentially • progressive downloading can be done through simple file transfer protocols • eg. FTP, HTTP both over TCP/IP (i.e. over FTP or through a web server) • If limited buffer • progressive downloading can be done using simple TCP flow control • allows client to accept data from TCP only if there is space in media buffer • popularised by SHOUTcast, an early music streaming service network bandwidth > media content bit rate (the source coding rate)

Progressive downloading • type of streaming - media can be streamed faster than playback. i.e. downloading entire file • need to account for network jitter, temporary interferences • want highest possible source coding rate (not less than worst case network bandwidth) • These are much of the issues for media on demand, and the communication protocol between the client and server network bandwidth > media content bit rate (the source coding rate)

Live broadcast • encoder may be directly connected to the server through an encoder buffer • encoder buffer contains limited data to maintain fixed and short end-to-end delay • server accesses data at the playback point, not in any arbitrary data in a file • restricts adaptivity, important for multiple receivers • not possible to have interactive access to media • difficult to adapt transmission rate of varying clients** • difficult for server to use retransmission-based error control • due to negative acknowledgement (NAK) implosion problem • error becomes delicate issue for live broadcast **receiver-driven layered multicast (RLM) allows adaptation of transmission rate Also see: S. R. McCanne. Scalable Compression and Transmission of Internet Multicast Video. Ph.D. thesis, The University of California, Berkeley, CA, December 1996. S. R. McCanne, V. Jacobson, and M. Vetterli. “Receiver-Driven Layered Multicast,” in Proc. SIGCOM, pages 117–130, Stanford, CA, August 1996. ACM.

Protocols - 14.2.2 • streaming on demand requires many protocols at different levels This section covers a subset of the protocols described in week 2 of this class • RTP: Real-Time Protocol • RTSP: Real-Time Streaming Protocol • RTCP: Real-Time Control Protocol • SIP: Session Initiation Protocol

Real-time streaming protocol (RTSP) • RFC 2326 At the topmost level: • application level protocol • protocols for content discovery • connection to specific streaming media server Content discovery is done “out of band” eg. http://www.microsoft.com/directory/contentname.asx http://www.realnetworks.com/directory/contentname.ram http://www.apple.com/directory/contentname.mov • URL pointing to metadata that references a separate file on a webserver • different for each type: asx, ram, mov Client contacts server using URL for the content. eg. rtsp://wms.microsoft.com/directory/contentname.wmv rtsp://helixserver.example.com/audio1.rm?start=55&end=1:25 rtsp://qtserver.apple.com/directory/contentname.mov • Prefix: indicates the streaming protocol used • Suffix: info to the server, eg. seek, play speed, etc.

Example of auxiliary file Microsoft ASX file <ASX Version="3.0"> <ENTRY> <REF HREF="mms://streamingmedia/studios/0505/24721/MTV_XBOX_preview_160k.wmv" /> </ENTRY> <ENTRY> <REF HREF="mms://winmedianw/studios/0505/24721/MTV_XBOX_preview_160k.wmv" /> </ENTRY> </ASX> RealNetworks RAM file # First URL that opens a related info pane. rtsp://helixserver.example.com/video3.rm?rpcontextheight=350 &rpcontextwidth=300&rpcontexturl="http://www.example.com/relatedinfo2.html" &rpcontexttime=5.5&rpvideofillcolor=rgb(30,60,200) # # Second URL that keeps the same related info pane, # but changes the media playback pane’s background color. rtsp://helixserver.example.com/video4.rm?rpcontexturl=_keep &rpvideofillcolor=red Figure 14.2

Streaming protocol • commands typically sent reliably over TCP connection (many forms) • Real Time Streaming Protocol (RTSP) is widely adopted (RFC 2326) • Idea is simple but SET_PARAMETER can be complicated • a media file may have multiple streams for audio and video for different languages, subtitles, source coding rates, etc.

Real-time protocol (RTP) • Client is able to specify which lower level data transport protocol to use • data transport is usually either • RTP over UDP, or • RTP over TCP • Both are preferred for bandwidth efficiency • RTP over UDP - must be a means of transmission rate and error control • no standard means of transmission rate and error control for RTP • HTTP over TCP may be used when avoiding firewall issues

Real time control protocol (RTCP) • RFC 3551 • often used with RTP • often receivers provide statistical feedback to sender (reports) • the interoperable and proprietary features limit the use as a standard

Windows Media system • RTP over UDP • normally transmission rate control based on source coding rate of content • client can detect congestion • signal server to lower or increase source coding rate

Alternative methods of transmission rate control 1) TFRC: TCP-friendly rate control 2) TCP-like congestion control algorithm • Both are being standardised as two profiles in Datagram congestion control protocol (DCCP) • Must be paired with a source coding algorithm so that coding rate is same as transmission rate… • Source coding rate control algorithm • Eg. rate-distortion optimised (RaDiO) scheduling algorithm • error control in Windows Media use selective retransmission • gaps sends a NAK to the server (negative acknowledgement), causing retransmission • audio has higher priority than video • Windows media players stalls if missing audio packets and waits for arrival

File formats - 14.2.3 Challenging to adapt fixed media file to various network and client conditions • encoding must be done before streaming (no knowledge of context) • allow flexibility into media file Unrealistic to: • compress or transcode to needs of every client • best way is to allow server to select which parts of the file to stream

Some streaming formats The Major players • MPEG-4 format • QuickTime format (MPEG-4 is based) • RealMedia format • Microsoft Advanced streaming format (ASF) All have ability to contain/multiplex multiple media and versions of each medium • recorded into a track (MPEG-4/QT) or stream (ASF) • data units: made of chunks (MPEG-4/QT) or packets (ASF)

Streaming formats • Each has a header containing metadata relating to overall file and specific tracks or streams • title, author, date, encryption, right managements, table of contents, track/stream enumeration & their descriptions • Information on individual track/stream properties • start time, duration, bit rate, buffer size, sampling rate, picture size, scalability capabilities • Time-varying metadata can be associated with each track/stream • network packetisation, decoding and presentation time stamps, SMPTE time codes, key frame, switch frame • Two types of metadata • static metadata: size independent of length of data, inexpensive to transmit over the network • time-varying metadata: size grows with data, expensive to transmit

Streaming formats • … • provides a structure to allow a method to select parts of data to transmit Either • course grained: server streams only a particular subset of streams to client • fine grained: in addition allows fraction of the data to be chosen • Can set a Lagrange multiplier parameter which determines which data units are not transmitted

Encoding media into a stream Two methods 1) Multibit rate (MBR) • multiple independent encodings (each with varying coding rates) are stored in separate streams (in same file) • choice in which streams to play 2) scalable coding • later on section 14.3.3

Data units • use packets • eg. H.264/AVC use Network Adaption Layer (NAL) • In general, local playback/storage not suitable for streaming • hard for server to choose the right portions of the file to stream • difficult to randomly access (seek) arbitrary points in the stream

Fundamental abstractions - 14.3 Fundamental abstractions of streaming media on demand (Section 14.3) • Section covers • leaky bucket models of bit streams • constant bit rate (CBR) vs. variable bit rate (VBR) • compound (multiple media) streams • preroll delay • playback speed timing • timing • clocks • decoder and presentation timestamps • Should know when it is safe for client to begin playback

Buffering and leaky bucket models Scenario 1 - constant bit rate (CBR) • isochronous** noiseless communication channel • encoder buffer in between encoder and channel • decoder buffer in between channel and decoder • schedule – sequence of bits which successive bits in an encoded bit stream pass a given point in pipeline **isochronous - equal amounts of data are communicated in equal amounts of time Figure 14.3 Figure 14.4 B bits = Encoding buffer + Decoding buffer Encoding buffer Decoding buffer

Buffer tube • Can view previous as a buffer tube • Characterised with 3 parameters • R - slope • B - height in bits • Fe - offset/fullness from bottom of tube • Or by Fd - offset from top of tube • Fd = B - Fe Can view previous as a buffer tube • From a buffer point of view • overflow in of encoder buffer => decoder buffer underflow • underflow in of encoder buffer => decoder buffer overflow • B = encoder buffer + decoder buffer • Fe - initial fullness of encoder buffer • managed by a rate control algorithm • assigns a number of bits b(n) to each frame n

Buffer tube • Managed by a rate control algorithm • assigns a number of bits b(n) to each frame n • B = encoder buffer + decoder buffer • Fe - initial fullness of encoder buffer • De initial delay before entering channel De = Fe/R • Dd = Fd/R delay after data extracted by the decoder from the channel (R,B,F) tube Aim to keep decoder buffer delay Dd = Fd/R low Figure 14.5

Variable bit rate stream (VBR) Scenario 2 - variable bit rate stream (VBR) • Unlike CBR, VBR has a variable amount of data per time segment • higher bitrate for complex segments • lower bitrate for less complex segments • tend to have wider buffer streams => larger start-up delay • part of an overall problem: difficult to determine the average bit rate of system

Variable bit rate stream (VBR) • Recall the (R,B,F) tube • each parameter is not unique for a given bit stream Definitions of average rate is non trivial • fit the closest slope along the stairwell, or • number of bits in stream / duration of stream

Variable bit rate • encoder does not use channel continuously • channel has peak transmission rate R higher than average stream bit rate • when needed, sends packets at rate R • otherwise at 0 • typical of packet network and shared channels • best modelled by leaky bucket Defined by (R, B, Fe) • n: frame number • b(n): number of bits placed in leaky bucket • τ(n): time that frame n is processed • R: bit rate of data leaked out of bucket • Fe(n) fullness of en. buffer before frame n added • Be(n) fullness of en. buffer after frame n added • has schedule

Leaky bucket • Be(n) fullness of encoder buffer after frame n added to bucket • Fe(n) fullness of encoder buffer before frame n added to bucket • Be(n) < B for all n = 0, 1, … N • Aim is to find smallest decoder buffer size and smallest decoder buffer delay

Leaky bucket For a given stream, define: • Minimum bucket capacity with leak rate R and given initial fullness Fe Bmin(R,Fe) = minnBe(n) • Initial decoder buffer fullness • Derives that there is a minimum capacity B as well as minimum decoder buffer delay Dd = Fd / R, provided it starts with initial fullness Fe = Femin (R) • Source coding rate (Rc): maximum leak rate R such that a leaky bucket (R, B, Fe) does not underflow with initial fullness Fe = Femin(R) • larger leak rates R => smaller required capacity

Leaky bucket • If transmission rate R > source coding rate Rc • Decoder buffer reduced • Decoder buffer delay also reduced • client can determine required buffer size and preroll delay • use functions Bmin(R) and Fdmin(R) • computed off line at set of transmission rates R, R1 < R2 < · · · < RL • stored in the bit stream header as a set of leaky bucket parameters (Ri , Bi , Fi ) • where Bi = Bmin(Ri) and Fi = Fdmin(Ri) • each i ∈ L represents the breakpoints in piecewise linear function in Bmin(R) and Fdmin(R) • can estimate by linear interpolation (and extrapolation at ends) at any point R can estimate Bmin(R) and Fdmin(R) Figure 14.7

Leaky bucket Linear interpolation of Bmin(R) and Fdmin(R)

Compound streams (section 14.3.2) • Compound streams encapsulate many streams meant to played and streamed concurrently • view as a single compound stream and a set of leaky buckets • a leaky bucket (B,F,R) is the sum of its component leaky buckets • eg. If audio has bucket (Ra,Ba,Fa), and video has bucket (Rv,Bv,Fv), then parameters sum: • R = Ra + Rv • B = Ba + Bv • F = Fa + Fv • Find a combination of each leaky bucket s.t. the combined leaky bucket won’t overflow

Compound streams • Find a combination of each leaky bucket s.t. the combined leaky bucket won’t overflow • combination of i in La and j in Lv • minimising using Lagrangian shows that there are at most La + Lv index pairs, that lie on set • can extend this into M concurrent media streams

Multibit rate (MBR) • multiple independent encodings (each with varying coding rates) are stored in separate streams (in same file) • choice in which streams to play • mutually independent, each at different source coding rates • combining all possible mutually exclusive streams (eg. audio Na and video Nv) each with a different leaky bucket • most combinations of Na × Nv not likely, typically are Na + Nv • use distortion rate approach

Distortion-rate approach Decide which streams to pair • assign a distortion Dia and source coding rate Ria to each audio stream in i = 0… Na • assign a distortion Djv and source coding rate Rjv to each video stream in j = 0… Nv • For each (i,j) combined stream, define distortion and source coding rate • Where α: arbitrary weight relative to video distortion • using Lagrangian again, can find the lowest total distortion among all combinations with same or lower total bit rate • can extend this to other sets of media

Temporal coordinate systems and timestamps (section 14.3.4) • Each frame has a decoder timestamp (DTS) in (MPEG terminology) • instructs client when to decode it • also acts as a decoding deadline • presentation bufferholds decoded frames before the renderer • assigned presetation timestamp (PTS), instructs when to play • critical in synchronising different streams • PTS are a layer above the DTS • Note that presentation order ≠ decoding order • Eg. I0, B1, B2, P3, B4, B5, P6, ... (presentation order) I0, P3, B1, B2, P6, B4, B5, ... (decoding order) • assumed that frames are time stamped with DTS and PTS • book will only use DTS

clocks (temporal coordinate system) • media time τ: clock for device used to capture and timestamp original content (real time) • client time t: clock for device playing content eg. • τDTS(0), τDTS(1), etc. • tDTS(0), tDTS(1), etc. Converting is done by • Where • v is the playback rate (v=2 => playing 2x the speed) • t0 and τ0 are common initial events (first frame after seeking/rebuffering)

Leaky bucket update • Leaky bucket update becomes where • R´ = Rv is the arrival rate of bits into client (unit: bits/client time) • R = R´/v rate that must be used to compute required buffer size Bemin(R) and initial decoder buffer fullness • preroll delay is Fdmin(R)/R´ = Fdmin(R)/Rv • larger playback speed => smaller preroll delay

Packet networks - 14.4 • RC: source coding rate • RS: sending rate - rate at which data injected into transport layer • Measured in bits/s of client time • RX: transmission rate - rate which data injected into network layer (TCP or UDP) • RX - RS = error control overhead • RS / RX = channel coding rate • Ra: arrival rate • assumed to be RS • usually set to Ra = vRc Decoupling Rc and Ra has advantages Figure 14.8a

Decoupling Ra = vRc • Adjusting source coding rate defined by problem source coding rate control • Choose Rc as a function of Ra • Change client buffer duration and history • Have variety of average bit rates R(1), R(2), … • Each with tight buffer tube (R(i),B(i),Fe(i)) • Can delay playback to ensure guaranteed continuous playback

Control theoretic model - 14.4.2.1 • Client buffer - gap between frame arrival time ta(n) and its playback deadline td(n) • Overflow when gap too large • Underflow when gap too small • If gap shrinks, must reduce Rc to adjust tb(n) Figure 14.9

Control Objective - 14.4.2.2 • Underflow prevented by previous section • Quality fluctuates to complexity of content • Target schedule has a margin of safety • Introduces a penalty to the cost function • Deviation of buffer tube from target schedule • Coding rate difference between successive frames

Target schedule design - 14.4.2.3 • Want smallest client buffer duration • Start with small delay, and increase gap • Slope is the average source coding rate to the average arrival rate • If upper bound aligns with target schedule • tb(n) = tT(n) Eventually want logarithmic growth of buffer Figure 14.10

Controller design - 14.4.2.4 • Adjust source coding rate • Controller needs to change n+2 frame at time n • Uses notion of an error e(n) and a vector feedback gain G • Optimal G* is solved

Controller interpretation - 14.4.2.6 • Virtual frame rate is used to reduce feedback rate and as it is difficult to specify a frame rate for merged streams • Start with source coding rate 1/2 of arrival rate to build up the client buffer duration Figure 14.11a

Presented by H. Mark Okada CMPT 820 February 18, 2009

Presented by H. Mark Okada CMPT 820 February 18, 2009

Presentation Transcript

Presented by H. Mark Okada CMPT 820 February 18, 2009

Bucharest, 18 February 2009

Presented by Mark R. Winkelman, Owner Winkelman Consulting October 18, 2012

Farid Molazem Cmpt 820 Fall 2010

Presented by Benjamin J. Springer of the University of Utah February 18, 2009

Wednesday February 18, 2009

Wednesday 18 February 2009

Presented by George H. Groves, Chairman January 28, 2009

Presented by: Mark Hepple

State of BER February 18, 2009

Outlook 09 Changing Times, Emerging Trends Tuesday 17 February 2009 Presented by Mark McCrindle

Bluetooth Presented by Venkateshwar R Gotur CMPT - 320

Presented by: Prof Mark Baker

Presented by: 26 FEBRUARY 2014

NGAO Report February 18, 2009 Waimea

Presented by: Mark Hendricks

Presented by: Jonathan Claussen 18 February 2009

Chennai, India February 18, 2009

Presented by Gary Rutherford, Ed.D. Superintendent February 24, 2009

Presented by: Maggie Savelberg On: February 18, 2009

Social Platform : 18 February 2009

Presented by: Mark McGinnis/Swales