1 / 26

Scaling and Fault Tolerance for Distributed Messages in a Service and Streaming Architecture

This thesis proposal discusses the motivation, goals, and example applications of an architecture that aims to provide fault tolerance, scalability, and dynamic replay service in a distributed messaging system. The proposal also includes a literature survey, research issues and tasks, milestones, typical scenarios, tests, and contributions.

brantn
Download Presentation

Scaling and Fault Tolerance for Distributed Messages in a Service and Streaming Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scaling and Fault Tolerance for Distributed Messages in a Service and Streaming Architecture Thesis Proposal Hasan Bulut hbulut@cs.indiana.edu

  2. Outline • Motivation • Goals of the Architecture & Example Applications • Literature Survey • Research Issues and Tasks • Milestones • Typical Scenarios • Tests • Contributions • Summary

  3. Motivation • Collaboration systems enable people to collaborate with each other. However, there are various open research issues in these systems. Some of them are: • A more fault tolerant system • A distributed and replicated archiving system • An architecture or framework to cope with network failures • A mechanism to recover from failures while session is recorded • Playback is available only after the session is over • Playback mechanism for live sessions

  4. Motivation • An architecture or framework to recover late or broken clients • Late clients will miss parts of the session that have already passed • Extending services to unicast clients • What happens if multicast feature is disabled on the network? • Support for heterogeneous clients • Support for videoconferencing (i.e. H.323 clients) and streaming clients (i.e. RealOne player) • Support for desktops and mobile devices such as cellular phones.

  5. Goals of the Architecture • A service oriented architecture • Provide RTSP (Real Time Streaming Protocol) semantics • Compatible with Web Services standards and technologies • Persistent and fault tolerant architecture • A distributed and replicated archiving system in a messaging system environment • Dynamic replay service. Ability to switch among distributed replay services in case node failures • Scalable architecture • Allow a large number of clients to connect to the system. • Allow heterogeneous (different types of) clients to connect to the system

  6. Goals of the Architecture • Provide a flexible and extendable framework for new services • Allow instant replay of streams. With this feature, it would be possible to annotate streams • Improve Quality of Service (QoS) • Time ordering of events • Maintaining the time spacing between consecutive events • Enable late and broken clients to receive the past events (streams) • A generic architecture that can work with any collaboration tool, such as audio/video, whiteboard, text chat etc.

  7. Example Applications • Consider a late client joining live audio/video session. This client has three options: • Does not care about the missed stream. • Plays the missed stream in a faster mode until he/she catches up with the live stream. • Plays the stream from the beginning and follows the live session from behind. • The stream is not necessarily a video stream. It can be events from a shared displays/applications such as whiteboards or from other collaboration tools. • Client can play a 2-hour long archived stream in 30 min (scaling 2-hour stream to 30-min stream).

  8. Literature Survey • Collaboration systems • Access Grid, InSORS, VRVS, Web based collaboration tools (WebEx, Centra) • Archiving and replay services used in collaboration systems • Voyager, IG Recorder • Streaming media standards • SMIL, RTSP (RFC 2326), RTP/RDT, data types such as H.261, H.263, MPEG-4, RealMedia • XGSP – XML Based General Session Protocol; GlobalMMCS • NaradaBrokering - Distributed messaging infrastructure

  9. Collaboration Systems • Access Grid (AG) • Uses Internet2 multicast for audio/video transmission. • Voyager: Open source archiving tool used to record audio/video streams in MBONE sessions. • InSORS: Can be viewed as a commercial version of AG. • IG Recorder • Similar to Voyager, it records audio/video streams as well as other data streams (i.e. powerpoint slides) in AG sessions. • VRVS • Provides some kind of integration of different A/V endpoints. • No information about archiving system. • WebEx / Centra : Web based collaboration systems. • Recording and playback is done in a traditional way; session is recorded in a local storage.

  10. SETUP Ready TEARDOWN PAUSE TEARDOWN PLAY / RECORD Playing / Recording Init Streaming Media Standards • RTSP – Real Time Streaming Protocol • NOT a transport protocol. • VCR-like control protocol over media. • Stateful server-client communication. RTSP States

  11. Streaming Media Standards • SMIL - Synchronized Multimedia Integration Language • “An XML-based language that allows authors to write interactive multimedia presentations” • Multiple streams can be presented in a synchronized timeline. • Real Time Transport Protocol – RTP • Usually used in conjunction with RTCP. • RTSP server can deliver media data using RTP • RealNetworks’ Data Transport – RDT • RealNetworks’ proprietary standard to deliver media. • Can be used over UDP or TCP • Data types • H.261, H263, JPEG , etc. (mostly used in VC systems) • RealMedia, MPEG, etc. (mostly used in RTSP streaming clients)

  12. Streaming Servers • Streaming servers are implementation of RTSP. Support for RTSP may vary. • Helix Streaming Server • Streaming server from RealNetworks • Open source version has limited capability. Formats: RealMedia, mp3 • Commercial version provides live archiving to the local storage (as media files). Formats: RealMedia, mp3, mpeg-4, QT and WM • Darwin Streaming Servers • Open source streaming server from Apple. • Supports QT format. • Archives the session to the local storage (as media files)

  13. XML Based General Session Protocol (XGSP) • XGSP is a conference control framework. • The goal of XGSP is to integrate heterogeneous systems into one collaboration system. • Includes three components; user session management, application session management and floor control. • SIP is a non-XML text-based signaling protocol for Internet conferencing, telephony and instant messaging • GlobalMMCS : A prototype system to verify and refine XGSP conference control framework. • A XGSP media server • H.323, SIP gateways and Real Servers for A/V clients • XGSP A/V Session Server • The web server

  14. NaradaBrokering (NB) • Virtualizes communication transport and endpoints • UDP, TCP, Multicast, SSL ….. • Based on a distributed network of cooperating broker nodes. (brokers support software overlay network) • Efficiently routes (content or endpoint-based) information from producers to consumers of content. • Subscriptions can be based on SQL, Regular expressions and XPath queries. • Been deployed and tested in the context of multimedia conferencing and Grid applications. • Introduces delays of order one to two milliseconds at each broker

  15. Research Issues • We need to research capabilities/services that need to exist in a messaging system to achieve a higher quality of service (qos) of archiving and replay service • Effect of • Timestamping events using NTP on achieving synchronization among streams • Time ordering of events using buffering service and • Time spaced release of events using time differential service on stream quality. • A metadata management service for archiving and replay • How to build a session catalog to describe information regarding the streams in the session • How to manage messaging system topics for RTSP sessions • How to expose this service as a web service

  16. Research Issues • Improving fault tolerance of the system • Redundancy in archiving/replay services • How to provide continuity of the stream in case of a replay service node crash • How the replay service can leverage fault tolerance • Scalable replay service • How many requests a replay service can support • Load balancing among replay services • Effect of network threshold • Supporting different type of clients with different capabilities • Other research issues • Systematic applications of major and minor event concepts in event driven systems • How to expose RTSP semantics as a web service • Synchronization of replaying multiple streams

  17. Research Tasks • RTSP semantics support in XGSP (service oriented architecture) • How RTSP clients can join to XGSP sessions • A RTSP to XGSP signaling gateway • How XGSP will support RTSP clients • RTSP semantics support in NB (messaging system) • How to support active replay (play, pause, rewind, forward, absolute positioning, etc) for both live and archived streams • Instant replay • How to support and provide seeking capability in live streams • Current RTSP servers do not support rewind in live streams • Changes to NB archive and replay service to support RTSP semantics • Do we need extensions to RTSP?

  18. Milestones I • NB Time Service • An implementation of Network Time Protocol (RFC 1305) • Entities generating events in the system should utilize Time Service to timestamp the events. • NB Buffering Service • The goal is to time-order events. • Delay introduced by the buffer service can vary based on the above parameter values. • Time Differential Service • Releases events preserving the time spacing between events. • Streaming Gateway • Transcodes audio/video streams into RealMedia format. • Targets both desktop PCs and cellular phones • Stream conversion is a CPU intensive application

  19. Milestones II • NB Replay Service • Should provide API to support RTSP semantics. • RTSP Media/Topic Manager • Binding RTSP sessions with related NB topics. • XGSP Archive Manager • Provides RTSP RECORD semantics to start archiving of topics. • Session Metadata Service • Metadata service for archiving system. • RTSP Server / Proxy • Ability to dynamically locate replay and archiving services. • Ability to switch between replicas. • We will apply those to e-sports project

  20. Typical Scenario for Live Streaming and Recording Replay/ Archiving Service 1: XGSP client sends and receives RTP packets 2, 3: Archiving service subscribes to the topic and records the sessions on different storages. 4: RTSP client communicates with RTSP server/proxy and establishes a RTSP session. 5: RTSP client receives the stream from the topic. NB Stable Storage Replay/Archiving Service 3 2 NB X NB Stable Storage RTSP Server/Proxy 5 1 4 Producer (XGSP Client (MBONE tools, ...) ,…) RTSP Client Two way NB link One way NB link that carries stream Local Storage access Communication channel Topic X

  21. Typical Scenario for Live Streaming and Recording (with stream conversion) 1: XGSP client sends and receives RTP packets 2: Streaming Gateway (SG) subscribes to the stream topic and receives the stream 3: SG publishes the stream over NB link 4, 5: Archiving service subscribes to the topic and records the sessions on different storages. 6: RTSP client communicates with RTSP server/proxy and establishes a RTSP session. 7:RTSP client receives the stream from the topic. NB Stable Storage Streaming Gateway Replay/Archiving Service 4 2 NB 4 X 3 RTSP Server/Proxy 7 5 6 1 NB Stable Storage Producer (XGSP Client (MBONE tools, ...) ,…) Replay/ Archiving Service RTSP Client Two way NB link One way NB link that carries stream Local storage access Communication channel Topic X

  22. Typical Scenario for Playback Replay/Archiving Service NB Stable Storage 1: RTSP client communicates with RTSP server/proxy and establishes a RTSP session. 2: Stream is published by replay service 3: Alternate stream to 2 4: RTSP client receives the stream from the topic. 2 NB X RTSP Server/Proxy 4 1 3 RTSP Client Replay/Archiving Service One way NB link One way NB link that carries stream Communication channel Topic NB Stable Storage X

  23. Typical Scenario for Instant Replay Producer (XGSP Client (MBONE tools, ...) ,…) 1: XGSP client sends and receives RTP packets 2: Archiving service subscribes to the topic and records the sessions. 3: RTSP client communicates with RTSP server/proxy and establishes a RTSP session. 4: RTSP client receives the stream from the topic. 5: RTSP client communicates with RTSP server/proxy for instant replay. 6: Replay service publishes the archived stream to a topic 7: RTSP client receives the archived stream. RTSP Server/Proxy ,5 1 3 • NB 4 RTSP Client X X 7 2 6 NB Stable Storage Replay/ Archiving Service Two way NB link One way NB link that carries stream Local Storage access Communication channel Topic X

  24. Tests • NB Time Service tests on several machines. • Time differential service performance test. • Measuring number of clients that can be supported by a single replay service and storage. • Measuring client scalability • Measuring latency of recovery from failures • How long will it take to dynamically switch between replay services during a node failure (node that provides the replay service)? • How long will it take for an archiving node to recover the missed events?

  25. Contribution of this Thesis • Combines the benefits of RTSP with distributed messaging system and a service oriented architecture for archiving and replay in a geographically distributed large network • A fault tolerant architecture for collaboration systems • Enables late, broken clients to receive missed streams • An architecture for instant replay of live streams • A scalable replay architecture benefits from the advantages of service oriented architecture and messaging systems • Support for heterogeneous clients

  26. Summary • This thesis addresses the following open research issues in collaboration systems • A framework for fault tolerance: • Support for late or broken clients in live sessions. • Distributed archiving/replay system • Support for different clients : Research extension of architectures to support different clients with different capabilities, i.e. cellular phone clients. • Client scalability: Research extension of architectures to support as many clients as possible. Centralized servers support a limited number of clients • An instant replay mechanism for live streams.

More Related