1 / 51

Scalable, Robust Wide-area Control Architecture for Integrated Communications

Scalable, Robust Wide-area Control Architecture for Integrated Communications. Helen J. Wang Qualifying Examination March 8, 2000. Cellular. Pager. PSTN. Internet. Motivation. Lack support for: Integrated use of heterogeneous devices (old & new)

manchu
Download Presentation

Scalable, Robust Wide-area Control Architecture for Integrated Communications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scalable, Robust Wide-area Control Architecture for Integrated Communications Helen J. Wang Qualifying Examination March 8, 2000

  2. Cellular Pager PSTN Internet Motivation • Lack support for: • Integrated use of heterogeneous devices (old & new) • Rapid arbitrary communication service customization

  3. Limitations of Existing Systems • Telecommunications network: • engineered with one app and device in mind • Existing Internet Telephony systems: • ease of service creation, but limited • scalability, availability and fault tolerance not fully addressed

  4. How good is a communication system?(Dissertation Goals) • Functionality: communication services it can support, and the ease of creating them • Viability: scalability, robustness • Focus on the control aspect: • control architecture = system components + signaling protocol (session setup, tear-down, and control)

  5. Problem Statement • Given heterogeneity, how to design a scalable, robust wide-area control architecture that supports easy creation of a wide range of communication services? And how should these services be created?

  6. Outline • Related Work and Research Contribution • Control Architecture • Signaling Protocol • Service Creation Model • Summary, Methodology, Research Agenda

  7. Related Work

  8. Overview of Research Contributions • A scalable control architecture • A robust signaling protocol • A user-level, easy service creation model • Publications: • “A Signaling System Using Light Weight Sessions” accepted to Infocom 2000. • Helen J. Wang, et al. “ICEBERG, An Internet-Core Network Architecture for Integrated Communications,”accepted to IEEE Personal Communications April/2000.

  9. Outline • Related Work and Research Contribution • Control Architecture • Signaling Protocol • Service Creation Model • Summary, Methodology, Research Agenda

  10. Control Architecture: Goals • Any-to-any communication • inter-working, composition of data transformation • Personal mobility • unique ID, name mapping • Personalized communication services • preference storage and management • Enable user-activity driven services • activity tracking

  11. iPOP Call Agent iPOP IAP Call Agent dialed 333-2222 IAP PR NMS PR NMS APC APC PAC PAC Control ArchitectureComponents and Their Operations Alice@domain1 Bob@domain2 Pick up Data Path

  12. Leverage Cluster Computing Platforms • iPOP must be scalable and robust: leverage cluster computing platforms such asNinja, AS1 • Our requirements: • highly available service invocation: Ninja Base • fault tolerant service session: AS1 • session state maintained on client (IAP) • iPOP on Ninja Base augmented with client heartbeat support from AS1

  13. iPOP iPOP Control Architecture:Facts Access net Call Agent Call Agent IAP PR PAC PR PAC Local area communication Wide-area communication • One Call Agent per caller per device • One type of IAP per access network

  14. Outline • Related Work and Research Contribution • Control Architecture • Signaling Protocol • Service Creation Model • Summary, Research Methodology, Agenda

  15. Signaling Protocol • Basic call service: building blocks for supplementary services • Conventional: two party, homogeneous devices • ICEBERG communication model: • multi-device communication • invitation-based participation • large number of dynamic small group communication • Richer primitives: add/remove an endpt during a session • conference call, service handoff first class service; trivial to implement services that require endpoint changes.

  16. Alice Bob Carol Alice Bob Dale CA3 CA4 Carol Dale Invite Bob Invite Alice Invite (also Bob) Invite (also Alice) Alice Alice Bob Bob Challenges in Signaling:Problems with SIP CA1 CA2 CA5 Alice Bob Carol Dale Dale Carol • no consideration of session dynamics: membership, component failure • bridged conference: centralized component to maintain states -- single point of failure

  17. Problems with H.323 • Centralized approach for conferencing • Limited fault tolerance measure: • process-pair style • cannot capture new state during fault recovery • Complex

  18. Lessons Learned • Correctness and robustness: • need to maintain up-to-date membership and session state (call parties, device status, data path info) in the face of transient component failures, network partitions, and any exceptional conditions. • distributed approach rather than centralized

  19. Our Approach • Maintain membership and session state as soft state in a distributed fashion. • Soft state: expired unless refreshed, protocol action upon new state or timeout, error recovery same as normal operation • Question: call setup latency requirement? bandwidth scalability problems?

  20. Signaling Protocol: Session Membership • Session membership • membership: CAs • IP multicast’s group service an overkill for small group communication • per group state in routers, IP addr scarcity, deployment issues: access control, accountability • Solution: run an application-level group membership protocol among participating IAPs

  21. Announce Announce Listen Listen APC APC Announce Listen HB HB iPOP HB iPOP HB HB APC Signaling Protocol: Capture the Complete Session State Call Agent Call Agent Session state Session state Comm Session iPOP iPOP IAP IAP Call Agent Session state IAP iPOP iPOP HB

  22. Announce Announce Listen Listen Announce Listen HB HB iPOP HB iPOP HB Call Agent Session state HB Signaling Protocol: Fault Tolerance Call Agent Call Agent Session state Session state Comm Session APC APC iPOP iPOP IAP IAP IAP APC iPOP iPOP HB

  23. Announce Announce Listen Listen Announce Listen HB HB iPOP HB iPOP HB Call Agent Session state HB APC iPOP iPOP HB Signaling Protocol: Fault Tolerance Call Agent Call Agent Session state Session state Comm Session APC APC iPOP iPOP IAP IAP IAP

  24. Announce Announce Listen Listen HB HB iPOP HB iPOP HB iPOP Signaling Protocol: Fault Tolerance Call Agent Call Agent Session state Session state Comm Session APC APC iPOP iPOP IAP IAP IAP APC

  25. Invitation Protocol • Invite a Call Agent to participate a session • Also a soft state protocol for robustness: • IAP maintains the call state machine, sends stateful, keep-alive heartbeat to the iPOP • Call Agents advance call state machines on IAPs through periodic install-state message until receiving new heartbeat with the new state • Soft state inter-iPOP communication

  26. Bandwidth Scalability • Soft state period selection: call setup latency, fault recovery time vs Bandwidth overhead • An optimization problem: minimize bandwidth overhead, subject to the following contraints: • expected call setup latency (1.5 second) • standard deviation (0.5 second) • fault recovery time (1, 4 seconds for local and wide area) • parameters: 2% wide-area loss rate, 0.2% local-area loss rate, 2ms local-area propagation delay, 100 ms wide-area delay • local: 1 sec, 800bps; wide: 3 sec, 233 bps; for 64kbps data stream, local area control traffic 1%

  27. Processing Scalability • Compare our single cluster system against a class 4 switch which is a local (end) office: 250 calls/second • Our current prototype yields 10 calls/second on a PC due to inefficient RMI implementation (10’s ms), 25+ PCs = a class 4 switch

  28. Outline • Related Work and Research Contribution • Control Architecture • Signaling Protocol • Service Creation Model • Research Agenda

  29. Service Creation Model • Focus: control, redirection services • Goal: end users can easily customize the control services in any arbitrary way • Issues: • service creation/customization • service invocation • service portability • system support

  30. Intelligent Network • Separate service logic from basic call processing Switch Service Logic Trigger • Service portability: standardize basic call state machine  too strict a standard  failed • Limitation: no user-level customization

  31. Proposed Approach • Call processing implementation independent customization: use high-level events, e.g., call request received, callee device busy, callee device not answer • Service creation: condition-action pairs • condition: conjunction of high level events, user interested conditions, and boolean expressions; • Action: composition of system primitives • Hypothesis: condition-action pair sufficient

  32. check update Activity Condition Action event Condition Action Proposed ApproachService Invocation & Portability • Service Portability: standardize the events and system primitives, much easier than call state machine Preference Registry Call Agent PAC Condition Action Condition Action

  33. An ExampleCompletion of calls to busy subscriber callee busy && caller hang up  register with callee PAC; callee PAC reject  exit callee PAC notify  invite caller; invite callee; caller busy  wait 5 minutes; re-register with the callee PAC; hangup time > 1 hours  de-register with callee PAC; exit

  34. An Example, Cont. • System support issues: • extended Call Agent life time • queue management on the PAC • track event sequence: stack of timed events, stack depth depending on user preferences

  35. How good is a communication system? • Functionality: services • component identification • powerful signaling protocol primitives • easy, user-centric service creation model • Viability: scalability, robustness • first application of soft state to signaling protocol, bandwidth overhead not an issue, can fulfill latency requirements • processing scalability, local area robustness by leveraging cluster computing platforms

  36. Outline • Related Work and Research Contribution • Control Architecture • Signaling Protocol • Service Platform • Methodology and Research Agenda

  37. Methodology1st Iteration (Completed) • Control architecture • Session maintenance protocol • Control architecture • Signaling protocol • session maintenance protocol Design Prototype Analysis Evaluation • Measured the current prototype • Simple soft state period analysis

  38. Methodology2nd Iteration Overview • Wide-area testbed • Group membership protocol • Invitation protocol • Service creation model • Service creation model • Possibly revise the design of the control architecture and the signaling protocol • Completed work: • invitation protocol • membership protocol Design Prototype Analysis Evaluation • Evaluation: scalability, robustness, service creation, hard/soft state comparison • Analysis: group membership protocol, service creation

  39. Research Agenda • Phase 1: complete and fine-tune service creation model design (1 month) • define events and system primitives • preference conflict resolution • identify service creation interaction with the control architecture and signaling Planned paper submission on service creation model design to SmartNet 3/31

  40. Research Agenda • Phase 2: 2nd iteration Prototyping (3 - 6 months) • invitation protocol, membership protocol • employ Ninja vSpace • release ICEBERG to Ericsson, TU Berlin, NTT and construct a wide-area test-bed • service creation model Planned paper submission to ICNP (May) or INFOCOM (July) on protocols and analysis

  41. Research Agenda, Cont. • Phase 3: Evaluation (6 months) • processing scalability: measure call processing time, # of simultaneous sessions, compare against class 4 switch • bandwidth scalability: group membership protocol analysis; dynamic soft state period selection • robustness: emulate failure conditions (losses, long delays, component failures), run system over time • hard/soft state comparison: bandwidth usage, latency, fault recovery time

  42. Research Agenda, Cont. • Service creation evaluation: • comparable functionality : implement representative IN services such as “call completion upon busy” • new services such as policy-based call waiting • system extensibility: # of lines of code and amount of time to develop new primitives for new services Planned paper submission on wide-area testbed experience and evaluation to SIGMETRICS 3/2001

  43. Research Agenda, Cont. • Phase 4: Write thesis (6 month) • compile the publications

  44. Acronyms Lookup • APC: Automatic Path Creation • CA: Call Agent • IAP: ICEBERG Access Point • iPOP: ICEBERG Point of Presence • NMS: Name Mapping Service • PAC: Personal Activity Coordinator • PR: Preference Registry

  45. Soft State expire unless refreshed, protocol action upon new state and timeout loss of state will not stop the system -- robust eventual consistency error recovery built into normal operation --simple, but longer latency, and no diagnosis Hard State explicit state setup once only (bandwidth and processing efficiency) explicit error detection and recovery synchronously at involved components -- complex but immediate better consistency guarantees Soft and Hard State

  46. Signaling Protocol: Group Membership Protocol • Periodic membership exchange among members • no bootstrapping needed: every member knows at least one other member (invitation-based) • receive superset or disjoint set: immediate synchronization with the rest of the session • run among the IAPs for Call Agent fault recovery • time stamped <IAP, CA> list • Convergence efficiency rather than bandwidth efficiency

  47. Period Selection • Soft State Period: dominates fault recovery time, affects bandwidth overhead • cannot trade latency for bandwidth scalability • Problem: what period values to select to fulfill the call setup latency, fault recovery latency requirements and minimize the bandwidth overhead? -- an optimization problem

  48. Select PeriodProblem Formulation • Call setup latency = receiving 8 local-area and 4 wide-area msgs in sequence + msg processing time • Receive a local-area msg = f (local-area period, local-area loss-rate, local-area propagation delay) • The optimization problem: • find local-area and wide-area period that minimize bandwidth overhead, subject to the following constraints • E(call setup latency) <1.5 second • Standard deviation (call setup latency) < 0.5 second • local-area fault recovery time <1 s; wide < 4 s • with parameters: 2% wide-area loss rate, 0.2% local-area loss rate, 2ms local-area propagation delay, 100 ms wide-area delay

  49. Results: Period = f (processing) • fault recovery time constraints dominate the effects on period • local-area period = 1s • 800 bps overhead • wide-area period = 3s • 233 bps overhead • for 64kbps data stream, 1% * # of members

  50. Preference Registry Condition Action Proposed Approach: Service Creation • Condition: conjunction of high level events, user interested conditions, and boolean expressions; • Action: sequence of system primitives • Advantage: call processing impl. independent • Hypothesis: condition-action pair sufficient Call Agent GUI User

More Related