rrapid real time recovery based on active probing introspection and decentralization
Download
Skip this Video
Download Presentation
RRAPID : Real-time Recovery based on Active Probing, Introspection, and Decentralization

Loading in 2 Seconds...

play fullscreen
1 / 10

RRAPID : Real-time Recovery based on Active Probing, Introspection, and Decentralization - PowerPoint PPT Presentation


  • 100 Views
  • Uploaded on

RRAPID : Real-time Recovery based on Active Probing, Introspection, and Decentralization. Takashi Suzuki Matthew Caesar. Motivation. Today’s internet core has bursty losses Backbones have low average loss rates (<0.2%), but experience large bursts in loss

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' RRAPID : Real-time Recovery based on Active Probing, Introspection, and Decentralization' - orien


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
rrapid real time recovery based on active probing introspection and decentralization

RRAPID: Real-time Recovery based on Active Probing, Introspection, and Decentralization

Takashi Suzuki

Matthew Caesar

motivation
Motivation
  • Today’s internet core has bursty losses
    • Backbones have low average loss rates (<0.2%), but experience large bursts in loss
    • Loss durations vary from 10ms to 33.72sec
    • 6 out of 7 providers experienced large outage periods 10-220sec for 1-2 times per day
  • Difficult for multimedia applications to recover from repeated loss (e.g. with FEC)
  • Commonly used restoration techniques insufficient
    • Link layer recovery, MPLS not yet uniformly deployed
    • RON too slow (20 sec), not scalable
  •  real-time recovery desired
  • “Assessment of VoIP Quality over Internet Backbones,” Markopoulou, Tobagi, Karam (INFOCOM 2002)
approach
Approach
  • RRAPID:Real-time Recovery based on Adaptive Probing, Introspection, andDampening
  • Technique: Overlay based, real-time recovery
    • Use Link-state routing
    • Determine link cost from packet receipt delay
    • Adaptively dampen route advertisements
  • Desirable properties:
    • Speed: Low end-to-end failure time
    • Stability: Few route oscillations
    • Accuracy: Avoid reacting to transient failures
    • Scalability: Low probing/communication overhead
slide4

RS

System Architecture: Reaction Mechanism

  • Route Stabilization (RS):
    • Dampens route flaps
  • Adaptive Tracking (AT):
    • Filters noise
    • Reacts quickly to changes
  • Link Cost Estimation (LCE):
    • Estimates failure probability from packet loss
    • “Delay-deficit algorithm”

AT

LCE

simulation results layered control

--- LCE output

---AT output

---RS output

Simulation Results: Layered Control
  • Show detailed actions of layers
    • --- LCE output: metric representing probability link has failed
    • ---AT output: metric with noise filtered
    • ---RS output: advertised value for link
    • Red spikes result from back-to-back packet losses
  • Setup
    • Link Failure at t=[150s-170s]
    • Probe every 300ms, 10% loss
  • Results
    • First Detection in 0.92s, next at 5.42
    • Several false positives due to cold start. Stabilizes in 100s.
    • 0.92s corresponds to 3 lost probes plus propagation delay of 0.02s
simulation results reaction speed
Simulation Results: Reaction Speed
  • Reaction Speed
    • Probing faster improves speed
    • Probing every <400ms can give ~1s reaction times
    • Loss decreases reaction time
  • Overhead
    • Probing every >50ms gives reasonable overhead
  • Effect of packet loss
    • Increasing packet loss decreases accuracy
    • Advertisements and probes are dropped
    • Subsecond reactions even at 5% loss
simulation results comparison
Simulation Results: Comparison
  • Compared RRAPID, RON, and “Oracle-based” routing.
  • Results:
    • RON requires 4 to 10x more advertisements than RRAPID
    • RON’s overhead increases exponentially with probe speed, RRAPID’s overhead increases linearly
    • Packet loss has an extreme effect on RON, moderate effect on RRAPID
emulation results real internet workload
Emulation Results: Real Internet Workload

Overlay path 1

  • Method
    • Measured performance on real Internet workload
    • Traces acquired between UIUC and Stanford
    • Emulated 2-path overlay topology, one trace for each path
    • 1 natural failure at time t=[123.4s to 133.7s], introduced two failures from t=[40s to 50s] and t=[60s to 70s]
  • Result
    • Stable, sub-second reactions

Overlay path 2

--- Number of flows on link #1

---Number of flows on link #2

analysis
Analysis
  • Simplified model of system
    • Modeled RS layer as MIAD
      • Increase by 1, Decrease by 1/k
      • Advertisement threshold limited to n
    • Ignored AT layer effects
  •  n*k state Markov chain
  • Given:
    • Probe loss probability p
    • Number of paths N
    • Probe interval I
  • We can determine:
    • Speed: Average reaction time
    • Overhead: Average advertisement rate
  • Found best-case expected Overhead and Reaction time for variable transient loss rates.
  • Results
    • Can react quickly, stably for fairly large amounts of transient packet loss
    • Overhead and reaction time increases super-linearly with loss rate
conclusions
Conclusions
  • Can achieve sub-second reactions on most links with reasonable stability
    • Congested links increase reaction time
    • Can react well on most internet links
  • Trade off relationship between overhead and reaction speed
  • Lossy links worsen reaction time
    • Hard to react quickly, stably if all paths have >10% loss.
  • Future work:
    • Improve scalability with route aggregation
    • Extend evaluation of system parameters
    • Consider wider range of topologies, cross traffic, offered loads
ad