Redundancy in Network Traffic: Findings and Implications

Redundancy in Network Traffic: Findings and Implications Ashok AnandRamachandranRamjee ChitraMuthukrishnanMicrosoft Research Lab, India AdityaAkella University of Wisconsin, Madison

Redundancy in network traffic • Redundancy in network traffic • Popular objects, partial content matches, headers • Redundancy elimination (RE) for improving network efficiency • Application layer object caching • Web proxy caches • Recent protocol independent RE approaches • WAN optimizers, De-duplication, WAN Backups, etc.

Protocol independent RE • Message granularity: packet or object chunk • Different RE systems operate at different granularity WAN link

RE applications • Enterprise and data centers • Accelerate WAN performance • As a primitive in network architecture • Packet Caches [Sigcomm 2008] • Ditto [Mobicom 2008]

Protocol independent RE in enterprises Data centers • Globalized enterprise dilemma • Centralized servers • Simple management • Hit on performance • Distributed servers • Direct request to closest servers • Complex management • RE gives benefits of both worlds • Deployed in network middle-boxes • Accelerate WAN traffic while keeping management simple • RE for accelerating WAN backup applications Wan Opt ISP Wan Opt Enterprises

Recent proposals for protocol independent RE Web content • Reduce load on ISP access links • Improve effective capacity • Packet caches [Sigcomm 2008] • RE on all routers • Ditto [Mobicom 2008] • Use RE on nodes in wireless mesh networks to improve throughput ISP RE deployment on ISP access links to improve capacity Enterprises University

Understanding protocol independent RE systems • Currently little insight into these RE systems • How far are these RE techniques from optimal? • Are there other better schemes? • When is network RE most effective? • Do end-to-end RE approaches offer performance close to network RE? • What fundamental redundancy patterns drive the design and bound the effectiveness? • Important for effective design of current systems as well as future architectures e.g. Ditto, packet caches

Large scale trace-driven study • First comprehensive study • Traces from multiple vantage points • Focus on packet level redundancy elimination • Performance comparison of different RE algorithms • Average bandwidth savings • Bandwidth savings in peak and 95th percentile utilization • Impact on burstiness • Origins of redundancy • Intra-user vs. Inter-user • Different protocols • Patterns of redundancy • Distribution of match lengths • Hit distribution • Temporal locality of matches

Data sets • Enterprise packet traces (3 TB) with payload • 11 enterprises • Small (10-50 IPs) • Medium (50-100 IPs) • Large (100+ IPs) • 2 weeks • Protocol composition • HTTP (20-55%) • Spring et al. (64%) • File sharing (25-70%) • Centralization of servers • UW Madison packet traces (1.6 TB) with payload • 10000 IPs; trace collected at campus border router • Outgoing /24, web server traffic • 2 different periods of 2 days each • Protocol composition • Incoming, HTTP 60% • Outgoing, HTTP 36%

Evaluation methodology • Emulate memory-bound (500 MB - 4GB) WAN optimizer • Entire cache resides in DRAM (packet-level RE) • Emulate only redundancy elimination • WAN optimizers do other optimizations also • Deployment across both ends of access links • Enterprise to data center • All traffic from University to one ISP • Replay packet trace • Compute bandwidth savings as (saved bytes/total bytes) • Includes packet headers in total bytes • Includes overhead of shim headers used for encoding

Large scale trace-driven study • Performance comparison of different RE algorithms • Origins of redundancy • Patterns of redundancy • Distribution of match lengths • Hit distribution

Redundancy elimination algorithms Redundancy elimination algorithms Redundancy suppression across different packets (Use history) Data compression only within packets (No history) MODP (Spring et al.) GZIP and other variants MAXP (new algorithm)

MODP • Spring et al. [Sigcomm 2000] • Compute fingerprints Fingerprint table Packet payload Packet store Window Payload-1 Rabin fingerprinting Payload-2 Value sampling: sample those fingerprints whose value is 0 mod p • Lookup fingerprints in Fingerprint table

MAXP • Similar to MODP • Only selection criteria changes MODP MAXP Sample those fingerprints whose value is 0 mod p Choose fingerprints that are local maxima ( or minima) for p bytes region No fingerprint to represent the shaded region Gives uniform selection of fingerprints

Optimal • Approximate upper bound on optimal • Store every fingerprint in a bloom filter • Identify fingerprint match if bloom filter contains the fingerprint • Low false positive for bloom filter: 0.1%

Comparison of MODP, MAXP and optimal • MAXP outperforms MODP by 5-10% in most cases • Uniform sampling approach of MAXP • MODP loses due to non uniform clustering of fingerprints • New RE algorithm which performs better than classical MODP

Comparison of different RE algorithms • GZIP offers 3-15% benefit • (10ms buffering) -> GZIP increases benefit up to 5% • MAXP significantly outperforms GZIP, offers 15-60% bandwidth savings • MAXP -> (10 ms) -> GZIP further enhances benefit up to 8% • We can use combination of RE algorithms to enhance the bandwidth savings -> means followed by

Large scale trace-driven study • Performance study of different RE algorithms • Origins of redundancy • Patterns of redundancy • Distribution of match lengths • Match distribution

Origins of redundancy • Different users accessing the same content, or same content being accessed repeatedly by same user? • Middle-box deployments can eliminate bytes shared across users • How much sharing across users in practice? INTER-USER: sharing across users INTER-SRC INTER-DEST INTER-NODE Enterprise Data Centers Flow-1 Flow-1 Flow-2 Flow-2 INTRA-USER: redundancy within same user (a) INTRA-FLOW (b) INTER-FLOW Flow-3 Flow-3 Middlebox Middlebox

Study of composition of redundancy • 90% savings is across destinations for Uout/24 • For Uin/Uout, 30-40% savings is due to intra-user • For enterprises, 75-90% savings is due to intra-user Inter User Intra User

Implication: End-to-end RE as a promising alternative • End-to-end RE as a compelling design choice • Similar savings • Deployment requires just software upgrade • Middle-boxes are expensive • Middle-boxes may violate end-to-end semantics Enterprise Data Centers Middlebox Middlebox

Large scale trace-driven study • Performance study of different RE algorithms • End-to-end RE versus network RE • Patterns of redundancy • Distribution of match lengths • Hit distribution

Match length analysis • Do most of the savings come from full packet matches? • Simple technique of indexing full packet will be good • For partial packet matches, what should be the minimum window size?

Match length analysis for enterprise • 70% of the matches are less than 150 bytes and contribute 20% of savings • 10% of the matches come from full matches and contribute 50% of savings • Need to index small chunks of size <= 150 bytes for maximum benefit Percentage Bins of different match lengths (in bytes)

Hit distribution • Contributors of redundancy • Few pieces of content repeated multiple times • Small packet store would be sufficient • Many pieces of content repeated few times • Large packet store

Zipf-like distribution for chunk matches • Chunk ranking • Unique chunk matches sorted by their hit counts • Straight line shows the zip-fian distribution • Similar to web page access frequency • How much popular chunks contribute to savings?

Savings due to hit distribution • 80% of savings come from 20% of chunks • Need to index 80% of chunks for remaining 20% of savings • Diminishing return for cache size

Savings vs. cache size • Small packet caches (250 MB) provide significant percentage of savings • Diminishing returns for increasing packet cache size after 250 MB

Conclusion • First comprehensive study of protocol independent RE systems • Key Results • 15-60% savings using protocol independent RE • A new RE algorithm, which performs 5-10% better than Spring et al. approach • Zip-fian distribution of chunk hits; small caches are sufficient to extract most of the redundancy • End-to-end RE solutions are promising alternatives to memory-bound WAN optimizers for enterprises

Questions ? Thank you!

Backup slides

Peak and 95th percentile savings

Effect on burstiness • Wavelet based multi-resolution analysis • Energy plot • higher energy means more burstiness • Compared with uniform compression • Results • Enterprise • No reduction in burstiness • Peak savings lower than average savings • University • Reduction in burstiness • Positive correlation of link utilization with redundancy

Redundancy across protocols • Large enterprise • University

Redundancy in Network Traffic: Findings and Implications