Network Flow Watermarking Attack on Low-Latency Anonymous Communication Systems

Network Flow Watermarking Attack on Low-Latency Anonymous Communication Systems Xinyuan Wang, Shiping Chen, Sushil Jajodia Presented by Eun Kyoung Kim

Content • Introduction • Network Flow Identification and Anonymous Communication • Interval Centroid Based Watermarking Scheme • Properties of the Interval Centroid Based Watermarking Scheme • Experiments • Conclusions • Discussions

Introduction • To address privacy concerns, anonymous communication systems have been designed to provide anonymity • Traditional methods of achieving anonymity include using proxies, MIXes, and various other flow transformations • We investigate the fundamental limitations of flow transformations by developing a novel flow watermarking technique

Network Flow Identification and Anonymous Communication(1/5) • Network information flow : the transmission path of some information along the network • Network flow identification problem : how to determine network flows that belong to any particular network information flows • Network flow identification is inherently related to anonymous communication whose goal is to conceal the true identities and relationships among the communication parties

Network Flow Identification and Anonymous Communication(2/5) • Anonymous communication systems usually mix multiple network information flows among multiple communicating parties and transform each network flow substantially • Existing network flow transformations can be divided into intra-flow transformations and inter-flow transformations

Network Flow Identification and Anonymous Communication(3/5)

Network Flow Identification and Anonymous Communication(4/5)

Network Flow Identification and Anonymous Communication(5/5) • Existing low-latency anonymous communication systems have used variations of the flow transformations in addition to any cryptographic operations they may use • Whether or not we could uniquely identify a network flow despite these flow transformations is a key problem that has a direct impact on some of the very foundations of existing anonymizing techniques

Interval Centroid Based Watermarking Scheme(1/6) • Goal : to make a sufficiently long flow uniquely identifiable even after significant transformations have occurred • Method : given a packet flow of duration Tf, to embed l-bit watermark with redundancy r

Interval Centroid Based Watermarking Scheme(2/6) • Random grouping and assignment of intervals where n = l x r

Interval Centroid Based Watermarking Scheme(3/6) • Finding aggregated centroids • Aggregate all of the time stamps in the r group A and group B intervals ( IAi, j and IBi,j), respectively, and calculate the centroids of group A and B packets (Ai and Bi), respectively, assigned for watermark bit i • Before watermark encoding • E(Ai) = E(Bi) = T/2 • E(Yi) = 0, where Yi = Ai - Bi

Interval Centroid Based Watermarking Scheme(4/6) • Encoding scheme • To encode bit ‘1’ or ‘0’, make Yi positive or negative by increasing Ai or Bi, respectively • To increase Ai or Bi, delay each packet within each interval IAi, j or IBi,j, respectively • Delay strategy • After watermark encoding • E(A’i) = E(B’i) = (T+a) / 2 • E(Yi1) = a/2, E(Yi0) = -a/2

Interval Centroid Based Watermarking Scheme(5/6) • Decoding scheme • Calculate each Yi(i=0, …, l-1) given the exact interval grouping and assignment information <o, T, RNG, s> • If Yi is positive/negative, the decoding of watermark bit i is 1/0

Interval Centroid Based Watermarking Scheme(6/6) • The upper bound of the decoding error probability by Chebyshev inequality • Given any T and a, we can minimize the error by increasing Ni, which can be achieved by increasing r provided that the flow is long enough with sufficient packets

Properties of the Interval Centroid Based Watermarking Scheme(1/3) • Self-synchronization • Try a rage of different offsets and find the offset that results in the closest match with the watermark • Problem : increasing the false-positive rate • Solution : lowering the false-positive rate of the single-offset decoding if we have enough packets

Properties of the Interval Centroid Based Watermarking Scheme(2/3) • Robustness Against Chaff and Flow Mixing • The chaff added to a watermarked flow tends to shift the centroid within each interval toward the center of the interval • How large is the impact of the chaff packets over the watermark detection error probability? • The upper bounds on the decoding error probabilities says no matter how large the RA, RB, R, we can always make the decoding error probabilities arbitrarily close to zero by having sufficiently large Ni, which can be achieved by having sufficiently large number of packets

Properties of the Interval Centroid Based Watermarking Scheme(3/3) • Robustness against packet dropping, repacketization, and flow splitting • When there are enough packets left in the flow, the centroids of all the intervals tend to remain the same

Experiments(1/2) • Real-time experiments on live anonymized web traffic

Experiments(2/2) • Offline experiments

Conclusions • We demonstrate that existing flow transformations do not necessarily make a long network flow indistinguishable from others • By developing a novel flow watermarking technique, we can uniquely identify a long flow even after drastic flow transformations • Our flow watermarking attack is applicable to all practical low-latency anonymous communication systems

Discussions • Potential research topics • How to keep privacy from this attack • Make the flow “sufficiently” short • What is the capability of the low-latency anonymous communication systems in the presence of active adversary

Network Flow Watermarking Attack on Low-Latency Anonymous Communication Systems

Network Flow Watermarking Attack on Low-Latency Anonymous Communication Systems

Presentation Transcript

Communication latency in distributed memory systems

Minimizing Communication Latency to Maximize Network Communication Throughput over InfiniBand

Low Latency Networking

Anonymous communications: High latency systems

Anonymous communications: High latency systems

Energy Attack on Server Systems

Low Latency Computations on Massive Data

High -Fidelity Latency Measurements in Low -Latency Networks

Anonymous Communication

Low latency via redundancy

Low-Cost, High-Latency, Unlimited-Bandwidth Communication

Anonymous Communication

A New Replay Attack Against Anonymous Communication Networks

Network Flow Watermarking Attack on Low-Latency Anonymous Communication Systems

Attacks on Low-Latency Anonymous Network: TOR

Watermarking in DRM systems

The Design and Implementation of a Low-Latency On-Chip Network

Low-Cost, High-Latency, Unlimited-Bandwidth Communication

Minimizing Communication Latency to Maximize Network Communication Throughput over InfiniBand

Low Latency Server