400 likes | 406 Views
EE515/IS523 Think Like an Adversary Lecture 9 P2P. Yongdae Kim. Recap. http://security101.kr E-mail policy Include [ee515] or [is523] in the subject of your e-mail Student Survey http://bit.ly/SiK9M3 Student Presentation Send me email. Work on project. Attacking Kad Network.
E N D
EE515/IS523 Think Like an AdversaryLecture 9P2P Yongdae Kim
Recap • http://security101.kr • E-mail policy • Include [ee515] or [is523] in the subject of your e-mail • Student Survey • http://bit.ly/SiK9M3 • Student Presentation • Send me email. • Work on project
Attacking Kad Network E. Chan-Tin, P. Wang, J. Tyra, T. Malchow, D. Foo Kune, N. Hopper, Y. Kim, "Attacking the Kad Network - Real World Evaluation and High Fidelity Simulation using DVN -", Wiley Security and Communication Networks 2009
P2P Applications • File Sharing : Napster, Gnutella, BitTorrent, etc • Recent Commercial Applications • Skype • BitTorrent becomes legit • P2P TV by Yahoo Japan • Research community • P2P File and archival systems: Ivy, Kosha, Oceanstore, CFS • Web caching: Squirrel, Coral • Multicast systems: SCRIBE • P2P DNS: CoDNS and CoDoNS • Internet routing: RON • Next generation Internet Architecture: I3
Napster.com K V Match K V K V K V K V K V K V K V K V K V Napster Match O O P … A B X Query QueryHit Download P: a node looking for a file O: offerer of the file retrieve (K1) K V P2P Systems • How to find the desired information? • Centralized structured: Napster • Decentralized unstructured: Gnutella • Decentralized structured: Distributed Hash Table • Content Addressable! • A DHT provides a hash table’s simple put/get interface • Insert a data object, i.e., key-value pair (k,v) • Retrieve the value v using key k
C Q A X D Y B R k (k,v) DHT: Terminologies • Every node has a unique ID: nodeID • Every object has a unique ID: key • Keys and nodeIDs are logically arranged on a ring (ID space) • A data object is stored at its root(key) and severalreplica roots • Closest nodeID to the key (or successor of k) • Range:the set of keys that a node is responsible for • Routing table size: O(log(N)) • Routing delay: O(log(N)) hops • Content addressable!
Target P2P System • Kad • A peer-to-peer DHT based on Kademlia • Kad Network • Overnet: an overlay built on top of eDonkey clients • Used by P2P Bots • Overlay built using eD2K series clients • eMule, aMule, MLDonkey • Over 1 million nodes, many more firewalled users • BT series clients • Overlay on Azureus • Overlay on Mainline and BitComet
d(X, Y) = X XOR Y An entry in k-bucket shares at least k-bit prefix with the nodeID k=20 in overnet Add new contact if k-bucket is not full Parallel, iterative, prefix-matching routing Replica roots: k closest nodes Kademlia Protocol 10101100 11001011 01001011 123.24.3.1 00100101 23.37.12.13 0 K bucket 01011010 311.1.3.4 11000100 … 1 10101100 01000001 129.5.3.1 11011011 0 11001100 11000100 1 11111110 … 11001010 11010001 0 Find/store 10001011 1 10010100 10001110 … 0 10000001 1
No restriction on nodeID Replica root: |r, k| < K buckets with index [0,4] can be split if new contact is added to full bucket Wide routing table short routing path K bucket in i-th level covers 1/2i ID space A knows new node by asking or contact from other nodes Hello_req is used for liveness routing request can be used Kad Protocol 10101100 1 0 0 1 1 0 1 0 1 0 1 0 1 0 1 0 0 1 1 0 1 0 1 0 1 0 1 0 1 0 4 3 2 1 0 15 14 13 12 11 10 9 8 7 6 5 0 1 0 1
Vulnerabilities of Kad • No admission control, no verifiable binding • An attacker can launch a Sybil attack by generating an arbitrary number of IDs • Eclipse Attack • Stay long enough: Kad prefers long-lived contact • (ID, IP) update: Kad client will update IP for a given ID without any verification • Termination condition • Query terminates when A receives 300 matches. • Timeout • When M returns many contacts close to K, A contacts only those nodes and timeouts.
Actual Attack • Preparation phase • Backpointer Hijacking: 8 A, attacker M • Learns A’s Routing Table by sending appropriate queries • Then, change routing table by sending the following message. • Execution phase • Provide many non-existing contacts • Fact: Query will timeout after trying 25 contacts. Hello, B, IPM 0xD00D IPB IPM A M
Summary of Estimated Cost • Assumption • Total 1M nodes • 800 routing table entries • 100 Mbps network link • Preparation cost • 41.2GB bandwidth to hijack 30% of routing table • Takes 55 minutes with 100 Mbps link • Query prevention • 100 Mbps link is sufficient to stop 65% of WHOLE query messages.
Large scale simulation • 11,303 ~ 16,105 Kad nodes running on ~500 PlanetLab machines • Comparison between expected and measured • keyword query failures • Number of messages used to attack one node • Bandwidth usage
Self reflection attack • Fill node A’s routing table with A itself. A A C C IPC C C … … Hello, X, IPA IPG G G G Attack G • ≈ 100% queries failed after attack • Nodes can recover slowly • Second round of attack
Mitigations • Identity authentication • Routing correctness • Independent parallel routes • Incrementally deployable
The Frog-Boiling Attack: Limitations of Secure Network Coordinate Systems
What is Network Coordinate System? • P2P, CDN, DHT, … • How to measure distance between a pair of nodes • Efficient estimation of latency B A C
Embedding Error in Network Coordinates • Real value ≈ Estimated value? • What if the error is very big? • What a coordinate system should do?
Existing Network Coordinate Systems • Vivaldi • Decentralized • Resilient to network dynamics • Pyxida • Implementation of Vivaldi • Open source designed to operate on P2P [Dabek et al. ,2004] [Pyxida 2009]
Coordinate Update Relative Error =|Estimated-Real|/Real =|19.8-25|/25 = 20.8% (1,20) 25ms (14,5) Alice update coordinate that reduces error
Vivaldi is not secure A Real RTT T V’ V A’ Measured RTT
Existing secure mechanism A V’ V Mahalanobis distance, Kalman Filter, Veracity
Frog-Boiling Attack A V’ V Displacements falls inside the threshold
Targeted Attack • Goal: make victim report spoiled coordinate to the rest of the network and flagged as outliers and isolated • Is frog-boiling attack simple to achieve? A V A A
Three Variant Attacks • Basic-Targeted Attack: moves targeted nodes to some coordinate (a) • Network-Partition Attack: partitions a network into several subnetworks (b) • Closest-Node Attack: attacker becomes the closest node to the victim (c) Honest nodes are shaded whereas malicious nodes are white
Mahalanobis Distance • Distance measure of a point from a group of values based on the correlation between variables.
Mahalanobis Outlier Detection Vivaldi Spatial filter’s vector [ Peer’s reported error, change in the peer’s coordinate, the latency ] Mahalanobis Outlier Detection Temporal filter’s vector [ The remote error, local error, latency, change in the peer’s coordinate, local coordinate change ] • Used to update • Coordinate • History for filters
Convergence Time and Overhead • Long-Running Experiment • 400 PlanetLab nodes for almost 2 days with no attackers • Stabilized after 2 hours with low median relative error, rrl, ralp • A network coordinate system does not impose high latency penalty RRL(Relative Rank Loss): pairwise orderings of each neighbor RLAP(Relative Application Latency Penalty: percentage of lost latency
Basic-Targeted Attack Evaluation • Coordinate oscillation attack as a baseline • At time 120m, attacker nodes(11%) start to random coordinate attack • Basic-Targeted Attack
Aggressive Frog-Boiling Attack Evaluation • Aggressive step size • Different step size δ is used: 2, 5, 10, 50, 100, 250 (ms) • Effectiveness • For the value of 5 or 10, the attack is very effective and fast • For higher values of 50 or 100, the attack is less effective • For the value of 250, the frog-boiling attack is not successful
Network Partition Attack Evaluation • Both the victim nodes and the rest of the network are moved • Adversaries: 6%, Network1: 37%, Network2: 57% • Network1 is pushed to the location (1000,1000,1000,1000) with height 1000 while Network2 is pushed to the location (-1000, -1000,-1000,-1000) with height -1000
Closest-Node Attack Evaluation • Attacker nodes tries to become the closest node • Every victim node reports the closest neighbor at 10 minutes interval • The fraction of attackers was estimated • Snapshot of 500 minutes
Kalman Filter anomaly detection Comparison of predicted relative error and measured error with history and parameter Expectation-Maximization + Parameter calculation Parameter passing For Kalman filter Trusted nodes Unrusted nodes
Kalman Filter Attack Evaluation • 8% of nodes are set to trusted nodes with δ =10
Veracity Reputation System Γ = 7 2. Coordinate verification test V’s VSET (determined by V) P V 1. Coordinate report Λ =7 3. RTT delay verification test V’s RSET (randomly chosen by V)
Veracity Attack Evaluation • δ = 0.4, Δ = 20%, Γ = 7, Λ =7 • Each attack shifts its coordinate by +δ or - δ based on the victim’s unique global identifier GUID • Before replying with the forged coordinate to a victim node, the attacker sends out messages to the VSET members to compromise the first step of coordinate verification. Delaying RTT is not necessary.
Summary • A decentralized network coordinate system with proposed secure mechanisms can be entirely disrupted by more clever attack of the frog-boiling attack. • Frog-boiling attack manipulates peers’ required property of coordinate update, and the adversary slowly expands the range of data accepted by the node by influencing the node’s recent history. • A secure network coordinate system will need to provide some mechanism to verify a node’s reported coordinates and/or RTTs, which is a challenging problem.
Considerations • Limitary authentication mechanism dedicated for secure coordinate update can be considered • Mixing geographical information with a virtual coordinated can limit the range of victims that attackers can manipulate • Each node can be assigned a set of trusted nodes so that their coordinate information impacts majority of portion in a node’s coordinate change. • Periodically, each node can be made to verify its coordinate to a set of trusted nodes. If embedding error exceeds a boundary, its coordinate can be made reset by those trusted nodes and its neighbor list is reconstructed.