320 likes | 421 Views
Internet Coordinate Systems. Dr. Laurent Mathy Computing Department Lancaster University, UK laurent@comp.lancs.ac.uk RESCOM 2007. Aims of the talk. Review main Internet Coordinate Systems and techniques Discuss properties of Internet as delay space and resulting embedding issues
E N D
Internet Coordinate Systems Dr. Laurent Mathy Computing Department Lancaster University, UK laurent@comp.lancs.ac.ukRESCOM 2007
Aims of the talk • Review main Internet Coordinate Systems and techniques • Discuss properties of Internet as delay space and resulting embedding issues • Highlight (some) security issues for ICS and approach to solution
Why Internet Coordinate Systems? • Many applications, distributed systems, overlays benefit from “network topology awareness” • Closest server/neighbour selection • Distance ranking (which node is closer?) • Network-overlay topology congruence – e.g. CAN • Need measurements • But potentially high overhead • Many nodes to measure against • Many different simultaneous overlays/applications measuring simultaneously • Especially that distance changes need to be tracked • “ping storms” on PlanetLab
Why Internet Coordinate Systems (2) • Luckily, delays (RTT) are statistically constant and predictable • At least constancy in order of several minutes • Mostly present sporadic “level shifts” • predictable within 20% of real value, 95% of the time • Idea is to map (“embed”) the Internet delay space onto an appropriate metric space so that • Each nodes coordinate is computed/tracked via sample measurement of a small number of nodes • Distance between any 2 nodes can be estimated without the need for further measurements • Advantages • Low distance estimation/computation overhead • Low “full-mesh” distance communication overhead • O(k*d) vs O(k2), with k nodes and d dimensions
Relative positioning without ICS • Not all relative positioning problems need coordinates • Binning • Measure distances to set of landmarks (8 to 15) • Order landmarks by increasing RTTs to get bin “Id” • Rationale: nodes close to each other will see similar RTTs to landmarks and end up in same bin • Improvement: add “range levels” to bins • E.g. ]0, 100] ms = level 0; ]100, 200] ms = level 1; >200ms = level 2 L2 L1 70 56 238 204 L2L3L1:002 L1L3L2:012 L3 91 112
ICS Embedding Principles • The main goal is to embed the Internet delay (RTT) space on a metric space to allow easy distance estimations • Metric Space: given D(a,b) the distance function between a and b • (anti-reflexivity) D(a,b) = 0 iff a = b • (symmetry) D(a,b) = D(b,a) • (triangular inequality) D(a,b) <= D(a,c) + D(c,b) • An embedding is a mapping from a metric space to another • For now, ignore the fact that Internet delay space is not metric… • Goodness of embedding metric: relative absolute error • d(a,b) is estimated distance (= D(a,b)) • δ(a,b) is the (real) measured distance • |d(a,b) - δ(a,b)| / δ(a,b) • This will be directly or indirectly minimized • stabilisation of relative errors often equated to systems convergence • But bear in mind that in some pathological cases, errors may stabilize, while the system is in chaos!
Global Network Positioning (GNP) • The pioneering ICS • Infrastructure-based: uses landmarks y (x1, y1) (x2, y2) L2 L1 x (x4, y4) L3 (x3, y3)
GNP (2) • Goal: find coordinates so that overall error between measured distances and estimated distances is minimized • Embedding in 2 phases, based on multi-dimensional global minimization • Phase 1 • From full mesh measurements between landmarks, centrally • Minimize • where ε(.) is an error measurement functione.g. • Phase 2 • Minimize • Resolution by simplex downhill method • Should find global minimum but risk of getting stuck in local minimum
GNP (3) • Landmarks embed more often than normal nodes for accuracy • For space with D-dimensions, must have at least D+1 landmarks • Found that 7-D Euclidean space provides best accuracy vs overhead trade-off • In practice, 8 to 20 well placed landmarks is enough • But risk of high measurement overhead at landmarks • And landmarks represent point of failure Enter GNP’s “derivatives”
Network Positioning System (NPS) • GNP’s little brother • Hierarchical architecture for scalability • Membership servers designate positioned host as “reference points(RP)” when existing landmarks/RP are congested • Optimal is 3 layers, due to error amplification across layers Layer 0 L1 L2 L3 Layer 1 Layer 2 Layer 3
NPS (2) • Landmark positioning is distributed • Based on observation that GNP objective function F(.) can be re-written as have each landmark minimize its “corresponding” term • Better accuracy when all landmark reposition roughly at the same time • When change in RTT is detected, a landmark triggers others to reposition with special probe • Malicious reference point detection • On embedding, a node computes its relative error to its RF • Eliminates RF with max relative error if • Maxi(ERi) > 0.01 and • Maxi(ERi) > C mediani(ERi)
Practical Coordinate Computation (PIC) • Kind of infrastructureless-NPS! • No more points of failure! • Idea is that any node with a computed coordinate can be used as an RF/landmark • Again, for D-dimensions needs at least D+1 RFs • If not enough nodes in system yet, just work in lower dimension space • Better results if use roughly ½ of close and ½ of randomly chosen nodes • Hey, how do you know, as you’ve just arrived? • Do a first embedding with only random landmarks, then pick close neighbours based on these rough coord, and start again! • PIC has a malicious node detection based on the triangle inequality property
Lighthouse • Any node can be a landmark • Pick any D+1 nodes for a D-dimensional space, and use them as a local bases • Local basis are usually oblique • A node coordinates therefore depends on oblique projections
Lighthouse (2) • In a local basis • A node coord can be expressed as • And computed by resolvingwhere • With this, the full mesh measurements between the nodes in the local basis and general triangle formulas, we get the node coord in the local basis
Lighthouse (3) • How do we “reconcile” all those local bases – and all those coordinates? • By simple basis changing operation: given 2 basiswe have where • Pick any local basis as the global one and have each node maintain the transition matrix from its local to the global basis • All that’s needed are the coordinates of the (local) lighthouses in the global basis
Vivaldi • Main peer-to-peer based proposal (no infrastructure) • Based on the simulation of a network of springs • Spring between 2 nodes • Rest position is the measured distance δ(i,j) • If estimated distance d(i,j) is smaller, the embedding node is pushed away from the other node • If estimated distance d(i,j) is bigger, the embedding node is pulled towards the other node • Nodes should attach to about ½ close nodes and ½ far nodes • Each node has say 32 or 64 neighbours • Initial coord is the origin
Vivaldi (2) • For stability, don’t overreact if other node has low confidence in its coordinates and don’t move too much if you are confident in yours • For convergence, try and move more when you are not confident in your coordinates each node keep a “local error” • The local error can be seen as the inverse of the confidence a node has about its coordinates • Used to compute an adaptive timestep
Vivaldi (3) • Algorithm summary (embedding step for node i): • w = ei / (ei +ej) • Sample weight balances local and remote error • εs = |d(i,j) – δ(i,j)|/ δ(i,j) • Sample relative error • ei = εs * ce * w + ei * (1 – ce * w) • Update local error • Δ = cc * w • Compute time step • xi = xi + Δ * (δ(i,j) - d(i,j))u(xi – xj) • Update coordinate
Internet delay space characteristics • A study by Shavitt et al. has shown that the internet RTT space most resembles a hyperbolic space • This can be approximated by a 2d-Euclidean space augmented with a height vector • This is the preferred Vivaldi space • The Euclidean component represents the Internet core with latencies proportional to geographic distances (no congestion) • The height vector represents the access link • Issue when estimating distances between nodes behind the same access • But is the Internet delay space a metric space anyway? • … NO!
Internet delay space characteristics (2) • Internet a Metric space? • (anti-reflexivity) D(a,b) = 0 iff a = b • Holds if timing facility has high enough resolution • (symmetry) D(a,b) = D(b,a) • Paths are not symmetrical • Holds for round-trip path metric and a bit of good will • That’s why “delay” here always means “RTT” • (triangular inequality) D(a,b) <= D(a,c) + D(c,b) • Does not hold • Estimates are that between 4% and 20% of all Internet paths exhibit Triangular Inequality Violations (TIV) • The Internet is therefore a quasi-metric space, and embedding it into a metric space will create inaccuracies
Where are TIVs from? • They can have several causes: • Intra-domain routing • Intra-domain routing is based on shortest path routing • Discrepancies between actual link delay and link weights can create TIVs • Traffic engineering anyone? ;-) • Hot-potato routing 2 2 R2 1 d(2,3) = 13 d(2,1) = 4 d(1,3) = 8 d(2,3) > d(2,1) + d(1,3) TIV!!! R1 2 1 1 2 R4 R3 1 4 3
Where are TIVs from? (2) • Private peering links • Multihoming; bilateral, non-transitive peering relationships; interaction intra-inter domain routing, etc are even more causes for TIVs 6 R2 R1 1 1 d(2,3) = 28 d(2,1) = 2 d(1,3) = 4 d(2,3) > d(2,1) + d(1,3) TIV!!! 3 1 4 1 R5 R6 R3 R4 1 1 1 1 2
Impact of TIVs • At best, TIVs will just cause inaccuracies on embedding • If TIVs are encountered during embedding, the resulting coordinate will lie “in-between” • If not encountered during embedding, coordinates will still inadequately predict real distances • At worse, coordinate will “oscillate” • Typically the case in Vivaldi • Because the TIVs have a nasty happy to “pull” on nodes, who then get pushed back by other neighbours • TIVs are the major cause of errors in ICS
Other Oddity • ICS have been observed to drift • The centroid of the points in the metric space moves in a fairly constant direction at a rate of a few hundred millisecond per day • This has been observed on a large-scale vivaldi system • This is probably due to the accumulation of errors caused by TIVs, RTT level shifts, embedding errors, etc • For all practical purposes, this can be ignored as long as embedding refresh period is small compared to a day
What are ICS good for anyway? • Some studies tend to suggest that although the relative errors can be very small, coordinate systems can perform badly at specific application • Especially, closest neighbour selection and neighbour ranking • However, I have some doubt about the representativeness of the data used • Don’t get me wrong, it is actually very hard to get a snapshot of measurements that actually represent the network • True, you say, but that’s the same for the computation of the relative error • You believe who you want… • So the theory goes: the relative error may be too much of an aggregated metric to tell a good story… • … but the alternative is, so far, application-specific metrics (yurk!)
ICS security • Most ICS actually trade convergence time for scalability • Also, most of them are actually more accurate as the number of nodes increase • Because of this, you should expect ICS to be deployed has an always on service • You must have a coordinate by the time you need one! • Great, but then, they may become a prime target for attackers • Think of all the nice applications, distributed systems and overlays you can bring down with one stone!!! • Large scale DDoS Attack anyone? • What can an attacker do? • That depends on where/who they are…
ICS security (2) • Insider attack • Most ICS rely on full cooperation between the nodes to operate • Untrusted nodes can easily • Lie about their coordinate (to mess up estimation) • Tamper with your probes (usually delay them, to mess up measurement) • Lie about anything else they can lie about (e.g. local error in Vivaldi) • Both • Has been shown to be very effective • Result of this is a distortion of the coordinate space • This is insidious, because unsuspecting honest nodes will propagate errors for the bad guys! • Outsider attack • Inject rogue probes into the system to fool measurements • DDoS attacks on links • Impact still under study
ICS security (3) • Defending against insider attacks • Early methods too primitive • NPS median test can start working for the attacker when the attacker dominates the set of measurements (and skews the median) • PIC defence is based on the triangle inequality: the Internet messes it up for us without bad guys! • Trust propagation models • But you must trust the trust propagation • Can be complex
ICS security (3) • Signal processing • It was shown that relative error evolution can be modelled by a linear state space model (and tracked by a Kalman filter) • It was also shown that the model of error evolution for one node is a good match for the error model of nearby nodes • This means that the Kalman filter calibrated on one node can be used to predict errors observed on a nearby node • The Kalman filter gives you the mean and variance of its innovation process which is the difference between the input (measured error) and the predicted one • A simple hypothesis test is therefore possible on the deviation between the measured error and the predicted one • Idea: have a set of trusted infrastructure nodes (surveyors) that embed exclusively each other – they see a “clean” space • Surveyors also help embed other nodes • Have node use Kalman filter calibrated at (close by) surveyors • At any embedding, use the filter to test whether the observed error is compatible with the prediction • If not, ignore/change your neighbour
ICS security (4) • The previous signal processing method cannot defend against a node that lies about its coordinate during distance estimation (application phase) • In that case you need something else • Trust again? • Validity certificates? • ???
Conclusions • ICS are a relatively new field, and still very much a hot-topic • Our understanding of them still improves steadily • On the other hand, several large-scale trials have shown that they are mostly fit for practical purpose • They are poised to play a critical role in supporting future overlays and intelligent applications • Serious deployment could be only a few years away • Most structured p2p systems have some kind of ICS prototypes available to them • But these could of course become “famous last words” ;-)