220 likes | 338 Views
Explore efficient sampling strategies for disseminating information to target hosts, comparing static and dynamic methods, and leveraging subnet distribution to minimize sampling. Experimental results evaluate optimal strategies based on data sets. Future research may focus on time calculation for reaching hosts.
E N D
SAMPLING STRATEGIES FOR EPIDEMIC-STYLE INFORMATION DISSEMINTATION Milan Vojnovic, Varun Gupta, Thomas Karagiannisand Christos Gkantsidis
INTRODUCTION • Reaching target fraction of the hosts. • Discovering of nodes by random probing • Optimum static and dynamic random probing stratagies • Non uniformity of hosts of subnets. • Assumed that hosts are partitioned into groups or subnets
INTRODUCTION • Summary of Results: • Identify the optimal static and dynamic strategies • Optimal static strategy is unique • Optimal dynamic strategies are multiple • Simple sampling strategies, outperforms global random scanning and local subnet preference strategies. • K-FAIL and K- CANDSET
INTRODUCTION • Related Work • Speed of propagation of the information to the hosts • Time required to reach the target fraction of the hosts
STATIC SUBNET PREFERENTIAL SAMPLING • Class of sampling strategies for which the subnet sampling probabilities are fixed in time are considered. • Uniform Global random sampling strategy(UNI(Ω)). • total fraction of sucesptable hosts s(u)and the total number of samplings per host ‘u’ are related as • s(u) = s(0)e^−βu, u ≥ 0.
STATIC SUBNET PREFERENTIAL SAMPLING • Optimal Static Strategy (OPT - STATIC) • OPT-STATIC dictates sampling over a set A • Need not necessarily sample the smallest number initially densest subnets • Targeting the largest subsets may start slow dissemination but makes things faster at the end.
STATIC SUBNET PREFERENTIAL SAMPLING • Thoerm: • For any target fraction of infected hosts, the strategy OPT-STATIC is optimal for minimizing the total number of sampling over all static sampling strategies.
STATIC SUBNET PREFERENTIAL SAMPLING • The required number of samplings to reach the target hosts depends on: • Density of hosts over the address space • Initial fraction of infected hosts • Distribution of intiallysusceptable hosts over subnets • Distribution of subnet address sizes
DYNAMIC SAMPLING STRATEGIES • Optimal dynamic sampling strategy(OPT_DYNAMIC) • Extending the space of sampling strategies from static to dynamic does not give the optimum solution. • Assumed that number of samples are minimized. • Adding infected hosts to least dense subnets.
SAMPLING STRATEGIES THAT USE ONLY LOCAL KNOWLEDGE • Sampling strategies that are local in the each host biases its sampling over subnets based only on success or failure. • We would see the sampling strategies that at any time keep the state for only a constant number of subnets.
SAMPLING STRATEGIES THAT USE ONLY LOCAL KNOWLEDGE • Local Subnet Preference • Each infected host in a subnet samples an address uniformly at random.
SAMPLING STRATEGIES THAT USE ONLY LOCAL KNOWLEDGE • K-FAIL Strategy • Each infected Host starts with uniform random sampling. • When Strategy fails on a candidate subnet • When Host becomes infected
SAMPLING STRATEGIES THAT USE ONLY LOCAL KNOWLEDGE • K-CAND Strategy • Infected hosts are set arbitarily • Each infected host samples an address uniformly • A host that becomes infected inherits the candidate set of the instigator host
Experimental Results • Data Sets • WU: The data set refers to IIS logs collected at the windows update system • Hotmail: The data set consists of approximately 103 million IP addresses • Dsheild: The data-set consists of roughly 7.6 million IP addresses • Witty A: list of IPs(roughly 55000) corresponding to hosts spreading the witty worm
EXPERIMENTAL RESULTS • Evaluation of Optimal Sampling Strategy • Optimal sampling strategy depends on • Logarithmic term • KL divergence term • If KL divergence term is negligibly relative to logarithmic term then uniform random sampling over an address space is near optimal .
CONCLUSION • Leveraging the distribution of hosts over subnets • Static and dynamic sampling strategies • Analysis was done to acquire the number of samplings done to reach the target host • In future, calculating the time required to reach the host would be good aspect to be researched on