180 likes | 193 Views
Explore how parallel processing enhances the Kolmogorov-Smirnov 2D algorithm for comparing empirical distributions, with load balancing challenges and communication optimization. Investigate the impact of distributive memory and adaptive load balancing on performance across different data sets. Experimental results on uniformly distributed samples and standard normal distributions show the potential benefits and limitations of parallel KS2D processing. Emphasize the importance of load balancing strategies tailored to specific data characteristics for efficient computation.
E N D
Parallel 2D Kolmogorov-Smirnov Statistic Ian Chan 5/12/02 6.338J/18.337J http://web.mit.edu/ianchan/www/KS2D
Motivation: my friend’s research A colossal X-ray flare, likely sparked by a central Milky Way black hole, produced the bright spot in this Chandra image. [Source CNN]
1D Kolmogorov Smirnov Statistic • test difference in two empirical distributions F ¹ G nonparametrically • D statistic: maximum difference between 2 CDF’s
1D KS Test Bound • Kolmogorov(1933) asymptotic bound:
2D Analog of KS Test • Peacock J, Monthly Notices of the Royal Astronomical Society, 1983, vol 202 p615: Two-Dimensional Goodness-of-Fit Testing in Astronomy • D statistic: considering all possible quadrant divisions, the largest possible difference in CDFs
2D KS Test Bound • Monte Carlo simulated bounds Z = D n1/2
KS2D Test Brute Force Algorithm • O(n2), not exhaustive, quadrants centered at each data ponts • O(n3), exhaustive, quadrants centered at each possible data x and data y combination
O(nlogn) KS2D algorithm • Author: A. Cooke (1999) • construction of binary tree data structure ( O(nlogn) ), require pre-sorted sample data by y
How it works: (1) Tree construction • quadrants centered at (x,y) must have upper left quadrant contains all samples (a,b) where a < x AND b < y • If childless node, Dmin = Dmax = 1/Nsquare or –1/Ncircle, depending on class
How it works: (2) Upward Propagation At node (2,3), we find the MIN and MAX from the 3 choices: 1 inherit Dmin/max from its left child (1,2), which implies that Q excludes (2,3) where Q is the quadrant that contains the largest |D| 2 D = delta(left child) + (0/Ns-1/Nc), which implies Q contains (2,3) and has (2,3) on its border. Delta(x) = diff in CDF if quadrant contains all samples in subtree at x 3 D = delta(left child) + (0/Ns-1/Nc) + Dmin/max (right child), which implies Q contains (2,3) and (2,3) is not on its border
The other 3 quadrants • We have considered the Top Left Quadrant, but the Top Right quadrant can be obtained from the same tree structure if we modify the upward propagation rule by swapping left & right, i.e. At node (2,3), we find the MIN and MAX from the 3 choices: 1 inherit Dmin/max from its right child (1,2), which implies that Q excludes (2,3) where Q is the quadrant that contains the largest |D| 2 D = delta(right child) + (0/Ns-1/Nc), which implies Q contains (2,3) and has (2,3) on its border. Delta(x) = diff in CDF if quadrant contains all samples in subtree at x 3 D = delta(right child) + (0/Ns-1/Nc) + Dmin/max (left child), which implies Q contains (2,3) and (2,3) is not on its border • The Bottom Left/Right Quadrants can be obtained if the tree is built with samples sorted by reverse order of y.
Parallel KS2D Algorithm • Speed possibly scales linearly with number of processors during the upward propagation step, cannot parallelize the tree construction step • Problem size scales linearly with number of processors because sample nodes are stored in processors distributively Challenges • Load Balancing: Dividing the tree nodes equally among processors • Minimize communications: Try to store an entire subtree into a single processor so that less inter-processor communication is necessary.
Load Balancing and Minimum Communications • Ideally…
Load Balancing Strategy (1) Pre-processing Randomly sample 1000 data points. Sort them by x. Consider the 1/numproc*1000th, 2/numproc*1000th…, (numproc-1)/n*1000th positions and use them to define intervals for load balancing Drawback: assumes x and y to be more or less independent
Load Balancing Strategy (2): adaptive Keep a running average of the x values of nodes stored in each processor. For every CHECKPOINT(=2000) number of samples, if the load is skewed (if difference of load between the heaviest load processor and the lightest load processor > 30% of load of lightest processor) change the load balancing intervals to midpoints of the running averages.
Performance(1) • 20,000 and 200,000 samples from uniform [0,1] distribution
Performance (2) • Effects of adaptive load-balancing on performance for samples from standard normal distributions centered at (-0.7,0.7) and (0.7, -0.7)
Conclusion for Parallel KS2D • Speedup is not great, especially when more processors are used because of communication overhead. • Load balancing strategies is noticeably effective for certain data distributions, need dependent on samples • Distributive Memory: Gains the ability to solve larger problems