1 / 25


HEURISTIC & SPECIAL CASE ALGORITHMS FOR DISPERSION PROBLEMS - RAVI, ROZENKRANTZ, TAYI. ROB CHURCHILL (THANKS TO BEHZAD). Problem:. given V = {v 1 , v 2 , …, v n }, find a subset of p nodes (2 <= p <= n) such that some distance function between nodes is maximized

Download Presentation


An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript


  2. Problem: • given V = {v1, v2, …, vn}, find a subset of p nodes (2 <= p <= n) such that some distance function between nodes is maximized • My first reaction: sounds like a Max-k Cover Problem except instead of covering, maximizing distances

  3. Max-Min Facility Dispersion(MMFD) • Given non-negative, symmetric distance function w(x,y) where x, y ∈ V • Find a subset P = {vi1, vi2, …, vip} of V where |P| = p, s.t. f(P) = minx,y ∈ P{w(x, y)} is maximized.

  4. Max-Avg Facility Dispersion (MAFD) • Given non-negative, symmetric distance function w(x,y) where x, y ∈ V • Find a subset P = {vi1, vi2, …, vip} of V where |P| = p, s.t. f(P) = 2/[p(p-1)] * Σx,y ∈ Pw(x, y) is maximized.

  5. MMFD & MAFD are NP-Hard • Even when distance function is a metric • Reduction to the NP-Complete problem CLIQUE. • Checks to see if a given graph G = (V, E) contains a clique of size >= J

  6. Reduction • w(x, y) = 1 if they are connected, 0 otherwise • set J = p • For MAFD, if Clique(J) = 1, then there exists a clique of size J. If J < 1, then there does not exist a clique of size J • For MMFD, Clique(J) = 1 if there exists a clique of size J and 0 if there does not

  7. How do we solve these? • If we can’t get an optimal solution, we will settle for a good approximation • There are no absolute approximation algorithms for MMFD or MAFD unless P = NP • We want a relative approximation algorithm

  8. Use Greedy Algorithms “Greed is good.” - Gordon Gekko

  9. Max-Min Greedy Algorithm • Step 1. Let vi and vj be the endpoints of an edge of maximum weight. • Step 2. P <— {vi, vj}. • Step 3. while ( |P| < p ) do • begin • a. Find a node v ∈ V \ P such that minv' ∈ P {w(v, v’)} is maximum among the nodes in V \ P. • b. P <— P U {v} • end • Step 4. Output P. • Provides a 2-approximation to the optimal value

  10. Max-Avg Greedy Algorithm • Step 1. Let vi and vj be the endpoints of an edge of maximum weight. • Step 2. P <— {vi, vj}. • Step 3. while ( |P| < p ) do • begin • a. Find a node v ∈ V \ P such that Σv’ ∈ P w(v, v') is maximum among the nodes in V \ P. • b. P <— P U {v}. • end • Step 4. Output P. • Provides a 4-approximation of the optimal solution

  11. Special Cases • For one dimensional data points, you can solve MMFD & MAFD optimally in polynomial time • For two dimensional data points, you can solve MAFD slightly more accurately than the greedy algorithm in polynomial time • 2-D MMFD is NP-Hard, 2-D MAFD is open

  12. 1-D MAFD & MMFD • Restricting the points to 1-D allows for a dynamic programming optimal solution in polynomial time • O(max{n log n, pn}) • V = {x1, x2, …, xn}

  13. How it works • Sort the points in V (n log n time) • w(x, y) = distance from x to y • OPT(j, k) = the solution value with k points picked from x1, …, xj • OPT(n, p) = optimal solution for the whole set

  14. Recursive Statement • OPT(j, k) = max {OPT(j-1, k), OPT(j-1, k-1) U xj}

  15. Runtime MAFD • OPT(j-1, k) and OPT(j-1, k-1) are constant time lookups • Store the representative of OPT(j-1, k-1) in μ(j-1, k-1) • OPT(j-1, k-1) U xj is constant time:w(xj, μ(j-1, k-1)) + OPT(j-1, k-1)*(k-1) / k = average distance

  16. Runtime MMFD • Store the most recently picked element in the optimal solution in f(j-1, k-1) • This gives a constant time computation of OPT(j-1, k-1) U xj:min {OPT(j-1, k-1), w(xj, f(j-1, k-1))}

  17. Runtime • Both are O (nlogn + pn) since their computation times per iteration are constant if the right information is stored

  18. The Dynamic Programming Algorithm (*- - In the following, array F represents the function f in the formulation. - -*) Step 1. Sort the given points, and let {x, x2, …, xn} denote the points in increasing order. Step 2. for j := 1 to n do F [0, j] <— 0; Step 3. F [1,1] <— 0. Step 4. (*- - Compute the value of an optimal placement - - *) for j := 2 to n do for k:= 1 to min (p,j) do begin t1 <— F[k, j - 1] + k(p - k)(xj - xj-1); t2 <— F[k - 1, j - 1] + (k - 1)(p - k + 1)(xj - xj-1); if t1 > t2, then (*- - do not include xj - - *) F[k, j] <— t1; else (*- - Include xj - - *) F[k, j] <— t2; end; —>

  19. The Algorithm cont. Step 5. (*- - Construct an optimal placement - - *) P <— {x1}; k <— p; j <— n; while k > 1 do begin if F[k, j] = F[k - 1, j - 1] + (k - 1)(p- k + 1)(xj - xj-1), then (*- - xj to be included in optimal placement - - *) begin P <— P U {xj}; k <— k - 1; end; j <— j - 1; end; Step 6. Output P.

  20. 2-D MAFD Heuristic • Uses 1-D MAFD algorithm as the base • Gives a π/2-approximation

  21. How it works • given V = {v1, v2, …, vn} • vi = {xi, yi} (coordinates) • p <= n = |V|

  22. The Algorithm • Step 1. Obtain the projections of the given set V of points on each of the four axes defined by the equations • y = 0, y = x, x = 0, and y = -x • Step 2. Find optimal solutions to each of the four resulting instances of 1-D MAFD. • Step 3. Return the placement corresponding to the best of the four solutions found in Step 2.

  23. Relation to Study Group Formation & High Variance Clusters • These create one maximum distance group, not k max distance groups • If you want k-HVclusters, set p = n/k and run the algorithm (whichever you choose) k-1 times (last n/k points are the last cluster • This could guarantee that the first couple of groups have a high variance, but not the later ones

  24. Study group formation • Most study groups only study one subject • If you wanted to assign students one study group per subject, you could simplify their attributes to one dimension per subject and solve each subject optimally. • Instead of the exact algorithm described, minimize the distance from the mean, but stay on the opposite side of the mean as the teacher node • Maybe have positive & negative distances to reflect which side of the mean a point is on • This would ensure that people who would learn (under mean) would be picked before people who would not learn • You want multiple study groups and highest amount of learning • Not sure how to do this…

  25. References • S.S. Ravi, D.J. Rosenkrantz, and G.K. Tayi. 1994. Heuristic and Special Case Algorithms for Dispersion Problems.

More Related