1 / 29

Efficient and Effective Practical Algorithms for the Set-Covering Problem

Efficient and Effective Practical Algorithms for the Set-Covering Problem. Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software Engineering University of Wisconsin at Platteville. The Set-Covering Problem. Given N sets, let X be the union of all the sets.

Download Presentation

Efficient and Effective Practical Algorithms for the Set-Covering Problem

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Efficient and Effective Practical Algorithms forthe Set-Covering Problem Qi Yang, Jamie McPeek, Adam Nofsinger Department of Computer Science and Software Engineering University of Wisconsin at Platteville

  2. The Set-Covering Problem • Given N sets, let X be the union of all the sets. A cover of X is a group of sets from the N sets such that every element of X belongs to a set in the group. • The set-covering problem is to find a cover of X of the minimum size.

  3. Matrix Representation of the Set-covering Problem Number of sets: N = 4 Number of elements: M = 6 One cover: S1, S3, S4 One minimal cover: S1, S3 Not a cover: S1, S2, S4 (a is not covered)

  4. NP-Hard Problem • Introduction to Algorithms by T. H. Cormen, C.E. Leiserson, R. L. Rivest • The Set-covering problem has been proved to be NP hard • A Greedy Algorithm

  5. Algorithm Greedy ResultCover : The minimum cover to be found. Uncovered : The set of elements not covered yet. 1. Set ResultCover to the empty set 2. Set Uncovered to the union of all sets 3. While Uncovered is not empty • select a set S that is not in ResultCover and covers the most elements of Uncovered • add S to ResultCover • remove all elements of S from Uncovered

  6. Algorithm Check And Remove (CAR) • Identifying Redundant Search Engines in a Very Large Scale Metasearch Engine Context • 8th ACM International Workshop on Web Information and Data Management • The set-covering problem is equivalent to the problem of identifying redundant search engines on the Web • Algorithm CAR is much faster than Algorithm Greedy

  7. Algorithm CAR (Check And Remove) 1. Set ResultCover to the empty set 2. For each set S • determine if S has an element that is not covered by ResultCover • add S to ResultCover if S has such an element • exit the for loop if ResultCover is a cover of X 3. For each set S in ResultCover • determine if S has an element that is not covered by any other set of ResultCover • Remove S from ResultCover if S has no such an element

  8. Example Set ResultCover UnCovered {} {a, b, c, d, e, f} S1 {S1} {a, d, f} S2 {S1, S2} {a, f} S3 {S1, S2, S3} {} Removing S2 {S1, S3} {}

  9. Time Complexity Algorithm Greedy O(M * N * min(M, N)) Algorithm CAR O(M * N) N: number of sets M: number of elements of the union X

  10. CPU Time CPU Time (Sec) 40000 35000 30000 25000 Greedy 20000 CAR 15000 10000 5000 0 100 200 300 400 500 600 700 800 900 1000 Actual Cover Size CPU Time

  11. Cover Size

  12. Implementation Details • Read data Binary search tree BitMap indicating which sets cover an element • Convert the tree to an array of BitMaps Matrix representation of the set-cover problem • Find a cover

  13. Binary Search Tree and BitMap Number of sets (N) is known Number of elements of each set is known The total number of elements is unknown Reading elements of one set at a time BitMap size N which sets cover the element a column of the matrix element element element

  14. Array of Column BitMaps e1 e2 e3 e4 em-1 em • Row Operations • Find the number of elements in a set that are not covered by the result cover • Determine if a set contains an element that is not covered by the result cover • Determine if a set in the result cover has an element that is not covered by any other sets in result cover • … element

  15. Array of Row BitMaps It takes some time to convert column BitMaps to row BitMaps. But all row operations are performed within a row BitMap. element

  16. CPU Time The CPU time includes the time to convert column BitMaps to row BitMaps, but not the time to build the tree.

  17. CPU Time (Row BitMap)

  18. Algorithm Greedy 1. Set ResultCover to the empty set 2. Set Uncovered to the union of all sets 3. While Uncovered is not empty • select a set S that is not in ResultCover and covers the most elements of Uncovered • add S to ResultCover • remove all elements of S from Uncovered

  19. Algorithm Greedy Update UncoveredCount: number of elements of a set not covered by ResultCover 1. Set ResultCover to the empty set 2. Set Uncovered to the union of all sets 3. For each set, set the UncoveredCount to the size of the set 4. While Uncovered is not empty • select a set that has the largest value of UncoveredCount among all sets not in ResultCover • add the set to ResultCover • remove all elements of the set from Uncovered • update the value of UncoveredCount for each set not in ResultCover

  20. Update Uncovered Count For each element in the set to be added to the ResultCover If the result cover does not covers it For each set not in the result cover If the set contains the element uncovered count is decremented by one

  21. Time Complexity Algorithm Greedy O(M * N * min(M, N)) Algorithm CAR O(M * N) Algorithm Greedy Update O(M * N)

  22. CPU Time

  23. Algorithm List And Remove (LAR) Implemented the matrix using linked list instead of array of BitMaps Algorithm Update plus the remove phase from algorithm CAR

  24. Linked List for Matrix

  25. CPU Time

  26. Cover Size

  27. Cover Size (Different Data Sets)

  28. Summary • Algorithm LAR runs faster than Algorithm CAR • Algorithm LAR generates smaller cover sets than Algorithm CAR • Algorithm: Updating vs. searching every time • Data Structure: Link list vs. array of BitMaps

  29. Questions?

More Related