1 / 17

Takeaki Uno Ken Satoh National Institute of Informatics, JAPAN  19/Nov/2003 FIMI 2003

Detailed Description of an Algorithm for Enumeration of Maximal Frequent Sets with Irredundant Dualization I rredundant B order E numerator. Takeaki Uno Ken Satoh National Institute of Informatics, JAPAN  19/Nov/2003 FIMI 2003. Outline of This Talk. ・ Explanation of our algorithm

uyen
Download Presentation

Takeaki Uno Ken Satoh National Institute of Informatics, JAPAN  19/Nov/2003 FIMI 2003

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Detailed Description of an Algorithm forEnumeration of Maximal Frequent Sets with Irredundant DualizationIrredundant Border Enumerator Takeaki Uno Ken Satoh National Institute of Informatics, JAPAN  19/Nov/2003 FIMI 2003

  2. Outline of This Talk ・ Explanation of our algorithm (improved version of Gunopulos et al.) ・ Algorithm technique using sparseness. ・ Computational experiments for datasets

  3. Algorithm of Gunopulos et al. 11…1 1. Find minimal sets by dualization 2. If one of them is frequent, then find a maximal frequent set including it, and go to 1. 00…0

  4. Algorithm of Gunopulos et al. 11…1 1. Find minimal sets by dualization 2. If one of them is frequent, then find a maximal frequent set including it, and go to 1. 00…0

  5. Algorithm of Gunopulos et al. 11…1 1. Find minimal sets by dualization 2. If one of them is frequent, then find a maximal frequent set including it, and go to 1. 00…0

  6. Algorithm of Gunopulos et al. 11…1 -solves dualization many times -finds the same minimal set many times 1. Find minimal sets by dualization 2. If one of them is frequent, then find a maximal frequent set including it, and go to 1. Our algorithm dualizes and finds maximal elements simultaneously Irredundant Dualization 00…0

  7. Our Algorithm 11…1 1. When find a frequent minimal set during dualization, find a maximal frequent set including it, and add it to the current set. 00…0

  8. Our Algorithm 11…1 1. When find a frequent minimal set during dualization, find a maximal frequent set including it, and add it to the current set. 00…0

  9. Our Algorithm 11…1 1. When find a frequent minimal set during dualization, find a maximal frequent set including it, and add it to the current set. 00…0

  10. Our Algorithm -finds each minimal set once -solves one dualization -dualization can accept additional input 11…1 1. When find a frequent minimal set during dualization, find a maximal frequent set including it, and add it to the current set. Incremental dualization by Kavvadias and Stavropoulos, or by Uno 00…0

  11. C B ABC CE CD BCD CE ACD CDE BCDE ACDE Incremental Dualization φ AE - Algorithms of Kavvadias and Stavropoulos, by Uno ( !! input sets are the complement in the terms of dualizaion) CDE

  12. Algorithm Technique: crit items: itemset |max sets| : # max. sets - Algorithms of Kavvadias and Stavropoulos, by Uno, checks minimality many times (each takes O(|max sets|×|items|) time) - Algorithm of Uno checks it by using "crit" (critical elements) crit(e,H) ≠ φ ⇔ His minimal - crit can be updated for H∪{e} in O(|max sets|) time improving factor = O(|items|)

  13. Using Sparseness remains max. sets - Checking minimality for all H∪{e} takes O(|max. sets|×|items|) time - Checking them by tracing each max. set - |items|  ave. size ofmax sets e1 e2 e3 e4 e5 e6 crit(*,H∪*)

  14. Summery - Irredundant dualizatioin  O(1/|max. sets|) - Checking minimality by crit O(1/|items|) - Speeding up by sparseness  O(size of max sets / |items|) Computation time is reduced to O(size of max sets / |items|2|max sets|)

  15. Comparison to Bottom Up - Computation time depends on: Bottom up approach (ex. apriori)  #frequent sets, #closed sets Our algorithm  #max. frequent sets, #min. infrequent sets. For instances with few minimum infrequent sets, Our algorithm performs well

  16. Experiments

  17. Conclusion - We improved the algorithm of Gunopulos et al. by irredundant dualization and sparse algorithms - The computation time depends on #max. frequent sets, #min. infrequent sets. (reduced to size of max sets / |items|2|max sets|) For further improvements - Speed up dulization by pruning of unnecessary items - Speed up updating occurrences by usual techniques

More Related