170 likes | 267 Views
Detailed Description of an Algorithm for Enumeration of Maximal Frequent Sets with Irredundant Dualization I rredundant B order E numerator. Takeaki Uno Ken Satoh National Institute of Informatics, JAPAN 19/Nov/2003 FIMI 2003. Outline of This Talk. ・ Explanation of our algorithm
E N D
Detailed Description of an Algorithm forEnumeration of Maximal Frequent Sets with Irredundant DualizationIrredundant Border Enumerator Takeaki Uno Ken Satoh National Institute of Informatics, JAPAN 19/Nov/2003 FIMI 2003
Outline of This Talk ・ Explanation of our algorithm (improved version of Gunopulos et al.) ・ Algorithm technique using sparseness. ・ Computational experiments for datasets
Algorithm of Gunopulos et al. 11…1 1. Find minimal sets by dualization 2. If one of them is frequent, then find a maximal frequent set including it, and go to 1. 00…0
Algorithm of Gunopulos et al. 11…1 1. Find minimal sets by dualization 2. If one of them is frequent, then find a maximal frequent set including it, and go to 1. 00…0
Algorithm of Gunopulos et al. 11…1 1. Find minimal sets by dualization 2. If one of them is frequent, then find a maximal frequent set including it, and go to 1. 00…0
Algorithm of Gunopulos et al. 11…1 -solves dualization many times -finds the same minimal set many times 1. Find minimal sets by dualization 2. If one of them is frequent, then find a maximal frequent set including it, and go to 1. Our algorithm dualizes and finds maximal elements simultaneously Irredundant Dualization 00…0
Our Algorithm 11…1 1. When find a frequent minimal set during dualization, find a maximal frequent set including it, and add it to the current set. 00…0
Our Algorithm 11…1 1. When find a frequent minimal set during dualization, find a maximal frequent set including it, and add it to the current set. 00…0
Our Algorithm 11…1 1. When find a frequent minimal set during dualization, find a maximal frequent set including it, and add it to the current set. 00…0
Our Algorithm -finds each minimal set once -solves one dualization -dualization can accept additional input 11…1 1. When find a frequent minimal set during dualization, find a maximal frequent set including it, and add it to the current set. Incremental dualization by Kavvadias and Stavropoulos, or by Uno 00…0
C B ABC CE CD BCD CE ACD CDE BCDE ACDE Incremental Dualization φ AE - Algorithms of Kavvadias and Stavropoulos, by Uno ( !! input sets are the complement in the terms of dualizaion) CDE
Algorithm Technique: crit items: itemset |max sets| : # max. sets - Algorithms of Kavvadias and Stavropoulos, by Uno, checks minimality many times (each takes O(|max sets|×|items|) time) - Algorithm of Uno checks it by using "crit" (critical elements) crit(e,H) ≠ φ ⇔ His minimal - crit can be updated for H∪{e} in O(|max sets|) time improving factor = O(|items|)
Using Sparseness remains max. sets - Checking minimality for all H∪{e} takes O(|max. sets|×|items|) time - Checking them by tracing each max. set - |items| ave. size ofmax sets e1 e2 e3 e4 e5 e6 crit(*,H∪*)
Summery - Irredundant dualizatioin O(1/|max. sets|) - Checking minimality by crit O(1/|items|) - Speeding up by sparseness O(size of max sets / |items|) Computation time is reduced to O(size of max sets / |items|2|max sets|)
Comparison to Bottom Up - Computation time depends on: Bottom up approach (ex. apriori) #frequent sets, #closed sets Our algorithm #max. frequent sets, #min. infrequent sets. For instances with few minimum infrequent sets, Our algorithm performs well
Conclusion - We improved the algorithm of Gunopulos et al. by irredundant dualization and sparse algorithms - The computation time depends on #max. frequent sets, #min. infrequent sets. (reduced to size of max sets / |items|2|max sets|) For further improvements - Speed up dulization by pruning of unnecessary items - Speed up updating occurrences by usual techniques