1 / 33

Expert Systems with Applications 34 (2008) 459–468

Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang. 報告人 : Huai-Ping Chu. 2008/11/15. Outline. Abstract Introduction Review of related mining algorithms The proposed algorithm

fairly
Download Presentation

Expert Systems with Applications 34 (2008) 459–468

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang 報告人 : Huai-Ping Chu 2008/11/15

  2. Outline • Abstract • Introduction • Review of related mining algorithms • The proposed algorithm • An example • Conclusion

  3. Abstract • In real applications, different items may have different support criteria to judge their importance, taxonomic relationships among items may appear, and data may have quantitative values. • A fuzzy multiple-level mining algorithm for extracting knowledge implicit in quantitative transactions with multiple minimum supports of items is proposed to derive large itemsets and discover cross-level fuzzy association rules under the maximum-itemset minimum-taxonomy support constraint.

  4. Introduction • An association rule is expressed as the form A  B, where A and B are sets of items, such that the presence of A in a transaction will imply the presence of B in the same transaction. • Srikant & Agrawal proposed a method for mining association rules from data sets using quantitative and categorical attributes. • Hong et al. proposed a fuzzy mining algorithm for managing quantitative data.

  5. Introduction (cont.) • Liu et al. proposed an approach for mining association rules with non-uniform minimum support values, which allowed users to specify different minimum supports to different items and used the lowest minimum support among all the items in the itemset as the minimum support value of the itemset. • Lee, Hong & Lin proposed a simple and efficient algorithm based on the apriori approach to generate large itemsets under the maximum constraints of multiple minimum supports.

  6. Introduction (cont.) • Han et al. and Agrawal et al. proposed respectively algorithms to discover association rules on multiple-level taxonomic relationships among items. • This paper thus proposes a fuzzy multiple-level mining algorithm with multiple supports of items for extracting implicit knowledge from transactions stored as quantitative values, which integrates fuzzy-set concepts, data-mining technologies and multiple-level taxonomy to find fuzzy association rules.

  7. Review of related mining algorithms • Mining multiple-level association rules. • Mining association rules with multiple minimum supports.

  8. 1. Mining multiple-level association rules Relevant item taxonomies are usually predefined in real-word applications and can be represented as hierarchy tree. Terminal nodes on the trees represent actual items appearing in transactions; internal nodes represent classes or concepts formed from lower-level nodes.

  9. The method of Han & Fu : • Nodes in predefined taxonomies are first encoded using sequences of numbers and the symbol “*” according to their positions in the hierarchy tree. (1**) (2**) (11*) (22*) (12*) (21*) (111) (112) (211) (212)

  10. A top-down progressively deepening search approach is used and exploration of “level-crossing” association relationships is allowed. • Candidate itemsets at certain levels may thus contain items at lower levels. EX: Large items at level 2 may be paired with large items at level 1 to form candidate 2-itemsets at level 2 (such as {11*,2**}).

  11. 2. Mining association rules with multiple minimum supports • Liu et al. proposed an approach for mining association rules with non-uniform minimum support values, allowing users to specify different minimum supports to different items. The minimum support value of an itemset is defined as the lowest minimum supports among the items in the itemset.

  12. The minimum support of an item means that the occurrence frequency of the item must be larger than or equal to it for being considered in the next mining steps. If the support of an item is not larger than or equal to the support threshold, the item is not worth considering. • When the minimum support value of an itemset is defined as the lowest minimum supports of the items in it, the itemset may be large, but items included int it may be small.

  13. EX : Minimum support of item A is 20%. Minimum support of item B is 40%. If the support of item B is 30%, smaller than its minimum support 40%, and then the 2-itemset {A,B} should note be worth considering. • It is meaningful to assign the minimum support of an itemset as the maximum of the minimum supports of the items contained in the itemset.

  14. The proposed algorithm • The mining algorithm for fuzzy multiple-level association rules under the maximum-itemset minimum-taxonomy support constraint of multiple minimum supports: • INPUT: A set of quantitative transaction data, a taxonomy with the primitive items assigned their own minimum supports, a set of of membership functions, and a minimum confidence value. • OUTPUT: A set of fuzzy multiple-level association rules under maximum constraints of multiple minimum supports.

  15. Step 1: Encode the taxonomy using a sequence of numbers and the symbol “*”. • Step 2: Translate the item names in the transaction data according to the encoding schema. • Step 3: Group the items with the same first k in each transaction Di, and add the amounts of the items in the same groups in Di.

  16. Step 4: Calculate the occurring count of each group in all the transactions. Remove the group with their counts less than their respective support thresholds. • Step 5: Transform the quantitative value of each remaining group in each transaction data into a fuzzy set fij represented as (fkij1/Rkj1 + fkij2/Rkj2 + … + fkijh/Rkjh), k is the level number, h is the number of fuzzy regions for Ikj.

  17. Step 6: Collect the fuzzy regions (linquistic terms) with membership values > 0 to form the candidate set Ck1. • Step 7: Check whether the value countkjl of each region Rkjl in Ck1 ≧ the threshold, which is the minimum of minimum supports of the primitive items desceding from it. If Rkjl satisfies the threshold, put it into the large 1-itemset (Lk1) for level k.

  18. Step 8: Generate the candidate set Ck2 from L11, L21, … , Lk1 to find “level-crossing” large itemsets with satisfying following condition: • Each 2-itemset in Ck2 must contain at least one item in Lk1. • The two regions in a 2-itemset may not have the same item name. • The two item names in a 2-itemset may not be with the hierarchy relation in the taxonomy. • Both of the support values of the two large 1-itemsets comprising a candidate 2-itemset must ≧ the maximum of the minimum supports of the two large 1-itemsets.

  19. Step 9: Do the following substeps for each newly formed candidate 2-itemset s with regins(s1, s2) in Ck2: • Calculate the fuzzy value of s in each transaction Di as fis = fis1 Λ fis2 • Calculate the scalar cardinality of s in all the transaction data as counts = Σfis • If counts ≧ the maximum of the minimum supports of the items contained in it, put s into Lk2.

  20. Step 10:Repeat above similar steps and generate all large q-itemset. • Step 11:Construct the fuzzy association rules for the q-itemset by the following substeps: • Form all possible association rules as follows: S1 Λ … Λ Sr-1 Λ Sr+1 Λ …Λ Sq Sr r=1 to q • Calculate the confidence values of all association rules by

  21. Step 12: Output the rules with confidence values ≧ the predefined confidence value.

  22. An example

  23. All possible association rules are formed as follows: • If 2** = Middle, then 3** = Middle; • If 3** = Middle, then 2** = Middle; • If 21* = Middle, then 22* = Low; • If 22* = Low, then 21* = Middle; • If 22* = Low, then 32* = Middle; • If 32* = Middle, then 22* = Low.

  24. The confidence of the above association rules are calculated – • If 2** = Middle, then 3** = Middle, with conf = 0.74. • If 3** = Middle, then 2** = Middle, with conf = 0.69. • If 21* = Middle, then 22* = Low, with conf = 0.82. • If 22* = Low, then 21* = Middle, with conf = 046. • If 22* = Low, then 32* = Middle, with conf = 0.97. • If 32* = Middle, then 22* = Low, with conf = 1.0.

  25. Assume the confidence is set at 0.8 in this example. The following three association rules are generated. • If 21* = Middle, then 22* = Low, with conf = 0.82. • If 22* = Low, then 32* = Middle, with conf = 0.97. • If 32* = Middle, then 22* = Low, with conf = 1.0.

  26. Conclusion • This algorithm offers an solution for three issues that usually occur in real mining application: using different criteria to judge the importance of different items, managing taxonomic relationships among items, and dealing quantitative data sets. • In this algorithm, the minimum support for an item at a higher taxonomic concept is set as the minimum of the minimum supports of the items belonging to it and the minimum support for an itemset is set as the maximum of the minimum supports of the items contained in the itemset.

  27. THANK YOU !!

More Related