1 / 19

Association Rule Mining Multi Level And Multi Dimensional Association Rule Mining

13/Sep/2006. S.P.Vimal, CS IS Group, BITS-Pilani. 2. To discuss

xia
Download Presentation

Association Rule Mining Multi Level And Multi Dimensional Association Rule Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Association Rule Mining Multi Level And Multi Dimensional Association Rule Mining S.P.Vimal Assistant Lecturer CSIS/BITS-Pilani vimalsp@bits-pilani.ac.in

    2. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 2 To discuss… Multi Level Association Rules Concepts An Example Mining Uniform Support Reduced Support Redundant Rules Mining Multi Dimensional Association Rules Concepts Mining using Static Discretization Mining using Dynamic Discretization (ARCS) Mining for Distance based Association rules

    3. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 3 Multi Level Association Rules - Concepts Rules Generated from mining data at different levels of abstraction Essential to mine at different levels, in supporting business decision making Massive amount of data highly sparse at the primitive level Rules at high concept level adds to common sense Rules at low concept level may not be interesting always

    4. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 4 Multi Level Association Rules - An Example

    5. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 5 Multi Level Association Rules - An Example Items in task relevant data will be primitive Primitive data items occurs least frequently buys (hp-laptop computer) ? buys (canon-inkjet printer) Vs buys (laptop computer) ? buys (inkjet printer) Vs buys (computer) ? buys (printer)

    6. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 6 Multi Level Association Rules - Mining Support- Confidence Framework Top down Strategy, in accumulating counts Algorithms – Apriori & it’s variations Variations includes Uniform support for all levels Reduced Support at lower levels

    7. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 7 Multi Level Association Rules - Mining (UNIFORM SUPPORT) Same support for all levels of abstraction Subsets of ancestors not satisfying minimum support are not examined Higher support threshold ? lose interesting associations at lower abstractions Lower support threshold ? Many uninteresting associations at higher abstractions

    8. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 8 Multi Level Association Rules - Mining (REDUCED SUPPORT) Lower levels of abstractions are set with lower support thresholds

    9. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 9 Multi Level Association Rules - Mining (REDUCED SUPPORT) Alternate Search Strategies Level by level independent Full breadth search No back Ground knowledge in pruning Leads to examining lot of infrequent items Level-cross filtering by single item Examine nodes at level i, only if node at level i-1 is frequent Misses frequent items at lower level abstractions (due to reduced support) Level-cross filtering by k-itemset Examine k-itemsets at level i, only if k-itemset at level i-1 is frequent Misses frequent k-itemsets at lower level abstractions (due to reduced support)

    10. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 10 Multi Level Association Rules - Mining (REDUCED SUPPORT) Controlled level-cross filtering by singe item A modified level-cross filtering by singe item Sets a level passage threshold for every levels Allows the inspection of lower abstractions, even if its ancestor fails to satisfy min_sup threshold Computer ? Printer (At same Abstraction level) Computer ? InkJet Printer (Cross level Association rules) (At Different Abstraction level)

    11. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 11 Multi Level Association Rules - Redundancy Laptop computer ? InkJet Printer (Support = 10 % , confidence = 70%) Vs HP Laptop Computer ? InkJet Printer (Support = 5 % , confidence = 68%) Second one is redundant due to the existing ancestor relationship

    12. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 12 Multi Dimensional Association Rules - Concepts Rules involving more than one dimensions or predicates buys (X, “IBM Laptop Computer”) ? buys (X, “HP Inkjet Printer”) (Single dimensional) age (X, “20 ..25” ) and occupation (X, “student”) ? buys (X, “HP Inkjet Printer”) (Multi Dimensional- Inter dimension Association Rule) age (X, “20 ..25” ) and buys (X, “IBM Laptop Computer”) ? buys (X, “HP Inkjet Printer”) (Multi Dimensional- Hybrid dimension Association Rule)

    13. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 13 Multi Dimensional Association Rules - Concepts Attributes can be categorical or quantitative Quantitative attributes are numeric and incorporates hierarchy (age, income..) Numeric attributes must be discretized 3 different approaches in mining multi dimensional association rules Using static discretization of quantitative attributes Using dynamic discretization of quantitative attributes Using Distance based discretization with clustering

    14. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 14 Multi Dimensional Association Rules -Mining using Static Discretization Discretization is static and occurs prior to mining Discretized attributes are treated as categorical Use apriori algorithm to find all k-frequent predicate sets Every subset of frequent predicate set must be frequent If in a data cube the 3D cuboid (age, income, buys) is frequent implies (age, income), (age,buys), (income, buys)

    15. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 15 Multi Dimensional Association Rules -Mining using Dynamic Discretization Known as Mining Quantitative Association Rules Numeric attributes are dynamically discretized Consider rules of type Aquan1 ? Aquan2 ? Acat (2D Quantitative Association Rules) age(X,”20…25”) ? income(X,”30K…40K”) ? buys (X, ”Laptop Computer”) ARCS (Association Rule Clustering System) An Approach for mining quantitative association rules

    16. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 16 Multi Dimensional Association Rules -ARCS Map pairs of quantitative attributes on a 2-D Grid (Use Equiwidth binning for discretization) Search the Grid for cluster of points to generate association rules satisfying confidence & support

    17. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 17 Multi Dimensional Association Rules -ARCS Let the rules generated be age(X,23)?income(X,”20..25”)?buys(X,”Laptop Computer” ) age(X,23)?income(X,”26..30”)?buys(X, ,”Laptop Computer” ) age(X,24)?income(X,”31..35”)?buys(X, ,”Laptop Computer” ) age(X,24)?income(X,”36..40”)?buys(X, ,”Laptop Computer” ) The 4 rules above can be generalized using clustering algorithm as age(X,”23..24”)?income(X,”20..40”)?buys(X, ,”Laptop Computer” )

    18. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 18 Multi Dimensional Association Rules -Distance-based Association Rule Item_type(X, “electronic”) ? Manufacturer(X, “foreign”) ? price(X,$250) Binning methods such as equidepth, equiwidth do not capture the semantics of interval data

    19. 13/Sep/2006 S.P.Vimal, CS IS Group, BITS-Pilani 19 Multi Dimensional Association Rules -Distance-based Association Rule 2 step mining process Perform clustering to find the interval of attributes involved Obtain association rules by searching for groups of clusters that occur together The resultant rules must satisfy Clusters in the rule antecedent are strongly associated with clusters of rules in the consequent Clusters in the antecedent occur together Clusters in the consequent occur together ==X==

More Related