1 / 9

Ch5 Mining Frequent Patterns, Associations, and Correlations

Ch5 Mining Frequent Patterns, Associations, and Correlations. Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010. What Is Frequent Pattern Analysis?. Frequent pattern : a pattern (a set of items, subsequences, substructures, etc.) that occurs frequently in a data set

mari
Download Presentation

Ch5 Mining Frequent Patterns, Associations, and Correlations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ch5 Mining Frequent Patterns, Associations, and Correlations Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010

  2. What Is Frequent Pattern Analysis? • Frequent pattern: a pattern (a set of items, subsequences, substructures, etc.) that occurs frequently in a data set • First proposed by Agrawal, Imielinski, and Swami [AIS93] in the context of frequent itemsets and association rule mining

  3. What Is Frequent Pattern Analysis? • Motivation: Finding inherent regularities in data • What products were often purchased together? bread and milk? • What are the subsequent purchases after buying a PC? • What kinds of DNA are sensitive to this new drug? • Can we automatically classify web documents? • Applications • Basket data analysis, cross-marketing, catalog design, sale campaign analysis, Web log (click stream) analysis, and DNA sequence analysis.

  4. Association Rules

  5. Association Rules • support, s, probability that a transaction contains X  Y • confidence, c,conditional probability that a transaction having X also contains Y

  6. Association Rules • Let’s have an example • T100 1,2,5 • T200 2,4 • T300 2,3 • T400 1,2,4 • T500 1,3 • T600 2,3 • T700 1,3 • T800 1,2,3,5 • T900 1,2,3

  7. Association Rules with AprioriMinimum support=2/9Minimum confidence=70%

  8. The Apriori Algorithm • Pseudo-code: Ck: Candidate itemset of size k Lk : frequent itemset of size k L1 = {frequent items}; for(k = 1; Lk !=; k++) do begin Ck+1 = candidates generated from Lk; for each transaction t in database do increment the count of all candidates in Ck+1 that are contained in t Lk+1 = candidates in Ck+1 with min_support end returnkLk;

  9. Exercise • A dataset has five transactions, let min-support=60% and min_support=80% • Find all frequent itemsets using Apriori

More Related