introduction of data mining and association rules n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Introduction of Data Mining and Association Rules PowerPoint Presentation
Download Presentation
Introduction of Data Mining and Association Rules

Loading in 2 Seconds...

play fullscreen
1 / 14

Introduction of Data Mining and Association Rules - PowerPoint PPT Presentation


  • 181 Views
  • Uploaded on

Introduction of Data Mining and Association Rules. cs157 Spring 2009 Instructor: Dr. Sin-Min Lee Student: Dongyi Jia. What is data mining?. The automated extraction of hidden predictive information from database Allows users to analyze large databases to solve business decision problems.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Introduction of Data Mining and Association Rules


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. Introduction of Data Mining and Association Rules cs157 Spring 2009 Instructor: Dr. Sin-Min Lee Student:Dongyi Jia

    2. What is data mining? • The automated extraction of hidden predictive information from database • Allows users to analyze large databases to solve business decision problems. • An extension of statistics, with a few artificial intelligence and machine learning twists thrown in. • Attempts to discover rules and patterns from data.

    3. Data Mining - On What Kind of Data • In principle, data mining should be applicable to any kind of information repositiory: ● relational databases ● data warehouses ● transactional and advanced databases ● flat files ● World Wide Web

    4. Data Mining Functionalities-What kinds of Patterns Can be Mined? • Association Analysis • Classification and Prediction • Cluster Analysis • Evolution Analysis

    5. Applications of data mining • Require some sort of Prediction: for example: when a person applies for a credit card, the credit-card company wants to predict if the person is a good credit risk. • Looks for Associations: for example: if a customer buys a book, an on-line bookstore may suggest other associated books.

    6. Associations Rule Discovery • Task: Discovering association rules among items in a transaction database. • How are association rules mined from large database? 1. Find all frequent itemset: each of these itemsets will occur at least as frequent as pre-determined minimum support count. 2. Generate strong association rules from the frequent itemsets: these rules must satisfy minimum support and minimum confidence.

    7. Association Rules (cont.) • Retail shops are often interested in associations between items that people buy. • Someone who buys bread is quite likely also to buy milk. association rule: bread => milk • A person who brought the book Database System Concepts is quite likely also to buy the book Operating System Concepts. association rule: DSC => OSC

    8. Association Rules (cont.) • Two numbers: • Support:is a measure of what fraction of the population satisfies both the antecedent and the consequent of the true. • Confidence:is a measure of how often the consequent is true when the antecedent is true.

    9. Association Rules (cont.) • Let I = {i1, i2, …im} be a total set of items D is a set of transactions d is one transaction consists of a set of items d  I • Association rule: • X  Y where X  I ,Y  I and X  Y =  • support = (#of transactions contain X  Y ) /D • confidence = (#of transactions contain X  Y ) / #of transactions contain X

    10. example • Example of transaction data: • CD player, music’s CD, music’s book • CD player, music’s CD • music’s CD, music’s book • CD player • I = {CD player, music’s CD, music’s book} • D = 4 • #of transactions contain both CD player, music’s CD =2 • #of transactions contain CD player =3 • CD player  music’s CD (sup=2/4 , conf =2/3 )

    11. Association Rules (cont.) • Rule support and confidence reflect the usefulness and certainty of discovered rules. • A support of 50% for association rule means that 50% of all the transactions under analysis that CD’s player and music CD are purchased together. • A confidence of 67% means that 67% of the customers who purchased a CD’s player also bought music CD.

    12. Strong Association Rule • User sets support and confidence thresholds. • Rules above support threshold have LARGE support. • Rules above confidence threshold have HIGHconfidence. • Rules satisfying both are said to be STRONG.

    13. References • Professor Lee’s lectures • http://www.cs.sjsu.edu/~lee/cs157b/cs157b.html • Rui Zhao, SJSU http://www.cs.sjsu.edu/~lee/cs157b/cs157b.html • Jiawei Han, Micheline Kamber Data Mining Concepts and Techniques Morgan Kaufmann Publishers

    14. Thank you !