1 / 20

Introduction to Data Mining

Introduction to Data Mining. Chapter 1. Definition. DATA MINING : exploration & analysis by automatic means of large quantities of data to discover actionable patterns & rules Data mining is a way to use massive quantities of data that businesses generate

molly
Download Presentation

Introduction to Data Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Data Mining Chapter 1

  2. Definition • DATA MINING: exploration & analysis • by automatic means • of large quantities of data • to discover actionable patterns & rules • Data mining is a way to use massive quantities of data that businesses generate • GOAL - improve marketing, sales, customer support through better understanding of customers

  3. Retail Outlets • Bar coding & scanning generate masses of data • customer service • inventory control • MICROMARKETING • CUSTOMER PROFITABILITY ANALYSIS • MARKET-BASKET ANALYSIS

  4. Political Data Mining • Grossman et al., 10/18/2004, Time, 38 • 2004 Election • Republicans: VoterVault • From Mid-1990s • About 165 million voters • Massive get-out-the-vote drive for those expected to vote Republican • Democrats: Demzilla • Also about 165 million voters • Names typically have 200 to 400 information items

  5. Medical Diagnosis • J. Morris, Health Management Technology Nov 2004, 20, 22-24 • Electronic Medical Records • Associated Cardiovascular Consultants • 31 physicians • 40,000 patients per year, southern New Jersey • Data mined to identify efficient medical practice • Enhance patient outcomes • Reduced medical liability insurance

  6. Mayo Clinic • Swartz, Information Management Journal Nov/Dec 2004, 8 • IBM developed EMR program • Complete records on almost 4.4 million patients • Doctors can ask for how last 100 Mayo patients with same gender, age, medical history responded to particular treatments

  7. Business Uses of Data Mining • Customer profiling • Identify profitability of customers • Targeting – used to manage customer churn • Determine characteristics of most profitable customers • 3. Market-Basket Analysis • Determine correlation of purchases by profile • Part of Customer Relationship Management

  8. Reasons why Data Mining is now effective • Data are there • Data are warehoused (computerized) • Walmart: 35 thousand queries per week • Computing economically available • Competitive pressure • Commercial products available

  9. Trends • Every business is service • hotel chains record your preferences • car rental companies the same • service versus price • credit card companies • long distance providers • airlines • computer retailers

  10. Trends • Mass Customization • produce tailored products from standardized components • Levi-Strauss - custom fit jeans • The Custom Foot • Andersen Windows • Individual, Inc. • electronic clipping • customer profiles of interests • send custom newsletter

  11. Trends • Information as Product • Custom Clothing Technology Corporation • fit jeans, other clothing • Lands End • J. Crew • INFORMATION BROKERING • IMS - collects prescription data from pharmacies, sells to drug firms • AC Nielsen - TV

  12. Trends • Commercial Software Available • using statistical, artificial intelligence tools that have been developed • Enterprise Miner SAS • Intelligent Miner IBM • Clementine SPSS • PolyAnalyst Megaputer • Specialty products

  13. How Data Mining Is Being Used • U.S. Government • track down Oklahoma City bombers, Unabomber, many others • Treasury department - international funds transfers, money laundering • Internal Revenue Service

  14. How Data Mining Is Used • Safeway • offer Safeway Savings Club card • users given discounts • users must give personal information • every use, collect data • identify aggregate patterns (what sells well together; what should be sold together) • sell names for 5.5 cents per name to suppliers

  15. How Data Mining Is Used • Firefly • asks members to rate music and movies • subscribers clustered • clusters get custom-designed recommendations

  16. Cross-selling • USAA • insurance • doubled number of products held by average customer due to data mining • detailed records on customers • predict products they might need • Fidelity Investments • regression - what makes customer loyal

  17. Warranty Claims Routing • Diesel engine manufacturer • stream of warranty claims • examine each by expert • determine whether charges are reasonable & appropriate • think of expert system to automate claims processing

  18. Retaining Good Customers • Customer loss: • Banks - Attrition • Cellular Phone Companies - Churn • study who might leave, why • Southern California Gas • customer usage, credit information • direct mail contact - most likely best billing plan • who is price sensitive • Who should get incentives, whom to keep

  19. Fairbank & Morris • Credit card company’s most valuable asset: • INFORMATION ABOUT CUSTOMERS • Signet Banking Corporation • obtained behavioral data from many sources • built predictive models • aggressively marketed balance transfer card • First Union • who will move soon - improve retention

  20. In-class activity • Exercises 3-7, p. 15

More Related