1 / 39

Data Mining and Knowledge Discovery for Strategic Business Optimization

Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004. Why is a business in business?. Successful businesses create a lot of added value for their customers and capture it

lexi
Download Presentation

Data Mining and Knowledge Discovery for Strategic Business Optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Mining and Knowledge Discovery for Strategic Business Optimization Peter van der Putten ALP Group, LIACS & KiQ Ltd November 2004

  2. Why is a business in business? • Successful businesses create a lot of added value for their customers and capture it • Maximize long term profit • Optimize: Maximize sales, minimize costs, minimize risk

  3. Challenges • Businesses are bigger • Fragmentation of products, customer interaction channels, market segments • Fierce competition, chaotic economic climate and dynamic customer behavior • Data glut & information overflow • Solution: data mining & knowledge discovery for strategic business optimization

  4. All applications Expert knowledge 29.8% accepted Prediction model plus rules 34.5% accepted 12.7% infection 9.1% infection Accepted volume Credit scoring case: minimizing loan risk while maximizing loan acception

  5. 100.00 90.00 80.00 70.00 60.00 50.00 Cum. positive 40.00 30.00 20.00 10.00 0.00 0 10 20 30 40 50 60 70 80 90 100 Cases (%) Logistic-Regression Marketing case: maximizing direct mail response while minimizing cost A model was created that predicts the probability to respond to a mailing. By using the model to select customers to mail we could reach 50% of the responders by mailing only 20% of all customers

  6. Siebel OMEGA predicts a slight preference for general insurance and offers a one-click cross-sell button. Although the next customer might have preferences as well, the exit risk is overriding. Using a combination of predictive models and business rules, OMEGA suggests to Siebel an immediate attempt to retain the customer. OMEGA offers Siebel the appropriate text for its script engine. Within general insurance, OMEGA predicts a preference for car insurance and offers one-click access to the appropriate script. OMEGA again offers Siebel the appropriate text to execute a retention script.

  7. Overview • Why Data Mining? • The Data Mining Process • Data Mining Tasks • Data Mining Techniques • Future Outlook • Data Mining Opportunities by Sector and Function • Q&A

  8. Some working definitions…. • ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably • Data mining = • the discovery of interesting, meaningful and actionable patterns hidden in large amounts of data • Multidisciplinary field originating from artificial intelligence, pattern recognition, statistics, machine learning, econometrics, ….

  9. Data mining is a process… • Model Development • Objective • Data collection & preparation • Model construction • Model evaluation • Combining models with business knowledge into decision logic • Model / decision logic deployment • Model / decision logic monitoring

  10. Data mining tasks • Undirected, explorative, descriptive, ‘unsupervised’ data mining • Matching & search • Profile & rule extraction • Clustering & segmentation • Directed, predictive, ‘supervised’ data mining • Predictive modeling

  11. Data mining task example: Clustering & segmentation

  12. Data mining task example: Clustering & segmentation

  13. Start Looking Glass

  14. Tussenresultaat looking glass

  15. Resultaat Looking Glass

  16. Resultaat Looking Glass

  17. Past experience Score Behaviour Data Case A 10 9 8 7 6 5 4 3 2 1 Better business Good Case B Bad Case A 7 Model Bad Good Case B 4 Worse business Data mining task example:predictive modeling

  18. Data mining task example:predictive modeling Collected data

  19. Data mining task example:predictive modeling Known customer behaviour

  20. Data mining task example:predictive modeling score = (0 x Income) + (-1 x Age) + (25 x Children)

  21. Data mining task example:predictive modeling • Recruitment • Who will respond to a mailing campaign? • To who can we cross sell which products? • What will be the customer value one year from now? • Retention • Who is going to cancel his/her mobile phone subscription. Should I attempt to keep this customer? • Which customers have accounts that will go dormant? • Risk • Should I sell a loan to this person? • How much money will someone claim on a policy? • Is this caller going to pay his bills?

  22. Data mining techniques for predictive modeling • Linear and logistic regression • Decision trees • Neural Networks • Genetic Algorithms • ….

  23. Linear Regression Models score = (0 x Income) + (-1 x Age) + (25 x Children)

  24. Regression in pattern space Only a single line available in pattern space to separate classes Class ‘square’ income Class ‘circle’ age

  25. Decision Trees 20000 customers response 1% Income >150000? yes no 1200 customers 18800 customers balance>50000? Purchases >10? yes no no 400 customers 800 customers etc. response 0,1% response 1,8%

  26. Decision Trees in Pattern Space Line pieces perpendicular to axes Each line is a split in the tree, two answers to a question income age

  27. sum max average age children region spend Infotrees (Genetic Programming) • Nested regression formulas • sum(average(region, spend), max(age, children))

  28. Infotrees in Pattern Space Infotrees can seperate any class in pattern space, even if the class boundary is non-linear  Can model complex customer behavior income age

  29. Genetic Algorithms / Programming • How to find the best Infotree? Genetic algorithms • Based on the idea of evolution • Start with (random) Infotrees • Build a new generation • Fittest models can reproduce to create offspring, worst models die • Small amount of mutation occurs to keep exploring • Repeat process

  30. cross-over point s1 old model amean quadv new model region spend age convex concave amean cross-over point age children region spend convex old model invert concave salary age children Notes about Infotree models:Cross-over • New models can be created by cross-over: • part of one model is swapped with part of another • parts may chosen randomly or intelligently

  31. convex concave s2 amean age children house TV Region region spend convex concave amean s2 age children region spend region spend convex concave amean s2 age children region spend house spend Notes about Infotree models:Mutation • New models can be created by mutation: • part of a model (a sub-tree, operator or predictor) is changed • part and type of change may chosen randomly or intelligently Sub-tree convex concave age children Operator convex concave age children convex Predictor concave age children

  32. Short Demo(if time allows…) Model to predict caravan policy ownership Combining this model with other models and business rules

  33. Data Mining: the Future • Business (marketing) • More fine-grained segmentation down to the cluster or individual level • More personalised actions, inbound and outbound, in all customer contact channels • Optimization of both value for the business and the customer • Privacy • Technical • From Data Mining to Decisioning, combining multiple models with business rules • Monitoring business and model performance • Data Mining Process Automation

  34. Let’s discuss:Data Mining Opportunities by Function • Marketing, Sales, CRM • Product Development, R&D • Manufacturing, Production, Logistics • Customer service • Finance • Procurement • Human Resources • IT • ….

  35. Let’s discuss:Data Mining Opportunities by Sector • Retail • Telco • Pharma • Government • Automotive • Oil • Charity • Consumers / Citizens • ….

  36. The Paper: Requirements • 2500 words + -10%, APA style references • No plagiarism / copying! Rephrase in your own words, reference, cite & quote • Two parts of each 1250 words • Your grasp of the research topic: what is data mining? Own interpretation, clear, put into context • Memo to CEO/CIO of a specific company / industry: what are the benefits/changes/opportunities and next steps (best practice, proof of concept)? Impact, convincing, plan to action.

  37. The Paper: Suggestions • Suggestions for ‘companies’ • KPN Mobile, Marketing: how to reduce loss of customers to competitors • Dutch Police, Strategic Innovation: opportunities for law enforcement, privacy implications • Pfizer, Drug Discovery: using data mining to find new drugs • Google, Product Management / R&D: opportunities for new data mining features to enhace customer experience • Your Idea!

  38. The Paper: Resources • Webpage for this talk: • http://www.liacs.nl/~putten/ictvision.html • General Writing Resources: • http://www.liacs.nl/~putten/writingpapers.html • Homepage: • www.liacs.nl/~putten , mail putten@liacs.nl

  39. Dilbert’s Perspective on Data Mining

More Related