1 / 31

Data & Text Mining

Data & Text Mining. Abhay Ahluwalia , Chris Bruck , Christopher Stanton, Stefanie Felitto , Mike Paulus BUAD 466: Introduction to Business Intelligence November 30, 2011. Data Mining Background.

anahid
Download Presentation

Data & Text Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data & Text Mining AbhayAhluwalia, Chris Bruck, Christopher Stanton, Stefanie Felitto, Mike Paulus BUAD 466: Introduction to Business Intelligence November 30, 2011

  2. Data Mining Background • Definition – the process of analyzing data from different perspectives and summarizing it into useful information • Data Mining Software (ex. XL Miner) allows users to analyze data from many different dimensions, categorize it, and summarize the relationships identified

  3. The Basics of Data Mining • Analyzes relationships and patterns in stored transaction data based on open-ended user queries • Classes: Stored data is used to locate data in predetermined groups • Clusters: Data items are grouped according to logical relationships or consumer preferences • Associations: Data can be mined to identify associations • Sequential patterns: Data is mined to anticipate behavior patterns and trends

  4. Text Mining Background • Definition: the discovery by computer of previously unknown knowledge in text, by automatically extracting information from different written resources • Goal: to extract new, never-before encountered information • Text mining can expand the ability of data mining to deal with textual materials

  5. Data are Key to Business Value DATA: Measures of variables in categories • Support Decision Making • Provide Basis for Forecasting • Important to • Obtain data from new sources (text mining) • Integrate (mash) information from multiple sources

  6. Software Example #1: VAIM (Value-Added Information Mash) • MINING: finding patterns in data (pattern-oriented, record-oriented searches) • MASHING: Integrating information mined from multiple resources • Useful in Hospitals and for Government Campaigns

  7. Software Example #2: IBM SPSS • Assists in Statistical Analysis in predicting trends • Categorizes data, Preforms Statistical Analysis • Multiple Regressions to suggest causality

  8. Software Example #3: XL Miner • Add-In on Microsoft Excel Products • Builds off of software that companies already possess • Assists in predictive forecasting based on observed data trends • Demonstration

  9. Business Value Example #1: Grocery Store • Data mining using Oracle • Analyzed buying patterns • Finding lead to changes in Marketing • Increased revenues

  10. Value Example #2 - University of Rochester Cancer Center • Using KnowledgeSEEKER software • Studied effect of anxiety of Chemotherapy on nausea • Analysis helped improved treatment of patients and improved quality of life.

  11. Value Example #3: MGM Grand Hotel • Analyzed customer satisfaction and probability of return stay • Found that the front desk and room where most important • Focused next 6 months improving • 10% improvement in attrition • Increased guest returns and profitability

  12. Business Applications Pros: Cons: Expensive Requires Training Dependent on structure of warehouses and repositories • Extracts new information and Combines human linguistic capabilities with the speed and accuracy of a computer • Can answer the ‘Why?’ • Competitive advantage

  13. Complications & Concerns • Invasion of Privacy • According to Lita van Wel and LamberRoyakkers in “Ethical issues in web data mining”, privacy is considered lost when information about an individual is obtained, used, or spread without that individual’s permission

  14. More Complications Data is made anonymous before gathered into profiles, there are no personal profiles; therefore these applications de-individualize the users by judging them just by their mouse clicks De-individualization: tendency of judging and treating people on the basis of group characteristics instead of on their own individual characteristics

  15. More Concerns • Companies can claim to collect the data for one purpose and use it for another • The growing movement of selling personal data as a service encourages website owners to trade personal data obtained from their site • The companies that buy the data make it anonymous and these companies and assume ownership of the data that they release http://www.youtube.com/watch?v=zdM6vzRHrG0

  16. Even More Complications • Some web mining algorithms might use controversial characteristics to categorize individuals, such as sex, race, religion, or sexual orientation • This process could result in the refusal of service or a privilege to an individual based on his race, religion, or sexual orientation.

  17. Application Recommendations & Conclusion • Sync data repositories (VAIM Software) • Training • Use Data Mining and Text Mining together

  18. Group Jeopardy:

  19. Home Data and Text Mining Background For 100:True or False: Clustersrefer to Data Items that are grouped according to logical relationships or consumer preferences? • True.

  20. Home Data and Text Mining Background For 200:What is the name of the Text Mining Software that allows users to analyze data from different dimensions, categorize it, and summarize the relationships it identified, all within a familiar Microsoft Office Program? • XL Miner

  21. Home Data and Text Mining Background For 300:Name either 2 Pro's or 2 Cons to the Business Applications of Data Mining. • Pros: extracts new info, can answer the why, creates a competitive advantage • Cons: expensive, requires training, dependent on structure of warehouses and repositories

  22. Home Business Applications for 100:What does VAIM stand for? • Value-Added Information Mashing

  23. Home Business Applications for 200:What is the difference between Text Mining and Text Mashing? • MINING: finding patterns in data (pattern-oriented, record-oriented searches) • MASHING: Integrating information mined from multiple resources

  24. Home Business Applications for 300:What is the greatest benefit of Text Mining for Businesses? • Extracts new information and Combines human linguistic capabilities with the speed and accuracy of a computer

  25. Home Complications for 100:True or False: Companies who buy the data and make it anonymous are not responsible for potential legal actions against them for using the data? • False, they are responsible and can have serious legal actions taken upon them

  26. Home Complications for 200:What is the term used when the personal data of individuals is treated on the basis of group characteristics rather than individual characteristics? • De-individualization

  27. Home Complications for 300:Which two US Senators introduced the Commercial Privacy Bill of Rights? • John McCain (R-AZ) • John Kerry (D-MA)

  28. Home From the Examples for 100:When the grocery store analyzed men's buying trends they found that when men purchased diapers and what other item did they buy? • Beer

  29. Home From the Examples for 200:What software did the University of Rochester Cancer Center use to analyze the affects of Chemotherapy treatments on nausea? • KnowledgeSEEKER

  30. Home From the Examples for 300:What did Text Mining identify as the two most important areas of the MGM Grand Hotel? • The Front Desk and the Room

More Related