1 / 22

CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu

CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu. Agenda. Abstract Introduction Methodology Result Conclusion Learning Experience References. Abstract.

chill
Download Presentation

CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu

  2. Agenda • Abstract • Introduction • Methodology • Result • Conclusion • Learning Experience • References

  3. Abstract This project based on the VoIP survey data sets. Weka explorer’s classifiers are chosen as data mining tool to build models to predict potential customers of VoIP phone and the most important features and services of two VoIP models.

  4. Introduction • Background • VoIP phone has a potential opportunity with the wide use of internet service. • Two VoIP phone models: Basic & Deluxe • Data mining Scope • Customer • Product features and services

  5. Methodology • Data Mining Tools • C4.5/C5.0, Cubist • Weka • Microsoft SQL Server • SPSS • Chose: Weka Explorer Why? Free, Easy, Good Interface, More choices……

  6. Methodology • Explorer Vs KnowledgeFlow

  7. Methodology • Datasets: Totally: 94 instances

  8. Methodology • Preprocessing • Split table • Customer: 17 attributes • Basic-model: 14 attributes • Deluxe-model: 10 attributes • Processing Missing data • Delete • Replaced by “?” • Transfer data type SPSS  Excel  Weka

  9. Methodology • Algorithm selection • Classification • Clustering • Association • Chose: NNge Why? • High accuracy rate • Simple, clear Rules

  10. Methodology • NNge classifier • Nearest-neighbor like algorithm using non-nested generalized exemplars. • a rule based classifier • builds a sort of “hypergeometric” model. • shows promise as an ML method that performs well on a wide range of datasets

  11. Result

  12. Result

  13. Result • Rules: • One of customer rules : class Would_Buy IF : cost in {10-20} ^ phone in {yes} ^ email in {yes} ^ fax in {no} ^ chat in {yes,no} ^ other in {no} ^ service type in {Phone_cards_only} ^ price in {Somewhat_Dissatisfied, Somewhat_Satisfied} ^ voice_quality in {Somewhat_Dissatisfied, Somewhat_Satisfied} ^ service in {Somewhat_Dissatisfied} ^ convenience in {Somewhat_Satisfied} ^ promotion in {Somewhat_Dissatisfied} ^ Know VoIP in {yes,no} ^ marital status in {Single} ^ gender in {Male} (11)

  14. Result • Stat: • Classes allocation • Feature weights

  15. Result • Basic-model & Deluxe-model • Schema: meta.AttributeSelectedClassifier • Subschema: rules.NNge • Selected attributes: 3,6,8,10,11,12 : 6 • Why? avoid overfitting

  16. Result • Evaluation Ten-fold cross-validation • Summary Correctly classified instances > 85% • Detailed Accuracy By Class TP, FP, Precision, Recall, F measure • Confusion Matrix Misclassified instances:12 instances/94 instances

  17. Result

  18. Conclusion • Limitation • Small Datasets • Incomplete Data source • Models • High accuracy rate • Help further Market Analysis • Help product design

  19. Learning Experience • Process a real data mining problem • Know Classification algorithms better • Numeric, Nominal • Missing data • Overfitting • Know Evaluation methods better • How to compare algorithms • Evaluation factors

  20. Learning Experience • Learn how to use Weka • Future work: learn how to modify source to perform better data mining • Learn from classmates

  21. References • ”Data Mining - Concepts and Techniques"byJiawei Han and Micheline Kamber, Morgan Kaufmann 2001.  • “Data Mining – Practical Machine Learning Tools and Techniques with Java Implementations” by Ian H. Witten and Eibe Frank, Morgan Kaufmann 2000.  • http://www.cs.waikato.ac.nz/~ml/index.html. Machine Learning---Weka Home Page • Marketing Researchby David A. Aaker, V. Kumer and George S. Day, eighth edition, Willey 2004.

  22. Thank you

More Related