1 / 21

An Introduction to Web Mining

An Introduction to Web Mining. Categories of Web Mining. Web Content Mining Text Multimedia Web Structure Mining Web Usage Mining Reference R. Kosala and H. Blockeel, “Web Mining Research: A Survey”, SIGKDD Exploration, vol. 2, issue 1, 2000.

fox
Download Presentation

An Introduction to Web Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Introduction to Web Mining

  2. Categories of Web Mining • Web Content Mining • Text • Multimedia • Web Structure Mining • Web Usage Mining Reference • R. Kosala and H. Blockeel, “Web Mining Research: A Survey”, SIGKDD Exploration, vol. 2, issue 1, 2000. • J. Srivastava et al, “Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data”, SIGKDD Exploration, vol. 2, issue 1, 1999.

  3. How Does It Work -Web Usage Mining Process Web Server Log Data Preparation Data Mining Clean Data Site Data Usage Patterns

  4. Web Usage Mining Techniques • Data Preparation • Data Collection • Data Selection • Data Cleaning • Data Mining • Navigation Patterns • Association Rules • Sequential Patterns • Clustering • Classification

  5. Data Mining Techniques –Navigation Patterns A E B C D Web Page Hierarchy of a Web Site

  6. A E B C D Data Mining Techniques –Navigation Patterns A link could be provided from C to E

  7. Data Mining Techniques –Navigation Patterns (cont.) • Analysis Examples • 70% of users who accessed /company/product2 did so by starting at /company and proceeding through /company/new, /company/products and company/product1 • 80% of users who accessed the site started from /company/products • 65% of users left the site after four or less page references

  8. Data Mining Techniques - Association Rules • Supermarket example Transaction ID Items Purchased 1 butter, bread, milk, beer, diaper 2 bread, milk, beer, egg 3 Coke, Film, bread, butter, milk … ……… • An association rule will be like “If a customer buys diapers, in 60% of cases, he/she also buys beers. This happens in 3% of all transactions. 60%: confidence3%: support

  9. Data Mining Techniques - Association Rules (cont.) • Web usage example • 40% of users who accessed the Web page with URL /company/product1, also accessed /company/product2 • 30% of users who accessed /company/special, placed an online order in /company/product1 • 50% of users who bought the books by Michael Crichton also reviewed those by John Grisham in the same visit

  10. Data Mining Techniques – Sequential Patterns • Supermarket example Customer Transaction Time Purchased Items John 6/21/97 5:30 pm Beer John 6/22/97 10:20 pm Brandy Frank 6/20/97 10:15 am Juice, Coke Frank 6/20/97 11:50 am Beer Frank 6/21/97 9:25 am Wine, Water, CIder Mitchell 6/21/97 3:20 pm Beer, Gin, Cider Mary 6/20/97 2:30 pm Beer Mary 6/21/97 6:17 pm Wine, Cider Mary 6/22/97 5:05 pm Brandy Robin 6/20/97 11:05 pm Brandy

  11. Data Mining Techniques – Sequential Patterns (cont.) • Supermarket example Customer Sequence Customer Customer Sequences John (Beer) (Brandy) Frank (Juice, Coke) (Beer) (Wine, Water, Cider) Mitchell (Beer, Gin, Cider) Mary (Beer) (Wine, Cider) (Brandy) Robin (Brandy)

  12. Data Mining Techniques – Sequential Patterns (cont.) • Supermarket example Mining Result Sequential Patterns with Supporting Support >= 40% Customers (Beer) (Brandy) John, Frank (Beer) (Wine, Cider) Frank, Mary

  13. Data Mining Techniques – Sequential Patterns (cont.) • Web usage examples • 30% of users who visited /company/products had done a search in Yahoo, within the past week on keyword w • 60% of users who placed an online order in /company/product1 also placed an order in /company/product4 within 15 days

  14. Data Mining Techniques – Clustering Customer Profile dynamic static

  15. Data Mining Techniques – Clustering (cont.) 100 cluster 1 cluster 2 A g e cluster 3 Income $150,000

  16. Data Mining Techniques – Classification • Deployed methods • Decision Trees • Example

  17. Data Mining Techniques – Classification (cont.) • Example 1 Decision Tree Income =High Income =low D1 D2 D1 D2

  18. Data Mining Techniques – Classification (cont.) • Example 2 1 Decision Tree Income =High Income =low D1a D2 D1b D1 D2 D1a D1b

  19. Data Mining Techniques – Classification • Web usage examples • 50% of users who placed an online order in /company/product2, were in the 20-25 age group and lived on the West Coast • If an user put more than 2 items in the shopping cart, he/she will place an order during that visit to the site

  20. Challenges • Large data size • Sampling vs. Accuracy • Data complicatedness • Need hybrid data mining methods • Filtering of mining results • Incremental mining • On-line (real-time) mining

  21. Applications: Personalized Web Services Response & Recommendation Business System Entry System Real-Time Response System Ware House users Data Preparation Data Mining Rules Executer Clean Warehouse Business Rules

More Related