Web mining
This presentation is the property of its rightful owner.
Sponsored Links
1 / 29

Web Mining PowerPoint PPT Presentation


  • 109 Views
  • Uploaded on
  • Presentation posted in: General

Web Mining. by: Katharotiya Manthan. Overview. Web Mining Semantic Web Ontologies Semantic Web Mining Future Work References. Problems With Web Interaction. Finding Relevant Information Creating New Knowledge using Existing Resources Personlization of Information

Download Presentation

Web Mining

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Web mining

Web Mining

by:

Katharotiya Manthan


Overview

Overview

  • Web Mining

  • Semantic Web

  • Ontologies

  • Semantic Web Mining

  • Future Work

  • References


Problems with web interaction

Problems With Web Interaction

  • Finding Relevant Information

  • Creating New Knowledge using Existing Resources

  • Personlization of Information

  • Learning about Consumers or Individual Users


Web mining1

Web Mining

  • The term created by Orem Etzioni (1996)

  • Application of Data mining techniques 

  • Web Mining into Subtasks

    • Resource finding

    • Information Selection and pre-processing

    • Generalization

    • Analysis


Different types

Different Types

  • Web Usage Mining

  • Web Content Mining 

  • Web Structure Mining


Data mining vs web mining

Data Mining vs. Web Mining

  • Traditional data mining

    • data is structured and relational

    • well-defined tables, columns, rows, keys, and constraints.

  • Web data

    • Semi-structured and unstructured

    • readily available data

    • rich in features and patterns


Web structure mining

Web Structure Mining

  • Generate structural summary about the Web site and Web page

    • Extraction of patterns from the hyperlinks

    • Mining of the structure of the document


Web usage mining

Web Usage Mining

  • Discovering user ‘navigation patterns’ from web data.

    • Prediction of user behavior while the user interacts with the web.

    • Helps to Improve large Collection of resources.


Usage mining techniques

Usage Mining Techniques

  • Data Preparation

    • Data Collection

    • Data Selection

    • Data Cleaning

  • Data Mining

    • Navigation Patterns

    • Sequential Patterns


Data mining techniques

Data Mining Techniques

  • Navigation Patterns

    • Example:

    • 70% of users who accessed /company/product2 did so by starting at /company and proceeding through /company/new, /company/products and company/product1

    • 80% of users who accessed the site started from /company/products

    • 65% of users left the site after four or less page references


Web mining

Cont…

  • Sequential Patterns

    • In Google search, within past week 30% of users who visited /company/product/ had ‘camera’ as text.

    • 60% of users who placed an online order in /company/product1 also placed an order in /company/product4 within 15 days


Web content mining

Web Content Mining

  • ‘Process of information’ or resource discovery from content of millions of sources across the World Wide Web

    • E.g. Web data contents: text, Image, audio, video, metadata and hyperlinks

  • Goes beyond key word extraction, or some simple statistics of words and phrases in documents.


  • Semantic web

    Semantic Web

    • The Semantic Web is an evolving development of the World Wide Web in which the meaning (semantics) of information and services on the web is defined, making it possible for the web to "understand" and satisfy the requests of people and machines to use the web content.


    Xml rdf and web data

    XML, RDF and Web Data

    • Structured and Unstructured Data

    • W3c Standards for RDF

    • Semantic Web: Different Kinds of databases

    • Tight Coupling and Loose Coupling


    Rdf resource description framework

    RDF - Resource Description Framework

    • Data Model consists of three object types:

      • Resources

      • Properties

      • Statements


    Example

    Example

    • OraLassila is the creator of the resource http://www.w3.org/Home/Lassila

    • This sentence has the following parts:

      •  Subject(Resource) 

        http://www.w3.org/Home/Lassila  

      • Predicate (Property)  Creator

      •  Object (literal)  "OraLassila"


    Web mining

    Cont…


    Web mining

    Cont…


    Ontologies

    Ontologies

    • Ontologies are developed to provide machine-processable semantics of information sources that can be communicated between different agents (software and humans).


    Developing an ontology

     Developing an Ontology 

    • Defining classes in the ontology,

    • Arranging the classes in a taxonomic (subclass–superclass) hierarchy

    • Defining slots and describing allowed values for these slots,

    • Filling in the values for slots for instances.


    Web mining

    Cont…


    Semantic web mining

    Semantic Web Mining

    • Closing the gap between Semantic Web and Web Mining.

    • Use of ontologies


    Mining the semantic in web

    Mining the Semantic in Web


    Evaluation of semantic web mining

    Evaluation Of Semantic Web Mining


    Web mining

    • Web Mining Vs. Semantic Web Mining

    • A Note On E-Commerce


    Research initiatives

    Research initiatives

    • Vivísimo proposes a clustering approach for web document organization

    • Haveliwala also propose a methodology for evaluating strategies for similarity search on the Web.

      • Jaccard coefficient


    Future work

    Future Work

    • Demonstrating the utility of web mining can be done by making exploratory changes to web sites, e.g., adding links from hot parts of web site to cold parts and then extracting, visualizing and interpreting changes in access patterns.


    Conti

    Conti…

    • There is often a tension in the design of algorithms between accommodating a wide range of data, or customizing the algorithm to capitalize on known constraints or regularities.

    • Also web content mining can be introduced to implementations of this architecture.


    References

    References

    • http://en.wikipedia.org/wiki/Web_mining

    • http://www.engr.sjsu.edu/meirinaki/papers/NEMIS.pdf

    • http://www.w3.org

    • http://www.cs.washington.edu/research/projects/WebWare1/www/softbots/papers/agents97.pdf

    • http://infomesh.net/2001/swintro/

    • http://www.ksl.stanford.edu/people/dlm/etai/etai-abstract.html


  • Login