1 / 7

Contents: 1 – Introduction to the subject of web mining and techniques

7ET023 – MSc Dissertation. Student Name: Colin Hopson Student Number: 0482647 Course Title: MSc Computer Science (Internet Engineering). Research Question : What is the most suitable web mining technique for a specified business and mobile application case study?. Contents:

mikaia
Download Presentation

Contents: 1 – Introduction to the subject of web mining and techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 7ET023 – MSc Dissertation Student Name: Colin Hopson Student Number: 0482647 Course Title: MSc Computer Science (Internet Engineering) Research Question: What is the most suitable web mining technique for a specified business and mobile application case study? • Contents: • 1 – Introduction to the subject of web mining and techniques • 2 – Overview of research conducted (both theory and practical) • 3 – Software applications on which to test web mining techniques • 4 – Demonstration (Digital Solutions and Repairs) • 5 – Evaluating results (suitability and practicality)

  2. 7ET023 – MSc Dissertation 1 – Introduction to the subject of web mining and techniques • Sequential research of techniques for an empirical study • Initial research into data mining (databases) • Previous knowledge of web services (RSS, REST, etc.) • Research into theory of web mining • Web usage mining – logs to examine navigation patterns • Web structure mining – examine link hierarchy • Web content mining – “the discovery of useful information from the Web by examining the data that is contained in the Web site” (Pendharkar, 2003 pg.243) * Pendharkar, P.C. (2003) Managing data mining technologies in organizations: techniques and applications, Idea Group Pub, Hershey. • Data extraction from HTML (machine learning algorithms) • Wrapper Induction • Semi-Automatic Extraction

  3. 7ET023 – MSc Dissertation 2 – Overview of research conducted (both theory and practical) • Researching Theory of Data and Web Mining • Empirical research method to acquire knowledge, • Research into data mining, web mining, data extraction algorithms, etc., • Sequential investigation of applicable techniques. • Artefact Design and Development • E-commerce prototype website (Digital Solutions and Repairs), • Mobile application (Mobile Shopper). • Practical Research to Implement Techniques • Resolution of web services (Amazon APIs), • HTML extraction technique using XML; DOM; Xpath; PHP Arrays, • Consuming Google API with REST; DOM; Xpath; PHP Arrays, • Third-Party Software (Newprosoft and Automation Anywhere), • Functionality of XSLT.

  4. 7ET023 – MSc Dissertation 3 – Software applications on which to test web mining techniques

  5. 7ET023 – MSc Dissertation 4 – Demonstration (Digital Solutions and Repairs) • Web Mining Technique 1 • Amazon API • (coded class/methods) • Web Mining Technique 2 • HTML Extraction • (DOMDocument, Xpath and PHP Arrays) • Web Mining Technique 3 • Google API • (REST, DOMDocument, XPath and PHP Arrays) • Web Mining Technique 4 • Third-Party Software • (Automation Anywhere and Newprosoft) • Web Mining Technique 5 • None Implemented, but XSLT investigated Website Demonstration >>>

  6. 7ET023 – MSc Dissertation 5 – Evaluating results (suitability and practicality) • Web Mining Technique 1: Amazon API • Requires registration and associate keys, • Product Advertising API has most requirements (plus more), • ASINs assist administration system, • Top quality delivery and discounts, • Regular updates although lengthy documentation. • Web Mining Technique 2: HTML Extraction • No cost, but requires programming knowledge, • Bespoke algorithm specific for HTML format, • Limited to one online organisation. • Web Mining Technique 3: Google API • Requires registration and associate keys, • Searches products from many online organisations, • GoogleId does not assist administration system, • Web service retrieves limited product information, • Top security measures, but lengthy documentation. • Web Mining Technique 4: Third-Party Software • Limited free trial with subscription costs, • Possible difficulty with integration with administration system • Web Mining Technique 5: XSLT investigated • Limited free trial with subscription costs, • Integration difficulties with administration system

  7. 7ET023 – MSc Dissertation SUMMARY • Study of web mining and some of its techniques • Empirical study, data mining, web services, web content mining, data extraction algorithms. • Sequential research conducted (theory and practical) • Web services (APIs), HTML extraction, Third-Party software, XSLT. • E-commerce prototype website and mobile application • ‘Digital Solutions and Repairs’ and ‘Mobile Shopper’. • Demonstration of web mining techniques • DSR computer repairs administration system • Evaluation of web mining techniques investigated • Comparison between APIs, HTML extraction, third-party software and XSLT. Questions?

More Related