1 / 10

Data Mining – A First View

Data Mining – A First View. Roiger & Geatz. Definition. Data mining is the process of employing one or more computer learning techniques to automatically analyze and extract knowledge contained within a database. Knowledge Discovery in Databases (KDD) is same a data mining.

Download Presentation

Data Mining – A First View

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Mining – A First View Roiger & Geatz

  2. Definition • Data mining is the process of employing one or more computer learning techniques to automatically analyze and extract knowledge contained within a database. • Knowledge Discovery in Databases (KDD) is same a data mining. • Knowledge from a data mining session gives us a model or generalization of the data. • Induction-based learning – generalize by observing specifics.

  3. What Can Computer Learn? • Facts • Concepts • Procedures • Principles • Computers are good at learning concepts – concepts are the outputs from a data mining session.

  4. Three Concept Views • Classical view – all concepts have definite defining properties. • Probabilistic view – concepts are represented by properties that are probable of concept members. • Exemplar view –a given instance is determined to be example of a particular concept if the instance is similar enough to set of one or more known examples of that concept.

  5. Supervised Learning • Also known as induction-based supervised concept learning • Attribute-value matrix – table 1.1 • Decision tree

  6. Unsupervised Clustering • Builds models without predefined classes. • Table 1.3. • Example questions.

  7. Data Mining? • Can we clearly define the problem? • Does potentially meaningful data exist? • Does the data contain hidden knowledge? Or is the data factual and useful for reporting purposes only?

  8. Data Mining or Data Query • Shallow knowledge – factual, easily stored and manipulated. SQL is a good tool. • Multidimensional knowledge – is also factual but multidimensional knowledge _ OLAP tools. • Hidden knowledge – patterns and regularities in data – no SQL – data mining algorithms. • Deep knowledge – knowledge in database that can be found only with some direction – current data mining tools are ineffective.

  9. Expert Systems or Data Mining Data Mining: Data – data mining tool – knowledge Expert Systems – Human Expert – Knowledge Engineer – ES building tool – Knowledge

  10. Data Mining Application • Fraud detection • Health care • Business and finance • Scientific applications • Sports and gaming

More Related