1 / 14

Data Mining

Data Mining. Database Systems Timothy Vu. Mining.

keola
Download Presentation

Data Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Mining Database Systems Timothy Vu

  2. Mining Mining is the extraction of valuable minerals or other geological materials from the earth, usually bauxite, coal, diamonds, iron, precious metals, lead, limestone, nickel, phosphate, rock salt, tin, and uranium, petroleum, natural gas, and even water. Often something that is valuable, rare, or useful.

  3. What is Data Mining Data Mining, also known as Knowledge-Discovery in Databases (KDD), is the process of automatically searching large volumes of data for patterns. In order to achieve this, data mining uses computational techniques from statistics, machine learning and pattern recognition. Machine learning - a method for creating computer programs by the analysis of data sets. Pattern recognition - classify data (patterns) based on either a priori knowledge or on statistical information extracted from the patterns.

  4. Why Data Mining • Data mining is a technique that helps individuals or companies find useful information to make better decisions from large amounts of data. • Reduce risks • Find problems and issues • Save money • High confidence predictions • Simplifying information

  5. Discussion Topics 1 ) Classification 2 )Regression 3) Association 4) Clustering

  6. Classifiers Decision-Tree Classifiers – each node has an associated class and each internal node has a predicate. Bayesian Classifiers – find the distribution of attribute values for each class in the training data ( the maximum probability predicted ). Nuro Net Classifiers – Use the training data to train artificial nuro nets.

  7. Regression Regression – Deals with the prediction of a value rather than a class. Linear Regression – Predict values using a polynomial by finding the curve fitting, meaning finding coefficients that give the best answer.

  8. Associations Finding the association or relationship between two or more items. Support – measure of what fractions of the pupulation satisifies both the antecedent and the consequent of the rule. MILK => Screwdrivers Confidence – how often the consequent is true when the antecedent is true. MILK => Bread

  9. Clustering Clustering is the classification of similar objects into different groups, or more precisely, the partitioning of a data set into subsets (clusters), so that the data in each subset (ideally) share some common trait - often proximity according to some defined distance measure.

  10. Applications of Data Mining • 1. Predictions • - Stock Market • - Earth Quakes • NBA games • 2. Association • - Store Inventory • Fashion Trends • 3. Descriptive Patterns • - Disease Analysis • - Image Recognition • - Fraud Detection

  11. Gather Data

  12. Electrocardiogram

  13. Disease Analysis

  14. References • Silberschatz, H.F. Korth, S. Sudershan: Database System Concepts, 5th ed., McGraw-Hill, 2006 • Runge , Marschall, Magnus Ohman , and Frank Netter. Netter's Cardiology (Netter Clinical Science). W.B. Saunders Company, 2004. • "Data mining". Wikipedia. 4/1/2006 <http://en.wikipedia.org/wiki/Data_Mining>.

More Related