1 / 12

Data Mining

Data Mining. Concepts and techniques By Asst Prof . Muhammad Amir Alam. Data Warehousing and Data Mining. Introduction to Databases. Database A Database  is an organized collection of data . Relational Database

jamal
Download Presentation

Data Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Mining Concepts and techniques By Asst Prof. Muhammad Amir Alam

  2. Data Warehousing and Data Mining

  3. Introduction to Databases Database ADatabase is an organized collection of data. Relational Database It is a database system in which data is organized in the form of TABLES( columns and rows) It is based on the Relational Model(relational Algebra) introduced by E.F.Codd. “Most of the database are developed as Tables so we need to cop with RDBMS”

  4. RDBMS: It is an application software used to implement Relational Databases: Popular Database Management Systems are as follows: • Oracle • MS Access • mySQL • Microsoft SQL • FoxPro • DB2 • Dbase “There are too many Proprietry DBMS”

  5. Data Mining Defined: (the analysis step of the "Knowledge Discovery in Databases" process, or KDD) an interdisciplinary subfield of computer science, is the computational process of discovering patterns in large data sets involving methods at the intersection of • artificial intelligence • machine learning(which is also a branch of AI) • Statistics • database

  6. Pre-requisites of Data Mining • From the Definition we can easily conclude that before we can start mining data we must have knowledge of the following fields. 1. Artificial intelligence 2. machine learning (Artificial intelligence (AI) is technology and a branch of computer science that studies and develops intelligent machines and software Therefore A machine can never be intelligent until it has learning capabilities so we can say that AI and Machine Learning are inter-related) 3. Statistics (We will just use basic statistics or computer to perform statistical work) 4. database systems (we need to learn basics of databases so first we start with intro to Databases)

  7. Data mining involves six common classes of tasks: • Anomaly detection:(Outlier/change/deviation detection) – The identification of unusual data records, that might be interesting or data errors that require further investigation. • Association rule learning: (Dependency modeling) – Searches for relationships between variables. For example a supermarket might gather data on customer purchasing habits. Using association rule learning, the supermarket can determine which products are frequently bought together and use this information for marketing purposes. This is sometimes referred to as market basket analysis. • Clustering:It is the task of discovering groups and structures in the data that are in some way or another "similar", without using known structures in the data. • Classification:It is the task of generalizing known structure to apply to new data. For example, an e-mail program might attempt to classify an e-mail as "legitimate" or as "spam". • Regression:Attempts to find a function which models the data with the least error. • Summarization: Providing a more compact representation of the data set, including visualization and report generation

  8. Anomaly Detection: What is an Anomaly An object (point) that is sensibly different from other objects (points) In statistic, an outlier is an observation that is numerically distant from the rest of the data.

  9. Association rule learning: Cust.1 {milk, bread, jam, eggs} Cust.2 {sugar, milk, bread, eggs} Cust.3 {milk, bread,} Cust.4 {milk, yogurt, bread eggs} Cust.5 {milk, eggs, bread} if MILK  BREAD That’s a kind of association, we will discuss this topic later in Detail.

More Related