1 / 17

COMP 3503 Deductive Modeling and Visualization

COMP 3503 Deductive Modeling and Visualization. with Daniel L. Silver. Agenda. Deductive and Inductive Modeling Visualization and Graphical Exploratory Methods. The KDD Process. Interpretation and Evaluation. Data Mining . Knowledge. Selection and Preprocessing. p(x)=0.02. Data

bryce
Download Presentation

COMP 3503 Deductive Modeling and Visualization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COMP 3503Deductive Modeling and Visualization with Daniel L. Silver

  2. Agenda • Deductive and Inductive Modeling • Visualization and Graphical Exploratory Methods

  3. The KDD Process Interpretation and Evaluation Data Mining Knowledge Selection and Preprocessing p(x)=0.02 Data Consolidation Warehouse

  4. Selection and Preprocessing • Part of TL functions of ETL • Generate/Sample a set of examples • Explore the data • Reduce attribute dimensionality • Reduce attribute value ranges • Transform data • Encode data OLAP and visualization tools play key role

  5. Deductive and Inductive Modeling

  6. Induction versus Deduction Top-down verification of hypothesis Deduction Model or General Rule Example A Example B Example C Induction Bottom-up construction of hypothesis

  7. Deductive Modeling • Top-down (toward the data) verification of an hypothesis • The hypothesis is generated within the mind of the data miner (limited by human preconceptions) • Exploratory tools: • Query and response/report (SQL-like) software • Data visualization software • OLAP – On-Line Analytical Processing • Models are used for description

  8. Inductive Modeling • Bottom-up (from the data) development of an hypothesis • The hypothesis is generated by the technology directly from the data • Statistical and machine learning tools such as regression, decision trees and artificial neural networks are used • Models can be used for prediction

  9. Deductive Exploratory Methods Interactive Visualization Tools • Graphs and statistics from data • Histograms of value distribution • 2D, 3D, plus colors and shapes for nD • Time-series plots and animations • Can require training and practice Response MS Excel,IBM Cognos Temp Velocity

  10. Which type of graph do I use? • Depends on • The type of data • The type of analysis • The availability of statistical software • What you want to illustrate/explore • When creating graphs for others to interpret: • Keep in mind what you are trying to communicate • Be clear, concise, and consistent • Label all your documents! This slide courtesy Anders Stjarne

  11. Bar Charts • Summarizes categorical data • Horizontal axis represents categories, while vertical axis represents either counts (“frequencies”) or percentages (“relative frequencies”) • Used to illustrate the differences in percentages (or counts) between categories. This slide courtesy Anders Stjarne

  12. Histograms • Divide measurement up into equal-sized categories. • Each bar’s height represents number (or percent) falling into a category This slide courtesy Anders Stjarne

  13. Box Plots upper quartile whiskers outliers • “Whiskers” are drawn to not more than 1.5 times the length of the box beyond either quartile • “Outliers,” or observations outside of this statistic (shown as asterix). • For details see - http://davidmlane.com/hyperstat/A37797.html lower quartile median This slide courtesy Anders Stjarne

  14. Scatter Plots • Summarizes the relationship between two measurement variables. • Horizontal axis represents one variable and vertical axis represents second variable. • Plot one point for each pair of measurements. This slide courtesy Anders Stjarne

  15. Brushplots • One of the most common, and historically first widely used visualization technique explicitly identified as exploratory data analysis, is known as brushing. Weka provides a brushings view. This slide courtesy Anders Stjarne

  16. Deductive Exploratory Methods DEMO Excel and WEKA Capabilities

  17. THE ENDdanny.silver@acadiau.ca

More Related