# Final Exam Time and Place: - PowerPoint PPT Presentation

1 / 15

Final Exam Time and Place:. Saturday, Dec 8, 9:00am - 12:00pm EN 1054. Chapter 19.1 Exploratory Data Analysis. What is Exploratory Data Analysis?. An approach to analyze data sets to: Discover patterns Find a better model It’s an iterative process Refine to uncover patterns.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Final Exam Time and Place:

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Saturday, Dec 8,

9:00am - 12:00pm

EN 1054

## Chapter 19.1 Exploratory Data Analysis

### What is Exploratory Data Analysis?

• An approach to analyze data sets to:

• Discover patterns

• Find a better model

• It’s an iterative process

• Refine to uncover patterns

### Confirmatory vs. Exploratory

Confirmatory analysis

Exploratory analysis

What is the appropriate model?

What is data telling us?

What is structure of model?

Batch of data

Repeated use of a batch .

Iterative search for pattern

Explained variance = ?

Best model

Residuals show pattern?

Factor analysis

• What decision can be made?

• How certain can we be?

• What are values of parameters?

• Sample

• ONE use of a sample (data-grinding, otherwise)

• Single analysis

• p-value = ?

• Yes/no decision

• Residuals acceptable?

• Experimental design

But remember,

pattern ≠ cause

### Inference

• Confirmatory

• Narrow form of inference

• Relate one Q to another Q (e.g. βreg)

• Exploratory

• Trying to discover a pattern worth running through a confirmatory analysis

P corm P soil

N corn~ N soil

C corn C soil

⁞ ⁞

### Don’t confuse confirmatory and exploratory analyses

• Refining models using p-values ≠ exploratory analysis

• Repeated analysis of the same data set is data dredging (aka: data grinding, data mining, data fishing, data snooping…)

• Any data set has a degree of randomness, so multiple comparisons may be bound to find a false association

### Characteristics of Exploratory Analyses

• Relies strongly on graphical analyses

http://gallery.r-enthusiasts.com/thumbs.php

### Characteristics of Exploratory Analyses

• Simplify – determine best model for pattern

### Execution

• Define all quantities that are used

• Procedure statement

• Name and Symbol

• Values with Units

• Identify response and explanatory variables

• Decide whether to undertake exploratory or confirmatory analysis, stating reasons for choice

• State screening criterion to distinguish exploratory from confirmatory analysis

• Visual screening

• P-value based (e.g. keep if <0.1)

### Box and Arrow Diagrams  Logic

• Gordon Riley is interested in aquatic productivity of Georges Bank

Light

Nutrients (nitrates, phophates)

Phytoplankton

Zooplankton