# Submit Predictions - PowerPoint PPT Presentation

1 / 17

Goal. Predict whom survived the Titanic Disaster. Hypotheses. Woman and Children First. Get Data. Read dataset into Excel, R, etc. Data Management. Some Age Missing Data, Analyze Gender Only. Statistics & Analysis. 74% Women, 19% Men . Submit Predictions. 320 / 418 = 76.5%.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Submit Predictions

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

#### Presentation Transcript

Goal

Predict whom survived the Titanic Disaster

Hypotheses

Woman and Children First

Get Data

Read dataset into Excel, R, etc

Data Management

Some Age Missing Data, Analyze Gender Only

Statistics &

Analysis

74% Women, 19% Men

Submit Predictions

320 / 418 = 76.5%

### Predictor Variables

Age

All

N = 891

Data

N = 714

Missing

N = 177

Decision Trees

• Dependent variable, (Y)

• Continuous

• Categorical

• Independent variables, (X’s)

• Continuous

• Categorical

The Decision Tree looks for split on sample at the node that can lead to the most differentiation on Y

Age

Decision Trees

• maximize data likelihood (minimize deviance).

Prediction and Missing Values

Correlation, Association of Age with other Variables?

Goal

Predict whom survived the Titanic Disaster

Hypotheses

Woman and Children First

Get Data

Read dataset into Excel, R, etc

Data Management

Some Age Missing Data, Analyze Gender Only

Statistics &

Analysis

74% Women, 19% Men

Submit Predictions

320 / 418 = 76.5%

Gender

Gender and Age

• Tree grows based on optimizing only the split from the current node rather then optimizing the entire tree

• Tree stops when further split becomes ineffective

Prediction: Gender + Age

Goal

Predict whom survived the Titanic Disaster

Hypotheses

Woman and Children First

Get Data

Read dataset into Excel, R, etc

Data Management

Some Age Missing Data, Analyze Gender Only

Statistics &

Analysis

Submit Predictions

Goal

Predict whom survived the Titanic Disaster

Hypotheses

Woman and Children First

Get Data

Read dataset into Excel, R, etc

Data Management

Age + Gender

Statistics &

Analysis

Submit Predictions

Kitchen Sink

Kitchen Sink

Decision Trees

• Popular Implementations

• CART Classification And Regression Tree

• CHAID CHi-squared Automatic Interaction Detector

• CHAID allows multiple branch split - a wider tree

• CART uses binary split