120 likes | 230 Views
Explore Nan Yang University of Central Florida's mailing campaign model development process, from data visualization techniques to model building, variable selection, interaction handling, and model assessment using ROC data visualization. This comprehensive overview covers dealing with high-level categorical and interval variables, missing values, imputation methods, transformations, variable selection, and interactions. Collaborative acknowledgments to UCF Statistics Dept and BlueCross BlueShield of FL. AUC=0.66
E N D
Mailing Campaign Model Nan Yang University of Central Florida 04/11/2008
Overview • Data Visualization • Data Preparation • Model Building • Variable Selection • Interaction • Model Assessment • ROC
Data Visualization • 63 Variables • Target is binary with 1 indicating people responded to the mailing campaign • Target is very unbalanced • Target rate is 1.13% for training set
Data Visualization • Categorical Variable • High level variables • x2 ~ 57 levels • DATE variables (x10 & x11) ~ over 100 levels • Missing value • DATE variables ~ 30%-70% • Some variables missing value coded as “Unknown” or “Uncoded”, e.g x20
Data Visualization • Interval Variable • Skewness
Data Preparation • Missing Value Indicator (MVI) • Variables with > 5% missing • Binary • Capture the missing value information
Data Preparation • Imputation • Unconditional imputation • Categorical variable • Tree/Tree Surrogate • Interval variable • Cluster
Data Preparation • Transformation • Right skewed • Log or Square Root transformation • Left skewed • Square transformation
Model Building • Variable selection • Individual predictive power • Logistic backward elimination • Keep the potential interaction terms • Logistic stepwise selection • Tree • Different criterions • 21 variables selected
Model Building • Interactions • SAS EMiner Regression node • 11 interaction terms selected • Model • Ensemble different logistic models
Model Assessment • AUC = 0.66
Acknowledgement • UCF Statistics Dept • BlueCross BlueShield of FL