1 / 37

Marketing 8001: Database Marketing Class 3

Marketing 8001: Database Marketing Class 3. Discovering and Utilizing Reliable Patterns in Customer Data. Today’s Agenda. Exam 1 Next Week Reminder Class 3 Materials Accessing online SAS Data Step Issues Displaying Results: Proc Format Differentiating Customers Review Homework1

peta
Download Presentation

Marketing 8001: Database Marketing Class 3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Marketing 8001:Database MarketingClass 3 Discovering and Utilizing Reliable Patterns in Customer Data Class 3

  2. Today’s Agenda • Exam 1 Next Week • Reminder • Class 3 Materials • Accessing online SAS • Data Step Issues • Displaying Results: Proc Format • Differentiating Customers • Review Homework1 • Class Exercise • Notes on Missing Data…Review on Your Own Class 3

  3. Exam 1 Next Week • Readings and class work to date, including material and homework for next week • SAS • Data Step programming • Procedures, including Proc Format • Make sure you can find topics in… • LSB • Online documentation of SAS • No multiple choice • You will need to be able to write SAS code and execute SAS Procs to answer questions about customers. • Exam should take 2 hours…but I’ve reserved the room until 7:30 p.m., in case some students need more time. Class 3

  4. Reminder • Keep up with readings • LSB • D&D • Missou Media Pack • Material in readings and Power Point slides can be on exams, even if not covered in class • Bring questions about readings and SAS to class or contact me • SAS is the tool we’ll use to do much of what is presented in D&D text • Do not wait until Tuesday to do homework, unless you’ve mastered SAS Class 3

  5. Class 3 Materials • Download Class 3 files to desktop…C:\M8001 • Class 3 PowerPoint => Class3_SP2008.ppt • Word document for SAS data analysis class exercise • ProcTabulateToProcExportExerciseClass3SP2008.doc • Work on the Proc Tabulate part of this problem (and outputting the Proc Tabulate table) BEFORE you look at the SAS code I’ve provided for you. • SAS program and data for SAS Class Exercise • ProcTabulateAndExportClass3SP2008.sas • HLPrimSampForTabExportClass3SP08.sas7bdat • GJC Notes on Proc Tabulate • Notes on Using Proc Tabulate.doc Class 3

  6. Accessing Online SAS • http://v8doc.sas.com/sashtml/ • For Procs • Click on + Base SAS Software • Click on + Base SAS Procedures Guide • Click on + Procedures • Click on Proc desired • For Data Step documentation • Click on + SAS Language Reference: Concepts Class 3

  7. SAS Data Step Issues • SAS is standard tool for DBM • Helps us deal with numbers, data • Numbers are not abstract…they represent measurements of attributes/characteristics of real customers • DBM is about marketing to people…and using data to do it well and profitably. Class 3

  8. SAS Data Step Issues • Creating & manipulating variables…see LSB • Variable = expression; • New_Variable = Old_Variable + 17; • Math operations • Add + Var3 = Var1 + Var2; • Subtract - Var3 = Var1 – Var2 • Multiply * Var3 = Var1 * Var2 • Divide / Var3 = Var1 / Var2 • Exponentiate ** Var3 = Var1**2 • For division…remember zero divide exception • Do not to forget to end statement with semi-colon…; Class 3

  9. SAS Data Step Issues • Standard math rules for precedence for one line of SAS code…see LSB • Exponentiation • Multiplication & Division • Addition & Subtraction • If many calculations in Data Step, remember SAS executes statements in order. • GJC Recommendation • Always use parentheses…don’t rely on standard math rules above • Remember to use as many left “(“ as right “)” • e.g. Logit_Score=Log(Prob_Score/(1-Prob_Score)) • Always check variables…especially complex ones • Proc Freq or Tabulate for categorical variables • Proc Print for continuous Class 3

  10. SAS Data Step Issues • SAS Functions • See LSB 3.3 for functions (pp 80-81) • If necessary, you may need to look online • Be familiar with functions…lookup when needed Class 3

  11. IF-THEN / ELSE Statements • Mnemonics…see LSB EQ (equals) (=) NE (not equals) (^=) GT (greater than) (>) LT (less than) (<) GE (greater than or equal) (>=) LE (less than or equal) (<=) • GJC Recommendation • Except for “=“ sign, use the character mnemonics • You’ll make fewer errors Class 3

  12. IF-THEN / ELSE Statements • IF condition THEN action; ELSE action; • IF Gender=‘F’ THEN Female=1; ELSE Female=0; • Note: Because Gender is character, need to place values in single quotes. • Validate Coding Proc Freq; Tables Gender * Female / Missing; Run; • Remember…the more complicated the code, the more detailed your checking must be. • Don’t want a preventable SAS bug to cost you money…or a lower grade on an exam Class 3

  13. IF-THEN / ELSE Statements • Always remember to initialize variables to zero or a default value • ELSE does it… IF Gender=‘F’ THEN Female=1; ELSE Female=0; • Alternative Coding, explicitly initialize value… Female=0; IF Gender=‘F’ THEN Female=1; • GJC Recommendation • Explicitly initialize variable, fewer errors Class 3

  14. Displaying Results: Proc Format • Very useful for categorizing and displaying continuous variables. • Can also be used to combine levels of categorical variables together. • Can define verbal labels for variables because: • Analysis takes place in your head, not in a computer! • Proc Format does not read data • Following code will generate an error because Proc Format does not read an input data file • Proc Format Data=Temp; Class 3

  15. Displaying Results: Proc Format • Business managers most comfortable with data in tables. • Formats created with Proc Format do not change any values for variables in your dataset…it only affects printing in Procs. Class 3

  16. Displaying Results: Proc Format • Example…first use Proc Format to create formats, then use the Format statement in analysis Proc to associate format with specific variable. Proc Format; Value AgeFmt 35-55=’35-55 years’ Other=‘Else’; Run; Proc Freq Data=Temp; Tables Age; Format Age AgeFmt. ;/* Period at end of format associates it with var */ Run; Class 3

  17. Displaying Results: Proc Format • We have seen how Proc Format works in Class 1 and 2 exercises • Make sure you can use it! • I recommend adapting code I’ve given you if you can. Class 3

  18. Differentiating Customers • DBM is all about differences among customers that can be captured in data about them. • DBMers want to use data to segment their customer base so that they can… • Market to them appropriately • Target them for increased profitability • So, DBMers are always looking for differences in response (order, pay, return, not pay) for customers with different characteristics Class 3

  19. So, How Do We Determine if Customers with Different Characteristics Have Different Response Rates? • First step is to conduct a test • Second step is to analyze the data Class 3

  20. From a Data Point-of-View, How Do We Differentiate Customers? • Let’s say we have conducted a test and have merged our test’s “frozen file” of customer characteristics with our response variable (Order). • The marketing team wants to know “how much higher than average do buyers of multiple Healthy Living books respond?” • The frozen file variable NHLBKPAY measures the number of HL books paid. Class 3

  21. From a Data Point-of-View, How Do We Differentiate Customers? • Method 1: Subset data via a Data Step and analyze separate datasets…this works, but is really inefficient. Data Multi NotMulti; /*Creates 2 files */ Set A.TestData; If NHLBKPAY GE 2 then Output Multi; Else Output NotMulti; Proc Means Data=Multi; Var Order; Run; Proc Means Data=NotMulti; Var Order; Run; Proc Means Data=A.TestData; Var Order; Run; Class 3

  22. From a Data Point-of-View, How Do We Differentiate Customers? • Method 2: Use Proc Freq • Cross characteristic with response variable • Need Proc Format for NHLBKPAY because it can have many values. • Proc Format Value Freq Low-1 = ‘NotMulti’ 2-High=‘Multi’; Run; • Proc Freq Data=A.TestData; Format NHLBKPAY Freq.; Tables NHLBKPAY * Order; Run; Class 3

  23. From a Data Point-of-View, How Do We Differentiate Customers? • Method 3: Use Where statement with Proc Means Proc Means Data=A.TestData; Where NHLBKPay GE 2; Var Order; /* Order rate for Multis */ Run; Proc Means Data=A.TestData; Where NHLBKPay LE 1; Var Order; /*Order rate for non-Multis*/ Run; Proc Means Data=A.TestData; Var Order; /*Order rate for all names, the average rate*/ Run; Class 3

  24. From a Data Point-of-View, How Do We Differentiate Customers? • Use Proc Tabulate to create tables • It’s complicated…but very powerful…so… • For simple tables, use Proc Freq and Proc Means • For more complicated tables, use Proc Tabulate Class 3

  25. More Complicated Tables • Let’s say we want order rates for cells created by crossing one customer characteristic with another…a complicated problem. • For example, calculate order rate for all 6 combinations of NHLBKBUY (in 3 categories) crossed with TLHLPAY (in 2 categories) • A 3 * 2 tables has three rows and two columns for a total of 6 cells or combinations…will also be row results and column results. • Draw a picture of such a table on paper! Class 3

  26. From a Data Point-of-View, How Do We Differentiate Customers? • What do we need to calculate order rates for names with a given characteristic?…or combination of characteristics? • Number orderers with characteristic (numerator) • Number all names with characteristic (denominator) • So, for a 3*2 table how can we get the numbers we need? Class 3

  27. More Complicated Tables: Use Proc Freq with Where Statement • Calculate numbers of orderers for each combination of characteristics in the 3 * 2 table • Proc Format Value Freq Low-0 = ‘Zero’ 1= ‘One’ 2-High=‘2 or More’; Value Recency Low-12=’12 Mo Recent’ 13-High=‘GT 12 Mo Rec’; Run; • Proc Freq Data=Temp; Tables NHLBKBUY * TLHLPAY; Where Order=1; /* 3*2 table for orderers in sample */ Format NHLBKBUY Freq. TLHLPAY Recency. ; Run; Class 3

  28. More Complicated Tables: Use Proc Freq with Where Statement • Calculate total number of namesfor one combination of characteristics in the 3 * 2 table Proc Format Value Freq Low-0 = ‘Zero’ 1= ‘One’ 2-High=‘2 or More’; Value Recency Low-12=’12 Mo Recent’ 13-High=‘GT 12 Mo Rec’; Run; • Proc Freq Data=Temp; Tables NHLBKBUY * TLHLPAY; /*3*2 table for names in sample */ Format NHLBKBUY Freq. TLHLPAY Recency. ; Run; Class 3

  29. More Complicated Tables: Use Proc Freq with Where Statement • Calculate order rates and indices by hand. • Have 6 numerators • Have 6 denominators Class 3

  30. More Complicated Tables: Use Proc Means with Where Statement • Mean of 0-1 coded order variable is order rate as a proportion. • Problem: You cannot use Proc Format with Proc Means. • But, Where statement does work for ranges of variables • Proc Means Data=Temp; Where NHLBKBUY LE 0 and TLHLPAY LE 12; Var Order; Run; • Proc Means Data=Temp; Where NHLBKBUY EQ 1 and TLHLPAY LE 12; Var Order; Run; • Need to make 4 more runs…but gets job done Class 3

  31. More Complicated Tables:Use Proc Tabulate • Need Proc Format to categorize variables Proc Tabulate Data=Temp; Class NHLBKBUY TLHLPAY ; Var Order; Table NHLBKBUY * (N PCTN MEAN*ORDER), TLHLPAY; Format NHLBKBUY Freq. TLHLPAY Recency. ; Run; • On paper, draw what this table would look like • Save the code for any Proc Tabulate that works…it’s a difficult Proc to use and adapting code is easier than re-writing it. Class 3

  32. Review Homework1 Class 3

  33. Class Exercise • Summarizing Data with Proc Tabulate • Outputting Proc Tabulate table to Excel Class 3

  34. Notes on Missing Data…a Quick Primer Review This Section on Your Own • How SAS represents missing data… • Numeric data => with a decimal point (.) • Character data => with blanks => ’ ‘ • In real world data, almost all data have missing values, e.g…. • In a SAS file with 15,000 names, the mean of VarA is based on 14,001 names • But, mean of VarB based on 832 names • Can’t compare means (or do correlations) because names may be different Class 3

  35. Notes on Missing Data • When you have missing data… • Need to determine you do not have data problems…never assume data are OK! • May need to create special flags to encode missing data for modeling and profiling, especially if names with missing data respond at different rate than names with data. • In example above… • 1 means female • 0 means not female (i.e., male or missing) Class 3

  36. Notes on Missing Data • Another example • 1 mean female • 2 means male • Missing Recoded to 3 where 3 means Don’t Know GenderNew = 0; If Gender = ‘ ‘ then GenderNew = 3; If Gender = ‘F’ then GenderNew = 1; If Gender = ‘M’ then GenderNew = 2; • Note: In this example the variable Gender is character…the variable GenderNew is numeric. • There should no names with a value of zero! Class 3

  37. Notes on Missing Data • The Bad News • Treatment of missing data is usually complicated • Variables with much missing data not very useful • In new business, transaction & promotion variables mostly missing…that is, they have little coverage • The Good News • Only our Healthy Living (HL) data have missing values • So, for class don’t worry about it • In real world, worry about it • If you want to work with missing data, practice with the HL data…in these data, missing values and other unique values of variables are encoded as negative integer numbers Class 3

More Related