Computing for Research I Spring 2014

Computing for Research ISpring 2014 Exploratory Data Analysis and Hypothesis Testing February 26 Primary Instructor: Elizabeth Garrett-Mayer

Exploratory Data Analysis • We’ve already discussed some basic stuff • sum and sum, detail • tab • What other sorts of exploration might we do? • Confidence intervals • for continuous variables • for categorical variables

Immediate command for CIs Continuous: cii N xbar s Binary: cii N phat or cii N x

Confidence intervals • For a continuous variable: mean varlist • Example: * estimate means of ceramide variables mean c18ceramide mean totalc- s1pc1

Additional options tab initialre initial mean c18ceramide, over(initialre) mean c18ceramide, vce(bootstr) mean c18ceramide, vce(bootstr) over(initialre) mean c18ceramide, over(initialre) mean c18ceramide, level(90)

Confidence intervals for proportion proportion varlist ci var, bin Examples proportion failure proportion failure death initialre ci failure, bin

Hypothesis Testing • A number of different approaches • Options • nonparametric vs. parametric • continuous vs. categorical (vs. other?) • one vs. two vs. more than two groups

One sample t-tests • ttestiN mean sd null • ttestvarname == null • ttest var1 == var2 *paired • Examples: ttesti 20 48 2.75 50 ttest c18c == 10 ttest frombaselines1p==100 ttest frombaselinec18==100

Two sample t-tests ttesti N1 mean1 sd1 N2 mean2 sd2 ttestvarname1 == varname2, unpaired ttestvarname, by(groupvar) Examples: ttest c18, by(sex) ttest c18, by(sex) unequal

Nonparametric? • ranksum: two group comparison • kwallis: >= two group comparison • signrank: matched pairs signed ranks test • signtest: sign test of matched pairs

Nonparametric? *nonparametric tests ranksum c18, by(sex) kwallis c18, by(sex) use ceramide.alldata, clear keep if cycle==3 gen c18dif = frombaselinec18-100 signrank c18dif=0 signrank frombaselinec18=100 signtest c18dif=0 signtest frombaselinec18=100

Anova • anova y x (note that x is assumed to be categorical) anovay x1 x2 Examples: anova c18c initialre

One sample binomial tests • prtest and bitest • Difference? • prtest uses large sample approximations • bitest uses exact test bitestvarname==p0 bitesti N x p0

One sample binomial tests use "SCBC2004.v9.dta", clear replace ercat=. if ercat==9 gen ercatn=cond(ercat==2,0,1) replace ercatn=. if ercat==. tab ercatercatn bitestercatn=0.50 bitestercatn=0.65 prtestercatn=0.65

Two (or more) sample binomial tests tab y x, exact tab y x, chi tab ercatn grade tab ercatn stage

Computing for Research I Spring 2014