1 / 15

STR Assignment 2

This analysis uses R to perform various statistical tests and generate summary statistics on a dataset of adult information.

reyd
Download Presentation

STR Assignment 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. STR Assignment 2 Solve using R Presented by , Shikha Rani Deo BA(2014-2015)

  2. adult <- read.table("C:/Users/shikharanideo/Desktop/Praxis/statistics/adult.txt", header=FALSE, sep=",", na.strings="NA", dec=".", strip.white=TRUE)

  3. STRATIFIED SAMPLING Sample_17082014_1756 <- StrataSample(data = adult, strata1 = 10, sstotal= 2000, ppstype = "fixed", nminimum = 100, neyvar = ) • View(Sample_17082014_1756) • showData(Sample_17082014_1756, placement='-20+200', font=getRcmdr('logFont'), maxwidth=80, maxheight=30) > .Table <- table(Sample_17082014_1756$V10) > .Table # counts for V10 Female Male 1000 1000

  4. SUMMARY OF SAMPLE • summary(Sample_17082014_1756)

  5. RATIO OF VARIANCE(AGE) • Input: var.test(V1 ~ V10, alternative='two.sided', conf.level=.95, data=Sample_17082014_1756) # V1=Age, V10 = sex • OutPut: Ratio of Variance (1.1)>1 Ratio of Standard Deviation = sqrt(Ratio of Variance)= sqrt (1.104924)=1.0511

  6. RATIO OF VARIANCE(hours per week) • Input: var.test(V13 ~ V10, alternative='two.sided', conf.level=.95, data=Sample_17082014_1756) # V1=Age, V13 = hours per week • OutPut: Ratio of Variance(1.02) ~ 1 Ratio of Standard Deviation = sqrt(Ratio of Variance)= sqrt(1.015206) = 1.008

  7. WELCH TWO SAMPLE T-TEST(Age) • Input: • t.test(V1~V10, alternative='two.sided', conf.level=.95, var.equal=FALSE, data=Sample_17082014_1756) # V1=Age, V10= sex . Here var.equal=FALSE • OutPut: • Mean Age for female < Mean Age for male

  8. WELCH TWO SAMPLE T-TEST(hours per week) • Input: t.test(V13~V10, alternative='two.sided', conf.level=.95, var.equal=TRUE, data=Sample_17082014_1756) # V1=Age, V13= Hours per week . Here var.equal=TRUE • OutPut: Mean(# hours per week (female))< Mean(# hours per week (male))

  9. WILCOX TEST(age) • Input: wilcox.test(V1 ~ V10, alternative="two.sided", data=Sample_17082014_1756) # # V1=Age, V10= sex Output:

  10. WILCOX TEST(Hours per week) • Input: wilcox.test(V13 ~ V10, alternative="two.sided", data=Sample_17082014_1756) # V1=Age, V13= Hours per week Output:

  11. SUMMARY(sample) • Input: summary(adult) • Output:

  12. SUBSET 'sample'("Not-in-Family) • Input: NOFdata <- subset(Sample_17082014_2245, subset=V8=="Not-in-family") showData(NOFdata, placement='-20+200', font=getRcmdr('logFont'), maxwidth=80, maxheight=30) • Output:

  13. Subset'sample' ("Not-in Family" and "self employed") • Input: SEdata <- subset(NOFdata, subset=V2=="Self-emp-not-inc"|V2=="Self-emp-inc") showData(SEdata, placement='-20+200', font=getRcmdr('logFont'), maxwidth=80, maxheight=30) • Output: • P("Self Employed Person" and "Not-in-family" )= 38/2000= 0.019

  14. CODE(CONTINGENCY TABLE) • Input: .Table <- xtabs(~V10+V6, data=Sample_17082014_2245, subset=V6=="Divorced"| V6=="Married-civ-spouse"|V6=="Never-married") .Table .Test <- chisq.test(.Table, correct=FALSE) .Test round(.Test$residuals^2, 2) # Chi-square Components remove(.Test) remove(.Table)

  15. OUTPUT(CONTINGENCY TABLE) Output: Since X-squared =NAN and p-value= NA. There is no correlation between these category and gender • Pearson's Chi-squared test • Chi-square Components

More Related