1 / 69

APPLIED DATA ANALYSIS IN CRIMINAL JUSTICE

APPLIED DATA ANALYSIS IN CRIMINAL JUSTICE. CJ 525 MONMOUTH UNIVERSITY Juan P. Rodriguez. Perspective. Research Techniques Accessing, Examining and Saving Data Univariate Analysis – Descriptive Statistics Constructing (Manipulating) Variables Association – Bivariate Analysis

ryu
Download Presentation

APPLIED DATA ANALYSIS IN CRIMINAL JUSTICE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. APPLIED DATA ANALYSIS IN CRIMINAL JUSTICE CJ 525 MONMOUTH UNIVERSITY Juan P. Rodriguez

  2. Perspective • Research Techniques • Accessing, Examining and Saving Data • Univariate Analysis – Descriptive Statistics • Constructing (Manipulating) Variables • Association – Bivariate Analysis • Association – Multivariate Analysis • Comparing Group Means – Bivariate • Multivariate Analysis - Regression

  3. Lecture 6 Comparing Group Means Bivariate Analysis

  4. Relationships between categorical and numerical variables • ANOVA: • Compares group means • Test for significance • Bar Charts and Box Plots • Tests for Differences in means

  5. One Way ANOVA • How much the Mean Values of a Numerical Variable differ among the categories of a categorical variable

  6. One Way ANOVA • Example: Relationship between television viewing and marital status in GSS98 dataset • TVHOURS: numerical variable – number of hours spent watching TV per day • MARITAL: categorical variable – married, widowed, divorced, separated and never married

  7. One Way ANOVA • Null Hypothesis: • No relationship - People in all groups watch, on average, the same amount of television • Alternate Hypothesis: • There is a relationship – At least 2 of the categories differ in the number of hours of television watched

  8. Analysis Of Variance

  9. Analysis Of Variance

  10. Analysis Of Variance

  11. Analysis Of Variance

  12. Analysis Of Variance

  13. Analysis Of Variance

  14. Analysis Of Variance

  15. Analysis Of Variance

  16. Analysis Of Variance

  17. Analysis Of Variance

  18. Analysis Of Variance

  19. Analysis Of Variance • The differences in the Mean values for these groups are so large that are not likely due to chance: • There is a significant relationship between marital status and television viewing

  20. Graphing ANOVA Results • Bar charts • Used to present data to general people or to people not well versed in statistics • Box Plots • Show both the central tendencies and the distributions of each category

  21. Bar Chart

  22. Bar Chart

  23. Bar Chart

  24. Bar Chart

  25. Bar Chart

  26. Bar Chart

  27. Bar Charts- Results • Separated and widowed people watch more TV, on the average, than the other categories of people

  28. Box Plots • Depict differences in both the spread and center among groups of means. • By placing box plots side by side, it is easy to compare distributions

  29. Box Plots

  30. Box Plots

  31. Box Plots

  32. Box Plots

  33. Box Plots

  34. Box Plots

  35. Post-hoc Tests • ANOVA found significant differences among means with respect to TV viewing • Are only 2 means significantly different? • Are all of them are significantly different? • Or anything in between?. • Post-hoc tests tell us this

  36. Post-hoc Tests

  37. Post-hoc Tests

  38. Post-hoc Tests

  39. Post-hoc Tests

  40. Post-hoc Tests

  41. Post-hoc Tests

  42. Post-hoc Tests

  43. Post-hoc Tests

  44. Post-hoc Tests

  45. Assumptions in ANOVA • Within each sample, the values are independent, and identically normally distributed (same mean and variance). • The samples are independent of each other. • The different samples are all assumed to come from populations with the same variance, allowing for a pooled estimate of the variance. • For a multiple comparisons test of the sample means to be meaningful, the populations are viewed as fixed, so that the populations in the experiment include all those of interest.

  46. Assumptions of ANOVA • Distributions are normal: • The one-way ANOVA's F test is not affected much if the population distributions are skewed unless the sample sizes are seriously unbalanced. • If the sample sizes are balanced, the F test will not be seriously affected by light-tailedness or heavy-tailedness, unless the sample sizes are small (less than 5), or the departure from normality is extreme (kurtosis less than -1 or greater than 2). • In cases of nonnormality, a nonparametric test or employing a transformation may result in a more powerful test.

  47. Assumptions of ANOVA • Samples are Independent • A lack of independence within a sample is often caused by the existence of an implicit factor in the data. • Values collected over time may be serially correlated (here time is the implicit factor). • If the data are in a particular order, consider the possibility of dependence. (If the row order of the data reflect the order in which the data were collected, an index plot of the data [data value plotted against row number] can reveal patterns in the plot that could suggest possible time effects.)

  48. Assumptions of ANOVA • Variances are homogeneous: • Assessed by examination of the relative size of the sample variances, either informally (including graphically), or by a robust variance test such as Levene's test. • The risk of having unequal sample variances is incorrectly reporting a significant difference in the means when none exists. The risk is higher with greater differences between variances, particularly if there is one sample variance very much larger than the others.

  49. Assumptions of ANOVA • Variances are homogeneous (continued) • The F test is fairly robust against inequality of variances if the sample sizes are equal • If both nonnormality and unequal variances are present, use a transformation • A nonparametric test like the Kruskal-Wallis test still assumes that the population variances are comparable.

  50. Assumptions - Normality

More Related