1 / 22

Statistics for CS 312

Statistics for CS 312. Descriptive vs. inferential statistics. Descriptive – used to describe an existing population Inferential – used to draw conclusions of related populations. Graphical descriptions. Histograms Frequency polygons/curves Pie charts. Measures of central tendency.

wade-fulton
Download Presentation

Statistics for CS 312

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics for CS 312

  2. Descriptive vs. inferential statistics • Descriptive – used to describe an existing population • Inferential – used to draw conclusions of related populations

  3. Graphical descriptions • Histograms • Frequency polygons/curves • Pie charts

  4. Measures of central tendency • Mean – average – used most often • Median – midpoint value – used when data is skewed • Mode – most frequently occurring value – used when interested in what most people think

  5. Measures of variability • Range – highest value minus lowest value • Standard deviation – average of how distant the individual values are from the mean

  6. Normal curve • Bell shaped curve – 68% of values lie within one standard deviation of the mean • Non-normal – skewed either negatively (tail to left) or positively (tail to right) • Percentiles - values that fall between two percentile values • Standard scores – distance from mean in terms of the standard deviation – z = (X-m) / s. • Z scores – transformed standard scores – Z = 10z + 50

  7. Variables • Quantitative – things that can be measured (age, income, number of credits) • Qualitative – things without an inherent order (college major, address)

  8. Populations and samples • Population – entire universe from which a sample is drawn • Sample – subset of population • Symbols – mean m, µ; standard deviation s, σ; variance s2, σ2

  9. How representative is the sample • Random sample – use random numbers to choose members of the sample • Stratified sample – sample that represents subgroups proportionally

  10. Hypothesis testing • Hypothesis as to relationship of variables – similar or different • Inference from a sample to the entire population

  11. Statistical significance • Accept true hypotheses and reject false ones • Based on probability (10 heads in a row occurs once in 1024 coin tosses) • Significant result means a significant departure from what might be expected from chance alone • Example – a result two standard deviations from the mean occurs 2.3% of the time in a normally distributed population

  12. Null hypothesis • Assumption that there is no difference between two variables • Example – Male and female college students do similar amounts of music downloading using BitTorrent. • Example – School use of computers is unrelated to income of the students’ families

  13. Levels of significance • 5 percent level – Event could occur by chance only 5 times in 100 • 1 percent level – Event could occur by chance only 1 time in 100 • Significance level should be chosen before doing experiment

  14. Types of errors • Type I error – Rejection of a true null hypothesis • Type II error – Acceptance of a false null hypothesis • Decreasing one type increases the other

  15. One and two tailed tests • One tailed test – Experimental values will only fail the null hypothesis in one direction • Two tailed test – Values could occur on either the positive or negative tail of the curve

  16. Estimation • Concerns the magnitude of relationships between variables • Hypothesis testing asks “is there a relationship” • Estimation asks “how large is the relationship” • Confidence interval – provides an estimate of the interval that the mean will be in

  17. Sequence of activities • Description • Tests of hypotheses • Estimation • Evaluation

  18. Correlation • Quantifiable relationship between two variables • Example – relationship between age and type of computer games played • Example – relationship between family income and speed of home computer connection.

  19. Correlation chart • Two (or more) dimensional table • Variables on the axes, could be intervals • Scattergram – positive correlated values scatter with positive slope, negative with negative slope

  20. Product-moment coefficient • Formula based on deviations from means • If deviations are the same or similar, values are positively correlated • If deviations are the opposite, values are negatively correlated • Most correlations are somewhere in between +1 and -1

  21. D D A B C A B C Perfect positive correlation: r = +1 X Y Y

  22. D D A B C C B A Perfect negative correlation: r = -1 X Y

More Related