1 / 30

Welcome to MDM4U (Mathematics of Data Management, University Preparation)

Welcome to MDM4U (Mathematics of Data Management, University Preparation). http://www.wordle.net/. AGENDA. Attendance Course Outline Chapter 1 Problem (CP1) Assign textbooks. 1.1 Displaying Data Visually. Learning goal: Classify data by type Create appropriate graphs

nemo
Download Presentation

Welcome to MDM4U (Mathematics of Data Management, University Preparation)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Welcome to MDM4U (Mathematics of Data Management, University Preparation) http://www.wordle.net/

  2. AGENDA • Attendance • Course Outline • Chapter 1 Problem (CP1) • Assign textbooks

  3. 1.1 Displaying Data Visually Learning goal: Classify data by type Create appropriate graphs MSIP / Home Learning: p. 11 #2, 3ab, 4, 7, 8

  4. Chapter 1 Problem • Log on to a computer • You may pair up if no computers are available • Click MDM4U.LIEFF.CA • Save the file MDM4U CP1.PDF to your M:\ drive • Create a MDM4U folder • Create a Ch1 folder • Answer CP1 and CP2 in a Word document

  5. Why do we collect data? • We learn by observing • Collecting data is a systematic method of making observations • Allows others to repeat our observations • Good definitions for this chapter at: • http://www.stats.gla.ac.uk/steps/glossary/alphabet.html

  6. Types of Data • 1) Quantitative – can be represented by a number • E.g. age, height, weight, number of siblings • a) Discrete Data • Data where a fraction/decimal is impossible • E.g., Age, Number of siblings • b) Continuous Data • Data where fractions/decimals are possible • E.g., Weight, Height, Academic average • 2) Qualitative – cannot be measured numerically • E.g. eye colour, hair colour, favourite band

  7. Who do we collect data from? • Population - the entire group from which we can collect data / draw conclusions • NOTE: Data does NOT have to be collected from every member • Census – data collected from every member of the pop’n • Data is representative of the population • Can be time-consuming and/or expensive • Sample - data collected from some members of the pop’n (min. 10%) • A good sample must be representative of the pop’n • Sampling methods in Ch2

  8. Organizing Data • A frequency table is often used to display data, listing the variable and the frequency. • What type of data does this table contain? • Intervals can’t overlap • Use from 3-12 intervals / categories

  9. Organizing Data (cont’d) • Another useful organizer is a stem and leaf plot. • This table represents the following data: 101 103 107 112 114 115 115 121 123 125 127 127 133 134 134 136 137 138 141 144 146 146 146 152 152 154 159 165 167 168

  10. Organizing Data (cont’d) • What type of data is this? • The class interval is the size of the grouping, and is 10 units here • 100-109, 110-119, 120-129, etc. • No decimals req’d • Stem can have as many numbers as needed • A leaf must be recorded each time the number occurs

  11. Measures of Central Tendency • Used to indicate one value that best represents a group of values • Mean (Average) • Add all numbers and divide by the number of values • Affected greatly by outliers (values that are significantly different from the rest) • Median • Middle value • Place all values in order and choose middle number • For an even # of values, average the 2 middle ones • Not affected as much by outliers • Mode • Most common number • There can be none, one or many modes • Only choice for Qualitative data

  12. Displaying Data – Bar Graphs • Typically used for qualitative/discrete data • Shows how certain categories compare • Why are the bars separated? • Would it be incorrect if you didn’t separate them? Number of police officers in Crimeville, 1993 to 2001

  13. Bar graphs (cont’d) • Stacked bar graph • Compares 2 variables • Can be scaled to 100% • Double bar graph • Compares 2 sets of data Internet use at Redwood Secondary School, by sex, 1995 to 2002

  14. Displaying Data - Histograms • Typically used for Continuous data • The bars are attached because the x-axis represents intervals • Choice of class interval size is important. Why?

  15. Displaying Data –Pie / Circle Graphs • A circle divided up to represent the data • Shows each category as a portion of the whole • See p. 8 of the text for an example of creating these by hand

  16. Scatter Plot • A scatter plot shows the relationship between two numeric variables • This relationship, called a correlation, can be positive, negative or none • A line or curve of best fit (regression line) can be used to model the relationship

  17. Examining Trends • A line graph shows long-term trends over time • e.g. stock price, currency, moving average

  18. Examining the spread of data • A box and whisker plot shows the spread of data • Divided into 4 quartiles with 25% of the data in each • Instructions for creating these may be found on page 9 of the text or at: http://regentsprep.org/Regents/math/data/boxwhisk.htm

  19. MSIP / Home Learning • p. 11 #2, 3ab, 4, 7, 8

  20. Mystery Data • Gas prices in the GTA

  21. An example… • these are prices for Internet service packages • find the mean, median and mode • determine what type of data this is • create a suitable frequency table, stem and leaf plot and graph 13.60 15.60 17.20 16.00 17.50 18.60 18.70 12.20 18.60 15.70 15.30 13.00 16.40 14.30 18.10 18.60 17.60 18.40 19.30 15.60 17.10 18.30 15.20 15.70 17.20 18.10 18.40 12.00 16.40 15.60

  22. Answers… • Mean = 494.30/30 = 16.48 • Median = average of 15th and 16th numbers • Median = (16.40 + 17.10)/2 = 16.75 • Mode = 15.60 and 18.60 • The data is numerical, so at least Interval data. It has an absolute starting point, so it is ratio data. • Decimals so quantitative and continuous. • Given this, a histogram is appropriate

  23. Having the data is not enough. [You] have to show it in ways people both enjoy and understand. - Hans Rosling 1.2 Conclusions and Issues in Two Variable Data Learning goal: Draw conclusions from two-variable graphs MSIP / Home Learning Read pp. 16–19 Complete p. 20–24 #1, 4, 9, 11, 14

  24. What conclusions are possible? • To draw a conclusion, a number of conditions must apply • data must be representative of the population • sample size must be large enough • data must address the question

  25. Types of statistical relationships • Correlation • two variables appear to be related • i.e., a change in one variable is associated with a change in the other • e.g., salary increases as age increases • Causation • a change in one variable is proven to cause a change in the other • usually requires an in-depth study i.e. WE WILL NOT DO THIS IN THIS COURSE!!! • e.g., incidence of cancer among smokers • Do not use the P-word!!!

  26. Example 1 – Split bar graph • Do females like school more than males do?

  27. Example 2 – Is there a correlation between attitude and performance?

  28. Example 3 – Examine all 1046 students

  29. Drawing Conclusions • Do females seem more likely to be interested in student government? • Does gender appear to have an effect on interest in student government? • Is this a correlation? • Is it likely that being female causes interest?

  30. References • Calkins, K. (2003). Definitions, Uses, Data Types, and Levels of Measurement. Retrieved August 23, 2004 from http://www.andrews.edu/~calkins/math/webtexts/stat01.htm • James Cook University (n.d.). ICU Studies Online. Retrieved August 23, 2004 from http://www.jcu.edu.au/studying/services/studyskills/scientific/data.html

More Related