1 / 27

EXPLORING DATA

EXPLORING DATA. Individuals. Definition of Individuals: objects described by a set of data examples: people, animals, things. Variable. Definition of Variable any characteristic of an individual A variable can take different values for different individual. Categorical Variable

kovit
Download Presentation

EXPLORING DATA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EXPLORING DATA

  2. Individuals Definition of Individuals: objects described by a set of data examples: people, animals, things

  3. Variable • Definition of Variable any characteristic of an individual • A variable can take different values for different individual

  4. Categorical Variable Placing an individual into one of several groups or categories Ex: Race, Gender, Grade Level Quantitative Variable Taking numerical values for which arithmetic operations such as adding and averaging make sense Ex: Time, Age, GPA Types of Variables

  5. Distribution • Distribution of a variable Values of the variable and how often each value is given • The pattern of variation of a variable is its distribution.

  6. Displaying Distributions with Graphs

  7. Quantitative Graphs • Dot Plot: A simple graph used for small sets of data • Stem Plots: Effective graph for small to medium size data sets. • Histograms: Common distribution for one variable • Ogive: A relative cumulative frequency histogram.

  8. Dot Plots The number of goals scored by each team in the first round of the California Southern Division V high school playoffs is shown in the following table Goals Scored in California Division V Soccer Playoff o o o o o o o o o o o o o o o o o o o o o o o o o o o o 0 1 2 3 4 5 6 7 Number of Goals

  9. Stem Plots • Step 1: Separate each observation into a stem consisting of all but the rightmost digit and the leaf, the final digit. • Step 2: Write the stems vertically in increasing order from top to bottom and draw a vertical line to the right of the stems. Go through the data, writing each leaf to the right of its stem and spacing the leaves equally • Step 3: Make a key for the data set

  10. Caffeine content (in milligrams) for an 8 ounce serving of popular soft drinks Brand Caffeine Brand Caffeine A&W Cream Soda 20 IBC Cherry Cola 16 Barq’s root beer 15 Kick 38 Cherry Coca Cola 23 KMX 36 Cherry RC Cola 29 Mello Yellow 35 Coca Cola Classic 23 Mountain Dew 37 Diet A&W Cream Soda 15 Mr. Pibb 27 Diet Cherry Coca Cola 23 Nehi Wild Red Soda 33 Diet Coke 31 Pepsi One 37 Diet Dr. Pepper 28 Pepsi Cola 25 Diet Mellow Yellow 35 RC Edge 47 Diet Mountain Dew 37 Red Flash 27 Diet Mr Pibb 27 Royal Crown Cola 29 Diet Pepsi 24 Ruby Red Squirt 26 Diet Ruby Red Squirt 26 Sun Drop Cherry 43 Diet Sun Drop 47 Sun Drop Regular 43 Diet Sunkist Orange Soda 28 Sunkist Orange Soda 28 Diet Wild Cherry Pepsi 24 Surge 35 Dr Nehi 28 TAB 31 Dr. Pepper 28 Wild Cherry Pepsi 25

  11. Stem Plots • Stem Leaves • 1 5 5 6 • 2 0 3 3 3 4 4 5 5 6 6 7 7 7 8 8 8 9 • 3 1 1 3 5 5 5 6 7 7 7 8 • 4 3 3 7 7 KEY: 2|3 = 23

  12. Split Stem Plots • Stem Leaves • 1 • 1 5 5 6 • 2 0 3 3 3 4 4 • 2 5 5 6 6 7 7 7 8 8 8 9 • 3 1 1 3 • 3 5 5 5 6 7 7 7 8 • 4 3 3 • 4 7 7 KEY: 2|3 = 23

  13. Tips for when to use a stem plot • When you split stems be sure each stem is assigned an equal number of possible leaf digits • There is no magic number for how many stems to use. If you use to few you will get a skyscraper look and if you use to many if will be very flat. • Five stems is a good minimum • You can get more flexibility by rounding the data so that the final digits after round is suitable as a leaf. Do this when the data has too many digits.

  14. Age of presidents at inauguration President Age President Age Washington 57 Arthur 51 J. Adams 61 Cleveland 47 Jefferson 57 B. Harrison 55 Madison 57 Cleveland 55 Monroe 57 McKinley 54 J.Q. Adams 57 T. Roosevelt 42 Jackson 61 Taft 51 Van Buren 54 Wilson 56 W.H. Harrison 68 Harding 55 Tyler 51 Coolidge 51 Polk 49 Hoover 54 Taylor 64 F.D. Roosevelt 51 Filmore 50 Truman 60 Pierce 48 Eisenhower 61 Buchanan 65 Kennedy 43 Lincoln 52 L.B. Johnson 55 A. Johnson 56 Nixon 56 Grant 46 Ford 61 Hayes 54 Carter 52 Garfield 49 Reagan 69 G. Bush 64 Clinton 46 G.W. Bush 54 Class Count 40-44 2 45-49 6 50-54 13 55-59 12 60-64 7 65-69 3 Histograms

  15. Histograms

  16. Histogram Tips • There is no magic number for how many classes to use. If you use to few you will get a skyscraper look and if you use to many if will be very flat. • Five classes is a good minimum • Make sure all classes are the same width so that we can talk about the distribution based on the height of each class • Be careful when using your calculator to make histograms because it will choose its own class size that may not be appropriate for what you want.

  17. OGIVE • Ogive is a relative cumulative frequency histogram. • See page 28-29 to see how to make an Ogive

  18. Categorical Graphs • Bar Graph: Similar to Histogram but used for Categorical Data. • Pie Graph: Categorical Data separated into percentages.

  19. Bar Graphs Cases Sold Company (millions) Coca-Cola 4377.5 Pepsi Cola 3119.5 Dr. Pepper 1455.1 Cott Corp 310 National Beverage 205 Royal Crown 115.4 Other 347.5

  20. How to construct a bar graph • Step 1: Label your axes and title your graph • Step 2: Scale your axes. Use the counts in each category to help you scale the vertical axis • Step 3: Draw a vertical bar about each category to a height corresponding with the count in that category

  21. Pie Graphs Cases Sold Company (millions) Coca-Cola 4377.5 Pepsi Cola 3119.5 Dr. Pepper 1455.1 Cott Corp 310 National Beverage 205 Royal Crown 115.4 Other 347.5

  22. Other graphs • Time Plot: Use to follow trends based on time.

  23. Time Plot • A time plot of a variable plots each observation against the time it was measured at. Time is always on the horizontal axis. Time plots are made by connecting the dots on the graph to help to visualize the trend of the data in relationship to time

  24. Overall Pattern of a Distribution • Center: The middle of the distribution • Spread: How far apart is the data • Shape: Flat, Bell Shaped, Skewed, Bi-Modal • Outlier: an individual observation that falls outside the overall pattern of the data

  25. More about shape • Symmetric: The right and left sides of the distribution are approximately mirror images of each other

  26. Skewness

  27. Shapes-Skewed- Right and Left Right Skewed Skewed to the right implies that the right tail is very small Left Skewed Skewed to the left implies that the left tail is very small

More Related