Important Ideas in Data Analysis for PreK-12 Students, Teachers, and Teacher Educators

Important Ideas in Data Analysis for PreK-12 Students, Teachers, and Teacher Educators Denise S. Mewborn University of Georgia

Data analysis/statistics… • helps us answer questions. • helps us make better decisions. • helps us describe and understand our world. • helps us quantify variability.

What questions can we ask? • Where are you from? • How did you get here? • How long are you staying?/What day are you leaving? • How many times have you been to TEAM? • What is your day job?

Answering our questions • Collect data • Make a graph

Wait! There’s more!!!! • Analyze and interpret data • Answer the original question • Make inferences • Make predictions • What other questions can we answer with this data display?

Standards 2000 Instructional programs should enable all students to– • formulate questions that can be addressed with data and collect, organize, and display relevant data to answer them; • select and use appropriate statistical methods to analyze data; • develop and evaluate inferences and predictions that are based on data;

GAISE http://www.amstat.org/education/gaise/

Statistical Problem Solving • Formulate Questions • clarify the problem at hand • formulate question(s) that can be answered with data • Collect Data • design a plan to collect appropriate data • employ the plan to collect the data • Analyze Data • select appropriate graphical or numerical methods • use these methods to analyze the data • Interpret Results • interpret the analysis • relate the interpretation to the original question

Main Points • We are not asking enough of students!!! • We are not providing them with rich enough experiences in data analysis to enable them to move confidently into higher grades or to make sense of the world. • Statistics is an opportunity to APPLY lots of other mathematical ideas in a context. • Need to end the “mean-median-mode ad nauseum” pattern we’ve been using.

Big ideas that need more attention • Context • Why do we want to know these things? • Variability • natural vs. induced • Inference, prediction

Process Component Level A Level B Level C Formulate Question Beginning awareness of the statistics question distinction Increased awareness of the statistics question distinction Students can make the statistics question distinction Collect Data Do not yetdesign for differences Awareness of design for differences Students makedesigns for differences Analyze Data Useparticular properties ofdistributionsin context of specific example Learn to use particular properties of distributions as tools of analysis Understand and use distributions in analysis as a global concept Interpret Results Do not look beyond the data Acknowledge that looking beyond the data is feasible Able to look beyond the data in some contexts THE GAISE FRAMEWORK MODEL

Nature of Variability Focus on Variability Measurement variability Natural variability Induced variability Variability within a group Sampling variability Variability within a group and variability between groups Co-variability Chance variability Variability in model fitting THE FRAMEWORK MODEL

Most common and most appropriate type of data collection for PreK-5 Involves collecting and analyzing data about us/our classroom Examples Favorite ______ Type of shoes Lunch count Weather Birthdays Bus riders/car riders/walkers Classroom Census

Type of shoes we’re wearing • What is the most popular type of shoe in our class today?

Pushing to higher levels • Formulate questions • Allow children to generate questions from a context • Tie shoes vs. not tie shoes • Tie shoes, slip-on shoes, buckle shoes • Shoe color • Type of soles • Material from which shoe is made

Pushing… • Collect data • What data do we need in order to answer our question? • How could we get this data? • Use actual shoes • Raise hands and count • Use Unifix cubes to make towers • Use sticky notes to make a graph

Pushing… • Analyze data • Decide on an appropriate graphical representation • Describe the shape of the distribution • Locate individuals within group data

Pushing… • Interpret results • Answer the original question • Make inferences • Why might so many people be wearing tie shoes today? • Make predictions • Would you expect the same results if we collected this data in December? • Would we get the same results if we collected data from Ms. Murphy’s class? • Would we get the same results if we went to <local business> and collected data?

Pushing… • Extending to new problems • What other questions could we answer with this data? • How many more people are wearing tie shoes than slip-on shoes? • How many people are wearing tie shoes or buckle shoes?

Simple Experiment • Science experiment • Beans grown in dark or light • Comparison of 2 existing items • Sugar content in bubble gum vs. minty gum

Simple experiment • Formulatequestions • What things affect how well a bean plan grows? (light, soil, water, temperature) • What does it mean that a bean “grows well?” • Which condition are we most interested in investigating?

Simple experiment • Collect Data • Plan the experiment • Decide what data to collect (height of beans) • How will we collect it? (ruler–inches vs. centimeters, Unifix cubes, string) • When will we collect it? • Conduct the experiment

Simple experiment • Analyze Data • Dot plot • Did all beans from one condition grow better than all beans from the other condition? • Answer the original question.

Simple experiment • Interpret Results • Does this fit with what you know and observe about growing flowers, plants, and vegetables? • Why didn’t some beans in the light sprout at all? • Does this mean we can’t grow plants inside? • Predict • Does it matter what kind of seeds we use? • Extend • How much taller was the tallest bean than the shortest bean?

Evolution of the mean • Level A: fair share • Level B: balance point of a distribution • Level C: distribution of sample means • The Family Size Problem: How large are families today?

Level A • 9 children each represent their family size with cubes

2 3 3 4 4 5 6 7 9

How many people would be in each family if they were all the same size (e.g., no variability)?

All 43 Family Members

Results • Fair share value • Leads to algorithm for the mean

Upside down and backward • What if the mean is 6? • What could the 9 families look like?

Two Examples with Fair Share Value of 6. Which group is “closer” to being “fair?”

How might we measure “how close” a group of numeric data is to being fair?

Which group is “closer” to being “fair?” The blue group is closer to fair since it requires only one “step” to make it fair. The lower group requires two “steps.”

How do we define a “step?”When a snap cube is removed from a stack higher than the fair share value and placed on a stack lower than the fair share value, we count a step.“fairness” ~ number of steps to make it fairFewer steps is closer to fair

Number of Steps to Make Fair: 8 Number of Steps to Make Fair: 9

Students completing Level A understand:• the notion of “fair share” for a set of numeric data• the fair share value is also called the mean value• the algorithm for finding the mean• the notion of “number of steps” to make fair as a measure of variability about the mean• the fair share/mean value provides a basis for comparison between two groups of numerical data with different sizes (thus can’t use total)

Level B • Balance point • Developing measures of variation about the mean

Create different dot plots for of nine families with a mean of 6.

-+--+--+--+--+--+--+--+--+- 2 3 4 5 6 7 8 9 10

-+--+--+--+--+--+--+--+--+- 2 3 4 5 6 7 8 9 10 -+--+--+--+--+--+--+--+--+- 2 3 4 5 6 7 8 9 10

In which group do the data (family sizes) vary (differ) more from the mean value of 6?

1 2 4 2 1 0 1 2 3 -+--+--+--+--+--+--+--+--+- 2 3 4 5 6 7 8 9 10 0 0 4 3 2 0 2 3 4 -+--+--+--+--+--+--+--+--+- 2 3 4 5 6 7 8 9 10

Important Ideas in Data Analysis for PreK-12 Students, Teachers, and Teacher Educators