# Important Ideas in Data Analysis for PreK-12 Students, Teachers, and Teacher Educators - PowerPoint PPT Presentation

1 / 67

Important Ideas in Data Analysis for PreK-12 Students, Teachers, and Teacher Educators. Denise S. Mewborn University of Georgia. Data analysis/statistics…. helps us answer questions. helps us make better decisions. helps us describe and understand our world. helps us quantify variability. .

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Important Ideas in Data Analysis for PreK-12 Students, Teachers, and Teacher Educators

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

## Important Ideas in Data Analysis for PreK-12 Students, Teachers, and Teacher Educators

Denise S. Mewborn

University of Georgia

### Data analysis/statistics…

• helps us make better decisions.

• helps us describe and understand our world.

• helps us quantify variability.

### What questions can we ask?

• Where are you from?

• How did you get here?

• How long are you staying?/What day are you leaving?

• How many times have you been to TEAM?

• What is your day job?

• Collect data

• Make a graph

### Wait! There’s more!!!!

• Analyze and interpret data

• Make inferences

• Make predictions

• What other questions can we answer with this data display?

### Standards 2000

Instructional programs should enable all students to–

• formulate questions that can be addressed with data and collect, organize, and display relevant data to answer them;

• select and use appropriate statistical methods to analyze data;

• develop and evaluate inferences and predictions that are based on data;

### GAISE

http://www.amstat.org/education/gaise/

### Statistical Problem Solving

• Formulate Questions

• clarify the problem at hand

• formulate question(s) that can be answered with data

• Collect Data

• design a plan to collect appropriate data

• employ the plan to collect the data

• Analyze Data

• select appropriate graphical or numerical methods

• use these methods to analyze the data

• Interpret Results

• interpret the analysis

• relate the interpretation to the original question

### Main Points

• We are not asking enough of students!!!

• We are not providing them with rich enough experiences in data analysis to enable them to move confidently into higher grades or to make sense of the world.

• Statistics is an opportunity to APPLY lots of other mathematical ideas in a context.

• Need to end the “mean-median-mode ad nauseum” pattern we’ve been using.

### Big ideas that need more attention

• Context

• Why do we want to know these things?

• Variability

• natural vs. induced

• Inference, prediction

Process

Component

Level A

Level B

Level C

Formulate

Question

Beginning awareness of the statistics question distinction

Increased awareness of the statistics question distinction

Students can make the statistics question distinction

Collect

Data

Do not yetdesign for differences

Awareness of design for differences

Students makedesigns for differences

Analyze

Data

Useparticular properties ofdistributionsin context of specific example

Learn to use particular properties of distributions as tools of analysis

Understand and use distributions in analysis as a global concept

Interpret

Results

Do not look beyond the data

Acknowledge that looking beyond the data is feasible

Able to look beyond the data in some contexts

### THE GAISE FRAMEWORK MODEL

Nature of

Variability

Focus on

Variability

Measurement variability

Natural variability

Induced variability

Variability within a group

Sampling variability

Variability within a group and variability between groups

Co-variability

Chance variability

Variability in model fitting

### THE FRAMEWORK MODEL

Most common and most appropriate type of data collection for PreK-5

Involves collecting and analyzing data about us/our classroom

Examples

Favorite ______

Type of shoes

Lunch count

Weather

Birthdays

Bus riders/car riders/walkers

### Type of shoes we’re wearing

• What is the most popular type of shoe in our class today?

### Pushing to higher levels

• Formulate questions

• Allow children to generate questions from a context

• Tie shoes vs. not tie shoes

• Tie shoes, slip-on shoes, buckle shoes

• Shoe color

• Type of soles

• Material from which shoe is made

### Pushing…

• Collect data

• What data do we need in order to answer our question?

• How could we get this data?

• Use actual shoes

• Raise hands and count

• Use Unifix cubes to make towers

• Use sticky notes to make a graph

### Pushing…

• Analyze data

• Decide on an appropriate graphical representation

• Describe the shape of the distribution

• Locate individuals within group data

### Pushing…

• Interpret results

• Make inferences

• Why might so many people be wearing tie shoes today?

• Make predictions

• Would you expect the same results if we collected this data in December?

• Would we get the same results if we collected data from Ms. Murphy’s class?

• Would we get the same results if we went to <local business> and collected data?

### Pushing…

• Extending to new problems

• What other questions could we answer with this data?

• How many more people are wearing tie shoes than slip-on shoes?

• How many people are wearing tie shoes or buckle shoes?

### Simple Experiment

• Science experiment

• Beans grown in dark or light

• Comparison of 2 existing items

• Sugar content in bubble gum vs. minty gum

### Simple experiment

• Formulatequestions

• What things affect how well a bean plan grows? (light, soil, water, temperature)

• What does it mean that a bean “grows well?”

• Which condition are we most interested in investigating?

### Simple experiment

• Collect Data

• Plan the experiment

• Decide what data to collect (height of beans)

• How will we collect it? (ruler–inches vs. centimeters, Unifix cubes, string)

• When will we collect it?

• Conduct the experiment

### Simple experiment

• Analyze Data

• Dot plot

• Did all beans from one condition grow better than all beans from the other condition?

### Simple experiment

• Interpret Results

• Does this fit with what you know and observe about growing flowers, plants, and vegetables?

• Why didn’t some beans in the light sprout at all?

• Does this mean we can’t grow plants inside?

• Predict

• Does it matter what kind of seeds we use?

• Extend

• How much taller was the tallest bean than the shortest bean?

### Evolution of the mean

• Level A: fair share

• Level B: balance point of a distribution

• Level C: distribution of sample means

• The Family Size Problem: How large are families today?

### Level A

• 9 children each represent their family size with cubes

2 3 3 4 4 5 6 7 9

### How many people would be in each family if they were all the same size (e.g., no variability)?

All 43 Family Members

### Results

• Fair share value

• Leads to algorithm for the mean

### Upside down and backward

• What if the mean is 6?

• What could the 9 families look like?

Two Examples with Fair Share Value of 6.

Which group is “closer” to being “fair?”

How might we measure “how close” a group of numeric data is to being fair?

Which group is “closer” to being “fair?”

The blue group is closer to fair since it requires only one “step” to make it fair. The lower group requires two “steps.”

How do we define a “step?”When a snap cube is removed from a stack higher than the fair share value and placed on a stack lower than the fair share value, we count a step.“fairness” ~ number of steps to make it fairFewer steps is closer to fair

Number of Steps to Make Fair: 8

Number of Steps to Make Fair: 9

Students completing Level A understand:•the notion of “fair share” for a set of numeric data•the fair share value is also called the mean value•the algorithm for finding the mean•the notion of “number of steps” to make fair as a measure of variability about the mean•the fair share/mean value provides a basis for comparison between two groups of numerical data with different sizes (thus can’t use total)

### Level B

• Balance point

• Developing measures of variation about the mean

### Create different dot plots for of nine families with a mean of 6.

-+--+--+--+--+--+--+--+--+-

2 3 4 5 6 7 8 9 10

-+--+--+--+--+--+--+--+--+-

2 3 4 5 6 7 8 9 10

-+--+--+--+--+--+--+--+--+-

2 3 4 5 6 7 8 9 10

In which group do the data (family sizes) vary (differ) more from the mean value of 6?

1

2

4

2

1

0

1

2

3

-+--+--+--+--+--+--+--+--+-

2 3 4 5 6 7 8 9 10

0

0

4

3

2

0

2

3

4

-+--+--+--+--+--+--+--+--+-

2 3 4 5 6 7 8 9 10

### In Distribution 1, the Total Distance from the Mean is 16. In Distribution 2, the Total Distance from the Mean is 18.

1

2

4

2

1

0

1

2

3

-+--+--+--+--+--+--+--+--+-

2 3 4 5 6 7 8 9 10

The total distance for the values below the mean of 6 is 8, the same as the total distance for the values above the mean. So, the distribution will “balance” at 6 (the mean).

The SAD is defined to be:The Sum of the Absolute DeviationsRelationship between SAD and Number of Steps to Fair from Level A: SAD = 2 x number of steps

Number of Steps to Make Fair: 8

Number of Steps to Make Fair: 9

### An Illustration where the SAD doesn’t work!

4

4

-+--+--+--+--+--+--+--+--+-

2 3 4 5 6 7 8 9 10

1

1

1

1

1

1

1

1

-+--+--+--+--+--+--+--+--+-

2 3 4 5 6 7 8 9 10

### The SAD is 8 for each distribution, but in the first distribution the data vary more from the mean. Why doesn’t the SAD work?

Measuring Variation about the Mean•SAD = Sum of Absolute Deviations•MAD = Mean of Absolute Deviations•Variance = Mean of Squared Deviations•Standard Deviation = Square Root of Variance

Summary of Level B and Transitions to Level C•Mean as the balance point of a distribution•Mean as a “central” point•Various measures of variation about the mean

### Level C

• Sampling distribution of the sample means

• Transition from descriptive to inferential statistics

Eighty Circles/What is the Mean Diameter?

Activity•Choose 10 circles that you think have a diameter close to the mean. Find the mean diameter of your 10 circles. vs.•Select random samples of 10 circles and find the mean.

### Resources

• NCTM Principles and Standards

• GAISE Framework

• Quantitative Literacy series

### Statistical Problem Solving

• Formulate Questions

• clarify the problem at hand

• formulate question(s) that can be answered with data

• Collect Data

• design a plan to collect appropriate data

• employ the plan to collect the data

• Analyze Data

• select appropriate graphical or numerical methods

• use these methods to analyze the data

• Interpret Results

• interpret the analysis

• relate the interpretation to the original question