Histograms. Kevin Zhou, Joshua Zhong , and Daniel Falvo. What is a histogram?. Does this sound boring?. Is it like 1000 times less than a milligram?.
Kevin Zhou, Joshua Zhong, and Daniel Falvo
Does this sound boring?
Is it like 1000 times less than a milligram?
No. It is a diagram that displays the frequency of a continuous independent variable, or a series of intervals. It displays how many data points lay in each interval.
Number of events in a given interval
But when should you use histograms instead of these?
When you have only one independent variable, you should use histograms.
If you have very large data sets, use histograms to summarize them graphically.
Unless there are outliers.
Are you my kind?
Symmetrical histograms are shaped like mounds.
Am I an outlier?
What type of data is necessary to make a histogram?
Example: How many people arrive at a certain time at the car loop on a two hour delay schedule?
Independent variable: When people arrive
Dependent variable: How many people arrived
Notice how the data is NOT categorical, even though each interval seems like one! They have numbers, and thus, they are numerical.
Amount of students
When kids arrive when there's a two hour delay
This histogram is meaningful because it shows that most people arrive at the car loop early in the case of a two hour delay, so the teachers should pay the most attention from 10:00 to 10:03.
All of the histograms that we’ve seen so far are called frequency histograms.
To create a histogram, first make a data table; then, just make the histogram as you would make a bar graph.
Remember, the difference between them is very slight.
Histograms have the rectangles next to each other; the other has them separated.
Histograms have numerical independent variables. Bar graphs have categorical independent variables.
How can we make a relative frequency histogram using the information from the frequency histogram?
Both histograms look the samebecause the relative frequency histogram shows the proportion of data in each interval.
NOTICED HOW BOTH ARE 9 SIMPLE STEPS?
Rounded to the nearest hundredth due to space issues
“0.4” means that there is a 40% chance, that someone would arrive between 10:00-10:01.
What is probability?
Probability is the relative frequency of an event in the long run.
This means that in the long run, the relative frequency histogram should look the same as a probability histogram!
(Probability histograms are where all of the bar heights add up to 1)
Relative frequency is an event’s probability, or the proportion of times the event occurs in the long run.
For example: The X-axis is arranged in arrival time.
When used in histograms, order just means how the graph is arranged in sequence.
The interval size is a mini-range. Anything that is between the minimum and maximum of two adjacent intervals will all be summed up into one bar.
To select an interval size, it is suggested to find the range of the whole data set (maximum – minimum) and divide this number by 10.
When you decrease interval size, the histogram will look different. The bars will have lower frequency.
Interval size 5
Interval size 1
Also known as: A pop quiz!
Some people would arrive right on the spot, while others would arrive 15 minutes early.
If you imagine that there was more data to the left, wouldn’t it be like a regular bell-curve?
Where is the outlier on this histogram?
The answer is “After”. Notice the jump between 10:10 and After.
The reason why “10:00-10:01” is not the correct answer is because if we extended the histogram like so...