Tools for Civil Society to Understand and Use Development Data: Improving MDG Policymaking and Monitoring. Module 8: Living with Error. What you will learn from this module. What causes error in MDG indicators (MDGi’s) The 3 types of error in MDGi’s, and how they differ.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Tools for Civil Society to Understand and Use Development Data: Improving MDG Policymaking and Monitoring
Module 8: Living with Error
We can identify three types of error in MDG indicators (and other summary statistics):
Bias error is a systematic error that causes all measured values to deviate from the true value in a consistent direction, higher or lower
1. Bias (male) x x x x x
2. Bias (female) x x x x x
3. No bias x x x x x
Sample mean (male) X
Dozenland is the world’s smallest country
Estimate the average income (in Dozenland dollars) per person
How shall we do this?
Using a census (true value)
Using a household sample of size 4
Using all possible household samples of any size
Since we know the true answers from the hypothetical census, we can see the exact error in our sample-based estimate
The error in the estimate of the mean is
5100 - 5466.7 = -366.7 Dozenland dollars (D$)
i.e. we have underestimated average income by about 7%
1. Use samples of different sizes (The easiest way to do so is to use a larger sample, making the sample more similar to the population from which it is drawn)
2. Rely on statistical theory, which tells us how to estimate the sampling error
Summary results from taking Data: Improving MDG Policymaking and Monitoringall possible samples
ALL possible samples of size n (ranging from 1 to 12) from the 12 households
n S Mean Variation
1 12 5466.7 1327.5
2 66 5466.7 895.0
3 220 5466.7 693.3
4 495 5466.7 566.0
5 792 5466.7 473.6
6 924 5466.7 400.3
7 792 5466.7 338.3
8 495 5466.7 283.0
9 220 5466.7 231.1
10 66 5466.7 179.0
11 12 5466.7 120.7
12 1 5466.7
n = sample size; S = number samples of size n
Let us consider the sample of four households
The values in the sample are: 4200, 4700, 4500, and 7000. This yields:
In many cases, bias arises because we obtain data from a population that is not the one we really should be using, called the target population
Example: vital registration
Target population: all deaths
Population used: urban areas
Whether or not bias error occurs depends upon the difference between
Example: are infant deaths more common in rural than in urban areas?
Note:that there is some overlap between these groupings
This is where some members of the target population have a greater chance of selection into the sample than do others
Example:household surveys of income
This is where the population has been incorrectly specified
Classic example:use of a telephone to question potential respondents
Sampling frames or administrative systems might be inadequate in that clusters of the population are missing and therefore could not be sampled.
On the other hand the frame might cover all broad sectors but may have some units omitted or some “foreign elements”. For example:
Some units in the population might appear twice or more.
Administrative data: A business that moves to a new location may be included in register in both locations
The quality of administrative records can depend in part on the incentives of registration
Example: Casley and Lury (1981) give an example of a Caribbean finance department who offered fertilizer subsidies for every registered piece of land on an island
They later found that they were paying subsidies for an area greater than the entire island!
May be classified into three types:
Example 1: farmers might inflate their land holdings, by always rounding figures upwards, because they believe that the survey results will be used to allocate state aid, or….
Example 2: the farmers might deflate, by rounding down, in the hope of minimize taxation
Sometimes response bias is caused through leading questions such as, 'Do you agree that meat eating is barbaric?'
Most people like to please and/or will take the easy option of agreeing in the hope of avoiding further questions!
Many people do not want to appear uninformed.
On occasions the very appearance of the enumerator can cause bias
We have seen that sampling error will decrease as the sample size increases
Unfortunately the reverse is generally true about bias error: it tends to increase as sample size increases
RMSE Data: Improving MDG Policymaking and Monitoring
Sampling errorRoot Mean Square Error
The total error, sampling and bias combined, is measured by the rootmean square error, (RMSE)
This is defined as
There are 3 types of error that may have affected an MDG indicators: