- 770 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Fundamentals of Sampling Method' - valerie

**An Image/Link below is provided (as is) to download presentation**
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

Presentation Transcript

### Fundamentals of Sampling Method

### Non probabilistic samples

Week 4

Research Methods & Data Analysis

Research Methods & Data Analysis

Tutorials

- Thursday 30th October9-11 AG GL 20 (M. Mazzocchi)
- Tuesday 4th November11-1pm (H.Neeliah)
- You may attend:
- One (the most convenient for you)
- Both (it may be very useful)
- None (not really advised…)

Research Methods & Data Analysis

Lecture outline

- Key notions of statistics
- Simple random sampling
- Sampling error
- Sampling size
- Other sampling methods

Research Methods & Data Analysis

Distributions

- A set of values of a set of data together with their
- Absolute frequencies
- Relative frequencies (probabilities)

Research Methods & Data Analysis

Distributions of random variables

- The distribution of possible values together with their probabilities (probability density function, p.d.f.)

Research Methods & Data Analysis

The normal (Gaussian) distribution

- …is the distribution representing perfect randomness around a mean value
- In statistics, the normal distribution play a key role in the theory of errors
- The central limit theorem implies that “averaging” almost always give origin to a normal distribution (error on the average is random), provided that the number of observation is large (>40)

Research Methods & Data Analysis

The student-t distribution

- When the parameter in the population has a normal distribution (with unknown variance), within the sample the parameter assumes a t distribution
- The t-distribution is similar to the normal distribution, apart from having higher tail-probabilities
- The bigger is the sample, the more similar the t-distribution is to the normal distribution
- For samples with more than 30-40 units, the difference between the two distributions is negligible

Research Methods & Data Analysis

ta/2 and za/2 – tabled values

Research Methods & Data Analysis

Population parameters(in a population of N elements)

- Mean
- Variance
- Standard deviation

Research Methods & Data Analysis

Sampling

- A sample is a subgroup of the population selected for the study
- Sample statistics allow to make inference about the population parameters, through estimation and hypothesis testing
- The sample space is a complete set of all possible results of the sampling procedure

Research Methods & Data Analysis

Simple random sampling

- Each element of the population has a known and equal probability of selection
- Every element is selected independently from other elements
- The probability of selecting a given sample of n elements is computable (known)
- The Central Limit Theorem guarantees that for simple random samples with sample size (n) sufficiently large (>40), the sample mean in a S.R.S. follows the normal distribution

Research Methods & Data Analysis

Sample statistics

- Sample mean
- Sample variance
- Sample standard deviation

unbiasedness

Research Methods & Data Analysis

Standard deviation and standard error

- The standard deviation measures the variability of a given variable (e.g. X) within the population or sample
- The standard error refers to the accuracy (variability) of the sample statistics (e.g. mean), i.e. the error due to the fact that the statistic is computed on a sample rather than on the population (sampling error)

Research Methods & Data Analysis

Basic SRS sample statistics (unknown pop. variance)

Mean case

Proportion case (p)

Sample standard deviation of X

Standard error of the mean/proportion

ACCURACY of sample estimates

Research Methods & Data Analysis

Finite population correction factor

- For finite population (…i.e. all in social research), large samples (more than 10% of N) tend to overestimate the standard error of the sample mean (proportion)
- In order to account for that, the following correction is necessary

Research Methods & Data Analysis

Level of confidence aand z parameter

The level of confidence a refers to the probability that the true population mean falls in the identified confidence interval

For the normal distribution, given a value of a, the corresponding za/2values is tabulated

a=0.05

za/2 =1.96

a/2

a/2

x

Confidence interval for x at a level of confidence a

Research Methods & Data Analysis

Confidence intervals

- Calculate the sample mean
- Decide a level of confidence (usually 95% or 99%)
- Choose whether using the Student-t distribution or the Normal distribution
- Compute the sample standard error
- Define the lower and upper bound of the confidence interval

Research Methods & Data Analysis

Exercise

- Suppose that you have interviewed 20 students out of 200 in the agricultural building, asking them how much they paid for lunch yesterday
- You get an average of £ 3.67
- The standard deviation is 1.25
- Compute the 95% confidence interval
- Compute the 99% confidence interval

Research Methods & Data Analysis

Determining sample size

Factors influencing sample size (n):

- Size of the population (N)
- Variability of the population (s)
- Desired level of accuracy (q)
- Level of confidence (a)
- Budget constraint

Research Methods & Data Analysis

Simple random sampling: determining sample size

- Relative sampling error (r.s.e)
- Determining sampling size for a given r.s.e. (approximate formula)

Research Methods & Data Analysis

The sampling design process

- Define the target population, its elements and the sampling units
- Determine the sampling frame (list)
- Select a sampling technique
- Sampling with/without replacement
- Probability/Nonprobability sampling
- Determine the sample size
- Precision versus costs
- The marginal value in terms of precision of additional sampling units is decreasing
- Execute the sampling process

Research Methods & Data Analysis

The sampling techniques

- Probabilistic samples
- Simple random sampling
- Systematic sampling
- Stratified sampling
- Cluster sampling
- Other sampling techniques
- Nonprobabilistic samples
- Convenience sampling
- Judgmental sampling
- Quota sampling
- Snowball sampling

Research Methods & Data Analysis

Representativeness

- A sample can be considered as “representative” when it is expected to exhibit the average properties of the population

Research Methods & Data Analysis

Selection bias

- Improper selection of sample units (ignoring a relevant “control variable” that generate bias), so that the values observed in the sample are biased and the sample is not representative.

Example:

A survey is conducted for measuring goat milk consumption, but the interviewers just select people in urban areas, that on average drink less goat milk.

Research Methods & Data Analysis

Simple random sampling

- Each element of the population has a known and equal probability of selection
- Every element is selected independently from other elements
- The probability of selecting a given sample of n elements is computable (known)

- Statistical inference is possible
- It is easily understood

- Representative samples are large and expensive
- Standard errors are larger than in other probabilistic sampling techniques
- Sometimes it is difficult to execute a really random sampling

Research Methods & Data Analysis

Systematic sampling

- A list of N elements in the population is compiled, ordered according to a specified variable
- Unrelated to the target variable (similar to SRS)
- Related to the target variable (increased representativeness)
- A sampling size n is chosen
- A systematic step of k=N/n is set
- A random number s between 1 and N is extracted and represents the first element to be included
- Then the other elements selected are s+k, s+2k, s+3k…

- Cheaper and easier than SRS
- More representative if order is related to the interest variable (monotone)
- Sampling frame not always necessary

- Less representative (biased) if the order is cyclical

Research Methods & Data Analysis

Stratified sampling

- Population is partitioned in strata through control variables (stratification variables), closely related with the target variable, so that there is homogeneity within each stratum and heterogeneity between strata
- A simple random sampling frame is applied in each strata of the population
- Proportionate sampling: size of the sample from each stratum is proportional to the relative size of the stratum in the total population
- Disproportionate sampling: size is also proportional to the standard deviation of the target variable in each stratum

- Gains in precision
- Include all relevant subpopolation even if small

- Stratification variables may not be easily identifiable
- Stratification can be expensive

Research Methods & Data Analysis

Cluster sampling

- The population is partitioned into clusters
- Elements within the cluster should be as heterogeneous as possible with respect to the variable of interests (e.g. area sampling)
- A random sample of clusters is extracted through SRS (with probability proportional to the cluster size)
- 2a. All the elements of the cluster are selected (one-stage)
- 2b. A probabilistic sample is extracted from the cluster (two-stage cluster sampling)

- Reduced costs
- Higher feasibility

- Less precision
- Inference can be difficult

Research Methods & Data Analysis

Research Methods & Data Analysis

Convenience sampling

- Only “convenient” elements enter the sample

- Cheapest method
- Quickest method

- Selection bias
- Non representativeness
- Inference is not possible

Research Methods & Data Analysis

Judgmental sampling

- Selection based on the judgment of the researcher

- Low cost
- Quick

- Non representativeness
- Inference is not possible
- Subjective

Research Methods & Data Analysis

Quota sampling

- Define control categories (quotas) for the population elements, such as sex, age…
- Apply a “restricted judgmental sampling”, so that quotas in the sample are the same of those in the population

- Cheapest method
- Quickest method

- There is no guarantee that the sample is representative (relevance of control characteristic chosen)
- Many sources of selection bias
- No assessment of sampling error

Research Methods & Data Analysis

Snowball sampling

- A first small sample is selected randomly
- Respondents are asked to identify others who belong to the population of interests
- The referrals will have demographic and psychographic characteristics similar to the referrers

- Lower costs
- Low variability
- Useful for “rare” populations

- Inference is not possible

Research Methods & Data Analysis

Download Presentation

Connecting to Server..