1 / 29

Chapter 9: Deciding on the Sampling Strategy 模块 9: 抽样策略

干预措施 或 政策. 评价 问题. 数据 收集. 设计. 方法. Chapter 9: Deciding on the Sampling Strategy 模块 9: 抽样策略. 抽样. 引言 概念 类型 置信度 / 精确度 ? 样本容量 ?. Introduction 引言. Introduction to Sampling 抽样简介 Types of Samples: Random and Non-Random 样本的类型:随机和非随机 How Confident and Precise Do You Need to Be?

Download Presentation

Chapter 9: Deciding on the Sampling Strategy 模块 9: 抽样策略

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 干预措施 或 政策 评价 问题 数据 收集 设计 方法 Chapter 9: Deciding on the Sampling Strategy 模块9: 抽样策略 抽样 引言 概念 类型 置信度/精确度? 样本容量?

  2. Introduction引言 • Introduction to Sampling • 抽样简介 • Types of Samples: Random and Non-Random • 样本的类型:随机和非随机 • How Confident and Precise Do You Need to Be? • 你需要多大的可信度和精确度? • How Large a Sample Do You Need? • 你需要多大的样本? • Where to Find a Sampling Statistician? • 如何找到抽样调查统计员?

  3. Sampling抽样 • Is it possible to collect data from the entire population? (census) • 收集总体的数据可能吗? (普查) • If so, we can talk about what is true for the entire population • 如果可以,我们能够说出总体的真实情况 • Often we cannot (time/cost) • 经常的情况是我们不能 (时间/成本) • If not, we can use a smaller subset: a SAMPLE • 如果不能,我们可以使用一个较小的子集:样本

  4. Concepts概念 • Population • 总体 • the total set of units • 各单元构成的整体 • Sample • 样本 • a subset of the population • 总体的一个子集 • Sampling Frame • 抽样框架 • list from which to select your sample • 一个列表,从中可以选取你要的样本

  5. More Sampling Concepts更多的抽样概念 • Sample Design • 样本设计 • methods of sampling (probability or non-probability) • 抽样的方法(概率抽样或非概率抽样) • Parameter • 参数 • characteristic of the population • 总体的特征 • Statistic • 统计 • characteristic of a sample • 样本的特征

  6. Random Sample随机样本 • A random sample allows us to make estimates about the larger population based on what we learn from the subset • 一个随机样本允许我们基于从该子集(样本)所了解的情况,做出有关一个更大总体的估计 • Lottery, everyone has an equal chance • 博彩,每个人都有相同的机会 • Advantages: • 优点: • eliminates selection bias • 消除选择偏差 • able to generalize to the population • 能够推断总体 • cost-effective • 节省成本

  7. Types of Random Samples随机样本的类型 • Simple random sample • 简单随机样本 • Random interval sample • 随机间隔样本 • Stratified random sample • 分层随机样本 • Random cluster sample • 随机整群样本 • Multi-stage random sample • 多等级随机样本 • Combination random sample • 合并随机样本

  8. Simple Random Sample简单随机样本 • Simplest • 最简单的一类 • Establish a sample size and proceed to randomly select units until we reach the sample size • 先确定样本大小,然后进行随机地抽取直到获得预定数量的样本 • Uses a random number table to select units • 选择一个随机数量的表格来选取单位

  9. Random Interval Sample随机间隔样本 Used when there is a sequential population that is not already enumerated and would be difficult or time consuming to enumerate 用于一个数列型的整体,这个整体还没有被清点清楚,或者清点清楚过于费事且困难 Uses a random number table to select intervals 使用一个随机的数目表格来选取间隔 9

  10. Stratified Random Sample分层随机样本 • Use when specific groups must be included that might otherwise be missed by using a simple random sample • 总体中有若干个特定的子类,样本必须把这些子类都包含近来,但如果使用简单随机样本的话可能会遗漏某些子类。这时要使用分层随机样本。 • usually a small proportion of the population • 通常是总体的一小部分

  11. Sub-population 子总体 simple random sample 简单随机样本 sub-popula-tion 子总体 总体 simple random sample 简单随机样本 simple random sample 简单随机样本 sub-population 子总体 Stratified Random Sample分层随机样本 Total Population

  12. Random Cluster Sample随机整体样本 Another form of random sampling 另一种随机抽样 Any naturally occurring aggregate of the units that are to be sampled that are used when: 任何的自然发生的单位的聚合,它们的样本化在下面情况下得到使用: you do not have a complete list of everyone in the population of interest but have a list of the clusters in which they occur or 你没有一个完整的名单,但是有一个名单,上面的参与者是连串的 you have a complete list of everyone, but they are so widely distributed that it would be too time consuming and expensive to send data collectors out to a simple random sample 或者你有一个完整的名单,但是他们过于分散,因此给予收集者一个简单的随机样本过于费事且昂贵 12

  13. Multi-stage Random Sample随机多层样本 Combines two or more forms of random sampling 结合2个或者多个种类的随机样本 Most commonly, it begins with random cluster sampling and then applies sample random sampling or stratified random sampling 最经常的情况是,从随机的连串样本开始,然后运用到简单随机样本或者分层随机样本 13

  14. Combination Random Samples合并随机样本 More than one random sampling technique is used 不只一种随机抽样技巧被使用

  15. Drawback of Random Cluster and Multi-stage Random Sampling随机连贯以及多层随机样本的缺陷 May not yield an accurate representation of the population 可能无法精确地描述整体 15

  16. Summary of Random Sampling Process随机抽样过程概述

  17. Non-Random Samples非随机样本 • Can be more focused • 更具有针对性 • Can make sure a small sample is representative • 能够保证一个小样本具有代表性 • Cannot make inferences to a larger population • 无法推断一个更大总体的情况

  18. Types of Non-random Samples非随机样本的种类 Snowball 滚雪球效应 ask people who else you should interview 询问人们你还能采访谁 purposeful (judgment) 目的明确 set criteria to achieve a specific mix of participants 确定标准,实现特定的参与者的混合 convenience 方便 whoever is easiest to contact or whatever is easiest to observe 能联系到的任何人,或能观察到的任何事物 18

  19. Forms of Purposeful Samples有意识的样本的种类 Typical cases (median) 典型案例 (中间类型) Maximum variation (heterogeneity) 最大变化(异质性) Quota 配额 Extreme case 极端例子 Confirming and disconfirming cases 确认的以及否认的案例 19

  20. Bias and Non-random Sampling偏差和非随机抽样问题 • People selected in a biased way? • 选人的方法是否有偏差? • Are they substantially different from the rest of the population? • 抽取的样本是否与总体的其它部分有重大的不同? • collect some data to show that the people selected are fairly similar to the larger population (e.g. demographics) • 收集一些数据来表明所选择的人与总体非常相似(例如人口统计)

  21. Combinations: Random and Non-Random合并:随机样本和非随机样本 • Example: • 举例: • Non-randomly select two schools from poorest communities and two from the wealthiest communities • 从最贫困的社区内选取2所学校,并且从最富裕的社区内选取2所学校 • Select a random sample of students from these four schools • 从这4所学校中随机选取学生样本

  22. Possibility of Error误差的概率 • Sample different from the population? • 样本与总体不同? • Statistics: data derived from random samples • 样本统计量:从随机样本得出的数据

  23. How confident do you wish to be?你希望要多大的可信度 • confidence level • 可信水平 • E.g., 90% (90% certain your sample results are an estimate of the population as a whole) • 例如90%(能够90%地确定你的样本统计量是总体的估计值) • the higher confidence level, the larger sample needed • 可信水平越高,所需要的样本就越大

  24. Confidence Standard标准的可信水平 • Standard is 95% • 标准的可信水平是95% • 19 of 20 samples would have found similar results • 20个样本中有19个样本具有相似的样本统计量 • we are 95% certain that the population parameter is somewhere between the lower and upper confidence interval calculated from the sample • 我们可以95%地确定样本统计量是总体的精确估计值

  25. Confidence Interval可信区间 Sometimes called sampling error, margin of error, or precision 有时也被称为样本错误、错误范围或者精度 Example: 例如: in polls 48% for, 52% against, with (+/- 3%) 民意测验表明48%赞成,52%反对。(误差率正负3%) actually means 45% to 51% for and 49% to 55% against 实际上45%-51%的人赞成,49-55%的人反对 25

  26. Sample Size样本容量 • By increasing sample size, you increase accuracy and decrease margin of error • 通过增大样本容量,你就提高了精确度,同时降低了边际误差 • The larger the margin of error, the less precise your results will be • 边际误差越大,样本统计量的精确度就越小 • The smaller the population, the smaller the needed sample size for a given confidence level and margin of error, but the larger the needed ratio of the sample size to the population size. • 总体越小,在给定可信区间和边际误差的前提下,需要的样本容量就越小,但是样本与总体的比率就越大 • Aim for is a 95% confidence level and a margin of error of +/- 5% • 力求达到95%的可信水平和 +/- 5%的边际误差

  27. Precision 精确度 Confidence Level 可信区间 99% 95% 90% 1% 16,576 9,604 6,765 2% 4,144 2,401 1,691 3% 1,848 1,067 752 5% 666 384 271 Sample Sizes for Large Populations较大总体的样本容量

  28. Summary of Sampling Size样本容量的小结 Accuracy and precision can be improved by increasing the sample size 精确性可以通过增加样本大小来提高 The standard to aim for is a 95% confidence level and a margin of error of +/- 5% 目标是达到95%的可信程度,错误率是正负5%之间 The larger the margin of error, the less precise the results will be 错误率越大,结果越不精确 The smaller the population, the larger the needed ratio of the sample size to the population size 整体总量越小,样本比率越大 28

  29. Where to Find a Sampling Statistician如何找到样本统计师 • American Statistical Association (ASA) directory of statistical consultants • 美国统计协会(ASA)统计咨询师名录 • http://www.amstat.org/consultantdirectory/index.cfm • Alliance of Statistics Consultants(统计咨询师联合) • http://www.statisticstutors.com/#statistical-analysis • HyperStat Online • http://davidmlane.com/hyperstat/consultants.html 29

More Related