Psychology 242, Dr. McKirnan

Psychology 242, Dr. McKirnan • Defining your target population • Probability & Non-Probability sampling methods. Research Sampling. • Please run this as a PowerPoint Show • Go to “slide show” and click “run show”. • Click through it by pressing any key. • Focus & think about each point; do not just passively click. Dr. David J. McKirnan, University of Illinois at Chicago, Psychology; mckirnanuic@gmail.com

The big picture: Research sampling • Define your target population • What group do you want to generalize to? • How is / is not a member of the group? • What is your sampling frame? 

Sampling Sampling: Who do you want to generalize to? • Any study assesses only a sample of the population. • There are many different ways we may collect a sample. • There are many different populations or sub-populations we may be interested in. • The size and breadth of a sample can affect the Internal or External validity of the study. Week 6; Sampling

Define the target population Who do you want to generalize to? Mammals Humans All Western people All Americans Young Americans College students UIC Students This class Specificity (and ease) of sampling frame. Generally increases internal validity. Breadth of population to sample from (i.e., size of sampling frame). Represents increasing external validity. Week 6; Sampling

Who do you want to generalize to? • Samples are often comprised of very targeted sub-populations • Demographic groups; • Ethnicity • Socio-economic status • Geography; e./g.,urban dwellers… • Behavioral groups • Registered voters • Home owners • Clinical or other groups • Medical or psychiatric patients… That Specificity increases Internal validity by decreasing the complexity of the sample. Week 6; Sampling

Research samples & validity Clinical drug trials illustrate the conflict between internal v. external validity in sampling. • People with diverse symptoms and backgrounds see physicians for depression. • To enhance internal validity drug researchers use exclusion criteria to select only participants who fit a specific definition of depression • Zimmerman et al. suggest that too many exclusion criteria compromises the validity of this research area. (click image for article) E X A M P L E Zimmerman, M.l, Mattia, J.I., & Posternak, M.A. (2002). Are Subjects in Pharm-acological Treatment Trials of Depression Representative of Patients in Routine Clinical Practice? Am J Psychiatry, 159, 469–473. Week 6; Sampling

Exclusion criteria & validity The study begins with a large # of people self-referred for depression They exclude those with serious mental illness, drug abuse or personality disorder… …whose symptoms are not severe enough, are suicidal, or who have other affective disorders.. E X A M P L E …whose symptoms are too recent OR too long-standing… …and end up with a small, carefully selected sub-set of patients (8.4% of general depression patients). Week 6; Sampling

External vs. internal validity in sampling • Applying rigorous study selection criteria for drug trials excludes the great majority of routine depression patients. • Rigorous participant selection for internal validity seriously compromises external validity in these studies. • This leaves the actual usefulness of anti-depressant (and other) medications for the general population in doubt. • To be useful research must balance the need for careful subject selection with the need for representativeness E X A M P L E Week 6; Sampling

Who is a group member? Do you use Facebook or other media 5 times a week or more? A = Yes B = No C = Not sure – lost count. Week 3; Experimental designs

Who is a group member? Are you a “Facebooker”? A = Yes B = No C = Not sure – let me facebook that. Week 3; Experimental designs

Who is a group member? Are you a Latino? A = Yes B = No C = Maybe – I’m not sure Week 3; Experimental designs

Who is a group member? Do you speak Spanish? A = Yes B = No C = Maybe – I’m not sure Week 3; Experimental designs

Define the target population Who do you want to generalize to: who is in the group? • Once choosing our sampling group, we must decide on criteria for membership… • To sample “Facebook users”, do I use a … • Behavioral criterion (which behavior?) • Self-identification? • To sample “Latinos”… • Is it enough to call oneself “Latino” • Is Spanish language necessary…? • Using a behavioral criterion (amount of Facebook use) may yield a different sample than self-identification. Clearer and narrower group criteria increases Internal validity by making the sample more homogeneous.

Who do you want to generalize to? Criterion Demographic / BehavioralSelf-Identification “Student” # Hours registered Occupational Choice “gay / lesbian” Sexual patterns Self-label Perceived dependence “Drug User” # of drugs used Geographic origins. language “Latino” Ethic identification “Depression patient” Specific profile of behaviors & symptoms Self-referred for treatment The criteria used to define the group will determine who specifically gets sampled.

Who do you want to generalize to: Your “Sampling Frame”. • What is known about your larger population? • Are there Census or survey data? • E.g., are there “population” data on depressed people? • Do we know the demographic profiles of Facebook users? • Data about your target population will help you determine how well your sample represents that population. • What is its size, sub-groups, location…. • Where / how can I best recruit members of the population • Will different recruitment methods be biased in favor of some sub-groups? • E.g., internet surveys are biased against less computer-oriented people.

Research sampling • Defining your target population • Probability & Non-Probability sampling methods.  Week 3; Experimental designs

Major forms of sampling Probability (Random) Sampling • Recruit (or select) participants to maximize the representativeness of the sample to a known population. • Uses some form of random selection. • Requires that each member of the population has a known (often equal) probabilityof being selected. • Most externally validapproach to sampling general populations Non-Probability Sampling • Use available samplesfor convenience, or targeted outreach to unusual or small populations. • Selection may be either systematic or haphazard, but is not random. • Often the most externally valid approach to unusual, small, or extreme groups, or groups where little is known. • When used only for convenience it is the least externallyvalid.

Probability / Random Sampling • Core feature:all members of the study population have an equal chance of being sampled • Procedure:Choose participants in a systematic, random fashion. • e.g., every nth name from a list, every nth number in a phone exchange, etc. • Advantages:eliminates obvious biases of convenience sampling • Limitations: • May under-sample unusual / hard to reach participants • Some may be unavailable in, e.g., telephone lists, computer files. Week 6; Sampling

Basic Forms of random sampling • Simple Random Sampling: Select a specific % of a target population; all members of population have about equal chance of selection. • Multi-Stage: Randomly select population units(census tracts, households, schools..), then randomly select individuals within unit. • Stratified: Random within population sub-blocks, e.g., gender (randomly select 50 women and randomly select 50 men), ethnicity, etc. • Cluster: Random within (potentially convenience) clusters, e.g., specific locations or “venues”, events, times of day, etc. Week 6; Sampling

Simple Random sampling Objective: Attempts to trulyrepresent the general population; absolute minimal selection bias. Procedure: Recruitment method where all members of the population have~ chance of being selected: • Gallup polls using random digit dialing surveys Examples: • “Long form” of the census to a small % of U.S. households Advantages: Most representative sampling frame for general (non-targeted) population Disadvantages: Any recruitment method excludes some people (no telephone, no stable address, etc.). Week 6; Sampling

Multi-Stage Random sampling Objective: Focused & efficient random sample. Procedure: Concentrate recruitment in specific locations or venues. Examples: NIDA household drug surveys: 1) Random select moderate # of census tracts nationally 2) randomly select small % of households within each tract; 3) Interview 1st adult who answers phone in each household “CITY” HIV study among youth: 1) Randomly select bars, clubs, other venues across the city 2) Randomly approach every 4th person who enters the venue to recruit for interview Advantage: Much more efficient that simple random Disadvantage: Same as simple random Week 6; Sampling

Stratified or cluster sampling Objective: Represent every key segment of the population. Procedure: • Decide which population segments are important(e.g. ethnic groups, census tracts, geographic areas...), • Randomly select from each segment. • Proportionate: Same sampling fraction from each segment; approximates overall population • (e.g., sample 1% of all African-Americans, 1% of all Latinos, etc…) • Disproportionate: Unequal sampling fraction across segments, to over-represent smaller groups • (e.g., select larger % of recent immigrants…) Week 6; Sampling

Non-Probability Sampling Useful for populations that: • Cannot be randomly sampled; “hidden” or difficult to reach • No sampling frame available, such as census data, describing its size, composition, etc. •  Examples: drug users, gay men, homeless, etc. Likely to misrepresent the population • May be difficult or impossible to detect this misrepresentation • Often over-sensitive to incentives: paying participants attracts more poor people • “Respondent Driven” sampling (RDS) allows for “targeted” population estimates Week 6; Sampling

Non-Probability methods (1) • Haphazard Sampling; “Man on the street” • College psychology majors • Available medical / therapy clients • Volunteer samples Problem: No evidence for representativeness Advantage: availability of participants Modal Instance Sampling;“Typical” case • Typical New Yorker describing trade tower tragedy • Typical voter. Problem: May not represent the modal group. Advantage: Describe simple, “typical case” Haphazard / Modal instance often used by journalists or qualitative-descriptive studies; see NYT “down low” article. Week 6; Sampling

Non-Probability methods, 2 Venue & time / space Sampling • Sample a specific, well-defined, often hard to reach group • Assume group members are well represented at specific locations or settings (“venues”). • Use “Intercept” methods for reaching participants • Use indigenous outreach workers from the population • Develop a standard recruitment script • Collect / distribute contact information for later participation • Time / Space randomization: • Lessen bias due to choice of venue: • Randomly approach different venues at different times • Randomly select participants within the venue (e.g., every 4thperson…) • Strategy must be based on a clear epidemiological or theory question. Examples: Shopping mall intercepts, gay recruitment Week 6; Sampling

Outreach / venue sampling: examples of palm cards Week 3; Experimental designs

Outreach lead sheet Week 3; Experimental designs

Non-Probability methods, 3 Targeted Multi-Frame Sampling • Sample a specific, hard to reach group • No census or similar data for sampling frame. • Uses multiple (convenience) sampling “frames”: • Direct outreach to places where population members are available (venue sampling). • Newsletters, internet lists & chat rooms • Organizations or meeting places • Strategy must be based on a clear epidemiological or theory question. • Most common & valid convenience sample Examples: • “MTV” Market segments • Shoplifters • People who have risky sex • Homeless people… Week 6; Sampling

Non-Probability methods (3) Snowball / “Respondent Driven” Sampling (RDS) • Early participants are paid to recruit others, who recruit others, etc. • Form of targeted sampling: • Recruit network of “linked” people tracked by referrals Problem: Advantage: Access unusual or “hidden” people related by a common behavior. • With enough “generations” of links can well represent a target population. • With RDS can show “chain” of referrals / links. • Choice of seeds. • Eligibility criteria Sensitive to incentives! • Often part of multi-frame approach. • Useful for people who mistrust research or where personal contact is necessary for recruitment (HIV, drug use). • Portrays “chain” of influence or, e.g., infectious disease. Week 6; Sampling

RDS coupon examples Heckathorn, D.D. & Magnani, R. (2004). Snowball and Respondent-Driven Sampling. In: Behavioral Surveillance Surveys: Guidelines for Repeated Behavioral Surveys in Populations at Risk of HIV Week 6; Sampling

RDS; chain description Heckathorn, D.D. & Magnani, R. (2004). Snowball and Respondent-Driven Sampling. In: Behavioral Surveillance Surveys: Guidelines for Repeated Behavioral Surveys in Populations at Risk of HIV. Week 6; Sampling

Example of social network sampling:Bearman et al., Romantic ties among adolescents With a number of smaller chains And a small % in 2 to 4 person chains • From sampling perspective, several “seeds” access most of the population • Findings suggest a clear potential for STI transmission. A substantial majority of students are in an extended, linked chain of relationships. Week 6; Sampling

Non-Probability methods Quota Sampling Similar to cluster sampling, except you cannot randomly sample each population segment. • Select people non-randomly according to quotas • Must have clear theory / research question to pick relevant population characteristic(s). • Proportional quota sampling • Represent major characteristics of a population. If gender is important, and the proportion of women :: men in your population = 65% :: 35%, the sample must meet that quota. • Non-proportional quota sampling • Sample enough members of each group to test hypothesis, even if the sample is not proportional. (e.g., recruit 50 women & 50 men, even though the real proportion is 65::35). • Helps assure that you have good representation of smaller population groups. Week 6; Sampling

Non-Probability methods Web sampling • Typically highly targeted samples • Gay / bisexual men… • Adolescents… • “Gamers”… • Typically access through existing venues: • Users of specific web sites • List-serves, e-mail lists • Active recruitment in “chat rooms” • Problem: Inherent bias in computer literacy(?) • Advantage: • Cheap large national sample • Access unusual or “hidden” people who reach others via internet Week 6; Sampling

Non-Probability methods; Heterogeneity Sampling • Sample every sector of a population -- at least several of everyone -- without worrying about proportions. • At least some members of each geographic area • …ethnic group • …behavioral group (voters & non-voters…) • Assume that a few people are a good proxy for the group. Examples: focus groups or qualitative interviews about products, social issues... Problem; Cannot be sure a few people really represent their sub-group. Advantage: At least some representation of all sub-groups. Week 6; Sampling

Sampling overview • Who do you want to generalize to? • Who is the target population? • broad – external validity • narrow – internal validity • How do you decide who is a member? • demographic / behavioral criteria? • subjective / attitudinal? • What do you know about the population already – what is the “sampling frame”. • Is a Probability or random sample possible? • “Hidden” population? • Socially undesirable research topic? • Easily available via telephone, door-to-door? • Sampling frame adequate to choose selection method? Summary Week 6; Sampling

Overview, 2 Types of Non-probability Samples • Haphazard • Modal instance • Venue – time / space • Multi-frame • Snowball / Respondent driven • Web • Quota • Heterogeneity Summary Week 3; Experimental designs

Overview, 3 • Probability sampling • simple • multi-stage • cluster or stratified • Most externally valid • Assumes: • Clear sampling frame • Population is available • Less externally valid for hidden groups. Summary • Non-probability sampling • targeted / multi-frame • snowball • quota, etc. • Less externally valid • High “convenience” • Best when: • No clear sampling frame • Hidden / avoidant population.

Psychology 242, Dr. McKirnan