1 / 39

New Data in the Federal Statistical Research Data Centers

Learn about the types of new data available in Federal Statistical Research Data Centers and how to access them.

lonnie
Download Presentation

New Data in the Federal Statistical Research Data Centers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. New Data in the Federal Statistical Research Data Centers Melissa Ruby Banzhaf, PhD Administrator, ARDC Center for Economic Studies U.S. Census Bureau October 9, 2015

  2. Overview • Background on Federal Statistical RDCs • Types of Data Available in the RDC (Emphasis on New Data) • How to Obtain Access to this New Data (and other data) in the RDCs

  3. What are Federal Statistical Research Data Centers (RDCs)? • Secure computing labs where qualified researchers conduct approved statistical analysis on non-public data. • These data are collected by various government agencies (Census Bureau, NCHS, AHRQ, SSA, and more to come). • Established through an agreement between federal statistical agencies and a local research community. • Managed by the Census Bureau.

  4. Federal Statistical Research Data Center Locations

  5. The Atlanta Research Data Center • Located in the Federal Reserve Bank of Atlanta • corner of 10th & Peachtree • Consortium Members • Emory University • University of Georgia • Georgia State University • Clemson University • Federal Reserve Bank of Atlanta • University of Alabama at Birmingham • University of Tennessee – Knoxville • Florida State University • Georgia Institute of Technology

  6. Types of Restricted Data Available • Economic Data • Microdata on firms and establishments • Business Register data • Demographic Data • Survey data on individuals and households • Administrative data on individuals • Linked survey and administrative datasets • Employer-Employee Jobs Data (LEHD) • Data on employees linked with data on employers • Health Data • National Center for Health Statistics • Agency for Healthcare Research & Quality

  7. Advantages of Restricted Data • Vast number of business datasets that are not publicly available at the micro level • Census datasets can be linked together • Census datasets can be linked to external data • More detailed level of geographic identifiers • Very little top or bottom-coding

  8. Economic Datasets

  9. New Data – Management and Organizational Practices Survey • Supplement to the 2010 Annual Survey of Manufactures • Goal: Collect information on establishment’s use of structured management practices • 36 questions: • 16 Management (monitoring, targets, and incentives) • 13 Organization (who makes decisions, data in decision-making) • 7 background (number of managers/non-managers, union status) • Permits analysis of relationship between management practices and key economic outcomes (e.g., productivity)

  10. Demographic Datasets - Survey • Decennial Surveys (1950-2010) • American Community Survey • Current Population Survey • Survey of Income and Program Participation • American Housing Survey • National Survey of College Graduates • National Crime Victimization Survey

  11. New Data - Decennial • 1950 – 1% PUMS sample • Geography: Census tract but lowest level is enumeration district (roughly 600 people) • 1960 – 25% sample (densest ever) • Geography: Census tract and other sub-county geographies (Census place) but lowest level is enumeration district (roughly 600 people) • Harmonized coding across 1950 and 1960

  12. New Data – Current Population Survey • CPS Basic Monthly Data (2000-2014) • CPS Food Security Supplement (2001-2012) • CPS Voting and Registration Supplement (2006, 2008, 2010, 2012) • CPS Fertility Supplement (1998, 2000, 2002, 2004, 2006, 2008, 2010, 2012)

  13. New Data – Current Population Survey • Characteristics of Internal Files: • Geography: Census Tract • March CPS is only file that has PIKs • Has CPS identification key so may be able to link across CPS surveys. • Some limitations on types of analysis permitted by BLS.

  14. New Data – National Crime Victimization Survey • National survey of households (2006-2012) • Collects information on frequency, characteristics, and consequences of criminal victimization (sexual assault, robbery, burglary, motor vehicle theft etc.) • New: Public Police Contact Survey (2011) – Collects information on perceptions of police behavior and response during encounters.

  15. New Data – National Survey of College Graduates • Biennial survey collects information (such as occupation, work activities, salary, relationship between degree field and occupation) on college-educated individuals with particular emphasis on those in science and engineering fields. • 2010 currently available • Geography at state level • Currently no PIKs

  16. Demographic Datasets -Administrative • Census Numident File (SSA) • Housing Datasets (HUD): • Public and Indian Housing Information Center Dataset • Tenant Rental Assistance Certification Systems dataset • Computerized Homes Underwriting Management System

  17. Demographic - Administrative Continued • Medicare/Medicaid Datasets (CMS): • Medicare Enrollment Database • Medicaid Statistical Information System

  18. Administrative – Census Numident • Data derived from applications for Social Security Numbers • Contains data on: • Birthdate • Town or county of birth • Gender • Race • Citizenship • Date of death • PIKs

  19. Administrative - Housing • Public and Indian Housing Information Dataset • Contains information on all members of HH with a participant in a covered program: • Housing Choice Voucher • Public Housing • Indian Housing • Includes age, race, sex, rent, household income, PIK • Geography: block level

  20. Administrative - Housing • Tenant Rental Assistance Certification Systems (TRACS) dataset • Contains information on all members of HH with a participant in a covered program. • These programs provide rental assistance for participants living in privately-owned, subsidized housing. • Includes age, race, sex, rent, household income, PIK • Geography: block level

  21. Administrative - Housing • Computerized Homes Underwriting Management System (CHUMS) • Contains records on approved mortgage applications insured by Federal Housing Administration (FHA) • Contains information on borrowers and co-borrowers including income, housing value, mortgage, demographic characteristics, PIKs • Geography: block level

  22. Administrative - CMS • Medicare Enrollment Database (1999-2014) • Information on all Medicare beneficiaries • Limited to information on people not claims: eligibility dates and statuses, residence change dates, basic demographic information, PIKs • Geography: block level

  23. Administrative - CMS • Medicaid Statistical Information System (2000-2013) • Information on all Medicaid and CHIP enrollees in each month • Limited to information on people not claims: eligibility dates and statuses, basic demographic information, PIKs • Geography: zip code level

  24. Demographic Datasets: Linked Survey-Administrative • Current Population Survey - SSA Earnings Files • Survey of Income and Program Participation – SSA Earnings Files • National Longitudinal Mortality Study

  25. Linked: SSA Files with CPS and SIPP • CPS and SIPP Survey Data matched to SSA earnings files by PIK • SSA records include: • Detailed Earnings Record – earnings from FICA, non-FICA, and self-employment income (1978+) from Master File • Summary Earnings Record – all earnings for each year from 1951 to present • Master Beneficiary Record – contains information (entitlement and payment data) on Social Security Recipients (including Disability). • 831 Disability File – determines medical eligibility for Disability Insurance, and SSI benefits.

  26. Linked: National Longitudinal Mortality Study • Purpose of database: to study the effects of demographic and socio-economic characteristics on mortality • Survey data: March CPS, 1980 Decennial Census (sample) • Administrative data: Death Certificate information from National Death Index (through 2011) • Geography: county level

  27. LEHD • “Tracks” a person based on their place of employment; essentially links employees with employers • Based on unemployment insurance administrative records • Available on a state-by-state basis • Quarterly data starting in 1990 – currently through 2011 • Can link employer to employer data in other Census datasets • Can link employee to data on individuals in other Census datasets • New Variables: Firm age and size, Firm ID that matches Business Register

  28. New Data – Innovation Measurement Initiative • Goal: Improve measurement of innovation resulting from research grants, a small but important sector of the economy. • How: Integrate university data on federally funded research grants with Census Bureau data on people and businesses. • Specifically link: • Employee, vendor, sub-award transactions to the Census Business Register and LEHD (employee-employer database). • Innovation outcomes: Job placements, start-up activity and business dynamics, vendor characteristics

  29. New Data – Innovation Measurement Initiative • Partnership between Census and Institute on Research in Innovation and Science (IRIS) at the University of Michigan • Member institutions of IRIS provide data to Census and in turn receive: • Individual and collective reports • Underlying tables and graphics for institution’s use • Access to aggregate data for researchers • Input on new product design

  30. New Data – IMI Opportunity • Census is asking for nominations of teams of 2-5 researchers (at least one member with SSS) to assist in enhancing and documenting data for the IMI project. • What is in it for you? • Opportunity to do research on new data. • $25K in funding support for 1 graduate student. • Initial deadline for nominations: October 16

  31. Health Data in the ARDC • These data are collected by: • National Center for Health Statistics (NCHS) • Agency for Healthcare Research and Quality (AHRQ)

  32. What types of NCHS data?

  33. What types of NCHS data? Linked Data Sets • Linked mortality data: NHIS, NHANES LSOA II, NNHS • Linked Medicare Enrollment and Claims data: NHIS, NHANES, LSOA II • Linked Social Security Administration Data: NHIS, NHANES, LSOA II, NNHS • Linked EPA data

  34. What types of AHRQ Data? • Medical Expenditure Panel Survey (MEPS) files include: • Household Component • Provider Component • Insurance/Employer Component • Nursing Home Component (1996 only) • Area Resource File • Two-year two panel file • MEPS-NHIS linked data • Only Household Component and portions of Provider Component are publicly available

  35. How to Access the RDC • Develop proposal • Different guidelines for Census data vs. NCHS/AHRQ guidelines • Submit proposal for agency review • Census (and agency sponsors) • NCHS/AHRQ • Obtain Special Sworn Status (SSS) • Pay one-time fee for NCHS/AHRQ data

  36. Timeframe – “Patience is a Virtue” • Census Data • Plan on 6 to 9 months before working in lab • Census approval/ Other Agency Approval • NCHS/AHRQ Data • Timeframe dependent on agency approval process • Census approval NOT required • Special Sworn Status • 3 to 4 months for your security clearance

  37. Working in the ARDC lab • All analysis conducted in the ARDC lab • Data located on server in Maryland • Access data via thin client terminals • No internet access or personal computers allowed in lab • Statistical software available: SAS, Stata, R, Matlab, GIS, Sudaan, etc. • Agency reviews output before releasing • Penalty for disclosure is $250,000 and/or 5 yrs in prison (inadvertent or otherwise)

  38. Upcoming RDC-Related Events • Cornell University Course – INFO 7470 – Understanding Social and Economic Data • Can be connected via distance learning (and get course credit) • Intended for Ph.D. students and faculty who use large-scale restricted-access data from government suppliers • Emphasis on data accessible through the RDC network • Interested? Contact us for more information.

  39. Contact Information • People: • Melissa Ruby Banzhaf, ARDC Administrator melissa.r.banzhaf@census.gov, 404-498-7538 • Julie L. Hotchkiss, ARDC Executive Director Julie.l.hotchkiss@atl.frb.org, 404-498-8198 • Resources: • ARDC website: atlantardc.org • Quarterly ARDC Newsletter (email us to get on list)

More Related