introduction to csscr archive and campus data l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Introduction to CSSCR Archive and Campus Data PowerPoint Presentation
Download Presentation
Introduction to CSSCR Archive and Campus Data

Loading in 2 Seconds...

play fullscreen
1 / 50

Introduction to CSSCR Archive and Campus Data - PowerPoint PPT Presentation


  • 136 Views
  • Uploaded on

Introduction to CSSCR Archive and Campus Data . Tina Tian Data Archivist txtian@u.washington.edu. Topics. Major Sources of CSSCR Data Archive Finding Data Sets at CSSCR Other Data Resources at CSSCR Introduction to Decennial Censuses and American Community Survey. CSSCR Archive.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Introduction to CSSCR Archive and Campus Data' - niles


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
introduction to csscr archive and campus data

Introduction to CSSCR Archive and Campus Data

Tina Tian

Data Archivist

txtian@u.washington.edu

topics
Topics
  • Major Sources of CSSCR Data Archive
  • Finding Data Sets at CSSCR
  • Other Data Resources at CSSCR
  • Introduction to Decennial Censuses and American Community Survey
csscr archive
CSSCR Archive
  • The Center for Social Science Computation and Research (CSSCR) maintains a large electronic data archive related to social science research.
  • Data set are available through web viewer, network server or CDROM.
major sources of csscr data archive
Major Sources of CSSCR Data Archive
  • Inter-University Consortium for Political and Social Research (ICPSR)
  • US Census Bureau
  • Bureau of Labor Statistics
  • Washington State Data Center
major sources of csscr data archive5
Major Sources of CSSCR Data Archive
  • Inter-University Consortium for Political and Social Research (ICPSR) http://www.icpsr.umich.edu
  • Membership-based organization founded in 1962.

Provides access to the world’s largest archive of computerized social science data.

  • Offers training facilities for the study of quantitative social analysis techniques (e.g. the ICPSR Summer Program in Quantitative methods of Social Research).
major sources of csscr data archive6
Major Sources of CSSCR Data Archive
  • US Census Bureau http://www.census.gov
    • 1990, 2000 Decennial Census of Population & Housing
      • Summary Tape File/Summary File (STF/SF)
      • Public Use Microdata Sample (PUMS)
    • American Community Survey (ACS)
major sources of csscr data archive7
Major Sources of CSSCR Data Archive
  • Bureau of Labor Statisticswww.bls.gov/nls
    • National Longitudinal Survey of Youth 79,97 (NLSY79, NLSY97) Public-use File (CDs are available at CSSCR, or free downloadable on BLS website)
    • National Longitudinal Survey of Youth 79,97 Geocode data (confidential data)
      • Provides geographic variables for data file
      • To protect the confidentiality of respondents,

the agreement letter has to be signed with BLS.

major sources of csscr data archive8
Major Sources of CSSCR Data Archive
  • National Center for Education Statistics

http://nces.ed.gov/surveys/

http://nces.ed.gov/surveys/SurveyGroups.asp?Group=1

    • Education Longitudinal Study of 2002/06
      • The second follow-up data file of ELS2002
      • The restricted use data file
    • Room 106 Savery Hall is the security room for the restricted data file using
major sources of csscr data archive9
Major Sources of CSSCR Data Archive
  • Washington State Data Center

http://www.ofm.wa.gov

    • WA State Vital Statistics
    • WA State Population Projections
    • WA state Population Surveys
    • Pregnancy & Abortion Data
other sources of csscr data archive
Other Sources of CSSCR Data Archive
  • Data Access via DataFerrett http://dataferrett.census.gov
    • Current Population Survey http://cps.ipums.org/cps/
    • Survey of Income Program Participation
  • iPOLL databank at The Roper Center for Public Opinion Research is available through UW library

http://roperweb.ropercenter.uconn.edu/cgi- bin/hsrun.exe/Roperweb/iPOLL/iPOLL.htx;start=HS_iPOLL_LoginSetup

  • National Center for Health Statistics

http://www.cdc.gov/nchs/express.htm

finding data sets at csscr
Finding Data Sets at CSSCR
  • Web Site
  • CDROM
  • Codebook

All these materials are available at 110 Savery Hall or CSSCR web site

finding data sets through csscr web viewers
Finding Data Sets throughCSSCR web viewers
  • A complete list of data sets at CSSCR is available on the CSSCR Web page.
  • Most online data sets at CSSCR can be accessed through a web browser.
  • The CSSCR archive website address is

http://julius.csscr.washington.edu

finding data sets through csscr web viewers13
Finding Data Sets throughCSSCR web viewers
  • The data sets on the CSSCR homepage are divided into several categories:
    • ICPSR data
    • CDROM data
    • Census 2000
    • ACS
    • Census 2010

Clicking on one of these five icons will bring you to “ICPSR Resource” or “CDROW list” or “Census 2000, ACS Washington data”

finding data sets through csscr web viewers14
Finding Data Sets throughCSSCR web viewers
  • In “ICPSR resource”, click on
    • “Archive Brower” lets you search the data to get files you want. Under each title, information such as data source, codename, abstract and storage medium is displayed. 
types of file
Types of File
  • Codebooks & Documentation
    • Dataset codebook: <file name>.cod
    • Data dictionary:<file name>.dic or <file name>.doc
    • file description:<file name>.des
    • Frequency listing:<file name>.fre
    • Dataset errata:<file name>.err
types of file16
Types of File
  • Data Files
    • ASCII file:<filename>.dat
    • SPSS system file:<filename>.sav or <filename>.svf
    • SPSS portable file:<filename>.por or <filename>.exp
    • SPSS syntax file:<filename>.spss
    • SAS data file:<filename>.sas7bdat
    • SAS catalog file:<filename>.sas7bcat
    • SAS transport file:<filename>.xpt
    • SAS syntax file:<filename>.sas
    • STATA data file:<filename>.dta
    • STATA syntax file:<filename>.do
    • STATA dictionary file:<filename>.dct
economic data at csscr
Economic Data at CSSCR
  • Economagic: Economic Time Series Page http://www.economagic.com/

Provides internet browsing for the U.S. business, economic and trade information

  • DRI_WEFA Basic Economics Database
  • Datastream Database
dri wefa basic economics database
DRI_WEFA Basic Economics Database
  • A national macroeconomics database that contains about 7000 monthly, quarterly and annual time series dated back to 1946 when available and end with the latest available observations.
  • Includes financial data, construction & housing data, industrial statistics, population counts & estimates, foreign trade & interest rates
  • Accessible through E-Views in CSSCR lab. A reference book is available at room 110 Savery Hall .
datastream database
DataStream Database
  • Provides access to various global economic and financial databases (e.g. National Government & OECD Series, International monetary funds, equities, bond indices, interest and exchange rates, company account definitions, etc).
  • At CSSCR, DataStream is only available through the Archivist at Room 113 Savery Hall.
seattle data viewer
Seattle Data Viewer
  • A neighborhood information system.
  • Provides access to a comprehensive set of information about the city infrastructure and environment.
  • Allows to organize and print data and maps of the city.
  • Accessible at CSSCR lab through

“P:\Data\Seattle_Data_viewer”.

seattle data viewer21
Seattle Data Viewer
  • Neighborhood statistics are grouped into the units:

base map

Crimes and public safety

Housing, health, education and civic locations

Land use, value and zoning

Landscape and environmental features

Municipal and district Boundaries

Park, recreation and open space

Population and demographics

Streets and transportation Utilities

available census data at csscr
Available Census Data at CSSCR

1980 census data

STF1, STF3 (raw data)

1990 census data

STF1, STF2, STF3, STF4, 1%PUMS, 5%PUMS

2000 census data

SF1, SF2, SF3, SF4, 1%PUMS, 5%PUMS

2005-2008 1-Year ACS

ACS SF, 5%PUMS

2005-2008 3-Year ACS

ACS SF, 5%PUMS

http://julius.csscr.washington.edu/Decennial%20Census.htm

http://julius.csscr.washington.edu/american_community_survey.htm

census cds geolytics
Census CDs (GeoLytics)

Census CD 1960 Long Form

Census CD 1970 Long Form

Census CD 1980 Long Form in 2000 Areas

Census CD 1980 Long Form

Census CD 1990 blocks & Long Form

Census CD 1990-2000

Census CD 2000 blocks & Long Form

Census 2000 Redistricting

NCDB – Neighborhood Change Database

StreetDVD 2007

http://julius.csscr.washington.edu/Census%20CD.htm

Available in the Room 119 Savery Hall

introduction to decennial censuses
Introduction to Decennial Censuses
  • Decennial Census of Population & Housing
    • Summary Tape File/Summary File (STF/SF)
    • Public Use Microdata Sample (PUMS)
introduction to decennial censuses25
Introduction to Decennial Censuses
  • What is Summary Tape File/Summary File (STF/SF)
    • The basic unit of analysis is a specific geographic area.
    • About counts of persons or housing units in particular categories.
    • Also called tabulated summary statistics.
introduction to decennial censuses27
Introduction to Decennial Censuses
  • The Types of STF/SF
    • STF/SF 1 and 2 present tabulated data from the Census short-form (100%) questionnaire.
    • STF/SF 3 and 4 present cross-tabulations of information from the long-form (sample) questionnaire.
    • Tables in STF/SF 2 and 4 are iterated for many detailed racial groups, as well as American Indian and Alaska Native tribes. In SF4, many data are also tabulated by detailed ancestry groups.
introduction to decennial censuses28
Introduction to Decennial Censuses
  • 2000 Census short-form questionnaire:
    • full population
    • six questions
      • Household relationship
      • Sex
      • Age
      • Hispanic or Latino origin
      • Race
      • Tenure (whether the home is owned or rented)
introduction to decennial censuses29
Introduction to Decennial Censuses
  • 2000 Census long-form questionnaire:
    • a sample includes 15.8%-17% of full population
    • separates as two parts
      • Population
        • social and economic characteristics (14 areas)
      • Housing
        • physical and financial characteristics (11 areas)
introduction to decennial censuses30
Introduction to Decennial Censuses
  • In 1980, and 1990 census data (STF1A, STF2B,STF3C,STF4D…):
    • Letter A,B,C,D indicate different level of the geographic area
      • A - block groups; B - block, zip codes;
      • C – place, county; D - Congressional district;
  • In 2000 census data, no letters indicate the level of the geographic area
  • Table indicators:
    • P - person; H - housing unit;
    • PCT/ PT – person down to Census tract level;

HCT/ HT - occupied housing unit down to Census tract level;

http://julius.csscr.washington.edu/Decennial%20Census.htm

introduction to decennial censuses31
Introduction to Decennial Censuses
  • What is Public Use Microdata Sample (PUMS)
    • The basic unit of analysis is a housing unit or the person who live in it with identifiers (such as addresses, names, etc) removed to protect individual confidentiality.
    • It’s a stratified sample of the population which was created by sub sampling the full census sample that received census long form questionnaires
introduction to decennial censuses33
Introduction to Decennial Censuses
  • The Types of PUMS
    • 5-percent sample file (PUMS-A file)
    • 1-percent sample file (PUMS-B file)
introduction to decennial censuses34
Introduction to Decennial Censuses
  • 5-percent sample file (PUMS-A file)
    • provides the user records for over 14 million people and over 5 million housing units
    • Public Use Microdata Areas (PUMA) is the lowest level of geographic identifier, with a minimum population threshold of 100,000
    • Sample has only been produced since 1980
introduction to decennial censuses35
Introduction to Decennial Censuses
  • 1-percent sample file (PUMS-B file)
    • Provides a fuller range of detailed characteristics
    • Provides the user records for over 2.8 million people and over 1 million housing units
    • Each super-PUMAs meet a minimum population of 400,000 and are composed of a PUMA or PUMAs delineated on the 5-percent PUMS files
introduction to decennial censuses36
Introduction to Decennial Censuses
  • Integrated Public Use Microdata Series (IPUMS) http://www.ipums.umn.edu/
    • Consists of thirty-eight high-precision samples of the American population drawn from fifteen federal censuses (1850 – 2000) and from the American Community Surveys of 2000-2008
    • Is particularly useful for historical research because data can be comparable across time
what is american community survey acs
What is American Community Survey (ACS)
  • is a large, continuous demographic survey
  • produces annual and multi-year estimates of the characteristics of the population and housing
  • will replace the 2010 census long form by collecting detailed information throughout the decade
  • Short form still remains in 2010 decennial census
acs program schedule
ACS Program Schedule
  • Testing and development: 1994-2004
  • Full implementation began in 2005
  • Group Quarters data collection began in 2006
what is a group quarters gq
What is a Group Quarters (GQ)?
  • Definition: A living quarter in which unrelated people live or stay other than the usual house, apartment, or mobile home.

Examples:

    • Institutional: Nursing homes, hospitals, prison wards
    • Non-institutional: College dorms, military barracks, shelters
full implementation
Full Implementation
  • Annual national sample of approximately 3 million addresses in every county and American Indian and Alaska Native area in the United States
  • Provide profiles every year for communities of 65,000 population or more
  • Provide 3-year accumulations for communities of more than 20,000 population
  • Provide 5-year accumulations for all communities, the lowest geographic level could be block group
acs data release schedule
ACS Data Release Schedule

Before 2004 ACS the population threshold is 250,000+

acs file types
ACS file types
  • ACS Summary File (ACS SF)
  • ACS Public Use Microdata Sample
    • One-year PUMS (1-in-100, 1%, national random sample of the population, 1.3 million housing units & 3 million people)
    • Three-year PUMS (3-in-100, 3%, national random sample of the population, 3.9 million housing units & 9.1 million people)
    • Five-year PUMS (5-in-100, 5%, national random sample of the population)
    • Public Use Microdata Areas (PUMA) is the lowest level of geographic identifier, with a minimum population threshold of 100,000
comparing acs with the decennial census long form questionnaires
Comparing ACS with the Decennial Census long form questionnaires
  • Samples rate/size & design
  • Data collection
  • Residence rules & reference periods
samples rate size design comparison
Samples rate/size & designComparison
  • Census sample estimates based on about 18 million housing units; ACS 5 year estimates based on about 11 million housing units, 1 year estimates based on about 3 million housing units
  • ACS samples every year and spreads sample over 12 months; census samples once a decade and uses the entire sample at the same time

ACS estimates have higher sample error than census long form, however shown as 90% confidence limits or margins of error in every table. Similar sampling error measures for census long form sample estimates have not been provided

data collection comparison
Data Collection Comparison
  • ACS nonresponse follow-up uses computer-assisted telephone and computer-assisted personal interviews; past censuses have used only paper questionnaires;
  • ACS data collected only from household members; census data often collected from neighbors

ACS has higher level of overall response and individual item response, so less chance of nonresponse bias, means lower potential nonsampling error

residence rules comparison
Residence Rules Comparison
  • Decennial census based on concept of “usual residence”
    • The place where the person lives and sleeps most of the time
    • If a person had no usual residence, the person was to be counted where he or she was staying on Census Day
  • ACS uses a “two-month” rule

- Resident of an address if a person

    • Lives there year round
    • Lives there more than 2 months but not year round
    • Is living there now with no other place to live?

- Not a resident of an address if a person

    • Lives there 2 months or less with another residence
    • Is away now for more than 2 month

Compare with Caution

reference periods comparison
Reference Periods Comparison
  • ACS uses the interview date as the single reference point, or as the end of a reference period, for all data collection
  • Decennial census always use Census day-April 1st as reference point
  • Examples:
    • Income
      • ACS asks for income for the previous 12 months
      • Decennial census income data refer to the previous calendar year April 1
    • School enrollment
      • ACS asks if a person attended school during the “last three months
      • Census 2000 asks if a person attended school “any time since April 1”

Compare with Caution

comparison
Comparison
  • Comparing ACS Data to Census 2000 & Other Sources

http://www.census.gov/acs/www/guidance_for_data_users/comparing_data/

  • When to use 1-year, 3-year, or 5-year estimates

http://www.census.gov/acs/www/guidance_for_data_users/estimates/

  • Comparing 2008 ACS data

http://www.census.gov/acs/www/guidance_for_data_users/comparing_2008/

available acs data
Available ACS data
  • 2005 single-year ACS provides household population only for areas with populations of 65,000 or more
  • 2006, 2007 & 2008 single-year ACS provides household population and group quarters population for areas with populations of 65,000 or more
  • 2005-2007 & 2006-2008 three-year ACS provides household population and/or group quarters population for areas with populations of 20,000 or more
acs data release schedule in 2010
ACS Data Release Schedule in 2010
  • 2009 single-year ACS will be released by the end of September. These estimates will be available for the areas with populations of 65,000 or more.
  • 2005-2009 5-year ACS is planed to release in December. These estimates will be available for all areas regardless of population size, down to the census tract. Early in 2011, 2005-2009 ACS Summary Files down to the block group will be released as will the 2005-2009 PUMS files.
  • 2007-2009 3-year ACS is planed to release in January 2011. These estimates will be available for all geographic areas with populations of 20,000 or more.