1 / 68

The Data Liberation Initiative Orientation Session

The Data Liberation Initiative Orientation Session. University of Alberta December 5, 2001. Statistics Canada / Statistique Canada. Chuck Humphrey. Products and Services. Establishing Perspectives statistical information statistics and data statistics & data sources

eugenioa
Download Presentation

The Data Liberation Initiative Orientation Session

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Data Liberation InitiativeOrientation Session University of Alberta December 5, 2001 Statistics Canada / Statistique Canada Chuck Humphrey

  2. Products and Services Establishing Perspectives • statistical information • statistics and data • statistics & data sources • national and international • continuum of access • DLI

  3. Statistical Information Statistics • numeric facts/figures • created from data, i.e, already processed • presentation-ready Data • numeric files organized for analysis • requires processing • not ready for display

  4. Statistical Information The lines are blurring ... • the past • if it was on paper, it was statistics • if it was digital, it was data • the present • dynamic tables retrievable from online databases • e-journal publications with tables

  5. Statistical Information Statistics ... and a map!

  6. Product Implications won’t have a ‘published’ product but rather forced to work with dynamically generated tables from databases toward this end, will see more Web retrieval of statistics and processing of data examples: STC Community Profiles and ICPSR Data Analysis System Statistical Information

  7. Product Implications may only see graphical displays of statistics or data without the numbers or data example: Web map servers Statistical Information

  8. Service Implications spend less time providing technical services and more time doing extended reference and consulting the move to disintermediate products, that is, make them self-serve Statistical Information

  9. Service Implications need to deal with an even wider variety of retrieval or software tools and possibly formats may be more difficult to get at the actual statistics or data that are wanted (especially historical data) Statistical Information

  10. Financial & Stock Data Academic Research Data Statistics Canada Other Canadian Gov’t & Non-gov’t Sources Statistics & Data Sources

  11. Statistics Canada Other Governmental & Non-Governmental Academic Research Data Financial & Stock Data Statistics & Data Sources • Surveys • x-sect’l & longitudinal • Aggregate dbases • time-series & x-class • Geography files • Supporting documentation • SIC, SOC

  12. Statistics Canada Other Governmental & Non-Governmental Academic Research Data Financial & Stock Data Statistics & Data Sources • Health Canada • HBSC & Heart Health • CIC • LIDS & IMDB • CIHI • GDSourcing • Statistical Universe

  13. Statistics Canada Other Governmental & Non-Governmental Academic Research Data Financial & Stock Data Statistics & Data Sources • ICPSR • ISSP • World Values • Euro-barameters • ISR-York • CNES • Data Libraries • AAS

  14. Statistics Canada Other Governmental & Non-Governmental Academic Research Data Financial & Stock Data Statistics & Data Sources • Datastream • Financial Post Corporate Database • Compustat • CRSP • DRI Basic Economics

  15. Statistics Canada is an important source for statistics and data, but not the only source. Statistics & Data Sources

  16. Turning to Statistics Canada, access to statistics and data is through a variety of services and initiatives. Think of this as a continuum along which levels of access are provided. Continuum of Access

  17. Characteristics of this continuum are: cost: which runs from free to expensive restrictions: which runs from open to very restricted information : which runs from statistics to data Continuum of Access

  18. Statistical Information Available through Statistics Canada Open Restricted Free Expensive Statistics Data Different Services Statistics Depository Data Liberation Cu$tomized Remote Job Research Data Service: Canada Website Service Initiative Tabulations & Submission Centres Program Pay per View Who is General Public: Designated Post-secondary Individuals: Approved Approved Eligible & DSP Libraries Academic: Researchers: Researchers: available on the contract between Internet at STC and Conditions: & their Users: restricted to contract between SSHRC peer www.statcan.ca individual teaching and STC and review & deemed available on site research purposes individual STC employee - The Daily - Paper publica- Standard data Tables from “Dummy” or Confidential data - Canadian tions products: confidential files synthetic files to files from the Products: Statistics - Electronic pub- aggregate data that are specially build analysis longitudinal - Census lications, which bases, microdata produced by setups that must surveys begun in - Statistical profiles includes priced files and Statistics Canada then be submitted the 1990’s of Canadian down-loadable geography files for a fee and to Stats Can for communities publications & access to processing - Downloadable select CD ROMS specialized publications databases Warning: some Some DSP Interface to Specialized Notes parts of the Website libraries provide CANSIM I and databases include off-site access to Trade Analyzer are fee-based authenticated available through CHASS (University users of Toronto) by subscription ACCESS Services available Applications can for selected titles. now be submitted CANSIM II and Remote job through the submission is the Trade Analyzer SSHRC Web site. most developed for NPHS.

  19. Products and Services Summary • statistical information • traditional ways of handing print statistics now challenged by online statistics and data • statistics & data sources • Statistics Canada is an important source but not the only source • continuum of access • Several points of access may be needed when dealing with Statistics Canada

  20. The DLI license provides post-secondary institutions with access to “standard data products”, which consist of public use microdata, aggregate databases, and geography files listed in the Statistics Canada Catalogue. Product Types

  21. Think of this as the stuff that is sold, excluding publications and services. • STC Online Catalogue • Medium Categories • Tape • CD-ROM • Diskette Product Types

  22. Think of this as the stuff that is sold, excluding publications. Product Types Tape CD-ROM Diskette

  23. Aggregate data statistics organized in databases or as data files tabulations structured by time, geography, and social content Product Types

  24. Structure Time Geography Social Content Aggregate Data Example: CANSIM

  25. Structure Time Geography Social Content Aggregate Data Example: CANSIM

  26. Structure Time Geography Social Content Aggregate Data Example: Census

  27. Structure Time Geography Social Content Aggregate Data Example: Small Area Statistics SABAL cancelled

  28. Structure Time Geography Social Content Aggregate Data Example: HID

  29. Microdata raw data organized in a file where the records or lines in the file are observations of a specific unit of analysis and the information on the lines are the values of variables requires some form of processing or analysis to be used Product Types

  30. Anonymized Microdata these are microdata prepared to minimize the possibility of disclosing or identifying any of the cases or observations the original data (or master file) are edited to create a public use microdata file Public Use Microdata

  31. Steps in Anonymizing Microdata removal of all personal identification information (names, addresses, etc) include on gross levels of geography collapse detailed information into a smaller number of general categories suppress the values of a variable Public Use Microdata

  32. Statistics Canada PUMFs only available for select social surveys that undergo a review of the Data Release Committee, an internal Statistics Canada committee no enterprise public use microdata Public Use Microdata

  33. Statistics Canada PUMFs almost all are cross-sectional, that is, represent data collected at one point in time longitudinal data are difficult to anonymize and maintain useful information Public Use Microdata

  34. Statistics Canada PUMFs how do you recognize a PUMF? Statistics Canada calls them public use microdata files in the Daily. Public Use Microdata

  35. Other Microdata in Statistics Canada Master files: these are the confidential files from which public use microdata are created. They contain the fullness of the data captured about the unit of observation. Statistics Canada Microdata

  36. Other Microdata in Statistics Canada Share files: these are confidential files in which the respondents have signed a consent form permitting Statistics Canada to allow access for approved research to their information. Statistics Canada Microdata

  37. Geography Files Census digital boundary and cartographic files in two proprietary formats: ArcView and MapInfo correspondence tables for linking between Postal Code geography and Census geography Product Types

  38. Digital Copies of Standardized Code Lists and Concordances Files containing standardized codes for industry, goods, and occupations correspondence tables between versions of standardized codes for industry and occupations Product Types

  39. Treat as a Collection and Provide Reference Install Data and Provide Access “Order & Pass-through” Service Data Service Models Service models were presented as a continuum during the 1997 DLI workshop

  40. Data Service Models Choose a model that matches your staff and computing resources

  41. Acquisition Fill a Request  Locate data  Order data & documentation Collection Development Select & Locate data Order data & documentation Catalogue data & documentation Install & Store (data & documentation) Reference Search for data Interpret documentation Retrieve or download data Process data change formats subset cases or variables aggregate cases merge files analyze data

  42. Acquisition Fill a Request  Locate data  Order data & documentation Collection Development Select & Locate data Order data & documentation Catalogue data & documentation Install & Store (data & documentation) Reference Search for data Interpret documentation Retrieve or download data Process data change formats subset cases or variables aggregate cases merge files analyze data

  43. Acquisition Fill a Request  Locate data  Order data & documentation Collection Development Select & Locate data Order data & documentation Catalogue data & documentation Install & Store (data & documentation) Reference Search for data Interpret documentation Retrieve or download data Process data change formats subset cases or variables aggregate cases merge files analyze data

  44. Acquisition Fill a Request  Locate data  Order data & documentation Collection Development  Select & Locate data  Order data & documentation Catalogue data & documentation Install & Store (data & documentation) Reference Search for data Interpret documentation Retrieve or download data Process data  change formats  subset cases or variables aggregate cases merge files analyze data

  45. Find a referral partner on campus Acquisition Fill a Request  Locate data  Order data & documentation Collection Development  Select & Locate data  Order data & documentation Catalogue data & documentation Install & Store (data & documentation) Reference Search for data Interpret documentation Retrieve or download data Process data  change formats  subset cases or variables aggregate cases merge files analyze data

  46. The Inventory Model In the traditional inventory model, roughly half of the support goes to putting items on the shelf, while the other half goes to finding and getting the items off the shelf. Source: Darlene Fichter

  47. The Access Model With the access model, support is split between getting information into a deliverable state and finding appropriate ways of retrieving and disseminating the information.

  48. Access Models The access models for data and statistics are not really that different from the models employed with bibliographic and full-text databases. • stand-alone workstation • local area network CD-server • campus network server • Internet server

  49. Examples of Access Models Let’s look at some technology-based examples of access models divided between: • statistics and aggregate data, and • microdata.

  50. Stand-alone Workstation Advantages • install once with usually fewer problems • usually fewer license issues Disadvantages • patron must come to the service • queues may develop to use the workstation

More Related