1 / 33

Confidentiality Measures for Licensing and Disseminating Restricted-Access Census Microdata Extracts

This article discusses the confidentiality measures implemented by IPUMS-International for licensing and disseminating restricted-access census microdata extracts, including legal, administrative, and technical protections. It also highlights the importance of adequate use of microdata to avoid potential costs.

rivasa
Download Presentation

Confidentiality Measures for Licensing and Disseminating Restricted-Access Census Microdata Extracts

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IPUMS-Europe: Confidentiality measures for licensing and disseminating restricted-access census microdata extracts https://www.ipums.org/international * * *Robert McCaa, Minnesota Population CenterAlbert Esteve, Centre d’Estudis Demogràficsrmccaa@umn.edu, aepalos@yahoo.es “Inadequate use of microdata has high costs”--Len Cook (2003)

  2. Outline: IPUMS-International Confidentiality Measures • Introduction: What is IPUMSi 5 slides • Disseminating anonymized, integrated extracts 3 slides • IPUMS-International confidentiality protections • Legal 3 slides • Administrative 3 slides • Technical 6 slides • Conclusions 2 slides

  3. 1. Introduction:What is IPUM-International? (7 slides)

  4. IPUMS-International is… a global collaboratory of National Statistical Institutes & Universities to: • 1. Inventory the world’s census microdata • 2. Preserve endangered microdata and documentation * * * • 3.Integrate census microdata • a. use standards of UNSD, Eurostat, ISCO, ISCED, etc. • b. facilitate comparative research in time and space • 4. Anonymize census microdata to preserve statistical confidentiality, using highest standards • 5. Disseminaterestricted access, custom extracts to approved researchers at no cost

  5. Mollweide Projection IPUMS-International, October 2005dark green = disseminatinggreen = harmonizinglightest green = negotiating 55 countries, 57% world's population

  6. Available now: (see Table 1)https://www.ipums.org/international

  7. IPUMS-Europe: (Table 2). By July 2005, at 1st workshop 9 countries entrusted 28 datasets to the project (bolded); 2nd workshop in 2006; first release in 2007

  8. 2. Disseminating Anonymized, Integrated Extracts (3 slides)

  9. IPUMSi integration principles • 1. Respect absolute anonymity • 2. Preserve all original data, except adjustments to assure confidentiality (top codes blurrings, masking, re-ordering, etc.) • 3. Harmonize codes for countriesoccupation: ISCO, HISCO (detailed, general) education: ISCED “ “ family: IPUMS, etc. “ “ • 4. Enhance with constructed variables

  10. 2a. Study documentation2b. Design extract 3. Receive email; logon with p/word 1. Logon w/ password (also SAS, STATA) 4. Download extract (SSL encrypted) 5. UnZip data 6. Analyze 6 stepsusinghttps://www.ipums.org/international:

  11. Data Dissemination: web-based extraction system • Password protected: to make and retrieve extracts • Researcher selects: • Countries, • Censuses, • Cases/sub-populations, • Variables, and • Sample densities • Extract engine queues request, generates extract • Researcher retrieves extract via web with SSL 128-bit encryption • NO: CDs, original codes, or complete datasets

  12. 3. Confidentiality Protections(15 slides) “There has been no known attempt at identification with the 1991 SARs [microdata samples of the UK]-nor in any other countries that disseminate samples of microdata” --Elliott and Dale, Journal of the Royal Statistical Society, 1999

  13. 3 kinds of confidentiality protections: • Legal: Dissemination agreement between University of Minnesota and each National Statistical Institute • Uniform 11 point Memorandum of Understanding regarding: ownership, use, authorization, restrictions, confidentiality, security, publication, violations, sharing, arbitration, and order of precedence • Administrative: conditional use license between the University of Minnesota and each researcher • Permission to use restricted access microdata, 3 criteria: research need, research competence, and agree to abide by conditions of use license • Technical data protection measures • Specific to each country …/

  14. Legal: OSI and U. Minnesota

  15. Legal: OSI and U. Minnesota(2001-4)

  16. Legal: OSI and U. Minnesota(2005+)

  17. 3 kinds of confidentiality protections: • Legal: Dissemination agreement between University of Minnesota and each National Statistical Institute • Uniform 11 point Memorandum of Understanding regarding: ownership, use, authorization, restrictions, confidentiality, security, publication, violations, sharing, arbitration, and order of precedence • Administrative: conditional use license between the University of Minnesota and each researcher • Permission to use restricted access microdata, 3 criteria: research need, research competence, and agree to abide by conditions of use license • Technical data protection measures • Specific to each country …/

  18. IPUMSi DISSEMINATES Restricted Access web-based system Legally-binding license agreement • protects privacy and confidentiality • assures proper use • forces snoopers to violate law Access limited to: • Bona-fide researchers (credentials) • With a demonstrated scientific need • who agree to abide by license restrictions • Confidentiality • No redistribution • Safely secured

  19. User Conditions of Use License

  20. Conditions of Use License (Appendix B)

  21. Conditions of Use License (Appendix B)

  22. 3 kinds of confidentiality protections: • Legal: Dissemination agreement between University of Minnesota and each National Statistical Institute • Uniform 11 point Memorandum of Understanding regarding: ownership, use, authorization, restrictions, confidentiality, security, publication, violations, sharing, arbitration, and order of precedence • Administrative: conditional use license between the University of Minnesota and each researcher • Permission to use restricted access microdata, 3 criteria: research need, research competence, and agree to abide by conditions of use license • Technical data protection measures • Specific to each country …/

  23. IPUMSi technical measures are also applied, in addition to the legal & administrative protections ANONYMIZES » Suppress geographical detail» Blur/aggregate sensitive codes» Convert dates to ages (blur key vars.) » Swap cases between districts» Scramble records

  24. EUROSTAT statistical confidentiality standards (Thorogood, 1999) --all endorsed by IPUMS-International • 1. Restrict access to samples • 2. Limit geographical detail • 3. Re-code unique categories--top and bottom • 4. Sign non-disclosure agreement • 5. Prohibit redistribution to third parties • 6. Prohibit attempts to identify individuals or the making any claim to that effect • 7. Require users to provide copies of publications

  25. EUROSTAT statistical confidentiality standards (Thorogood, 1999) --all endorsed by IPUMS-International • 8. Construct age from birthdate, if necessary • 9. Do not identify date of birth • 10. Do not identify precise place of birth • 11. Migration: timing/place not identified in detail • 12. Identify place of residence by major civil division (pop>20k, 60k, 100k, 1 million—i.e., national convention) • 13. Do sensitivity analysis • 14. Do confidentiality assessment

  26. Kenya: Anonymization Based on Unique Characteristics Threshold (50,000 for geographic variables; 10,000 for other variables) Variable Name Sensitive Aggregated 10,000/1,000 minimum: Occupation, Employment Status Type Procedure Key Suppressed Division, Location, Sublocation, Enumeration area, Tribe/Ethnicity Aggregated 50,000 minimum: Province, District of Residence, Birth and Past Residence None Sex, Marital Status, Relationship to Head, etc. Transitory (information is considered too changeable to be used to identify individuals from microdata). None Age, Urban/Rural Residence, Literacy, Educational Status, Educational Level, Labor Activity, Children Everborn/Alive/Dead, Last Birth Year, Mortality variables Anonymization example: Kenya, 1989

  27. IPUMS-International samples anonymized by:Census Agency (36 countries)or IPUMS (19 countries)

  28. Risk assessment of 1991 SARs:the risk is very low… • After taking into account errors in the data, coding variability and changing of personal characteristics in time • Dale and Elliott, JRSS-A (2003): “For a user of an outside database, attempting this sort of match with no opportunity for verification would prove fruitless. In the first place, the small degree of expected overlap would be a considerable deterrent to an intruder. However, if a match between the two files was attempted the large number of apparent matches would be highly confusing as an intruder would have no way of checking correct identification.”

  29. 4. Conclusion

  30. Joint integrated European census microdata projects Coordinate CIECM 2005 IPUMS-Europe 2004 – 2009 Disseminate DIECM 2006-2008 Enhance EIECM 2007-2009

  31. IPUMS-International strengths • Uniform legal authorization with national statistical authorities • Access restricted to academics with need who agree to abide by stringent confidentiality protections • Experienced integration teams • Proven web-based distribution system • High user satisfaction • Sustainable: NSF, NIH, FP-6 funded (Europe only)

  32. Thank you!https://www.ipums.org/internationaladditional information at:www.hist.umn.edu/~rmccaa/click: ipums-europeand:www.ced.uab.es click: IECM* * * * * *Contacts: rmccaa@umn.eduaepalos@yahoo.es

More Related