1 / 54

Dr. Ann Cavoukian

Dr. Ann Cavoukian. Information and Privacy Commissioner of Ontario. De-identification Risk and Resolution. Bradley Malin, Ph.D. Assistant Professor Vanderbilt University. De-identified is not Anonymous ( Sweeney 1998, 2000 ). Name Address Date registered Party affiliation

brian
Download Presentation

Dr. Ann Cavoukian

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 29e Confrence internationale des commissaires à la protection de la vie prive

  2. Dr. Ann Cavoukian Information and Privacy Commissioner of Ontario 29e Confrence internationale des commissaires à la protection de la vie prive

  3. De-identification Risk and Resolution Bradley Malin, Ph.D. Assistant Professor Vanderbilt University 29e Confrence internationale des commissaires à la protection de la vie prive

  4. De-identified is not Anonymous(Sweeney 1998, 2000) Name Address Date registered Party affiliation Date last voted Ethnicity Visit date Diagnosis Procedure Medication Total charge Zip Birthdate Sex 87% of the United States is RE-IDENTIFIABLE Hospital Discharge Data Voter List 29e Confrence internationale des commissaires à la protection de la vie prive

  5. DNA Re-identification • Many deployed genomic privacy technologies leave DNA susceptible to re-identification (Malin 2005) • DNA is re-identified by automated methods, such as: • Genotype – Phenotype Inference (Malin & Sweeney, 2000, 2002) 29e Confrence internationale des commissaires à la protection de la vie prive

  6. Genealogy Re-identification(Malin 2006) • IdentiFamily: • software that links de-identified pedigrees to named individuals • Uses publicly available information, such as obituaries, death records, and the Social Security Death Index database to build genealogies 29e Confrence internationale des commissaires à la protection de la vie prive

  7. Genealogy Re-identification(Malin 2006) 29e Confrence internationale des commissaires à la protection de la vie prive

  8. System Susceptibility(Malin, JAMIA 2005) Susceptible Not Susceptible 29e Confrence internationale des commissaires à la protection de la vie prive

  9. Altering Data Does notGuarantee Protection • Science Magazine (Lin et al, 2004) • < 100 “SNPs” make DNA unique • Proposed protection: perturb DNA • i.e., change A with T, etc. • aaaact atacct • Increase perturbation, decrease internal correlations (see graph) • Conclusions • Too much perturbation needed to prevent linkage • Keep records under lock and key DISCLAIMER: Uniqueness Does not Guarantee Privacy will be Compromised Utility (Correlations) Privacy (Perturbation) 29e Confrence internationale des commissaires à la protection de la vie prive

  10. Formal Re-identification Model Already Public Necessary Condition LINKAGE MODELC De-identified Biobank Data Identified Data 2. Certify No Linkage Route Necessary Condition UNIQUENESS Necessary Condition UNIQUENESS Necessary Condition UNIQUENESS 1. Make Data Non-unique 29e Confrence internationale des commissaires à la protection de la vie prive

  11. Formal Protection • k-Map (Sweeney, 2002) • Each shared record refers to at least k entities in the population • k-Anonymity (Sweeney, 2002) • Each shared record is equivalent to at least k-1 other records • k-Unlinkability (Malin 2006) • Each shared record links to at least k identities via its trail • Satisfies k-Map protection model 29e Confrence internationale des commissaires à la protection de la vie prive

  12. Beyond Ad hoc Protections • Perturbation does not guarantee privacy • Alternative: Generalization of data (Lin et al 2004) (Malin 2005) 29e Confrence internationale des commissaires à la protection de la vie prive

  13. Learning Who You Are From Where You Have Been (“Trails”)(Malin & Sweeney, 2001; 2004, Malin & Airoldi 2006) 29e Confrence internationale des commissaires à la protection de la vie prive

  14. Preventing Trails: Cystic Fibrosis Population(1149 samples) BEFORE STRANON 100% Samples In Repository AFTER STRANON 0% Samples k-Re-identified 29e Confrence internationale des commissaires à la protection de la vie prive

  15. Benefit: Quantified Risk Forced Setting Initial Setting • Change in re-identification risk • Shift burden of increased risk to requesting analyst • Ties together legal and computational models Requested Quantity 29e Confrence internationale des commissaires à la protection de la vie prive

  16. Measuring and ManagingRe-identification Risk by Khaled El Emam University of Ottawa 29e Confrence internationale des commissaires à la protection de la vie prive

  17. Managing Re-id Risk- I • Before data is collected: • Scenarios • When preparing a protocol • For review by ethics boards • When formulating new policies and procedures • When writing data sharing agreements • Tools • Heuristics • Simulations 29e Confrence internationale des commissaires à la protection de la vie prive

  18. Managing Re-id Risk - II • After data is collected: • Scenarios • Providing data to administrators, researchers or government departments • Responding to an access to information request • Tools • Masking • Risk-based anonymization 29e Confrence internationale des commissaires à la protection de la vie prive

  19. Heuristics, Masking, Anon • The 20k rule, 70k rule, 100k rule …. • Decision tools from matching experiments • Around 18 tools for masking on the market • Deciding on a risk threshold for anonymization 29e Confrence internationale des commissaires à la protection de la vie prive

  20. Acceptable Re-id Risk • What databases does an attacker have access to for record linkage ? • What does an attacker know beforehand ? • What is the verification cost ? • How do we account for privacy tradeoffs by the public ? • What is the impact of consent model ? 29e Confrence internationale des commissaires à la protection de la vie prive

  21. Databases • Public information and registries • Commercial but generally available databases • Confidential and proprietary databases 29e Confrence internationale des commissaires à la protection de la vie prive

  22. Verification Cost • At some point the verification cost becomes too high compared to the benefit for the attacker • The proportion of data that is population unique is important • The extent of overall matching success is also important • You can control both through anonymization 29e Confrence internationale des commissaires à la protection de la vie prive

  23. Tradeoffs • The public is willing to trade their privacy for personal benefits/gains • What they tell us is not necessarily how they will behave • To what extent is the public willing to trade their privacy for societal gain ? 29e Confrence internationale des commissaires à la protection de la vie prive

  24. Consent Models • Is the impact on recruitment rates and bias a function of the consent model or how it is implemented ? • There are many factors that influence consent – were all of these controlled for when comparing consent models ? 29e Confrence internationale des commissaires à la protection de la vie prive

  25. Workshop 4 Protecting Privacy Through De-Identification: Reality or Fallacy Part 1: Discussion 29e Confrence internationale des commissaires à la protection de la vie prive

  26. Dr. Debra Grant Senior Health Privacy Specialist Information and Privacy Commission of Ontario 29e Confrence internationale des commissaires à la protection de la vie prive

  27. De-identificationchallenges raised bygenetic and genomic data William W. Lowrance, PhD (lowrance@iprolink.ch) September 26, 2007 29e Confrence internationale des commissaires à la protection de la vie prive

  28. The physical basis of the challenges The human genome: − is extensive and very fine-grained − influences many personal attributes − is intrinsic to the body − doesn't change during the lifetime − is unique to the individual. The full genome is carried by the DNA in every cell of the body (except red blood cells). 29e Confrence internationale des commissaires à la protection de la vie prive

  29. What genomic data look like ...tttccgtatgcgtagccagacttaccctcctagtag... − through 3,000,000,000 "data-cells," each carrying a/t/g/c. Altering or inserting just a few a/t/g/c can make a big difference, whether the genome is being considered: − as a dynamic program-tape, or − as an intrinsic "barcode." 29e Confrence internationale des commissaires à la protection de la vie prive

  30. What genetic data look like • at sequence scale: │ctag...ctccca│ • at gene scale: "Diabetes-factor gene SLC308A" • at body scale: "red hair," "heritable renal dysplasia" • at family scale: pedigree, family health history, other indicators. 29e Confrence internationale des commissaires à la protection de la vie prive

  31. The most useful construal of identifiability for genomic data, in my view "Identifiability" is the potential associability of data with persons. 29e Confrence internationale des commissaires à la protection de la vie prive

  32. Paths through which genomic data can become identified (a) matching genotype to identifiable reference genotype data (such as police, military, or blood-relatives') (b) linking genomic+associated data (health, social, etc) with other data (c) profiling, i.e. probabilistically describing likely appearance, health factors, or other traits. 29e Confrence internationale des commissaires à la protection de la vie prive

  33. Tactics for de-identifying genomic data (a) limiting the proportion of genome released (b) statistically degrading the data before releasing (c) irreversibly de-identifying (d) separating the identifiers and key-coding. 29e Confrence internationale des commissaires à la protection de la vie prive

  34. Tactic (a): limiting the proportion of genome released • is done, and can protect • but often limits usefulness, because often it isn't known in advance which portions of genome are relevant • difficult to judge how much is "not too much" to release. 29e Confrence internationale des commissaires à la protection de la vie prive

  35. Tactic (b): statistically degrading the data before releasing • can be done, such as by randomly substituting some a/t/g/c • almost always degrades usefulness, because most analyses depend on precise fine details. 29e Confrence internationale des commissaires à la protection de la vie prive

  36. Tactic (c): irreversibly de-identifying • is occasionally done, such as when the purpose is to survey the background occurrence of some phenomenon, or to provide data for educational use. 29e Confrence internationale des commissaires à la protection de la vie prive

  37. Tactic (d): separating the identifiers and key-coding • works well − if performed carefully, the key is properly safeguarded, and use of the key to reconnect is strictly controlled • is increasingly being used in activities such as health research. 29e Confrence internationale des commissaires à la protection de la vie prive

  38. To de-identify, or not? Whether and in what ways to de-identify genomic data depends on the: − character of the data − consent − intended uses − potential for linking to reference genotype or other data − protections. 29e Confrence internationale des commissaires à la protection de la vie prive

  39. Alternatives and complements to de-identification • Provide access via controlled release (governed by contract, overseen by a stewardship committee, etc) • Sanction against misuse of the data (such as improper re-identifying) or abuse using the data (such as negative discrimination). 29e Confrence internationale des commissaires à la protection de la vie prive

  40. Closing sermon De-identification is a crucial, practical protection − for both genomic and other kinds of data − and its use must be strongly encouraged! General ref: Lowrance and Collins, "Identifiability in genomic research," Science 317, 600−602 (August 3, 2007). 29e Confrence internationale des commissaires à la protection de la vie prive

  41. Consent and Access to Personal Information for Health Research – public perspective Don Willison, Sc.D. Centre for Evaluation of Medicines, St. Joseph’s Healthcare, Dept of Clinical Epidemiology & Biostatistics, McMaster University, willison@mcmaster.ca 29e Confrence internationale des commissaires à la protection de la vie prive

  42. Research team: • McMaster University • Don Willison (P.I. – privacy, policy, research methods) • Lisa Schwartz (philosophy, bioethics) • Julia Abelson (public engagement) • Cathy Charles (public engagement, qualitative methods) • Lehana Thabane (statistician, quantitative methods) • Marilyn Swinton (research coordinator, qualitative methods) • York University • David Northrup (survey methods) • Canadian Policy Research Networks • Mary Pat MacKinnon, Judy Watling (dialogue) • Funding: Canadian Institutes of Health Research • Publication: JAMIA – November 2007 29e Confrence internationale des commissaires à la protection de la vie prive

  43. Context: Expanding Use of personal information for health research • Increase in scope and complexity of data use • Data linkage • administrative and clinical data • survey and genetic information • Single time-limited studies → registries and biobanks • EHR: expanded access to health information for: • population / public health research • pragmatic trials • Researchers need individual-level data • Challenge: masking of identity • Debate: treat data as identifiable? 29e Confrence internationale des commissaires à la protection de la vie prive

  44. Issues Around Consent • Patient/public perspective: • How to obtain meaningful and valid consent? • Researcher’s perspective: • practicability of obtaining consent • potential selection biases in a consent-based system • If consent is waived, limitations: • Cannot contact patient / Who may screen charts? • General: • Must we be limited to the binary option of consent / no consent? 29e Confrence internationale des commissaires à la protection de la vie prive

  45. Our survey: • Cross-Canada telephone survey, random-digit dialled • March-April 2005 • n=1230 (58% response rate) • Structure: • General questions • Demographics, altruism • Placing health and privacy in context of other priorities • Questions in abstract • attitudes re: privacy and research • trust in institutions • use of medical records for different types of research • Specific scenarios. Role of consent in: • medical record research • electronic health record • record linkage 29e Confrence internationale des commissaires à la protection de la vie prive

  46. WHAT DID WE FIND? Attitudes to privacy • High support for privacy in principle: • 97% felt protection of the privacy of their personal information was important • 74% very important / 23% somewhat important. • 91% agreed that more effort needs to be made to protect our privacy • 59% strongly agreed / 32% somewhat agreed • 92% agreed that everyone benefits if the privacy of individuals is respected • 66% strongly agreed / 26% somewhat agreed 29e Confrence internationale des commissaires à la protection de la vie prive

  47. Privacy vs. Research 29e Confrence internationale des commissaires à la protection de la vie prive

  48. 29e Confrence internationale des commissaires à la protection de la vie prive

  49. Research Scenarios • 4 scenarios: • Abstraction of information from health record for research • Use of electronic health information for research • Linkage of education with EHR • Linkage of income with electronic health record • Data have direct identifiers removed • Makes it difficult but not impossible to re-identify 29e Confrence internationale des commissaires à la protection de la vie prive

  50. Opinion regarding consent and alternatives across scenarios 29e Confrence internationale des commissaires à la protection de la vie prive

More Related