1 / 8

“Mortgages, Privacy, and Deidentified Data”

“Mortgages, Privacy, and Deidentified Data”. Professor Peter Swire Ohio State University Center for American Progress Consumer Financial Protection Bureau Conference on “New Research on Sustainable Mortgages & Access to Credit” October 6, 2011. Overview.

alyssa
Download Presentation

“Mortgages, Privacy, and Deidentified Data”

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “Mortgages, Privacy, and Deidentified Data” Professor Peter Swire Ohio State University Center for American Progress Consumer Financial Protection Bureau Conference on “New Research on Sustainable Mortgages & Access to Credit” October 6, 2011

  2. Overview • Federal experience to date with deidentification (“DeID”) • Why DeID technically harder over time • Technical & administrative measures to protect identity • Court records: public records and privacy • Conclusion: Technology alone often cannot succeed, so the choice becomes make public, keep private, or create effective data use agreements

  3. Federal DeID to Date • 2000 HIPAA rule • Recognized reidentification (“ReID”) is possible • Can scrub 18 data fields; or expert testifies have “very small” risk of ReID • Current HHS study in progress on DeID – similar issues to financial data • Data.gov • Administration push for transparency • Privacy & DeID more challenging than many had hoped • Census data • History of census data sensitivity, required data collection • Suppress small cell size; technical limits on researchers’ access

  4. Why DeID is Harder over Time • Two tech trends • Search vastly improved: Google incorporated in 1999 • Increase in (almost) unique publicly available facts • Mortgages • Street View of each house -- pictures • Public records and likely market values & date of sale of each house • Social networks, blogs, marketing information available for purchase: • “We got our new house today, and Bank X did a great/lousy job” • How hard for forensic, automated efforts to reID? • Sweeney “K-anonymity” and can shrink “deID mortgage” to one or a few properties

  5. Technical Measures • Technical measures to DeID may: • Be subject to ReID (previous slide); • Introduce noise to data; or • Both • Add noise (or subtract signal) • Census approach • Public data set, suppress small cell size, lots of noise; or • Researchers can run regressions using somewhat better data • Cynthia Dwork’s “differential privacy” (Microsoft Research) • Limits queries into database based on tolerance for ReID • Agrawal and other IBM research • “Hippocratic Database” adds noise with goal of allowing analysis but minimizing risk of linkage

  6. Administrative Measures • HIPAA data use agreements • Agreements apply to a “limited data set”, with obvious identifiers (name, address) stripped out • Data use agreement • Contractual guarantees to use data only for limited purposes, such as research • Promise to use appropriate safeguards on data • Promise not to reID the data • 2009 CDT conference report on DeID and health data emphasized importance of administrative safeguards

  7. Public Records & Privacy • Court records have been the subject of intense study on tradeoffs of public records and privacy • Strong reasons for public access • Privacy: juvenile court, financial account info, etc. • Annual Williamsburg conference, each November • Many state task forces on subject

  8. Conclusion • Some records are or should be public • Some records are or should be private • Ability to ReID is large and growing • Technical measures to mask exist but are limited in applicability • Administrative measures often essential for researchers to get meaningful results • Technology alone often cannot succeed, so the choice becomes make public, keep private, or create effective data use agreements

More Related