1 / 25

“Inside the MATRIX: Fair Information Practices in a World of Data Mining ”

This talk discusses the hard privacy issues surrounding the MATRIX system, a data mining and link analysis system used for information sharing. It explores the effectiveness and lawfulness of MATRIX and raises questions about data quality, sensitive data, and the use of private sector data. The talk emphasizes the importance of addressing fair information practices in order to alleviate privacy concerns.

dcathy
Download Presentation

“Inside the MATRIX: Fair Information Practices in a World of Data Mining ”

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “Inside the MATRIX: Fair Information Practices in a World of Data Mining” Professor Peter Swire Ohio State University DePaul Symposium on Privacy and Identity October 15, 2004

  2. The Challenge Federal official, involved in funding information sharing systems, recently asked me: “What can we do to address the concerns of privacy proponents so that they will stop complaining about MATRIX and other needed systems?” • Today’s talk in that national security context. • This was a good-faith question from an honorable person. • He was sobered by my answer.

  3. Overview • Pattern analysis and link analysis • Current MATRIX as link analysis system • Open questions on effectiveness of MATRIX and overall lawfulness • This talk: the hard privacy issues that exist even if assume MATRIX is effective and lawful

  4. Pattern & Link Analysis • Pattern analysis as “data mining” • Seek statistical correlations, then act • DeRosa/CSIS describes pattern analysis issues • Policy approaches: • Original MATRIX system: use data mining • Dempsey & Rosenzweig as D.C. policy compromise • ACLU and others oppose it entirely

  5. Pattern & Link Analysis • Link analysis: learn more about one suspect • More traditional police work • Warrants, subpoenas, and public records, depending on type of information • Current MATRIX system and focus of this talk

  6. MATRIX • Multi-State Anti-Terrorism Information Exchange (MATRIX) • $12 million from DHS & DOJ • Project security and access in Florida • First proposed after 9/11 • At the peak,12 states had agreed to participate • Currently FL, CT, MI, OH, PA are in program • States that have left or decided not to join after actively considering it: AL, CA, CO, GA, LA, KY, OR, SC, TX, UT, WV • Privacy and cost cited as reasons not to do it

  7. The Current MATRIX “Information accessible includes criminal history records, driver’s license data, vehicle registration records, and incarceration/corrections records, including digitized photographs, with significant amounts of public records data. This capability will save countless investigative hours and drastically improve the opportunity to successfully resolve investigations. The ultimate goal is to expand this capability to all states.” Official site: www.matrix-at.org

  8. 2 Early Objections • System was created and pushed by admitted drug smuggler, Herb Asher of Seisent • This is not relevant to how we should view the current system • It made it harder to say “Trust Us” on MATRIX • After 9/11, 120,000 names sent to law enforcement for “high terrorism factor” • This is data mining, without individualized suspicion, with no transparency or known checks against abuse • Today, “MATRIX is not a data mining application.”

  9. Jan. 2003 Seisent Documents HTF based on factors including: • Age, gender & ethnicity • “What they did with their driver’s licenses” • Pilots or associations to pilots • Proximity to “dirty addresses/phone numbers” • Investigational data • SSN anomalies • Credit histories

  10. Seisent Documents • “The associative links, historical residential information, and other information, such as an individual’s possible relatives and associates, are deeper and more comprehensive than other commercially available database systems presently on the market.”

  11. Answering the Federal Official • Privacy experts (not necessarily “advocates”) will have a list of questions: • About current configuration of system and its compliance with fair information practices • About system as designed (it had original, broader functions) • How system could easily evolve over time (mission creep)

  12. Florida, Other States Police & Other State Subscribers More States Supply Data MATRIX Intel (?) “Public” Records Feds (?) “Private” Records (?)

  13. The Inputs Florida, Other States Police & Other State Subscribers More States Supply Data MATRIX Intel (?) “Public” Records Feds (?) “Private” Records (?)

  14. Florida, Other States Questions on Inputs: Data Quality: 2003 FBI announcement that NCIC data could no longer be subject to “accuracy” requirements of the Privacy Act Are state criminal, prison, and similar records more accurate? If record are fixed in one place, is that correction spread to all the other databases? More States Supply Data “Public” Records “Private” Records (?)

  15. Florida, Other States Questions on Inputs: Sensitive data: Sources of identity theft -- SSNs are listed in many public records; bank account records in bankruptcy “public” records Known privacy concerns of American people on medical, financial, children’s, & other “sensitive” records More States Supply Data “Public” Records “Private” Records (?)

  16. Florida, Other States Questions on Inputs: Private sector data. Was there notice & consent for these uses? For medical, credit history, and other sensitive data? Are these “secondary” uses appropriate? Federal data under the Privacy Act, with public oversight. What similar checks and balances for how private data is gathered and used? More States Supply Data “Public” Records “Private” Records (?)

  17. Questions on Outputs: For secret/confidential data, assume good security in data center. How many people have access to the outputs of MATRIX? 800,000 uniformed police, for traffic stops, etc. Non-uniformed? Firefighters? Others? Police & Other State Subscribers Intel (?) Feds (?)

  18. Questions on Outputs: • How to secure outputs to 1 million • people? • Assume few/no secrets for what the • million can see about the system – • Swire paper on security/obscurity • Training • Audit trails • Anti-browsing laws & enforcement • But, what can terrorist or organized • crime group learn by bribing one • out of the million? Police & Other State Subscribers Intel (?) Feds (?)

  19. Questions on the Data Center/System: • A principle: the more important the decisions made, the more important it is to have due process and fair information practices. E.g., denied for mortgage or job, so have FCRA. • Decisions here might include: • Arrest the person (my student Greg Smith) • Deny ability to travel, enter secured spaces • Deny job, on a background check • Suspicion on a person’s “associates”? • Other uses over time?

  20. Questions on the Data Center/System: • Access and correction as key fair information practices. • Currently no access by individual to data held in MATRIX. Instead, individual told to go to every data source and get access there. • Problems include: • Burdensome to go to numerous sources • Data sources not all publicly listed. • Even if correct mistake once, it often reappears

  21. Questions on the Data Center/System: • Transparency & Governance • No privacy policy posted until recently • No individual identified as CPO • Perhaps have outside experts or advisory board? • Most generally, how provide public oversight, accountability, assurance?

  22. The Sobering List of Privacy Issues for the Federal Official • Inputs: data quality • Inputs: sensitive data • Inputs: private-sector data • Outputs: secrets when thousands or a million receive data • Outputs: anti-browsing and good security at the edges • Important decisions by government require due process • Access and correction (when secrecy unlikely to work) • Transparency and governance, to reduce mistakes and improve public acceptance

  23. Is It Worth Answering Those Questions? • To the Homeland Security official: • If the privacy homework assignment seems too burdensome, then temptation is to minimize or ignore privacy issues • But the privacy homework is good policy and good government • Markle report and the need to do the privacy homework or else watch public opposition undermine the potential benefits of a system • Transparent, good governance as the touchstone

  24. Conclusion • The official who questioned me was surprised and sobered by the number of significant and difficult privacy issues in MATRIX • Should be sobering to all of us how little the funders of MATRIX had worked through these issues • This conference, and ongoing vigilance, are needed on these issues

  25. Contact Information • Professor Peter Swire • Moritz College of Law of the Ohio State University • Phone: (240) 994-4142 • Email: peter@peterswire.net • Web: www.peterswire.net

More Related