1 / 69

Lessons learned in interdisciplinary project SPION: Four years as a spy

This presentation explores the role of data mining in privacy, beyond just hiding information. It discusses the impact of data mining on our perception of reality and the need for privacy feedback and awareness tools. The presentation also examines case studies, such as the FreeBu tool for audience management on Facebook, and the relationship between privacy and discrimination/fairness.

mandyr
Download Presentation

Lessons learned in interdisciplinary project SPION: Four years as a spy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Four years a SPY - Lessons learned in the interdisciplinary project SPION (Security and Privacy in Online Social Networks) Bettina Berendt Department of Computer Science, KU Leuven, Belgium www.berendt.de , www.spion.me

  2. Thanks to (in more or less chronological order) • Sarah Spiekermann • Seda Gürses • Sören Preibusch • Bo Gao • Ralf De Wolf • Brendan Van Alsenoy • Rula Sayaf • Thomas Peetz • Ellen Vanderhoven • my other SPION colleagues • and many others co-authors and collaborators! [All references for these slides are at the end of the slide set.]

  3. Overview1. What can data mining do for privacy?2. Beyond privacy: discrimination/fairness, democracy3. Towards sustainable solutions

  4. 1. What can data mining do for privacy?

  5. The Siren (AD 2000)

  6.  1. DM can detect privacy phenomena

  7. 2. DM can cause privacy violations

  8. 3. DM can be modified to avoid privacy violations

  9. 3. DM can be modified to avoid privacy violations Is that sufficient?

  10. ... because: What is privacy? • Privacy is not only hiding information: • “dynamic boundary regulation processes […] a selective control of access to the self or to one's group“ (Altman/Petronio) • Different research traditions relevant to CS: & vis-à-vis whom? Social vs. institutional privacy

  11. AND: Privacy vis-à-vis whom? Social privacy, institutional privacy, freedom from surveillance

  12. ... because: What is privacy? ... and what is data mining? whom? Social vs. institutional privacy

  13. Goal (AD ~ 2008): From thesimple view... Goal (AD ~ 2008): From thesimple view...towardsa more comprehensive view

  14. 4. DM can affect our perception of reality

  15. 4. DM can affect our perception of reality – also enhancing awareness & reflection?! Privacy feedback and awareness tools

  16. encrypted content, unobservable communication selectivity by access control offline communities: social identities, social requirements identification of information flows legal aspects profiling feedback & awareness tools educational materials and communication design cognitive biases and nudging interventions

  17. Complementary technical approaches in SPION Only these ^^^ friends should see it ^^^ Nobody else should even know I communicated with them Who are (groups of) recipients in this network anyway? What happens with my data? What can I do about this? DTAI is 1 of the technical partners (with COSIC and DistriNet) Developing software tool for Privacy Feedback and Awareness Collaborating with other partners (general interdisciplinary questions, requirements, evaluation) What is Privacy Feedback and Awareness?Examples ...

  18. 1. What can data mining do for privacy? Case study FreeBu: a tool that uses community-detection algorithms for helping users perform audience management on Facebook

  19. An F&A tool for audience management

  20. FreeBu (1): circle

  21. FreeBu (2): circle

  22. FreeBu (3): map

  23. FreeBu (4): column

  24. FreeBu (5): rank

  25. FreeBu is interactive, but does it give a good starting point? Testing against 3 ground-truth groupings and finding “the best“ community-detection algorithm

  26. FreeBu: better than Facebook Smart Lists for access control • User experiment, n=16 • 2 groups, same interface (circle), algo: hierarchical modularity-maximisation vs. Facebook Smart Lists • Task: think of 3 posts that you wouldn‘t want everybody to see, select from the given groups those who should see it Result:

  27. FreeBu: What do users think? • Two user studies with a total of 12 / 147 participants • Method: exploratory, mixed methods (interview, questionnaire, log analysis) • Results: • Affordances: grouping for access control, reflection/overview, (unfriending) • Visual effects on attention – examples “map“ & “rank“ vis.s:

  28. More observations • No relationship of tool appreciation with privacy concerns • “don‘t tell my friends I am using your tool to spy on them“ • “don‘t give these data to your colleague“ • “how can you show these photos [in an internal presentation] without getting your friends‘ consent first?“ • Trust in Facebook > trust in researchers & colleagues? • Or: machines / abstract people vs. concrete people? • Recognition of privacy interdependencies? ( discussion of „choice“ earlier today) • Feedback tools are themselves spying tools ...

  29. Lessons learned • Social privacy trumps institutional privacy • Change in attitudes or behaviour takes time • No graceful degradation w.r.t. usability: • Tools that are <100% usable are NOT used AT ALL. • What is GOOD? What is BETTER?

  30. 2. Beyond privacy: discrimination/fairness

  31. “Privacy is not the problem“ • Privacy, social justice, and democracy • View 1: Privacy is a problem (partly) because its violation may lead to discrimination.

  32. “Data mining IS discrimination“

  33. “Data mining IS discrimination“

  34. “Privacy is not the problem“ • Privacy, social justice, and democracy • View 1: Privacy is a problem (partly) because its violation may lead to discrimination. • View 2: Privacy is one of a set of social issues.

  35. Discrimination-aware data mining (Pedreschi, Ruggieri, & Turini, 2008, + many since then) PD and PND items: potentially (not) discriminatory goal: want to detect & block mined rules such as purpose=new_car & gender = female → credit=no measures of discriminatory power of a rule include elift (B&A → C) = conf (B&A → C) / conf (B → C) , where A is a PD item and B a PND item Note: 2 uses/tasks of data mining here: Descriptive “In the past, women who got a loan for a new car often defaulted on it.“ Prescriptive (Therefore) “Women who want a new car should not get a loan.“

  36. Limitations of classical DADM

  37. Exploratory DADM: DCUBE-GUI Left: rule count (size) vs. PD/non-PD (colour) Right: rule count (size) vs. AD-measure (rainbow-colours scale)

  38. Evaluation: Comparing c & eDADM “hiding bad patterns“, black box “highlighting bad patterns“, white box

  39. Online experiment with 215 US mTurkers • Framing • Prevention: bank • Detection: agency • $6.00 show-up fee • Tasks • 3 Exercise tasks • 6 Assessed tasks • $0.25 performance bonus per AT • Questionnaire • Demographics • Quant/bank job • Experience with discrimination Dabiku is a Kenyan national. She is single and has no children. She has been employed as a manager for the past 10 years. She now asks for a loan of $10,000 for 24 months to set up her own business. She has $100 in her checking account and no other debts. There have been some delays in paying back past loans.

  40. Decision-making scenario • Task structure • Vignette, describing applicant and application • Rules: positive/negative risks, flagged • Decision and motivation, optional comment • Required competencies • Discard discrimination-indexed rules • Aggregate rule certainties • Justify decision by categorising risk factors

  41. Rule visualisation by treatment • (not DA)DM • Neither flagged nor hidden • Constrained DADM • Hide bad features • Prevention scenario • Exploratory DADM • Flag bad features • Detection scenario residence savings residence foreigner residence foreigner

  42. Results: Actionability and decision quality • Decisions and Motivations • DA versus DADM • More correct decisions in DADM • More correct motivations in DADM • No performance impact • Relative merits • Constrained DADM better for prevention • Exploratory DADM better for detection • Biases • Discrimination persistent in cDADM • ‘‘I dropped the -.67 number a little bit because it included her being a female as a reason.’’ Berendt & Preibusch. Better decision support through exploratory discrimination-aware data mining. in: ARTI, 2014

  43. “Privacy is not the problem“ • Privacy, social justice, and democracy • View 1: Privacy is a problem (partly) because its violation may lead to discrimination. • View 2: Privacy is one of a set of social issues. • View 3: Heightened privacy concerns are just a symptom of something more general being wrong. (e.g. Discrimination – underlying definition of fairness – who gets to decide?)

  44. Discrimination-aware data mining (Pedreschi, Ruggieri, & Turini, 2008, + many since then) 2 uses/tasks of data mining: Descriptive “In the past, women who got a loan for a new car often defaulted on it.“ Prescriptive (Therefore) “Women who want a new car should not get a loan.“ Goal: detect the first AND/OR block the second (= push it below a threshold)

  45. What we did • an interactive tool DCUBE-GUI • a conceptual analysis of • (anti-)discrimination as modelled in data mining (“DADM“) • unlawful discrimination as modelled in law • framework: constraint-oriented vs. exploratory DADM • two user studies (n=20, 215) with DADM as decision support that showed • DADM can help make better decisions & motivations • cDADM / eDADM better for different settings • Sanitized patterns are not sufficient to make sanitized minds

  46. “Privacy is not the problem“ • Privacy, social justice, and democracy • View 1: Privacy is a problem (partly) because its violation may lead to discrimination. • View 2: Privacy is one of a set of social issues. • View 3: Heightened privacy concerns are just a symptom of something more general being wrong. (e.g. Discrimination – underlying definition of fairness – who gets to decide?)

  47. Lessons learned Privacy by design?! • A systems approach is needed “Multi-stakeholder information systems“ Diverse stakeholders Value- sensitive design, Sociology, Politics, Education Information systems Experts Software Develop- ment, IS Science, Law Interactive systems (e.g. Exploratory analysis) Users HCI Algorithms No people; “solutionism“ AI / Data mining

  48. 3. Towards sustainable solutions

  49. Effectiveness of “ethical apps“?

  50. Effectiveness of “ethical apps“? Hudson et al. (2013): • What makes people buy a fair-trade product? • Informational film shown before buying decision? • NO • Having to make the decision in public? • NO • Some prior familiarity with the goals and activities of fair-trade campaigns as well as broader understanding of national and global political issues that are only peripherally related to fair trade? • YES

More Related