1 / 13

The Economics of Community Detection and Hiding

The Economics of Community Detection and Hiding. Ross Anderson Shishir Nagaraja. Introduction. Conflicts turn on connectivity Large scale monitoring programs pursued by repressive governments Privacy risks from the aggregation of social network information

akando
Download Presentation

The Economics of Community Detection and Hiding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Economics of Community Detection and Hiding Ross Anderson Shishir Nagaraja

  2. Introduction • Conflicts turn on connectivity • Large scale monitoring programs pursued by repressive governments • Privacy risks from the aggregation of social network information • The role of surveillance and traffic analysis to uncover networks: covert networks are often embedded within larger networks such as social networks • Detecting hidden communities goes back a long way in the history of traffic analysis • Danezis and Wittneben surveillance model (Danezis,Wittneben 2006) – the static case

  3. Detecting communities The network is represented by a graph Nodes represent people Edges represent relations between people Covert group

  4. Surveillance model • Attacker objective: perform surveillance, to detect communities within a social network • Attacker model: global passive adversary, with a finite surveillance budget (count of node wiretaps) • Full knowledge • Partial knowledge: the use of anonymous communications

  5. Defence objective: minimise the exposure of the covert network – maximise false positive and false negative rates in community detection • Defence model: defences are based on local topology knowledge, limited by a budget of the count of edge additions per node • Questions raised

  6. Community detection techniques • Min-cuts • Intuition: communities are separated by a small number of edges • Error tolerance: linear increase in additional edges (noise) between the communities leads to ~exponential increase in false negative rate • Modularity • Intuition: the number of edges falling within communities is significantly greater than the expected number in an equivalent network with edges placed uniformly at random • Error tolerance: linear increase in noise leads to only a ~linear increase in the false negative rate!

  7. Modularity based detection • Modularity matrix: Bij = Aij – kikj/2m • For each pair of nodes: (the number of edges in network) – (the expected number of edges) • Group membership: si in {+1,-1} • Maximise “modularity”: Q ~ sTBs Q = (number of edges within community) – (expected number of such edges) • By: finding leading eigenvector u(1) of B • group membership then assigned according to sign(ui(1)) • “importance” to community is ~| ui(1) |

  8. Network dataset • Social network harvested from email exchanges between members of the Univeristy Rovira i Virgili. (R. Guimera, L. Danon, A. Diaz-Guilera, F. Giralt and A. Arenas, Physical Review E , vol. 68, 065103(R), (2003). • We extracted the giant connected component found to be scale-free (exponent =2.02) with 1133 nodes and 10903 edges. • The dataset easily partitions into two scale-free communities of 831 nodes (=2.19, <deg>=8.19 edges per node) and 302 nodes (=2.25, <deg>=8.5 edges per node).

  9. Effects of surveillance on community detection accuracy (Theoretical upper-bound)

  10. Counter-surveillance defences Our defences are modelled on adding cover traffic: • Naïve strategy (RNDc-RNDm) • Add edges between a random covert node and a random main network node (uniform distribution) • Pure centrality based strategies: Add edges between a high centrality covert node and a high centrality main network node HBc – HBm (HB is high betweenness centrality) or HDc – HDm (HD is high degree centrality) • Hybrid centrality based strategies: • RNDc – HBm • RNDc – HDm (Freeman) Betweenness centrality of a node is the # of shortest paths going through it

  11. The effect of applying defences

  12. Community detection accuracy under counter-surveillance defences

  13. Conclusions • Regardless of the use of anonymous channels, placing 8% of the network under direct surveillance reveals 45% of community membership. • We have analysed the dynamics of community detection and hiding: • Naïve strategies don’t work • Pure centrality based strategies are expensive • Hybrid centrality strategies are much more effective (a budget of 0.082 edges per covert node is enough to hide 70% of the covert network, even when 99% is under direct surveillance)

More Related