china mobile leader s programme mobile technology jon crowcroft l.
Skip this Video
Loading SlideShow in 5 Seconds..
China Mobile Leader’s Programme Mobile Technology Jon Crowcroft PowerPoint Presentation
Download Presentation
China Mobile Leader’s Programme Mobile Technology Jon Crowcroft

Loading in 2 Seconds...

play fullscreen
1 / 99

China Mobile Leader’s Programme Mobile Technology Jon Crowcroft - PowerPoint PPT Presentation

  • Uploaded on

China Mobile Leader’s Programme Mobile Technology Jon Crowcroft. +gmail, hotmail +441223763633 +447733 231822 +linkedin, facebook, myspace. 4 Areas. Mobile Social Networks Data Collection Energy Programming.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'China Mobile Leader’s Programme Mobile Technology Jon Crowcroft' - Donna

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
china mobile leader s programme mobile technology jon crowcroft

China Mobile Leader’s ProgrammeMobile TechnologyJon Crowcroft

+gmail, hotmail


+447733 231822

+linkedin, facebook, myspace

4 areas
4 Areas
  • Mobile Social Networks
  • Data Collection
  • Energy
  • Programming
we meet we connect we communicate
We meet, we connect, we communicate
  • We meet in real life in the real world
  • We use text messages, phones, IM
  • We make friends on facebook, Second Life
  • How are these related?
  • How do they affect each other?
  • How do they change with new technology?

Thank you but you are in the opposite direction!

I have 100M bytes of data, who can carry for me?

Give it to me, I have 1G bytes phone flash.

I can also carry for you!

Don’t give to me! I am running out of storage.

Reach an access point.

There is one in my pocket…


Search La Bonheme.mp3 for me

Finally, it arrive…

Search La Bonheme.mp3 for me

Search La Bonheme.mp3 for me

dunbar s number trust
Dunbar’s Number & Trust
  • Dunbar’s number:-150 (for humans)
    • Size of simple communities of humans
    • Reflects ability to cope with group
    • Humans gossip rather than physical grooming
    • Language lets us abstract
    • We can reason up to 5 levels of intentionality
    • (Shakespear does 6 :-)
  • T = 1 / [3.x^N]
    • T is trust metric
    • 3.x is a number between 3 and 4
    • N is distance in social net
conjecture on n
Conjecture on N?
  • N = 0 = Kin (sex)
  • N = 1 = friends (beer/drugs)
  • N = 2 or more = acquaintances(dancing/music/laughing at same jokes)
  • How does this help in facebook?
conjecture on online v real
Conjecture on Online v. Real
  • We’re looking at co-lo networks
    • c.f. haggle, cityware - bluetooth etc
  • AND online social networks
    • Friendship graph on orkut,li,facebook
  • AND communication networks
    • Email address book, sms, phonecalls
  • Can use to infer real relationship
    • I.e. type of edge in graph (and value of N)
conjectures on trust
Conjectures on Trust
  • Trust in terms of revelation/disclosure
    • Or carrying data (in ferry net)
    • Or simple automated/default grouping for ACLs
  • Need to do some experiments
    • Figure out how ties are broken
    • Forgetting
    • How new tools/technology affect
    • Size and dynamics of social net…
eu social net project questions
EU Social Net Project Questions
  • What net/edge type is more likely to cause anedge in another net?
    • Does meeting someone dominate over online orvice versa -
    • i.e. how does new tech affect x (size of immediategang) and
    • N (scope of gang/level of intentionalityreasoning?)?
    • Can you use this to detect dodgy behaviour (spam, bullying, etc)?
ongoing studies
Ongoing studies
  • Data?
    • We have large datasets for single edge-type/modality
      • (6M phone call timeloc, 1M social net)
    • But only very small datasets for 2 or 3 modalities
      • 30 army base people -> retirement
      • 100 school leavers -> University
    • Very heavy-lifting
    • Not only lots of data processsing, but worse:-
      • Interview eahc user for context
  • Privacy?
    • Correlating (datamining) the different nets is massive breach of trust
  • Usefulness?
  • Improve privacy
    • As mentioned, could auto-default Fb settings and relate to phone/locn
    • Could also use as interest based filter
  • Fundamental understanding of social groups
    • How society/technology co-evolve
    • Social inclusion and accessibility (!)
  • Epidemiology (*)
  • Buzztraq
    • Use currency of local interest to
    • Fetch content…
  • Two projects -
    • Emulation (ESRC)
    • Run s/w on smart phone that mimics a disease
    • Has a “vector” and SIR(!) parameter per person
    • Run on “real socieity” based on meeting duration/proximity/frequency
  • Flubook (Horizon)
    • Panic button (“Not well”/”Feelin better”)
    • Uploads list of contacts in last week via free SMS
    • Puts anonymized data on google maps
    • Alerts trusted friendship group on facebook
  • Susceptibility, Infectiousness, Recovery
    • Given contact distribution,
      • Can compute progress of epidemic
      • Whether collapse (S, I low, R high)
      • Or go pandemic (S, I high)
    • As with relationship between online and RL behaviour for socialising,
      • Flubook might alter contact rate…
      • ….systematically for subset of population
      • …(social or geographic) with high S/I
    • Help prevent/collapse epidemic
thank you
Thank you…
  • Questions? …
and another thing
And another thing
  • Virtualising online social self
    • Floating it in the “cloud”
    • Crypt content, but allow cloud/fb to match interests (for advertising)
  • Migrate it to track user (and handset)
  • Performance gain
    • handset can be meagre cpu/memory
    • Latency reduced
    • Synchronisation/persistence assured
    • Don’t care if handset lost/stolen :-)
threads of your life
Threads of your life
  • Human level is activities & relationships
  • Nodal level is processing and storage
  • World level is location and context
idea is
Idea is…
  • To allow mobile (compact/portable) representation of your activities and relationships (0wned by ou)
  • Roam across arbitrary nodes in environment (embedded or handset owned by anyone)
  • While recording where you are and context (= other people)
2 data collection for modelling contact networks

2. Data Collection for Modelling Contact Networks

Eiko Yoneki and Jon Crowcroft

Systems Research Group

University of Cambridge Computer Laboratory

  • Purposes of Data Collection

 Modelling Human Contact Networks

  • Proximity Data Collection Methodology
  • Issues for Data Collection
  • Examples of Data Analysis
  • Extending to Collect/Correlate Online Data
  • Conclusion
purpose of data collection
Purpose of Data Collection


  • Building communication protocol based on proximity
    • EU FP6 Haggle Project
  • Inferring social interaction, opinion dynamics  Apply results to networking and computer systems
    • EU FP7 Socialnets, EU FP7 Recognition
  • Network modelling for epidemiology
    • EPSRC Data Driven Network Modelling for Epidemiology
  • Understanding behaviour to infectious disease outbreak - social and economic influences
    • ESRC FluPhone Project

Haggle: Pocket Switched Networks

  • Networked distributed database over opportunistically connected devices (e.g. Mobile phones)

Legacy network

(e.g. the Internet)

Ex. Haggle Twitter

EU FP6 Haggle


fluphone project
FluPhone Project
  • Understanding behavioural responses to infectious disease outbreaks
  • Extending data collection to general public


purpose of data collection28
Purpose of Data Collection

Modelling Contact Networks: Empirical Approach


Robust data collection from real world

Post-facto analysis and modelling yield insight into human interactions

Data is useful from building communication protocol to understanding disease spread

proximity data collection
Proximity Data Collection
  • Sensor board (iMote), mobile phone
  • Proximity detection by Bluetooth, and/or GPS
  • Environmental information (e.g. in train, on road)





proximity detection by bluetooth
Proximity Detection by Bluetooth

Only ~=15% of devices Bluetooth on

Scanning Interval

2 mins iMote (one week battery life)

5 mins phone (one day battery life)

or continuous scanning by station nodes

Bluetooth inquiry (e.g. 5.12 seconds) gives >90% chance of finding device

Complex discovery protocol

Two modes: discovery and being discovered

5~10m discover range

Can it produce reliable data (negligible noise)?



Sensor Board or Phone or ...

  • iMote needs disposable battery
    • Expensive
    • Third world experiment
  • Mobile phone
    • Rechargeable
    • Additional functions (messaging, tracing)
    • Smart phone: location assist applications
  • Provide device or software
  • Combine with online information (e.g. Twitter)



Phone Price vs Functionality

  • ~<20 GBP range
    • Single task (no phone call when application is running)
  • ~>100 GBP
    • GPS capability
    • Multiple tasks – run application as a background job
  • Challenge to provide software for every operation system of mobile phone



Location Data

  • Location data necessary?
    • Ethic approval gets tougher
    • Use of WiFi Access Points or Cell Towers
    • Use of GPS but not inside of buildings
  • Infer location using various information
    • Online Data (Social Network Services, Google)
    • Us of limited location information – Post localisation

Scanner Location in Bath



Target Population


  • Provide devices to limited population or target general public
    • For epidemiology study ~=100% coverage may be required
    • Fluphone project: participants will be general public
  • Or school as mixing centres

Experiment Parameters vs Data Quality

  • Battery life vs Granularity of detection interval
  • Duration of experiments
    • Day, week, month, or year?
    • Data rate
  • Data Storage
    • Contact /GPS data <50K per device per day (in compressed format)
    • Server data storage for receiving data from devices
    • Extend storage by larger memory card
  • Collected data using different parameters or methods  aggregated?



Data Retrieval Methods

  • Retrieving collected data:
    • Tracking station
    • Online (3G, SMS)
    • Uploading via Web
    • via memory card
  • Incentive for participating experiments
  • Collection cycle: real-time, day, or week?


data transformation for analysis
Data Transformation for Analysis

Transform to discrete version of contact data

Deal with noise and missing data

Ex. transitivity closure

Data analysis requires high performance computer and storage

Low volume - raw data in compact format

Transformation of raw data for analysis increases data volume



Security and Privacy

  • Current method: Basic anonymisation of identities (MAC address)
  • FluPhone Project – use of HTTPS for data transmission via 3G
  • Anonymising identities may not be enough?
    • Simple anonymisation does not prevent to be found the social graph
  • Ethic approval tough!
    • 40 pages of study protocol document for FluPhone project – took several months to get approval


human connectivity traces
Human Connectivity Traces

Capture Human Interactions

..thus far not large scale

Crawdad DB

Contact: 025d04b2b3f 4650000025d0 5416492246711621549 5416492246711644527

Location: 0025d0e113da [lon: -3.384610278596745E125; lat: 1.3168305280597862E182] 5066619950170431763



Size of largest connected nodes shows network dynamics

Regularity of Network Activity

5 Days



inter contact time of pair nodes
Inter Contact Time of Pair Nodes
  • Power law distribution (+ exponential decay)




classification of node pairs
Classification of Node Pairs

I: Community

High Frequency - Long Duration:

II: Familiar Stranger

High Frequency - Short Duration:


Low Frequency – Short Duration:

IV: Friend

Low Frequency - High Duration:



Number of Contact



Contact Duration



Betweenness Centrality

  • Frequency of a node that falls on the shortest path between two other nodes




uncovering community
Uncovering Community

Contact trace in form of weighted (multi) graphs

Contact Frequency and Duration

Use community detection algorithms from complex network studies

K-clique, Weighted network analysis, Betweenness, Modularity, Fiedler Clustering etc.

Fiedler Clustering




Visualisation of Community Dynamics


Extending Data Collection to OSN

  • Online Social Networks (e.g. Facebook, Twitter)
    • Potential to obtain data of dynamic behaviour
    • High volume of data
  • Does Facebook matter?
  • Over 190 M users
  • Growth rates for 2008 around the world
    • Italy: 2900%, Argentina: 2000%, Indonesia: 600



Power Law Degree Distribution

  • Crawled original Stanford (15043 Nodes), Harvard (18273 nodes) networks
    • From era when UIDs assign sequentially
  • Obtains friends of each user, and their affiliations
  • 2.1 million links, Maximum degree 911



Information Cascade thru Social Networks

  • Use Google geo-coding API - predict the geographical access patterns
  • T0................................................Tk







  • Real World Data is Powerful!
  • Analyse Network Structure of Social Systems to Model Dynamics  Emerging Research Area
    • Weighted networks
    • Modularity
    • Centrality (e.g. Degree)
    • Community evolution and dynamics
    • Network measurement metrics
    • Patterns of interactions
  • Plan purpose of data collection first that leads to decide data collection method
  • Solve ethic issues/approval in advance
  • Combine data collection using device and available online data for efficiency and accuracy

Thank You!


  • Real World Data is Powerful!
  • Analyse Network Structure of Social Systems to Model Dynamics  Emerging Research Area
    • Weighted networks
    • Modularity
    • Centrality (e.g. Degree)
    • Community evolution and dynamics
    • Network measurement metrics
    • Patterns of interactions
  • Plan purpose of data collection first that leads to decide data collection method
  • Solve ethic issues/approval in advance
  • Combine data collection using device and available online data for efficiency and accuracy
3 challenging opportunities

3. Challenging Opportunities

Jon Crowcroft,

history personal
History (personal:-)



Tschudin et al




Pocket Switched & Mobile Social




choosing adversity
Choosing Adversity

Perverse, but valid research motive

Make the network really really bad

(like it was in 1970s)

And maybe neat new ideas will emerge

Which will work really, really well on a rock-solid network

compete with infrastructure
Compete with Infrastructure

“They have the guns, we have the numbers”

But maybe opportunities give us information the infrastructure guys can’t or won’t get…


Hard to compute

Mostly assume rational selfish players

Recent market failures prove this is nonsense

What to do instead?

Use a priori social knowledge

Travel plans, SIM, Fb/Buzz data

privacy and risk aversion
Privacy and Risk Aversion

May be over sold

Known: younger people are more cavalier with their online presence than older (pre web) generation

But needs respect

at least informed choice (opt out) by user

Prob. With id+loc is it is 2/3 of what you need to find out everything

(2 digits of postcode, age +gender)

There may be some trigger event which will change public view

back to drawing board 0
Back to drawing board #0
  • Information theory and opportunities
  • What can we infer
    • popularity in meeting
    • Popularity in communicating
    • Hub/centrality
    • Clique/giant component
  • Predictive patterns of behaviour
    • Latest barabasi science paper on locn
    • Other?
back to drawing board 1
Back to drawing board #1

Non rational players

Tools to measure & adapt to



Opinion dynamics

back to drawing board 2
Back to drawing board #2

One small step at a time

Pair of nodes -

why share anything?

What’s useful

What does it cost

Micro-research agenda…

share between just 1 pair of phones
Share between just 1 pair of phones

Now a phone is much more than a computer

GPS, Camera, Mike,

Compass, Accelerometer

several networks

Several (heterogeneous) cores in processor

We could share these

e.g. lots of people taking panoramic tiled photos,

or 1 GPS providing lots of people with location

lets look at actual resource costs
Lets look at actual resource costs

Phone OS now about same as Desktop

Android == Linux

Iphone == OSX

Windows Mobile 6 (actually Windows 7!)

Etc etc

Software uses resources too

E.g. Java garbage collector surprise

Power/network aware applications…

narseo s results
Narseo’s results…

We’ve started looking at resource use in battery terms

Calibrate OS tools for battery charge reporting

By opening up phone and putting probe on battery:)

Then run experiment with lots of users…

fooling the user
Fooling the user

Buzz/Mobile Social

Driving License

Smart Badges:)

back to drawing board 3
Back to Drawing Board #3

What business model fools user best?

What are the ethics?

Buzz was first “big bang” social mix

Take 1 network (gmail contacts, sorted by frequency of interaction)

And bootstrap another with it

How big a cognitive dissonance would this be to do on an opportunistic net?

Without informed consent, would cause major major headaches

Possibly illegal – viz healthcare workers


Thanks to MSR for a bunch of WiMo phones

Thanks to Google for a bunch of Android phones

Thanks to volunteers in Cambridge for abandoning almost all privacy :-)


Do we need both the guns and the numbers?

The truth is out there…

Eiko Yoneki, Ioannis Baltopoulos and Jon Crowcroft

University of Cambridge Computer Laboratory

Systems Research Group

D3N*4 Programming Distributed Computation in Pocket Switched Networks

*Data Driven Declarative Networking

rise of sparse disconnected networks
Rise of Sparse Disconnected Networks

Haggle EU FP6: New communication paradigm using dynamic interconnectedness


By necessity or design


With enough mobility for some connectivity over time

Path existing over time

Data has to be delay tolerant

Opportunistic Forwarding instead Routing



pocket switched networks
Pocket Switched Networks
  • Topology changes every time unit
  • Node 35 is a hub


Use of dynamic human connectivity

haggle node architecture
Haggle Node Architecture
  • Each node maintains a data store: its current view of global namespace
    • Persistence of search: delay tolerance and opportunism
  • Semantics of publish/subscribe and an event-driven + asynchronous operation
  • Multi-platform
    • (written in C++ and C)
    • Windows mobile
    • Mac OS X, iPhone
    • Linux
    • Android

Unified Metadata Namespace







D3N Data-Driven Declarative Networking

  • How to program distributed computation?
  • Use Declarative Networking ?

Declarative Networking

  • Declarative is new idea in networking
    • e.g. Search: ‘what to look for’ rather than ‘how to look for’
    • Abstract complexity in networking/data processing
  • P2: Building overlay using Overlog
    • Network properties specified declaratively
  • LINQ: extend .NET with language integrated operations for query/store/transform data
  • DryadLINQ: extends LINQ similar to Google’s Map-Reduce
    • Automatic parallelization from sequential declarative code
  • Opis: Functional-reactive approach in OCaml

D3N Data-Driven Declarative Networking

  • How to program distributed computation?
  • Use Declarative Networking
    • Use of Functional Programming
      • Simple/clean semantics, expressive, inherent parallelism
    • Queries/Filer etc. can be expressed as higher-order functions that are applied in a distributed setting
  • Runtime system provides the necessary native library functions that are specific to each device
    • Prototype: F# + .NET for mobile devices

D3N and Functional Programming I

  • Functions are first-class values
    • They can be both input and output of other functions
    • They can be shared between different nodes (code mobility)
    • Not only data but also functions flow
  • Language syntax does not have state
    • Variables are only ever assigned once; hence reasoning about programs becomes easier

(of course message passing and threads  encode states)

  • Strongly typed
    • Static assurance that the program does not ‘go wrong’ at runtime unlike script languages
  • Type inference
    • Types are not declared explicitly, hence programs are less verbose

D3N and Functional Programming II

  • Integrated features from query language
    • Assurance as in logical programming
  • Appropriate level of abstraction
    • Imperative languages closely specify the implementation details (how); declarative languages abstract too much (what)
    • Imperative – predictable result about performance
    • Declarative language – abstract away many implementation issues
overview of d 3 n architecture
Overview of D3N Architecture
  • Each node is responsible for storing, indexing, searching, and delivering data
  • Primitive functions associated with core D3N calculus syntax are part of the runtime system
  • Prototype on MS Mobile .NET


d 3 n syntax and semantics i
D3N Syntax and Semantics I


  • Very few primitives
    • Integer, strings, lists, floating point numbers and other primitives are recovered through constructor application
  • Standard FP features
    • Declaring and naming functions through let-bindings
    • Calling primitive and user-defined functions (function application)
    • Pattern matching (similar to switch statement)
    • Standard features as ordinary programming languages (e.g. ML or Haskell)
d 3 n syntax and semantics ii
D3N Syntax and Semantics II


  • Advanced features
    • Concurrency (fork)
    • Communication (send/receive primitives)
    • Query expressions (local and distributed select)
runtime system
Runtime System


  • Language relies on a small runtime system
    • Operations implemented in the runtime system written in F#
  • Each node is responsible on data:
    • Storing
    • Indexing
    • Searching
    • Delivering
    • Data has Time-To-Live (TTL)
    • Each node propagates data to the other nodes.
    • A search query w/TTL travels within the network until it expires
    • When the node has the matching data, it forwards the data
    • Each node gossips its own metadata when it meets other nodes
kernel event handler
Kernel Event Handler


  • Kernel maintains
    • An event queue (queue)
    • A list of functions for each event (fenc, fdep)
  • Kernel processes
    • It removes an event from the front of the queue (e)
    • Pattern matches against the event type
    • Calls all the registered functions for the particular event

Example: Query to Networks


select name from poll() where institute = “Computer Laboratory”


|> filter (fun r -> = “Computer Laboratory”)

|> map (fun r ->







(code, nodeid, TTL, data)


  • Queries are part of source level syntax
    • Distributed execution (single node programmer model)
    • Familiar syntax
example vote among nodes
Example: Vote among Nodes


  • Voting application: implements a distributed voting protocol of choosing location for dinner
  • Rules
    • Each node votes once
    • A single node initiates the application
    • Ballots should not be counted twice
    • No infrastructure-base communication is available or it is too expensive
  • Top-level expression
    • Node A sends the code to all nodes
    • Nodes map in parallel (pmap) the function voteOfNodeto their local data, and send back the result to A
    • Node A aggregates (reduce) the results from all nodes and produces a final tally
sequential map function smap
Sequential Map function (smap)


  • Inner working
    • It sends the code to execute on the remote node
    • It blocks waiting for a response waiting from the node
    • Continues mapping the function to the rest of the nodes in a sequential fashion
    • An unavailable node blocks the entire computation
parallel map function pmap
Parallel Map Function (pmap)










  • Inner working
    • Similar to the sequential case
    • The send/receive for each node happen in a separate thread
    • An unavailable node does not block the entire computation
reduce function
Reduce Function


  • Inner working
    • The reduce function aggregates the results from a map
    • The reduce gets executed on the initiator node
    • All results must have been received before the reduce can proceed
cascaded map function
Cascaded Map Function








(a) Social Graph










(c) Nodes for Map at B

(b) Nodes for Map at A


  • Social Graph can be exploited for map function
    • Logical topology extracted from social networks
    • Construct a minimum spanning tree with node A
    • Use tree as navigation of task
outlook and future work
Outlook and Future Work
  • Current reference implementation:
    • F# targeting .NET platform taking advantage of a vast collection of .NET libraries for implementing D3N primitives
  • Future work:
    • Security issues are currently out of the scope of this paper. Executable code migrating from node to node
    • Validate and verify the correctness of the design by implementing a compiler targeting various mobile devices
    • Disclose code in public domain