Download presentation
Presentation is loading. Please wait.
Published byAnn Powell Modified over 9 years ago
1
China Mobile Leader’s Programme Mobile Technology Jon Crowcroft http://www.cl.cam.ac.uk/~jac22 Jon.crowcroft@cl.cam.ac.uk +gmail, hotmail +441223763633 +447733 231822 +linkedin, facebook, myspace
2
4 Areas Mobile Social Networks Data Collection Energy Programming
3
1. Online Mobile Social Nets & Real Life
4
We meet, we connect, we communicate We meet in real life in the real world We use text messages, phones, IM We make friends on facebook, Second Life How are these related? How do they affect each other? How do they change with new technology?
5
Give it to me, I have 1G bytes phone flash. I have 100M bytes of data, who can carry for me? I can also carry for you! Thank you but you are in the opposite direction! Don’t give to me! I am running out of storage. Reach an access point. Internet Finally, it arrive… Search La Bonheme.mp3 for me There is one in my pocket…
6
My facebook friendswheel
7
My email statistics!
8
Cliques and Communities
9
Dunbar’s Number & Trust Dunbar’s number:-150 (for humans) Size of simple communities of humans Reflects ability to cope with group Humans gossip rather than physical grooming Language lets us abstract We can reason up to 5 levels of intentionality (Shakespear does 6 :-) T = 1 / [3.x^N] T is trust metric 3.x is a number between 3 and 4 N is distance in social net
10
Conjecture on N? N = 0 = Kin (sex) N = 1 = friends (beer/drugs) N = 2 or more = acquaintances (dancing/music/laughing at same jokes) How does this help in facebook?
11
Conjecture on Online v. Real We’re looking at co-lo networks c.f. haggle, cityware - bluetooth etc AND online social networks Friendship graph on orkut,li,facebook AND communication networks Email address book, sms, phonecalls Can use to infer real relationship I.e. type of edge in graph (and value of N)
12
Conjectures on Trust Trust in terms of revelation/disclosure Or carrying data (in ferry net) Or simple automated/default grouping for ACLs Need to do some experiments Figure out how ties are broken Forgetting How new tools/technology affect Size and dynamics of social net…
13
EU Social Net Project Questions What net/edge type is more likely to cause an edge in another net? Does meeting someone dominate over online or vice versa - i.e. how does new tech affect x (size of immediate gang) and N (scope of gang/level of intentionality reasoning?)? Can you use this to detect dodgy behaviour (spam, bullying, etc)?
14
Ongoing studies Data? We have large datasets for single edge-type/modality (6M phone call timeloc, 1M social net) But only very small datasets for 2 or 3 modalities 30 army base people -> retirement 100 school leavers -> University Very heavy-lifting Not only lots of data processsing, but worse:- Interview eahc user for context Privacy? Correlating (datamining) the different nets is massive breach of trust Usefulness?
15
1.Improve privacy 1.As mentioned, could auto-default Fb settings and relate to phone/locn 2.Could also use as interest based filter 2.Fundamental understanding of social groups 1.How society/technology co-evolve 2.Social inclusion and accessibility (!) 3.Epidemiology (*) 4.Buzztraq 1.Use currency of local interest to 2.Fetch content…
16
Epidemiology Two projects - Emulation (ESRC) Run s/w on smart phone that mimics a disease Has a “vector” and SIR(!) parameter per person Run on “real socieity” based on meeting duration/proximity/frequency Flubook (Horizon) Panic button (“Not well”/”Feelin better”) Uploads list of contacts in last week via free SMS Puts anonymized data on google maps Alerts trusted friendship group on facebook
17
SIR Susceptibility, Infectiousness, Recovery Given contact distribution, Can compute progress of epidemic Whether collapse (S, I low, R high) Or go pandemic (S, I high) As with relationship between online and RL behaviour for socialising, Flubook might alter contact rate… ….systematically for subset of population …(social or geographic) with high S/I Help prevent/collapse epidemic
18
Thank you… Questions? …
19
And another thing Virtualising online social self Floating it in the “cloud” Crypt content, but allow cloud/fb to match interests (for advertising) Migrate it to track user (and handset) Performance gain handset can be meagre cpu/memory Latency reduced Synchronisation/persistence assured Don’t care if handset lost/stolen :-)
20
Snakes (and Ladders) on a Plane Human Node World
21
Threads of your life Human level is activities & relationships Nodal level is processing and storage World level is location and context
22
Idea is… To allow mobile (compact/portable) representation of your activities and relationships (0wned by ou) Roam across arbitrary nodes in environment (embedded or handset owned by anyone) While recording where you are and context (= other people)
23
2. Data Collection for Modelling Contact Networks Eiko Yoneki and Jon Crowcroft eiko.yoneki@cl.cam.ac.uk Systems Research Group University of Cambridge Computer Laboratory
24
Outline Purposes of Data Collection Modelling Human Contact Networks Proximity Data Collection Methodology Issues for Data Collection Examples of Data Analysis Extending to Collect/Correlate Online Data Conclusion
25
Purpose of Data Collection Building communication protocol based on proximity EU FP6 Haggle Project Inferring social interaction, opinion dynamics Apply results to networking and computer systems EU FP7 Socialnets, EU FP7 Recognition Network modelling for epidemiology EPSRC Data Driven Network Modelling for Epidemiology Understanding behaviour to infectious disease outbreak - social and economic influences ESRC FluPhone Project 25
26
Legacy network (e.g. the Internet) Legacy network (e.g. the Internet) Ex. Haggle Twitter Haggle: Pocket Switched Networks EU FP6 Haggle http://www.haggleproject.org Networked distributed database over opportunistically connected devices (e.g. Mobile phones) 26
27
FluPhone Project Understanding behavioural responses to infectious disease outbreaks Extending data collection to general public https://www.fluphone.org 27
28
Purpose of Data Collection Robust data collection from real world Post-facto analysis and modelling yield insight into human interactions Data is useful from building communication protocol to understanding disease spread 28 Modelling Contact Networks: Empirical Approach
29
Proximity Data Collection Sensor board (iMote), mobile phone Proximity detection by Bluetooth, and/or GPS Environmental information (e.g. in train, on road) AroundYou FluPhone iMote 29
30
Proximity Detection by Bluetooth Only ~=15% of devices Bluetooth on Scanning Interval 2 mins iMote (one week battery life) 5 mins phone (one day battery life) or continuous scanning by station nodes Bluetooth inquiry (e.g. 5.12 seconds) gives >90% chance of finding device Complex discovery protocol Two modes: discovery and being discovered 5~10m discover range 30 Can it produce reliable data (negligible noise)?
31
Sensor Board or Phone or... iMote needs disposable battery Expensive Third world experiment Mobile phone Rechargeable Additional functions (messaging, tracing) Smart phone: location assist applications Provide device or software Combine with online information (e.g. Twitter) 31
32
Phone Price vs Functionality ~<20 GBP range Single task (no phone call when application is running) ~>100 GBP GPS capability Multiple tasks – run application as a background job Challenge to provide software for every operation system of mobile phone 32
33
Location Data Location data necessary? Ethic approval gets tougher Use of WiFi Access Points or Cell Towers Use of GPS but not inside of buildings Infer location using various information Online Data (Social Network Services, Google) Us of limited location information – Post localisation Scanner Location in Bath 33
34
Provide devices to limited population or target general public For epidemiology study ~=100% coverage may be required Fluphone project: participants will be general public Or school as mixing centres Target Population 34
35
Experiment Parameters vs Data Quality Battery life vs Granularity of detection interval Duration of experiments Day, week, month, or year? Data rate Data Storage Contact /GPS data <50K per device per day (in compressed format) Server data storage for receiving data from devices Extend storage by larger memory card Collected data using different parameters or methods aggregated? 35
36
Data Retrieval Methods Retrieving collected data: Tracking station Online (3G, SMS) Uploading via Web via memory card Incentive for participating experiments Collection cycle: real-time, day, or week? 36
37
Data Transformation for Analysis Transform to discrete version of contact data Deal with noise and missing data Ex. transitivity closure Data analysis requires high performance computer and storage Low volume - raw data in compact format Transformation of raw data for analysis increases data volume 37
38
Security and Privacy Current method: Basic anonymisation of identities (MAC address) FluPhone Project – use of HTTPS for data transmission via 3G Anonymising identities may not be enough? Simple anonymisation does not prevent to be found the social graph Ethic approval tough! 40 pages of study protocol document for FluPhone project – took several months to get approval 38
39
Consent 39
40
Human Connectivity Traces Capture Human Interactions..thus far not large scale Crawdad DB http://crawdad.cs.dartmouth.edu/ Contact: 025d04b2b3f 4650000025d0 5416492246711621549 5416492246711644527 Location: 0025d0e113da [lon: -3.384610278596745E125; lat: 1.3168305280597862E182] 5066619950170431763 HAGGLE 40
41
Size of largest connected nodes shows network dynamics Tuesday5 Days Regularity of Network Activity 41
42
Inter Contact Time of Pair Nodes Power law distribution (+ exponential decay) cutoff Time 42
43
Classification of Node Pairs I: Community High Frequency - Long Duration: II: Familiar Stranger High Frequency - Short Duration: III: Stranger Low Frequency – Short Duration: IV: Friend Low Frequency - High Duration: Contact Duration Number of Contact III IIIIV 43
44
Betweenness Centrality MIT Cambridge Frequency of a node that falls on the shortest path between two other nodes 44
45
Fiedler Clustering Uncovering Community Contact trace in form of weighted (multi) graphs Contact Frequency and Duration Use community detection algorithms from complex network studies K-clique, Weighted network analysis, Betweenness, Modularity, Fiedler Clustering etc. 45
46
Visualisation of Community Dynamics 46
47
Extending Data Collection to OSN Online Social Networks (e.g. Facebook, Twitter) Potential to obtain data of dynamic behaviour High volume of data Does Facebook matter? Over 190 M users Growth rates for 2008 around the world Italy: 2900%, Argentina: 2000%, Indonesia: 600 47
48
Power Law Degree Distribution 48 Crawled original Stanford (15043 Nodes), Harvard (18273 nodes) networks From era when UIDs assign sequentially Obtains friends of each user, and their affiliations 2.1 million links, Maximum degree 911
49
Information Cascade thru Social Networks Use Google geo-coding API - predict the geographical access patterns T 0................................................T k Texas IllinoisFlorida 49
50
Conclusions Real World Data is Powerful! Analyse Network Structure of Social Systems to Model Dynamics Emerging Research Area Weighted networks Modularity Centrality (e.g. Degree) Community evolution and dynamics Network measurement metrics Patterns of interactions Plan purpose of data collection first that leads to decide data collection method Solve ethic issues/approval in advance Combine data collection using device and available online data for efficiency and accuracy 50
51
Conclusions Real World Data is Powerful! Analyse Network Structure of Social Systems to Model Dynamics Emerging Research Area Weighted networks Modularity Centrality (e.g. Degree) Community evolution and dynamics Network measurement metrics Patterns of interactions Plan purpose of data collection first that leads to decide data collection method Solve ethic issues/approval in advance Combine data collection using device and available online data for efficiency and accuracy 51 Thank You!
52
3. Challenging Opportunities Jon Crowcroft, http://www.cl.cam.ac.uk/~jac22
53
History (personal:-) Manet Mobileman Tschudin et al Incredibles Dtn Interplanetary/Oceanographic Pocket Switched & Mobile Social Oppnet Drive-Thru Disaster
54
Choosing Adversity Perverse, but valid research motive Make the network really really bad (like it was in 1970s) And maybe neat new ideas will emerge Which will work really, really well on a rock-solid network
55
Compete with Infrastructure “They have the guns, we have the numbers” But maybe opportunities give us information the infrastructure guys can’t or won’t get…
56
Incentives Hard to compute Mostly assume rational selfish players Recent market failures prove this is nonsense What to do instead? Use a priori social knowledge Travel plans, SIM, Fb/Buzz data
57
Privacy and Risk Aversion May be over sold Known: younger people are more cavalier with their online presence than older (pre web) generation But needs respect at least informed choice (opt out) by user Prob. With id+loc is it is 2/3 of what you need to find out everything (2 digits of postcode, age +gender) There may be some trigger event which will change public view
58
Back to drawing board #0 Information theory and opportunities What can we infer popularity in meeting Popularity in communicating Hub/centrality Clique/giant component Predictive patterns of behaviour Latest barabasi science paper on locn Other?
59
Back to drawing board #1 Non rational players Tools to measure & adapt to Herding Cascading Opinion dynamics
60
Back to drawing board #2 One small step at a time Pair of nodes - why share anything? What’s useful What does it cost Micro-research agenda…
61
Share between just 1 pair of phones Now a phone is much more than a computer GPS, Camera, Mike, Compass, Accelerometer several networks Several (heterogeneous) cores in processor We could share these e.g. lots of people taking panoramic tiled photos, or 1 GPS providing lots of people with location
62
Lets look at actual resource costs Phone OS now about same as Desktop Android == Linux Iphone == OSX Windows Mobile 6 (actually Windows 7!) Etc etc Software uses resources too E.g. Java garbage collector surprise Power/network aware applications…
63
Narseo’s results… We’ve started looking at resource use in battery terms Calibrate OS tools for battery charge reporting By opening up phone and putting probe on battery:) Then run experiment with lots of users…
64
Principal components on b’s phone
65
Principal components on T’s phone
66
N’s phone charging correlogram
67
N’s cell location correlogram
68
N’s “screen on” correlogram
69
J’s interaction v. location
70
J’s net usage by location
71
PCA Analysis
72
Average principal components
73
Fooling the user Buzz/Mobile Social Driving License Smart Badges:)
74
Back to Drawing Board #3 What business model fools user best? What are the ethics? Buzz was first “big bang” social mix Take 1 network (gmail contacts, sorted by frequency of interaction) And bootstrap another with it How big a cognitive dissonance would this be to do on an opportunistic net? Without informed consent, would cause major major headaches Possibly illegal – viz healthcare workers
75
Acknowledgements Thanks to MSR for a bunch of WiMo phones Thanks to Google for a bunch of Android phones Thanks to volunteers in Cambridge for abandoning almost all privacy :-)
76
Questions… Do we need both the guns and the numbers? The truth is out there…
77
Eiko Yoneki, Ioannis Baltopoulos and Jon Crowcroft University of Cambridge Computer Laboratory Systems Research Group D 3 N* 4 Programming Distributed Computation in Pocket Switched Networks * Data Driven Declarative Networking
78
78 Rise of Sparse Disconnected Networks Haggle EU FP6: New communication paradigm using dynamic interconnectedness http://www.haggleproject.org Disconnected By necessity or design Mobile With enough mobility for some connectivity over time Path existing over time Data has to be delay tolerant Opportunistic Forwarding instead Routing 1+16
79
Pocket Switched Networks Human-to-Human Use of dynamic human connectivity http://www.cl.cam.ac.uk/~ey204/Haggle/Vis/ Topology changes every time unit Node 35 is a hub
80
Haggle Node Architecture 80 Each node maintains a data store: its current view of global namespace Persistence of search: delay tolerance and opportunism Semantics of publish/subscribe and an event-driven + asynchronous operation Multi-platform (written in C++ and C) Windows mobile Mac OS X, iPhone Linux Android Unified Metadata Namespace node data Search Append
81
How to program distributed computation? Use Declarative Networking ? D 3 N Data-Driven Declarative Networking
82
Declarative is new idea in networking e.g. Search: ‘what to look for’ rather than ‘how to look for’ Abstract complexity in networking/data processing P2: Building overlay using Overlog Network properties specified declaratively LINQ: extend.NET with language integrated operations for query/store/transform data DryadLINQ: extends LINQ similar to Google’s Map-Reduce Automatic parallelization from sequential declarative code Opis: Functional-reactive approach in OCaml Declarative Networking
83
How to program distributed computation? Use Declarative Networking Use of Functional Programming Simple/clean semantics, expressive, inherent parallelism Queries/Filer etc. can be expressed as higher-order functions that are applied in a distributed setting Runtime system provides the necessary native library functions that are specific to each device Prototype: F# +.NET for mobile devices D 3 N Data-Driven Declarative Networking
84
Functions are first-class values They can be both input and output of other functions They can be shared between different nodes (code mobility) Not only data but also functions flow Language syntax does not have state Variables are only ever assigned once; hence reasoning about programs becomes easier (of course message passing and threads encode states) Strongly typed Static assurance that the program does not ‘go wrong’ at runtime unlike script languages Type inference Types are not declared explicitly, hence programs are less verbose D 3 N and Functional Programming I
85
Integrated features from query language Assurance as in logical programming Appropriate level of abstraction Imperative languages closely specify the implementation details (how); declarative languages abstract too much (what) Imperative – predictable result about performance Declarative language – abstract away many implementation issues D 3 N and Functional Programming II
86
Overview of D 3 N Architecture 86 Each node is responsible for storing, indexing, searching, and delivering data Primitive functions associated with core D 3 N calculus syntax are part of the runtime system Prototype on MS Mobile.NET
87
D 3 N Syntax and Semantics I Very few primitives Integer, strings, lists, floating point numbers and other primitives are recovered through constructor application Standard FP features Declaring and naming functions through let-bindings Calling primitive and user-defined functions (function application) Pattern matching (similar to switch statement) Standard features as ordinary programming languages (e.g. ML or Haskell) 87
88
D 3 N Syntax and Semantics II Advanced features Concurrency (fork) Communication (send/receive primitives) Query expressions (local and distributed select) 88
89
D 3 N Language (Core Calculus Syntax) 89
90
Runtime System Language relies on a small runtime system Operations implemented in the runtime system written in F# Each node is responsible on data: Storing Indexing Searching Delivering Data has Time-To-Live (TTL) Each node propagates data to the other nodes. A search query w/TTL travels within the network until it expires When the node has the matching data, it forwards the data Each node gossips its own metadata when it meets other nodes 90
91
Kernel Event Handler Kernel maintains An event queue (queue) A list of functions for each event (fenc, fdep) Kernel processes It removes an event from the front of the queue (e) Pattern matches against the event type Calls all the registered functions for the particular event 91
92
Queries are part of source level syntax Distributed execution (single node programmer model) Familiar syntax Example: Query to Networks select name from poll() where institute = “Computer Laboratory” poll() |> filter (fun r -> r.institute = “Computer Laboratory”) |> map (fun r -> r.name) D 3 N: F#: Message: (code, nodeid, TTL, data) B A C D E
93
Example: Vote among Nodes 93 Voting application: implements a distributed voting protocol of choosing location for dinner Rules Each node votes once A single node initiates the application Ballots should not be counted twice No infrastructure-base communication is available or it is too expensive Top-level expression Node A sends the code to all nodes Nodes map in parallel (pmap) the function voteOfNode to their local data, and send back the result to A Node A aggregates (reduce) the results from all nodes and produces a final tally
94
Sequential Map function (smap) 94 Inner working It sends the code to execute on the remote node It blocks waiting for a response waiting from the node Continues mapping the function to the rest of the nodes in a sequential fashion An unavailable node blocks the entire computation
95
Parallel Map Function (pmap) 95 Inner working Similar to the sequential case The send/receive for each node happen in a separate thread An unavailable node does not block the entire computation A B C D EFG pmap
96
Reduce Function 96 Inner working The reduce function aggregates the results from a map The reduce gets executed on the initiator node All results must have been received before the reduce can proceed
97
Voting Application Code 97
98
Cascaded Map Function 98 Social Graph can be exploited for map function Logical topology extracted from social networks Construct a minimum spanning tree with node A Use tree as navigation of task A B C D E F G A B C D E (a) Social Graph (b) Nodes for Map at A B D E F (c) Nodes for Map at B
99
Outlook and Future Work Current reference implementation: F# targeting.NET platform taking advantage of a vast collection of.NET libraries for implementing D 3 N primitives Future work: Security issues are currently out of the scope of this paper. Executable code migrating from node to node Validate and verify the correctness of the design by implementing a compiler targeting various mobile devices Disclose code in public domain http://www.cl.cam.ac.uk/~ey204 Email: eiko.yoneki@cl.cam.ac.uk
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.