Presentation is loading. Please wait.

Presentation is loading. Please wait.

Xuan Bao and Romit Roy Choudhury Mobicom 08 ACM MobiHeld 2009 VUPoints: Collaborative Sensing and Video Recording through Mobile Phones VUPoints: Collaborative.

Similar presentations


Presentation on theme: "Xuan Bao and Romit Roy Choudhury Mobicom 08 ACM MobiHeld 2009 VUPoints: Collaborative Sensing and Video Recording through Mobile Phones VUPoints: Collaborative."— Presentation transcript:

1 Xuan Bao and Romit Roy Choudhury Mobicom 08 ACM MobiHeld 2009 VUPoints: Collaborative Sensing and Video Recording through Mobile Phones VUPoints: Collaborative Sensing and Video Recording through Mobile Phones

2 Context Next generation mobile phones will have large number of sensors Cameras, microphones, accelerometers, GPS, compasses, health monitors, …

3 Each phone may be viewed as a micro lens Exposing a micro view of the physical world to the Internet Context

4 With 3 billion active phones in the world today (the fastest growing comuting platform …) Our Vision is … Context

5 Internet A Virtual Information Telescope

6 Our Prior Work  Micro-Blog [MobiSys08] Intantiate this vision through  mobile blogs  Sensor querying  Participatory responses Virtual Telescope Cellular, WiFi Cellular, WiFi Web Service Web Service Phones People

7 Motivating Current Work  Several research questions …  Of which, one arises frequently  Which information is of interest?  Humans already with a high noise level in life  Can such information be distilled out?  Based on a notion of “human interest”  Can this be done automatically?  Exploiting rich sensing, computing, communication capabilities

8 This Work: VUPoints Early effort towards information distillation in a restricted application space Asks the question: Can phones identify “interesting” events in a social occasion and “record” them automatically, through mobile phones … … creating a highlights of the occasion

9 VUPoints  Envisioning the end product  Imagine a social party of the future  Assume phones are wearable  The goal is to get a 10 minute video highlights of the party  Without human intervention  The idea  Mobile phones sense the ambience  Collaboratively infer an “interesting event”  Trigger a video recording on the phone with a good view  Finally, stitches all the clips to form the highlights

10 Event Coverage  Several sensing opportunities to detect events  Birthday cake comes … everyone turns to the table  Compass orientations from collocated phones suggest event  People dance to music  Accelerometers and microphone observe simultaneous activity  People laught and clap at jokes  Acoustic signatures match, triggering video recording … etc.

11 Another Perspective  Video highlights is = social event coverage  Analogous to spatial coverage in sensor networks

12 Applications  Personal travel blogging  Phone identifies and automatically clicks photos/videos  Smart, distributed surveillance  Don’t miss your baby’s first crawl, laugh, talk …  Get a highlights of important events in the office  Multi-view vision  Watching a basketball game from multiple viewpoints

13 Architecture Identify multi-modal event triggers Video record from Phone with best view Video record from Phone with best view

14 Design Challenges  Social grouping  Which phones form the same social group?  Not necessarily spatial  Trigger detection  Characterizing social interest to measurable triggers  Dictionary of sensor triggers (collaborative)  View selection  Which phone in the best view  What is “best”? … what is “view”?

15 Social Grouping  Acoustic  High frequency ring tones  Phones grouped based on audible ringtones  Light  Intensity sensed through camera  3 intensity bands Bright Regular Dark

16 Social Grouping  Similar views  Multiple phones looking at the same object  Exploiting spatiograms  Equation

17 Event Trigger Detection  Simultaneous orientation changes  Large number of people may rotate towards the birthday cake, or towards wedding speech …  Ambience fluctuations  Noise floor might increase  Light intensity might change  New signatures detected  …

18 Event Trigger Detection  Acoustic signatures  Laughter, clapping, whistle, screaming, singing …

19 View Selection  Current prototype activates all phones  In the same social group  “Best view” manually selected later

20 Building and Experiment Setting  Nokia N95 phones + Nokia 6210  Python + Symbian C++  MATLAB  Basic SVM libraries (for acoustic signature classification)  Artificial social gathering of students  5 students taped phones on shirt pocket  Gathered in a group  Chatting  Watching movies  Playing video games

21 Methodology and Metric  One dedicated phone for complete recording  Other phones run VUPoints  Form groups, search for triggers (time-stamps them)  Triggers used to select “interesting” clips (offline)  Clips stitched to form highlights  Metric  Manually identified logical events from original video  Identified events labeled  VUPoints identifies 10 second events  Observe the overlap in the events  Observe detection delay  Compute time-window overlap

22 Results  Trigger detection examined  Time of detection  Accuracy (A third person identified interesting events … VUPoints was matched against these events)

23 Limitations  VUPoints in very early stage  Event space detected is limited  Difficult to identify “what humans perceive interesting events”  Aggressive event detection --> false positives  Cameras in poor view of the event (even if wearable)  Occlusions in front  Ongoing work  Exploring larger set of triggers  Offload the central server -- some tasks at the phone  Energy efficiency of phones  Possibility to combine with other devices  Wall mounted cameras, webcams, laptop microphones, …

24 Conclusion  Mobile phones becoming capability-rich  Exploiting them as sensor network  Different from existing mote-based networks  Human centric  Complex objectives, often subjective  New kinds of problems  Developing VUPoints:  A collaborative framework for ambience sensing and video recording  Many challenges … only scratching the surface

25 Thanks Visit Systems Networking Research Group (SyNRG) @ Duke University Google “synrg duke”

26 Context  Sensing, Computing, Communication  Converging on the mobile phone platform  Combined with density, humans  Forms a capability-rich platform  Question is …  What can we do with it? Where are the opportunities?  Mobile phones equipped with multiple sensors  Camera, microphone, accelerometer, compass …  Almost everyone carrying a phone  Almost 3 billion phones today

27 Drawing Parallels  One way to view this is like a sensor network  With more powerful capabilities  But more important distinctions are …  Human interfacing / participatory  Human scale density  Human mobility  Personal  In view of this, we are viewing the phone platform as  A sensor network for human applications

28 Rough  A stripped down version  Simplify the context as a starting step  We motivate with a use-case:  Imagine you go to a party.  The goal is to get a 10 minute video highlights …

29 New Research Pastures  Many new problems  In the context of humans  The translation is like:  Localization  Coverage  Energy efficiency  Security  Privacy

30 Content Sharing Virtual Telescope Cellular, WiFi Cellular, WiFi Visualization Service Web Service People Physical Space Phones

31 Content Querying Virtual Telescope Cellular, WiFi Cellular, WiFi Visualization Service Web Service Phones People Physical Space Some queries participatory Is beach parking available? Some queries participatory Is beach parking available? Others are not Is there WiFi at the beach café? Others are not Is there WiFi at the beach café?

32 SyNRG Demo Setting The experiments involved 4 users, pretending to be in different types of gatherings. Each user taped a Nokia N95 phone near his shirt pocket. The N95 model has a 5 megapixel camera, and a 3-axis accelerometer. Two of the users also carried a Nokia N6210 in their pockets--the N6210 has a compass that the N95s do not have. The user-carried phones formed social groups and detected triggers throughout the entire occasion. The occasions were also continuously video-recorded by a separate phone. At the end, all sensed and video-recorded data (from all the phones) were downloaded, and processed in MATLAB. The triggers were identified, and using their time-stamps, a 20-second video-clip for each trigger was extracted from the continuous video file. All the clips were then “stitched” in a chronological manner.

33

34 VUPoints: An attempt to collaboratively sense and video record social events through mobile phones


Download ppt "Xuan Bao and Romit Roy Choudhury Mobicom 08 ACM MobiHeld 2009 VUPoints: Collaborative Sensing and Video Recording through Mobile Phones VUPoints: Collaborative."

Similar presentations


Ads by Google