SoundSense: Scalable Sound Sensing for People-Centric Application on Mobile Phones Hon Lu, Wei Pan, Nocholas D. lane, Tanzeem Choudhury and Andrew T. Campbell.

Slides:



Advertisements
Similar presentations
The Sociometer: A Wearable Device for Understanding Human Networks Tanzeem Choudhury and Alex Pentland MIT Media Laboratory.
Advertisements

Display Power Management Policies in Practice Stephen P. Tarzia Peter A. Dinda Robert P. Dick Gokhan Memik Presented by: Andrew Hahn.
More Accurate Bus Prediction Allows Passengers to find alternate forms of transportation Do this with energy efficiency in mind Dont use any high level.
Darwin Phones: the Evolution of Sensing and Inference on Mobile Phones Emiliano Miluzzo *, Cory T. Cornelius *, Ashwin Ramaswamy *, Tanzeem Choudhury *,
Outline Activity recognition applications
Chunyi Peng, Guobin Shen, Yongguang Zhang, Yanlin Li, Kun Tan BeepBeep: A High Accuracy Acoustic Ranging System using COTS Mobile Devices.
Entropy and Dynamism Criteria for Voice Quality Classification Applications Authors: Peter D. Kukharchik, Igor E. Kheidorov, Hanna M. Lukashevich, Denis.
Virtual Sensing Range Emiliano Miluzzo, Nicholas D. Lane, and Andrew T. Campbell Computer Science Dept., Dartmouth College With support from the Institute.
Improvement of Audio Capture in Handheld Devices through Digital Filtering Problem Microphones in handheld devices are of low quality to reduce cost. This.
THE JIGSAW CONTINUOUS SENSING ENGINE FOR MOBILE PHONE APPLICATIONS Hong Lu,† Jun Yang,! Zhigang Liu,! Nicholas D. Lane,† Tanzeem Choudhury,† Andrew T.
D u k e S y s t e m s Sensing Meets Mobile Social Networks: The Design, Implementation and Evaluation of the CenceMe Application Emiliano Miluzzo†, Nicholas.
SoundSense: Scalable Sound Sensing for People-Centric Applications on Mobile Phones -Hong LU, Wei Pan, Nicholas D. Lane, Tanzeem Choudhury and Andrew T.
DARWIN PHONES: THE EVOLUTION OF SENSING AND INFERENCE ON MOBILE PHONES PRESENTED BY: BRANDON OCHS Emiliano Miluzzo, Cory T. Cornelius, Ashwin Ramaswamy,
Video Shot Boundary Detection at RMIT University Timo Volkmer, Saied Tahaghoghi, and Hugh E. Williams School of Computer Science & IT, RMIT University.
NEUROPHONE: BRAIN- MOBILE PHONE INTERFACE USING A WIRELESS EEG HEADSET Andrew T. Campbell, Tanzeem Choudhury, Shaohan Hu, Hong Lu, Matthew K. Mukerjee!,
A Survey of Mobile Phone Sensing
Toward Semantic Indexing and Retrieval Using Hierarchical Audio Models Wei-Ta Chu, Wen-Huang Cheng, Jane Yung-Jen Hsu and Ja-LingWu Multimedia Systems,
Chapter 1: Introduction to Pattern Recognition
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
TRADING OFF PREDICTION ACCURACY AND POWER CONSUMPTION FOR CONTEXT- AWARE WEARABLE COMPUTING Presented By: Jeff Khoshgozaran.
SENSING MEETS MOBILE SOCIAL NETWORKS: THE DESIGN, IMPLEMENTATION AND EVALUATION OF THE CENCEME APPLICATION Emiliano Miluzzo†, Nicholas D. Lane†, Kristóf.
A Practical Approach to Recognizing Physical Activities Jonathan Lester Tanzeem Choudhury Gaetano Borriello.
Slides modified and presented by Brandon Wilson.
Y. Wang, et al. Dept. of Electrical Engineering,
Computer Science Department A Speech / Music Discriminator using RMS and Zero-crossings Costas Panagiotakis and George Tziritas Department of Computer.
PeopleTones: a system for the detection and notification of buddy proximity on mobile phones Kevin A. Li Timothy Sohn Steven Huang William G. Griswold.
Hand Signals Recognition from Video Using 3D Motion Capture Archive Tai-Peng Tian Stan Sclaroff Computer Science Department B OSTON U NIVERSITY I. Introduction.
Gaussian Mixture-Sound Field Landmark Model for Robot Localization Talker: Prof. Jwu-Sheng Hu Department of Electrical and Control Engineering National.
A Characterization of Processor Performance in the VAX-11/780 From the ISCA Proceedings 1984 Emer & Clark.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Cross Strait Quad-Regional Radio Science and Wireless Technology Conference, Vol. 2, p.p. 980 – 984, July 2011 Cross Strait Quad-Regional Radio Science.
SensEye: A Multi-Tier Camera Sensor Network by Purushottam Kulkarni, Deepak Ganesan, Prashant Shenoy, and Qifeng Lu Presenters: Yen-Chia Chen and Ivan.
{ NeuroPhone: Brain-Mobile Phone Interface using a Wireless EEG Headset Andrew T. Campbell, Tanzeem Choudhury, Shaohan Hu, Hong Lu, Matthew K. Mukerjee.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Sensing Meets Mobile Social Networks: The Design, Implementation and Evaluation of the CenceMe Application Emiliano Miluzzo†, Nicholas D. Lane†, Kristóf.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
“SoundSense: Scalable Sound Sensing for People-Centric Applications on Mobile Phones” Authors: Hong Lu, Wei Pan, Nicholas D. Lane, Tanzeem Choudhury and.
Design, Implementation and Evaluation of CenceMe Application COSC7388 – Advanced Distributed Computing Presentation By Sushil Joshi.
SoundSense by Andrius Andrijauskas. Introduction  Today’s mobile phones come with various embedded sensors such as GPS, WiFi, compass, etc.  Arguably,
9 th Conference on Telecommunications – Conftele 2013 Castelo Branco, Portugal, May 8-10, 2013 Sara Candeias 1 Dirce Celorico 1 Jorge Proença 1 Arlindo.
TEMPLATE DESIGN © Detecting User Activities Using the Accelerometer on Android Smartphones Sauvik Das, Supervisor: Adrian.
SoundSense: Scalable Sound Sensing for People-Centric Applications on Mobile Phones -Hong LU, Wei Pan, Nicholas D. Lane, Tanzeem Choudhury and Andrew T.
Dan Rosenbaum Nir Muchtar Yoav Yosipovich Faculty member : Prof. Daniel LehmannIndustry Representative : Music Genome.
Experimental Results ■ Observations:  Overall detection accuracy increases as the length of observation window increases.  An observation window of 100.
Nicholas D. Lane, Hong Lu, Shane B. Eisenman, and Andrew T. Campbell Presenter: Pete Clements Cooperative Techniques Supporting Sensor- based People-centric.
1 Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition Qi Li, Senior Member, IEEE, Jinsong Zheng, Augustine.
The Second Life of a Sensor: Integrating Real-World Experience in Virtual Worlds using Mobile Phones Mirco Musolesi, Emiliano Miluzzo, Nicholas D. Lane,
James Pittman February 9, 2011 EEL 6788 MoVi: Mobile Phone based Video Highlights via Collaborative Sensing Xuan Bao Department of ECE Duke University.
The Sociometer: A Wearable Device for Understanding Human Networks
GSAF: A Grid-based Services Transfer Framework Chunyan Miao, Wang Wei, Zhiqi Shen, Tan Tin Wee.
Voice Activity Detection based on OptimallyWeighted Combination of Multiple Features Yusuke Kida and Tatsuya Kawahara School of Informatics, Kyoto University,
Bradley Cowie Supervised by Barry Irwin Security and Networks Research Group Department of Computer Science Rhodes University DATA CLASSIFICATION FOR CLASSIFIER.
Network Community Behavior to Infer Human Activities.
Performance Comparison of Speaker and Emotion Recognition
Automatic Speech Recognition A summary of contributions from multiple disciplines Mark D. Skowronski Computational Neuro-Engineering Lab Electrical and.
Predicting Voice Elicited Emotions
Arlindo Veiga Dirce Celorico Jorge Proença Sara Candeias Fernando Perdigão Prosodic and Phonetic Features for Speaking Styles Classification and Detection.
Sensing Meets Mobile Social Networks: The Design, Implementation and Evaluation of the CenceMe Application Emiliano Miluzzo†, Nicholas D. Lane†, Kristóf.
Pocket, Bag, Hand, etc. - Automatically Detecting Phone Context through Discovery Emiliano Miluzzoy, Michela Papandreax, Nicholas D. Laney, Hong Luy, Andrew.
Tom Lovett and Eamonn O’Neill Department of Computer Science University of Bath Bath BA2 7AY UK +44 (0) Social sensing:
RESEARCH MOTHODOLOGY SZRZ6014 Dr. Farzana Kabir Ahmad Taqiyah Khadijah Ghazali (814537) SENTIMENT ANALYSIS FOR VOICE OF THE CUSTOMER.
ADAPTIVE BABY MONITORING SYSTEM Team 56 Michael Qiu, Luis Ramirez, Yueyang Lin ECE 445 Senior Design May 3, 2016.
Research Methodology Proposal Prepared by: Norhasmizawati Ibrahim (813750)
A Survey of Mobile Phone Sensing Nicholas D. Lane Emiliano Miluzzo Hong Lu Daniel Peebles Tanzeem Choudhury - Assistant Professor Andrew T. Campbell -
IMAGE PROCESSING APPLIED TO TRAFFIC QUEUE DETECTION ALGORITHM.
Traffic State Detection Using Acoustics
Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas
EE513 Audio Signals and Systems
John H.L. Hansen & Taufiq Al Babba Hasan
Automatic Prosodic Event Detection
Presentation transcript:

SoundSense: Scalable Sound Sensing for People-Centric Application on Mobile Phones Hon Lu, Wei Pan, Nocholas D. lane, Tanzeem Choudhury and Andrew T. Campbell Department of Computer Science, Dartmouth College

Motivation: Utilizing the microphone sensor to detect personalized sound events. Sound captured by mobile phone’s microphone is a rich source of information for surrounding environment, social environment, conversation, activity, location, dietary etc.

What is SoundSense? Scalable Sound Sensing Framework: Capable of identifying any meaningful sound events of a user’s daily life. Implemented for resource limited devices, Apple iPhone. System solely runs in mobile phone

Contribution First general purpose sound event classification system designed for large number of events. Able to address significant sound event’s for individual user’s environment Implemented the whole system architecture and algorithm in Apple iPhone

Design Consideration Building a scalable sound classification system so that it can detect all type of sound events for different users. Privacy Issue: Record and Processing audio data happens all in the Mobile phone. Light weight signal processing and classification of sound.

Design Consideration Phone context condition  RMS good approximation of volume.  30% range of variation for different contextual position.

SoundSense Architecture Remove Frames that are silent or hard to classify

SoundSense Architecture 1. Collect features that are insensitive to volume. 2. Detect coarse-grain category of sound: Voice, music and ambient sound. 3. Multilevel Classification: Decision Tree and Markov Model based classifier. 4. Two level of classification to make the output smoothing.

SoundSense Architecture 1. Use previously established audio signal processing technique 2. In this stage speech recognition, speaker identification and music genre classification is applied

SoundSense Architecture 1. Detect only ambient sound (sound other then voice and music) 2.Unsuprvised learning technique 3. Detect meaningful ambient sound. ( assumption: sound occurrence and duration indicates its importance) 4. Maintain a SoundRank: ranking of the meaningful sound based on their importance 5. Prompt user, if a new sound exceed the threshold value of minimum sound rank.

Implementation Implemented in C,C++ and Objective C Developed for Apple iPhone Duty cycle 0.64 second during lack of acoustic event

Parameters Selection Increasing the buffer size (Sequence Length) increase the accuracy. However, responsiveness of the system also increases. Optimal buffer size is 5. Decision tree Classifier Buffered in FIFO queue Markov model classifier

Parameters Selection Precision is the number of frames that are correctly classified divided by all frames. Recall is define as the recognized occurrence of a frame type divided by the number of overall occurrence of that frame MFCC frame length This Precision and Recall plot is for ambient sound

Evaluation 1. When acoustic event detected CPU usage increase to 25%. In idle situation CPU usage is less then 5% 2. Processing time of a frame (64 ms) is around 20-30ms.

Evaluation Only Decision Tree Classifier With Markov model Classification accuracy improved 10% for music and speech and 3% for ambient sound

Evaluation No reliable sound to represent bus riding

Applications Audio Daily Diary: Log everyday events for a users. – To make query, how much time spend in certain event Music Detector based on Participatory Sensing: – Provides user a way to discover event that are associated with music being played.

Friday Saturday Some music and voice samples are incorrectly classified as ambient sound

Conclusion General Sound Classification – Light-weight – Hierarchical Flexible and Scalable. All task implemented in mobile Phone. Able to identify new sound. Can be used in personalized context.

Thank you Question?