Florian Bacher & Christophe Sourisse [623.400] Seminar in Interactive Systems.

Slides:



Advertisements
Similar presentations
Report on Common Intrusion Detection Framework By Ganesh Godavari.
Advertisements

1 A Spectral-Temporal Method for Pitch Tracking Stephen A. Zahorian*, Princy Dikshit, Hongbing Hu* Department of Electrical and Computer Engineering Old.
Martin Wagner and Gudrun Klinker Augmented Reality Group Institut für Informatik Technische Universität München December 19, 2003.
Advanced Speech Enhancement in Noisy Environments
Nonparametric-Bayesian approach for automatic generation of subword units- Initial study Amir Harati Institute for Signal and Information Processing Temple.
Sensor-based Situated, Individualized, and Personalized Interaction in Smart Environments Simone Hämmerle, Matthias Wimmer, Bernd Radig, Michael Beetz.
Learning in the Wild Satanjeev “Bano” Banerjee Dialogs on Dialog March 18 th, 2005 In the Meeting Room Scenario.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 7: Expert Systems and Artificial Intelligence Decision Support.
HIWIRE MEETING Trento, January 11-12, 2007 José C. Segura, Javier Ramírez.
New Technologies Are Surfacing Everyday. l Some will have a dramatic affect on the business environment. l Others will totally change the way you live.
Emotional Intelligence and Agents – Survey and Possible Applications Mirjana Ivanovic, Milos Radovanovic, Zoran Budimac, Dejan Mitrovic, Vladimir Kurbalija,
報告日期 :2012/03/07 指導教授 : 蔡亮宙 報 告 者 : 吳烱華 自製率 :100%.
Amarino:a toolkit for the rapid prototyping of mobile ubiquitous computing Bonifaz Kaufmann and Leah Buechley MIT Media Lab High-Low Tech Group Cambridge,
SoundSense: Scalable Sound Sensing for People-Centric Application on Mobile Phones Hon Lu, Wei Pan, Nocholas D. lane, Tanzeem Choudhury and Andrew T. Campbell.
28 August 2015T Kari Laitinen1 T Seminar on Wireless Future 3 ECTS cr Dr. Kari Laitinen Principal Lecturer Oulu University of Applied Sciences.
Kinect Player Gender Recognition from Speech Analysis
Ambulation : a tool for monitoring mobility over time using mobile phones Computational Science and Engineering, CSE '09. International Conference.
“Curious Places” October, 2007 Key Centre of Design Computing and Cognition, University of Sydney A Room that Adapts using Curiosity and Supervised Learning.
SoundSense by Andrius Andrijauskas. Introduction  Today’s mobile phones come with various embedded sensors such as GPS, WiFi, compass, etc.  Arguably,
Introduction to SDLC: System Development Life Cycle Dr. Dania Bilal IS 582 Spring 2009.
Introduction Kinect for Xbox 360, referred to as Kinect, is developed by Microsoft, used in Xbox 360 video game console and Windows PCs peripheral equipment.
ETSI STQ, Taiwan Workshop, February 13th, Recent improvements in transmission quality assessment : Background noise transmission Results of STF.
Design of a Speech Recognition System to Assist Hearing Impaired Students Richard Kheir 2 and Thomas P. Way Department of Computing Sciences, Villanova.
1 A Portable Tele-Emergent System With ECG Discrimination in SCAN Devices Speaker : Ren-Guey Lee Date : 2004 Auguest 25 B.E. LAB National Taipei University.
Mobile HCI Presented by Bradley Barnes. Mobile vs. Stationary Desktop – Stationary Users can devote all of their attention to the application. Very graphical,
Sluzek 142/MAPLD Development of a Reconfigurable Sensor Network for Intrusion Detection Andrzej Sluzek & Palaniappan Annamalai Intelligent Systems.
3D Sketch-based 3D Model Retrieval With Kinect Natacha Feola 1, Azeem Ghumman 2, Scott Forster 3, Dr. Yijuan Lu 4 1 Department of Computer Science, University.
© Yilmaz “Agent-Directed Simulation – Course Outline” 1 Course Outline Dr. Levent Yilmaz M&SNet: Auburn M&S Laboratory Computer Science &
Florian Bacher & Christophe Sourisse [ ] Seminar in Interactive Systems.
CHATS IN THE CLASSROOM: EVALUATIONS FROM THE PERSPECTIVES OF STUDENTS AND TUTORS AT CHEMNITZ UNIVERSITY OF TECHNOLOGY, COMMUNICATION ON TECHNOLOGY AND.
NM – LREC 2008 /1 N. Moreau 1, D. Mostefa 1, R. Stiefelhagen 2, S. Burger 3, K. Choukri 1 1 ELDA, 2 UKA-ISL, 3 CMU s:
CPT 123 Internet Skills Class Notes Audio/Video Communication Session.
1 ENTROPY-BASED CONCEPT SHIFT DETECTION PETER VORBURGER, ABRAHAM BERNSTEIN IEEE ICDM 2006 Speaker: Li HueiJyun Advisor: Koh JiaLing Date:2007/11/6 1.
REVISED CONTEXTUAL LRT FOR VOICE ACTIVITY DETECTION Javier Ram’ırez, Jos’e C. Segura and J.M. G’orriz Dept. of Signal Theory Networking and Communications.
Odyssey A Reuse Environment based on Domain Models Prepared By: Mahmud Gabareen Eliad Cohen.
MULTIMEDIA DEFINITION OF MULTIMEDIA
MABAS CONFERENCE CHIEFS AND COMMAND February 21 – 23, 2011 Chicago Fire Digital Project Commander Leonard Edling Chicago Fire Department.
Ekapol Chuangsuwanich and James Glass MIT Computer Science and Artificial Intelligence Laboratory,Cambridge, Massachusetts 02139,USA 2012/07/2 汪逸婷.
SPEECH CONTENT Spanish Expressive Voices: Corpus for Emotion Research in Spanish R. Barra-Chicote 1, J. M. Montero 1, J. Macias-Guarasa 2, S. Lufti 1,
Spoken Dialog Systems and Voice XML Lecturer: Prof. Esther Levin.
ICT 1 A multimodal context aware mobile maintenance terminal for noisy environments Fredrik Vraalsen Research scientist SINTEF MOBIS’04 – Oslo, 15/9-04.
A methodology for the creation of a forensic speaker recognition database to handle mismatched conditions Anil Alexander and Andrzej Drygajlo Swiss Federal.
MULTIMEDIA INPUT / OUTPUT TECHNOLOGIES INTRODUCTION 6/1/ A.Aruna, Assistant Professor, Faculty of Information Technology.
Cerberus: A Context-Aware Security Scheme for Smart Spaces presented by L.X.Hung u-Security Research Group The First IEEE International Conference.
Voice-based generic UPnP Control Point Andreas BobekUniversity of Rostock Faculty of Computer Science and Electrical Engineering Andreas Bobek, Hendrik.
Model of the Human  Name Stan  Emotion Happy  Command Watch me  Face Location (x,y,z) = (122, 34, 205)  Hand Locations (x,y,z) = (85, -10, 175) (x,y,z)
Introduction to Onset Detection Functions HAO-HSUN LI 1/30.
Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK Audio Content Analysis in The.
PhD Candidate: Tao Ma Advised by: Dr. Joseph Picone Institute for Signal and Information Processing (ISIP) Mississippi State University Linear Dynamic.
Speech Communication Lab, State University of New York at Binghamton Dimensionality Reduction Methods for HMM Phonetic Recognition Hongbing Hu, Stephen.
Groupe de Recherche en Informatique Université du Québec à Chicoutimi A Logical Approach to ADL Recognition for Alzheimer’s patients ICOST 2006 Bruno Bouchard,
AN INTELLIGENT ASSISTANT FOR NAVIGATION OF VISUALLY IMPAIRED PEOPLE N.G. Bourbakis*# and D. Kavraki # #AIIS Inc., Vestal, NY, *WSU,
Performance Comparison of Speaker and Emotion Recognition
Author: Tatsuya Yamazaki National institute of Information and Communications Technology Presenter: Samanvoy Panati.
ARTIFICIAL INTELLIGENCE FOR SPEECH RECOGNITION. Introduction What is Speech Recognition?  also known as automatic speech recognition or computer speech.
VoIP Steganography and Its Detection – A Survey 1 ACM Computing Surveys (CSUR) Volume 46 Issue 2, November 2013 WOJCIECH MAZURCZYK Warsaw University of.
Sensor Analysis – Part II A literature based exploration Thomas Plötz [material taken from the original papers and John Krumm “Ubiquitous Computing Fundamentals”
ICADI Preparatory Workshop Purpose Andrew Sixsmith, U. Liverpool Sumi Helal, U. of Florida.
Luis E. Palafox and J.Antonio Garcia-Macias CICESE – Research Center 2009 Proceedings of the 4 th international conference on Wireless pervasive computing.
Designing a framework For Recommender system Based on Interactive Evolutionary Computation Date : Mar 20 Sat, 2011 Project Number :
BeNeFri University Vision assistance for people with serious sight problems Moreno Colombo Marin Tomić Future User Interfaces, BeNeFri University.
Speaker Recognition UNIT -6. Introduction  Speaker recognition is the process of automatically recognizing who is speaking on the basis of information.
Preliminary project assignment Smart house Natural User Interface for Business NUIT4B.
Automatic Speech Recognition
Preface to the special issue on context-aware recommender systems
Xingtian Dong Design and evaluation of a Smart home voice interface for the elderly: acceptability and objection aspects Xingtian Dong.
11th International Conference on Mobile and Ubiquitous Systems:
Kocaeli University Introduction to Engineering Applications
TECHNOLOGICAL PROGRESS
3rd Studierstube Workshop TU Wien
Presentation transcript:

Florian Bacher & Christophe Sourisse [ ] Seminar in Interactive Systems

 Introduction  Methodology  Experiment Description  Implementation  Results  Conclusion

 Smart homes have become a major field of research in information and communication technologies.  Possible way of interaction: Voice commands.  Goal of our experiment: evaluate the possibility of recognizing voice commands initiated by hand claps in a noisy environment.  Gather a set of voice commands uttered by various speakers.

 Main method: Lecouteux et al. [1] ◦ Deals with speech recognition within distress situations. ◦ Problem: no background noise was considered.  Chosen methodology: adapt Lecouteux et al. protocol considering: ◦ Noisy settings. ◦ Initiating recognition using hand claps.

 Choice of the room setting ◦ Lecouteux et al. [1]: a whole flat. ◦ Vovos et al. [7]: one-room microphone array. ◦ Choice: one room with 2 microphones.  Choice of background noises ◦ Hirsch and Pierce [8]: NoiseX 92 database. ◦ Moncrieff et al. [5]: “Background noise is defined as consisting of typical regularly occurring sounds.” ◦ Choice: background noises of the daily house life.

 Performed in a 3m x 3m room.  Sounds were captured by two microphones which were hidden in the room.

 20 participants (10 men, 10 women, 25,5 ± 11 years) participated to a 2-phase exp.  1 st phase: recognize a word (“Jeeves”) as a command ◦ System’s attention is catched by double clapping. ◦ 4 scenarios. ◦ Background noises tested: step noises, opening doors, moving chairs, radio show.  2 nd phase: Gather a set of voice commands ◦ List of 15 command-words. ◦ Reference record for pronounciation issues. ◦ Each word is uttered 10 times.

 Used technologies: ◦ C# Library System.Speech.Recognition : Interface to the Speech Recognition used by Windows. ◦ Microphones: Two dynamic microphones with cardioid polar pattern (Sennheiser BF812/e8155) ◦ Line6 UX1 Audio Interface ◦ Line6 Pod Farm 2.5

 Signal is captured in real time.  If there are exactly two signal peaks within a certain timeframe, the software classifies them as a double clap.  After a double clap has been detected, the actual speech recognition engine is activated (i.e. the software is waiting for commands).

True positive True negative False positive False negative Attempt of the participant? Performance? System recognized something?

 A new idea of how to initiate speech recognition in human computer interaction.  An evaluation of the potential influence of a noisy environment.  Results: encouraging, but not yet satisfying.  Next step: perform this experiment in a real smart-home-context.

 [1] B. Lecouteux, M. Vacher and F. Portet. Distant speech recognition in a smart home: comparison of several multisouce ASRs in realistic conditions. Interspeech.,  [2] A. Fleury, N. Noury, M. Vacher, H. Glasson and J.-F. Serignat. Sound and speech detection and classification in a health smart home. 30th Annual International IEEE EMBS Conference, Vancouver, British Columbia, Canada, August  [3] M. Vacher, N. Guirand, J.-F. Serignat and A. Fleury. Speech recognition in a smart home: Some experiments for telemonitoring. Proceedings of the 5th Conference on Speech Technology and Human-Computer Dialogue, pages 1 – 10, June  [4] J. Rouillard and J.-C. Tarby. How to communicate smartly with your house? Int. J. Ad Hoc and Ubiquitous Computing, 7(3),  [5] S. Moncrieff, S. Venkatesh, G. West, and S. Greenhill. Incorporating contextual audio for an actively anxious smart home. Proceedings of the 2005 International Conference on Intelligent Sensors, Sensor Networks and Information Processing, pages 373 – 378, Dec  [6] M. Vacher, D. Istrate, F. Portet, T. Joubert, T. Chevalier, S. Smidtas, B. Meillon, B. Lecouteux, M. Sehili, P. Chahuara and S. Méniard. The sweet-home project: Audio technology in smart homes to improve well-being and reliance. 33rd Annual International IEEE EMBS Conference, Boston, Massachusetts, USA,  [7] A. Vovos, B. Kladis and N. Fakotakis, Speech operated smart-home control system for users with special needs, in Proc. Interspeech 2005, 2005, pp. 193 – 196.  [8] H.-G. Hirsch and D. Pearce. The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions. In ASR-2000, pages 181 – 188.