Academic Advisor: Dr. Yuval Elovici Technical Advisor: Dr. Rami Puzis Team Members: Yakir Dahan Royi Freifeld Vitali Sepetnitsky 2
3
Most of our navigation in the everyday life heavily depends on visual feedback that we get from our environment When the ability to see the surroundings is missing due to visual impairments, the ability to navigate is also damaged 4
Physical sense: White Cane Guide Dog Sensory substitution: Warning of obstacles (e.g. Optical Radar)Optical Radar Sonar-like images scanning (e.g. The vOICe)The vOICe 5
Sightless navigation by sensory substitution Sightless navigation by sensory substitution Development of an application that allows a person to navigate, relying primarily on the sense of hearing Integration with a spatial auditory environment Integration with a spatial auditory environment Providing a flexible environment for future research Providing a flexible environment for future research 6
A Combination of visual information processing and 3D sound creation and positioning: Taking a stream of frames from a web-camera Taking a stream of frames from a web-camera Processing the frames and retrieving visual information relevant to the user Creating appropriate sounds according the recognized information Performing an auditory spatialization of the sound and informing the user about the locations of the detected information 7
8
OpenCV OpenCV OpenAL OpenAL MATLAB engine library 9
End Users End Users Visually impaired (or even blind) people who use the system for the purpose of hearing their physical environment Configuration Users Configuration Users The system installation and initial tuning, such as user profiles creation, will be done by configuration users having the ability to see the operations they perform Researchers Researchers Cognitive science researchers who wish to conduct experiments regarding 3D sound 10
11
For all users (especially the researcher): For all users (especially the researcher): 1.Support several types of computer vision and image processing algorithms for extraction of the following information: Feature points (points of interest) Feature points (points of interest) Contours Contours BLOBS (regions that are either darker or brighter than the surrounding) BLOBS (regions that are either darker or brighter than the surrounding) 2.Provide a utility to add new implementations of the above algorithms according to a predefined API 3.Support specific configurability options for each algorithm type 12
For all users: For all users: 4.Create appropriate sounds according to the following features: Location Location Brightness Brightness Color Color 5.Support sound spatialization using OpenAL API implementations and HRTF datasets conforming with a predefined format 6.Allow to install new HRTF datasets and OpenAL implementations for improving the quality of sound localization and research purposes 13
For the configuration user: For the configuration user: 1.Ability to install the system along with all the peripheral software and initial set of HRTF datasets 2.User profiles managing: Support creation of user profiles, which store the system settings optimized to the user preferences Support creation of user profiles, which store the system settings optimized to the user preferences Support the ability to view the settings stored in a user profile Support the ability to view the settings stored in a user profile Support the ability to modify and delete profiles Support the ability to modify and delete profiles Supply a set of predefined (default) profiles used for initial system configuring Supply a set of predefined (default) profiles used for initial system configuring Ability to initialize the system according to a given user profile and switch between profiles Ability to initialize the system according to a given user profile and switch between profiles 14
For the blind user: For the blind user: 1.Support an extensive training mechanism for: 3D sound perception 3D sound perception Environment understanding Environment understanding 2.Support the following training types: Visualizing random shapes Visualizing random shapes Visualizing pre-defined image files Visualizing pre-defined image files Fully immersive use of the system by emphasis of some feature Fully immersive use of the system by emphasis of some feature For the researcher: For the researcher: Support defining a training experiment task Support defining a training experiment task Support recording of the task results and retrieve them later Support recording of the task results and retrieve them later 15
16
Speed requirements: Speed requirements: 1.Response time: The system will produce a 3D sound according to a frame taken by the camera within 0.1 seconds at most (we will strive to 0.03 seconds – 30 fps) 2.Training Speed: A simple training in order to reach 50% accuracy of recognition should take no more than 30 minutes for a blind user. A simple training in order to reach 50% accuracy of recognition should take no more than 30 minutes for a blind user. A blind user should pass at least 80% of the accuracy tests after 2 days of extensive system usage. A blind user should pass at least 80% of the accuracy tests after 2 days of extensive system usage. A regular user should pass at least 80% of the accuracy tests after 3 days of extensive system usage. A regular user should pass at least 80% of the accuracy tests after 3 days of extensive system usage. 17
Portability requirements: Portability requirements: 1.Currently the system is designed to be deployed on Microsoft Windows (XP / Vista / 7 and later) operating systems only 2.The system will be compatible with 32 / 64 bit machines having web-camera and audio drivers installed Capacity requirements: Capacity requirements: 1.The system should work on machines with at least 1 GB of RAM 2.The system will support many different OpenAL implementations and HRTF datasets, the limit is the hard disk capacity only 18
User Interface requirements: User Interface requirements: 1.The UI should be easy to use even for users that are not well familiar to the computers technology 2.User interface will be in English Documentation and Help: Documentation and Help: 1.A extensive documentation will be supported along with an installation guide 2.Operations will be implemented as wizards 3.Error messages heard via headphones 19
1.The application core and the UI will be written in C++ language using.NET 3.5 Framework and Visual Studio 10.0 IDE. 2.MATLAB will be used as a computational engine 3.During the development stage of the system a home-made simple device will be used (a PC web-camera strapped to a top of headphones) 4.For the demo and testing purposes, a real device will be supplied by DT labs which are spy-sunglasses (sunglasses with a tiny camera hidden in the nose bridge of glasses) 20
21
22
A blind user starts the visualization process 23
A blind user performs a training process 24
A blind user chooses an existing user profile for the purpose of performing a training or in order to use the system 25
The core of the visualization process 26
27
28