THE ASSISTIVE SYSTEM SHIFALI KUMAR BISHWO GURUNG JAMES CHOU SENIOR DESIGN II FINAL PRESENTATION THE ASSISTIVE SYSTEM SHIFALI KUMAR BISHWO GURUNG JAMES CHOU
Project Goal To develop an object recognition system which utilizes audio input and output to assist visually impaired individuals.
Methods Tested Algorithms: SIFT, SURF, HOG Models: Bag of Features Classifiers: SVM, Random Forest
Methods Abandoned SIFT Needs to be built from scratch. No native function. SURF with Bag of Features Very accurate but slow. (~ 2 minutes for 3 categories) Randomforest Easy to tune, requires several trees for more accuracy. 50+ trees for good accuracy but still slow. (~ 2 minutes for 6 categories)
HOG with SVM(~15 classes) Generally, Works in 30 seconds. Fast and fairly accurate. Feature extraction in 10 seconds. Classifier trained in 20 seconds. SVM doesn’t require a large dataset. Doesn’t require too much tuning.
User supervised learning
Text to Speech in Windows
Speech Recognition - Made progress but still requires development. - Researched several sources and methods to find a way incorporate speech to text. - Attained the first few steps which will help us later such as playing an audio file and recording voice input.
Audio Progress 1
Audio Progress 2
Accomplishments Local and global feature extraction of the images. Classifiers built and trained to predict the correct label of the image. Text-to-speech feature added in both Mac OS X and Windows system. Machine learning supervised by the user for better predictions.
Future Plans - Finishing speech-to-text feature and having it work reliably. - Working with classes that result in better time efficiency. - Getting more accurate predictions from images captured by the camera. - Adding more query options along with the corresponding results - Implementing neural network models.
Conclusion We’ve learned a lot about the field of computer vision where none of us have had prior experience in. We observed the performances of various algorithms and classifiers. We learned a lot about supervised machine learning. Speech recognition as a means of communication for the user to interact with the system was a more difficult task than anticipated.
Questions & Concerns are welcome. Thank You! Questions & Concerns are welcome.