Download presentation
Presentation is loading. Please wait.
Published byStephany Lewis Modified over 6 years ago
1
INITIAL GOAL: Detecting personality based on interaction with Alexa
Krithika Ganesh CS2310 Multimedia Software Engineering
2
Detecting personality based on interaction with Alexa
Detect personality of the user by passing the user voice commands to a machine learning model which predicts the emotional and interest quotient for each voice command. At the end of the week gives an analysis of the interest and emotional quotient of the user.. Objectives Predict emotion and interest quotient for each voice command End of week give an overall analysis of the mood and interest of the person Extension of Exercise 4
3
<Sentence , emotion>
13 Machine Learning Training Component Building Model Sentence vectors <Sentence , emotion> Embedding layer Data scrubbing Clean data Vectorization Word vectors LSTM Predicted vectors Dense layer Probability distribution vector Softmax layer Max of Probability distribution vector Categorical Cross entropy loss function Changes vector to O/P dimension Predicts emotion
4
Challenges - Solution Challenge 2 Solution Challenge 1 Solution
Voice commands Voice commands Voice commands Voice commands Text Use State of the Art AI emotion Predictor My LSTM model Amazon echo dot Google speech Recognizer No way to capture the voice commands for analysis !! Predictions slow Not accurate enough Need better training data Could not capture frequency, loudness Could save the voice commands for analysis !! Excellent predictions Component based SE GOAL : Focus on the components working together rather than the algorithm itself!!
5
Krithika Ganesh CS2310 Multimedia Software Engineering
PLAN B: Detecting personality based on interactions with Google Speech Recognizer Krithika Ganesh CS2310 Multimedia Software Engineering
6
Project Remote component Machine learning Trained component
Architecture Super component Component Project Remote component Machine learning Trained component Component Flask server Input processor Component SIS Server Uploader component Component UI component
7
Google Speech Recognizer
System Design Voice Commands Uploader Google Speech Recognizer Share Personality Input processor Trained Model Emotion Predicts Super component Test Voice Samples PRJ Remote
8
Input processor Voice commands Laptop microphone Voice file convertor Google Cloud Speech API enables developers to convert audio to text by applying powerful neural network models in an easy to use API. WAV PCM 8 KHz, 16 bit Mono. Google Speech Recognizer Text Save converted file to be analyzed locally
9
Super Component: Trained ML Component
Raw voice sample The Emotions Analytics engine measures the speaker’s current mood. It requires at least 13 seconds of continuous voice to render an emotional analysis. Attitudes Temper value : measures aggressiveness Valence value : positive negative neutral Arousal value : measures degree of energy Attitudes Emotion But the attitude values are numbers which a lay man cannot understand !! So for this project I decided to extract only emotion data !!
10
Emotions : Mood Groups Mood groups are an indicator of a speaker’s emotional state during the analyzed voice section. Aggressive / Confrontational Mood Groups Self-Control Mood Group Supremacy and Arrogance Hostility and Anger Self-control and practicality Criticism and Cynicism. Embracive Mood Groups Depressive / Gloomy Mood Groups Leadership and Charisma Creativeness and Passion Loneliness and Unfulfillment Friendliness and Warm Love and Happiness Sadness and Sorrow Defensiveness and Anxiety
11
PRJ Remote: Testing phase
Check results User can upload any test voice sample
12
Uploader: Share the emotion results
Mail results to user Hostname: smtp.gmail.com Port: 587
13
UI Component Records user voice, Google speech Recognizer translates to text, voice sample saved User uploads voice file, can listen to it, analyze the voice sample, check emotion results Share the emotion results via Gmail or Facebook
14
Interaction among components and SIS Server
Start running the SIS server Start running UI component Then interaction begins UI Component SIS Server Record voice Display voice to text INPUT PROCESSOR runs Record Voice component runs File convertor component runs Google Speech Recognizer runs Upload converted voice file Render voice file PRJ REMOTE runs Click start analysis Render Analysis ML Trained Component runs Click on share UPLOADER Component runs
15
Screenshots of a Scenario
16
SIS Server Running
17
Google Speech Recognizer Running
Record voice here Google Speech Recognizer Running
19
Choose the recorded file
Play the recorded file PRJ Component Tester Running
20
After clicking on start
Results are displayed here More visual :D
21
Share on Facebook Mail Results
22
Demo link: https://www.youtube.com/watch?v=CzMNhcfvvgE
23
Future work Improve on my LSTM model.
Share results directly on Facebook – feeling surprised, feeling blessed….. Maintain history of previous results by saving it to a database Aggregate the emotion results and analyze it
24
References My Demo Video https://www.youtube.com/watch?v=CzMNhcfvvgE
Input Processor Trained model : Beyond verbal Uploader Training data <sentence, emotion> : Training model LSTM
25
DEMO
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.