Retrieval of audio testimonials via voice search

Slides:

Advertisements

Similar presentations

Introduction to Computers Lecture By K. Ezirim. What is a Computer? An electronic device –Desktops, Notebooks, Mobile Devices, Calculators etc. Require.

Advertisements

Speech Recognition There are different kinds of voice or speech “_______" that take the sounds of your voice and match it with words. The engine is software.

Distant Supervision for Emotion Classification in Twitter posts 1/17.

Hierarchy of Design Voice Controlled Remote Voice Input Control Path Speech Processing IR Interface.

15.0 Utterance Verification and Keyword/Key Phrase Spotting References: 1. “Speech Recognition and Utterance Verification Based on a Generalized Confidence.

Int 1 Revision Word Processing Most people are familiar with word processing packages such as Microsoft Word, Open Office and Word Perfect. Here are some.

Copyright 2004 Monash University IMS5401 Web-based Systems Development Topic 2: Elements of the Web (g) Interactivity.

Many kinds of clients and servers This work is licensed under a Creative Commons Attribution-Noncommercial- Share Alike 3.0 License. Skills: none IT concepts:

Language and Speaker Identification using Gaussian Mixture Model Prepare by Jacky Chau The Chinese University of Hong Kong 18th September, 2002.

Voice-enabled Image Identification System Design Aashish P. Shrestha Ming Ming Zheng Multimedia Signal Processing, University of Bridgeport, Connecticut.

The Chinese University of Hong Kong Department of Computer Science and Engineering Lyu0202 Advanced Audio Information Retrieval System.

Computer Science 101 Web Access to Databases Overview of Web Access to Databases.

UNIT-V The MVC architecture and Struts Framework.

Describe the purpose, components, and use of speech recognition systems.

Lecture 3 – Data Storage with XML+AJAX and MySQL+socket.io

JavaScript & jQuery the missing manual Chapter 11

Review of Building Intelligent.NET Applications Stu Egli Andre Inistotov Frenny Saldana Kate Styers Nishant Zinzuwadia MSE 614 February 26, 2008.

Chapter 16 The World Wide Web Chapter Goals ( ) Compare and contrast the Internet and the World Wide Web Describe general Web processing.

Meir Botner David Ben-David. Project Goal Build a messenger that allows a customer to communicate with a service provider for a fee.

9 Chapter Nine Compiled Web Server Programs. 9 Chapter Objectives Learn about Common Gateway Interface (CGI) Create CGI programs that generate dynamic.

1 Computational Linguistics Ling 200 Spring 2006.

1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC.

Universiti Utara Malaysia Chapter 3 Introduction to ASP.NET 3.5.

ASP/ASP.NET: Tricks and Tips How to get Microsoft’s Programming Language to work for you By Wade Tripp Park University

1 Welcome to CSC 301 Web Programming Charles Frank.

Sale system Expected system: Web-base, Software Application, 2 tiers Application – data, Stand alone database for client, Database engine for Server, Client.

Google’s Deep-Web Crawl By Jayant Madhavan, David Ko, Lucja Kot, Vignesh Ganapathy, Alex Rasmussen, and Alon Halevy August 30, 2008 Speaker : Sahana Chiwane.

Ranking CSCI 572: Information Retrieval and Search Engines Summer 2010.

Logic Analyzer ECE-4220 Real-Time Embedded Systems Final Project Dallas Fletchall.

Getting started with ASP.NET MVC Dhananjay

Basic structure of sphinx 4

8 th Semester, Batch 2009 Department Of Computer Science SSUET.

Improvement of Apriori Algorithm in Log mining Junghee Jaeho Information and Communications University,

Submitted by: Moran Mishan. Instructed by: Osnat (Ossi) Mokryn, Dr.

PREPARED BY MANOJ TALUKDAR MSC 4 TH SEM ROLL-NO 05 GUKC-2012 IN THE GUIDENCE OF DR. SANJIB KR KALITA.

A Presentation Presentation On JSP On JSP & Online Shopping Cart Online Shopping Cart.

Glencoe Introduction to Multimedia Chapter 2 Multimedia Online 1 Internet A huge network that connects computers all over the world. Show Definition.

National College of Science & Information Technology.

KINECT AMERICAN SIGN TRANSLATOR (KAST)

Twitter Data Mining and Sentiment Analysis

Speech Recognition There are different kinds of voice or speech "engines" that take the sounds of your voice and match it with words. The engine is software.

Supervisor: Prof Michael Lyu Presented by: Lewis Ng, Philip Chan

Lecture 11. Web Standards Continued

OPERATING SYSTEMS CS3502 Fall 2017

Artificial Intelligence for Speech Recognition

E-commerce | WWW World Wide Web - Concepts

Weapon Impact Scoring System Application Architecture

E-commerce | WWW World Wide Web - Concepts

3.0 Map of Subject Areas.

INITIAL GOAL: Detecting personality based on interaction with Alexa

Design and Maintenance of Web Applications in J2EE

Multimedia Information Retrieval

Speech Recognition There are different kinds of voice or speech "engines" that take the sounds of your voice and match it with words. The engine is software.

DWR: Direct Web Remoting

Lecture 1: Multi-tier Architecture Overview

Data Mining Chapter 6 Search Engines

David Cyphert CS 2310 – Software Engineering

David Cyphert CS 2310 – Software Engineering

Serpil TOK, Zeki BAYRAM. Eastern MediterraneanUniversity Famagusta

Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1

Voice Activation for Wealth Management

CS246: Information Retrieval

Introduction to World Wide Web

The ultimate in data organization

Architecture of the web

introduction to computers

Client-Server Model: Requesting a Web Page

Web Servers (IIS and Apache)

* Web Servers/Clients * The HTTP Protocol

Presentation transcript:

Retrieval of audio testimonials via voice search David Cyphert CS 2310 – Software Engineering Fall 2017

Project Goals Read in keywords specified by the user via speech recognition Take the recognition and search audio files for specified keywords Dictate entire audio file and calculate sentiment

Components Sentiment Analysis (dictation) Keyword spotting Web based approach. HTML/CSS/JavaScript front-end. ASP.NET (C#) backend with SQL Server database. Client side processing: Web Speech API (Speech Recognition) Come up with an algorithm to determine if an audio testimonial stored on the server is good or bad. Will probably use a predefined set of “good” and “bad” descriptor words to make this determination. Server side processing will be using SpeechRecognitionEngine Client-side processing to get search criteria for audio testimonial. Analyze the audio file to spot keywords. Sentiment Analysis – determine if the review was positive or negative

Client-side processing Web Speech API Part of the HTML5 specifications JavaScript API to enable web developers to incorporate speech recognition and synthesis into their web pages. Used speech-to-text to get input from the user. Sending ajax requests to the server with the search criteria

Server-side analysis of audio files Microsoft’s Speech Recognition Engine “Keyword spotting” Defined “grammars” to process only certain utterances that have particular semantic meaning (spoken search criteria) Based on confidence level calculated by the engine, it determines if a given word is spoken in an audio file. Returns the rows that are above confidence threshold

Sentiment Analysis Also known as opinion mining or emotion AI. Aims to determine the attitude of a speaker, writer, or other subject with respect to some topic. Examples: typical negations (e.g., "not good") use of contractions as negations (e.g., "wasn't very good") using degree modifiers to alter sentiment intensity (e.g., intensity boosters such as "very" and intensity dampeners such as "kind of") VADER API Valence Aware Dictionary and sEntiment Reasoner The compound score is computed by summing the valence of each word in the lexicon, adjusted with rules, and then normalized to be generally between -1 (most extreme negative) and +1 (most extreme positive) “normalized weighted composite score”

Problems Turns out, keyword spotting in general is a hard problem Not very accurate for short words (# of syllables). Shorter words are easily confused and cause false positives. Microsoft’s Recognition Engine for keyword spotting It works, but not 100% accurate Works great for dictation of entire file

Improving accuracy Lowering the amplitude of the audio Not sure why – possibly when using this library through the microphone, it programmically reduces the volume as its processing. Wildly inaccurate without doing this Stereo -> Mono, 16-bit PCM (Pulse-code Modulation). This is a requirement by the library

Improving accuracy (cont.) Only accepting higher confidence values This reduces false positives. Currently I’m only accepting detections with 80% confidence. Problems with this: Could reject an accurate detection

Improving accuracy (cont.) “Training” the Speech Recognition Engine

DEMO