OpenWorld 2018 Audio Recognition Using Oracle Data Science Platform

Slides:



Advertisements
Similar presentations
PHONEXIA Can I have it in writing?. Discuss and share your answers to the following questions: 1.When you have English lessons listening to spoken English,
Advertisements

IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
GAAIN Virtual Appliances: Virtual Machine Technology for Scientific Data Analysis Arihant Patawari USC Stevens Neuroimaging and Informatics Institute July.
STD Approach Two general approaches: word-based and phonetics-based Goal is to rapidly detect the presence of a term in a large audio corpus of heterogeneous.
4/9/2016 SharePoint Saturday Omaha Kerry Dirks MCP, MCSD Manager Consultant, Sogeti SharePoint Solution Architect.
Dato Confidential 1 Danny Bickson Co-Founder. Dato Confidential 2 Successful apps in 2015 must be intelligent Machine learning key to next-gen apps Recommenders.
Cloud Analytics Platforms Christian Frey. About AIDA Our mission is to advance knowledge in data analytics through research, education and outreach Our.
CSC 241: Introduction to Computer Science I
Platform as a Service (PaaS)
G. Anushiya Rachel Project Officer
Data Platform and Analytics Foundational Training
Open Exeter Project Team
Platform as a Service (PaaS)
Welcome: Hands-On Lab Plug in to the network.
Facebook's Plan to Send Thoughts from Brain to Computer
Foundations of Data Science
This is a Safe Harbor Front slide, one of two Safe Harbor Statement slides included in this template. One of the Safe Harbor slides must be used if your.
Topics Introduction Hardware and Software How Computers Store Data
Working With Azure Batch AI
Oracle JavaOne 2017 – Hands-On Labs (HOL) Get Started on Oracle Cloud: Java Apps with Containers and DevOps Plug in to the network Connect via WiFi. Connect.
Docker Birthday #3.
The importance of being Connected
THE BUSINESS CASE FOR AI, SPARK & MORE
AI development using Data Science Virtual Machines (DSVM) in Azure
Evaluating state of the art in AI
Parrot Solutions Pte. Ltd
Spark Presentation.
College of Engineering
Logo here Module 3 Microsoft Azure Web App. Logo here Module Overview Introduction to App Service Overview of Web Apps Hosting Web Applications in Azure.
VMware és KVM környezetek változtatás nélkül a felhőben
3.0 Map of Subject Areas.
Introduction CSE 1310 – Introduction to Computers and Programming
Hadoop Clusters Tess Fulkerson.
A VERY Brief Introduction to Convolutional Neural Network using TensorFlow 李 弘
Lecture 5 Smaller Network: CNN
PRO6268 – Injecting AI into Applications to Make Them Smarter
State-of-the-art face recognition systems
Object Recognition & Detection
Confidential – Oracle Internal/Restricted/Highly Restricted
Principal Product Manager Oracle Data Science Platform
Data science and machine learning at scale, powered by Jupyter
SAS Deep Learning: From Toolkit to Fast Model Prototyping
Object Classification through Deconvolutional Neural Networks
Module 01 ETICS Overview ETICS Online Tutorials
Where Intelligence Lives & Intelligence Management
Using the Microsoft AI Platform for next generation applications
Technical Capabilities
Audio and Speech Computers & New Media.
Amazon Machine Learning
John H.L. Hansen & Taufiq Al Babba Hasan
RCNN, Fast-RCNN, Faster-RCNN
Advances in Deep Audio and Audio-Visual Processing
IST346: Virtualization and Containerization
Node.js Test Automation using Oracle Developer Cloud- Simplified
Natural Language Processing (NLP) Systems Joseph E. Gonzalez
H2O is used by more than 14,000 companies
UNGP Methods & Developer Services
Automatic Handwriting Generation
Enol Fernandez & Giuseppe La Rocca EGI Foundation
DBOS DecisionBrain Optimization Server
Chapter 2 Applications Software and Operating Systems
Deploying Deep Learning Models on GPU Enabled Kubernetes Cluster
A DevOps process for deploying R to production
CSC 241: Introduction to Computer Science I
OpenStack for the Enterprise
Huawei CBG AI Challenges
Microsoft 365 Business Technical Fundamentals Series
Open data in teaching and education
Directional Occlusion with Neural Network
Machine Learning for Cyber
Presentation transcript:

OpenWorld 2018 Audio Recognition Using Oracle Data Science Platform Welcome for attending this hands on labs. My name is Jean-Rene Gauthier and I’m a lead data scientist working on the Oracle Data Science Platform. I was part of the company DataScience.com which was acquired by Oracle in June of this year. Jean-René Gauthier Lead Data Scientist Oracle Data Science Platform October 24, 2018 Confidential – Oracle Internal/Restricted/Highly Restricted

Presentation Agenda 1 Quick Overview of the Upcoming Oracle Data Science Platform Speech Recognition and Keyword Spotting Overview of the Lab Lab 2 3 The lab/demo we will go through today will be done on the legacy DataScience.com platform. The set of features available on the Oracle Data Science Platform will be similar to what you will use on the legacy product today. Ask a few questions : How many have coded or are familiar with Python? How many have used Jupyter notebooks in the past? How many have used a data science platform before?  4 The Lab GitHub Public Repository https://github.com/datascienceinc/speech-commands-oow2018 Confidential – Oracle Internal/Restricted/Highly Restricted

Oracle Data Science Platform What is It? The Oracle Data Science Platform enables data science teams to organize their work, easily access data and computing resources, and build, train, deploy, and manage models on the Oracle Cloud. What’s the Value? The Oracle Data Science Platform makes data science teams more productive, and enables them to deploy more work faster to power their organizations with machine learning. We want to make data scientists more productive? How? Giving them self-service access to scalable compute resource to train and deploy their models. A platform to manage these models after they deployed them. All on the latest and great Oracle Cloud Infrastructure hardware (GPUs) The value? 5% of data science projects make their way to production. We want to shorten the time to model deployment. We want to limit the involvement of IT teams and devops engineers in the completion of data science projects *Final name pending legal review and approval

Oracle Data Science Platform Core Capabilities End-to-end platform for enterprise data scientists Data science workflow: Collaboration for enterprise data science teams in projects Model building and training*: Python development in Jupyter notebooks Model deployment: Deploy models as APIs, serve predictions in real-time Version control: External Git Provider required for files Access to open-source: Curated sets of packages for data science use cases Access to compute: Self-service access to spin up containers on OKE Cluster of OCI VMs (CPU only) Access to data: Oracle Object Store  * Model training in single Jupyter container with reserved CPU/memory (non-distributed over multiple containers)

Speech Recognition Applications are endless “The task of speech recognition is to map an acoustic signal containing a spoken natural language utterance into the corresponding sequence of words intended by the speaker.” - Goodfellow et al. (2016) Deep Learning, MIT Press High level overview: Use cases Task of the lab Dataset Waveform and sampling FFT Machine learning model Lab Applications are endless Speech transcription Text captions for movies or TV Issue commands to your car while driving Issue commands to personal assistants (e.g. Siri, Google Home, Alexa, etc) Etc. https://www.youtube.com/watch?v=mpw_FB2QrjQ Confidential – Oracle Internal/Restricted/Highly Restricted

Speech Recognition – Keyword spotting task Problem Statement Personal assistant devices use keyword spotting to start interactions (e.g. Siri, Alexa, Hey Google, etc.) Most devices run a small keyword recognition (spotting) model locally. The device listens and runs a model to spot keywords. If a keyword is recognized, data transfer to the cloud starts. Otherwise, device is listening and calling locally-stored model for inferences. You need a local, fast model you can store on the device You don’t want to continuously transfer data to the cloud. Very costly and ineffective. You need a small model that runs fewer operations. That’s the difference here. Typically: one word or very short sentences like “Hey Siri” https://twitter.com/gregorymancuso/status/959496561695670274 Confidential – Oracle Internal/Restricted/Highly Restricted

The Lab in One Slide Notebook 1 (optional) Notebook 2 Notebook 3 This is what we’re going to do: We are going to use Jupyter notebooks written in python on the legacy DataScience.com Platform A machine learning models (deep learning model) using keras that can recognize audio keywords. We’re going to start with a collection of 1-sec audio clips of someone saying a single word (cat, dog, left, right, etc.) Then we’re going to take that data and transform it into a 2-D map called a spectrogram. We give you a basic introduction to spectrograms, etc. in notebook 1. I don’t think we’ll have time to go through that notebook. It’s optional. In notebook 2, we’ll train a convolutional neural network using keras to classify these keywords. This is a modeling technique that his generally applied to images. We’re going to deploy this model as a REST API endpoint. Follow what a real data scientist would do on this platform and move this model to a production environment. Lastly, I will run a notebook on my local laptop and call that web service to classify different audio clips. I’ll also create live clips that I will classify with the REST API endpoint Notebook 3 Confidential – Oracle Internal/Restricted/Highly Restricted

The Speech Command Dataset (Warden 2018) Standard training & evaluation dataset for simple speech recognition tasks 105k utterances in WAVE audio file format: Single word spoken. 35 different words: Dog, cat, bed, bird, up, wow, yes, etc. One second or less 16 kHz sampling rate 16-bit single channel 2,618 speakers were recorded Recorded with phone or computer mic in realistic settings This is the dataset we’re going to use. It’s called the Speech Command Dataset. English only. Confidential – Oracle Internal/Restricted/Highly Restricted

References Warren, P. 2018, ”Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition” Shainath, T.N., Parada, C. 2015, “Convolutional Neural Networks for Small- footprint Keywork Spotting”, Interspeech 2015, ISCA Confidential – Oracle Internal/Restricted/Highly Restricted

The Lab GitHub Public Repository https://github.com/datascienceinc/speech-commands-oow2018 Any Questions? jr.gauthier@oracle.com All the Lab materials are freely available on github.com. Feel free to download the materials and go over all the notebooks in your free times. Confidential – Oracle Internal/Restricted/Highly Restricted

CNN Model for Keywork Spotting Task CNN Model ( 2 convo layers + 2 max pools + 2 dropouts; 2 FC layers) (see also Shainath & Parada 2015 ) Confidential – Oracle Internal/Restricted/Highly Restricted