OpenWorld 2018 Audio Recognition Using Oracle Data Science Platform

Slides:

Advertisements

Similar presentations

PHONEXIA Can I have it in writing?. Discuss and share your answers to the following questions: 1.When you have English lessons listening to spoken English,

Advertisements

IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.

GAAIN Virtual Appliances: Virtual Machine Technology for Scientific Data Analysis Arihant Patawari USC Stevens Neuroimaging and Informatics Institute July.

STD Approach Two general approaches: word-based and phonetics-based Goal is to rapidly detect the presence of a term in a large audio corpus of heterogeneous.

4/9/2016 SharePoint Saturday Omaha Kerry Dirks MCP, MCSD Manager Consultant, Sogeti SharePoint Solution Architect.

Dato Confidential 1 Danny Bickson Co-Founder. Dato Confidential 2 Successful apps in 2015 must be intelligent Machine learning key to next-gen apps Recommenders.

Cloud Analytics Platforms Christian Frey. About AIDA Our mission is to advance knowledge in data analytics through research, education and outreach Our.

CSC 241: Introduction to Computer Science I

Platform as a Service (PaaS)

G. Anushiya Rachel Project Officer

Data Platform and Analytics Foundational Training

Open Exeter Project Team

Platform as a Service (PaaS)

Welcome: Hands-On Lab Plug in to the network.

Facebook's Plan to Send Thoughts from Brain to Computer

Foundations of Data Science

This is a Safe Harbor Front slide, one of two Safe Harbor Statement slides included in this template. One of the Safe Harbor slides must be used if your.

Topics Introduction Hardware and Software How Computers Store Data

Working With Azure Batch AI

Oracle JavaOne 2017 – Hands-On Labs (HOL) Get Started on Oracle Cloud: Java Apps with Containers and DevOps Plug in to the network Connect via WiFi. Connect.

Docker Birthday #3.

The importance of being Connected

THE BUSINESS CASE FOR AI, SPARK & MORE

AI development using Data Science Virtual Machines (DSVM) in Azure

Evaluating state of the art in AI

Parrot Solutions Pte. Ltd

Spark Presentation.

College of Engineering

Logo here Module 3 Microsoft Azure Web App. Logo here Module Overview Introduction to App Service Overview of Web Apps Hosting Web Applications in Azure.

VMware és KVM környezetek változtatás nélkül a felhőben

3.0 Map of Subject Areas.

Introduction CSE 1310 – Introduction to Computers and Programming

Hadoop Clusters Tess Fulkerson.

A VERY Brief Introduction to Convolutional Neural Network using TensorFlow 李弘

Lecture 5 Smaller Network: CNN

PRO6268 – Injecting AI into Applications to Make Them Smarter

State-of-the-art face recognition systems

Object Recognition & Detection

Confidential – Oracle Internal/Restricted/Highly Restricted

Principal Product Manager Oracle Data Science Platform

Data science and machine learning at scale, powered by Jupyter

SAS Deep Learning: From Toolkit to Fast Model Prototyping

Object Classification through Deconvolutional Neural Networks

Module 01 ETICS Overview ETICS Online Tutorials

Where Intelligence Lives & Intelligence Management

Using the Microsoft AI Platform for next generation applications

Technical Capabilities

Audio and Speech Computers & New Media.

Amazon Machine Learning

John H.L. Hansen & Taufiq Al Babba Hasan

RCNN, Fast-RCNN, Faster-RCNN

Advances in Deep Audio and Audio-Visual Processing

IST346: Virtualization and Containerization

Node.js Test Automation using Oracle Developer Cloud- Simplified

Natural Language Processing (NLP) Systems Joseph E. Gonzalez

H2O is used by more than 14,000 companies

UNGP Methods & Developer Services

Automatic Handwriting Generation

Enol Fernandez & Giuseppe La Rocca EGI Foundation

DBOS DecisionBrain Optimization Server

Chapter 2 Applications Software and Operating Systems

Deploying Deep Learning Models on GPU Enabled Kubernetes Cluster

A DevOps process for deploying R to production

CSC 241: Introduction to Computer Science I

OpenStack for the Enterprise

Huawei CBG AI Challenges

Microsoft 365 Business Technical Fundamentals Series

Open data in teaching and education

Directional Occlusion with Neural Network

Machine Learning for Cyber

Presentation transcript:

OpenWorld 2018 Audio Recognition Using Oracle Data Science Platform Welcome for attending this hands on labs. My name is Jean-Rene Gauthier and I’m a lead data scientist working on the Oracle Data Science Platform. I was part of the company DataScience.com which was acquired by Oracle in June of this year. Jean-René Gauthier Lead Data Scientist Oracle Data Science Platform October 24, 2018 Confidential – Oracle Internal/Restricted/Highly Restricted

Presentation Agenda 1 Quick Overview of the Upcoming Oracle Data Science Platform Speech Recognition and Keyword Spotting Overview of the Lab Lab 2 3 The lab/demo we will go through today will be done on the legacy DataScience.com platform. The set of features available on the Oracle Data Science Platform will be similar to what you will use on the legacy product today. Ask a few questions : How many have coded or are familiar with Python? How many have used Jupyter notebooks in the past? How many have used a data science platform before? 4 The Lab GitHub Public Repository https://github.com/datascienceinc/speech-commands-oow2018 Confidential – Oracle Internal/Restricted/Highly Restricted

Oracle Data Science Platform What is It? The Oracle Data Science Platform enables data science teams to organize their work, easily access data and computing resources, and build, train, deploy, and manage models on the Oracle Cloud. What’s the Value? The Oracle Data Science Platform makes data science teams more productive, and enables them to deploy more work faster to power their organizations with machine learning. We want to make data scientists more productive? How? Giving them self-service access to scalable compute resource to train and deploy their models. A platform to manage these models after they deployed them. All on the latest and great Oracle Cloud Infrastructure hardware (GPUs) The value? 5% of data science projects make their way to production. We want to shorten the time to model deployment. We want to limit the involvement of IT teams and devops engineers in the completion of data science projects *Final name pending legal review and approval

Oracle Data Science Platform Core Capabilities End-to-end platform for enterprise data scientists Data science workflow: Collaboration for enterprise data science teams in projects Model building and training*: Python development in Jupyter notebooks Model deployment: Deploy models as APIs, serve predictions in real-time Version control: External Git Provider required for files Access to open-source: Curated sets of packages for data science use cases Access to compute: Self-service access to spin up containers on OKE Cluster of OCI VMs (CPU only) Access to data: Oracle Object Store * Model training in single Jupyter container with reserved CPU/memory (non-distributed over multiple containers)

Speech Recognition Applications are endless “The task of speech recognition is to map an acoustic signal containing a spoken natural language utterance into the corresponding sequence of words intended by the speaker.” - Goodfellow et al. (2016) Deep Learning, MIT Press High level overview: Use cases Task of the lab Dataset Waveform and sampling FFT Machine learning model Lab Applications are endless Speech transcription Text captions for movies or TV Issue commands to your car while driving Issue commands to personal assistants (e.g. Siri, Google Home, Alexa, etc) Etc. https://www.youtube.com/watch?v=mpw_FB2QrjQ Confidential – Oracle Internal/Restricted/Highly Restricted

Speech Recognition – Keyword spotting task Problem Statement Personal assistant devices use keyword spotting to start interactions (e.g. Siri, Alexa, Hey Google, etc.) Most devices run a small keyword recognition (spotting) model locally. The device listens and runs a model to spot keywords. If a keyword is recognized, data transfer to the cloud starts. Otherwise, device is listening and calling locally-stored model for inferences. You need a local, fast model you can store on the device You don’t want to continuously transfer data to the cloud. Very costly and ineffective. You need a small model that runs fewer operations. That’s the difference here. Typically: one word or very short sentences like “Hey Siri” https://twitter.com/gregorymancuso/status/959496561695670274 Confidential – Oracle Internal/Restricted/Highly Restricted

The Lab in One Slide Notebook 1 (optional) Notebook 2 Notebook 3 This is what we’re going to do: We are going to use Jupyter notebooks written in python on the legacy DataScience.com Platform A machine learning models (deep learning model) using keras that can recognize audio keywords. We’re going to start with a collection of 1-sec audio clips of someone saying a single word (cat, dog, left, right, etc.) Then we’re going to take that data and transform it into a 2-D map called a spectrogram. We give you a basic introduction to spectrograms, etc. in notebook 1. I don’t think we’ll have time to go through that notebook. It’s optional. In notebook 2, we’ll train a convolutional neural network using keras to classify these keywords. This is a modeling technique that his generally applied to images. We’re going to deploy this model as a REST API endpoint. Follow what a real data scientist would do on this platform and move this model to a production environment. Lastly, I will run a notebook on my local laptop and call that web service to classify different audio clips. I’ll also create live clips that I will classify with the REST API endpoint Notebook 3 Confidential – Oracle Internal/Restricted/Highly Restricted

The Speech Command Dataset (Warden 2018) Standard training & evaluation dataset for simple speech recognition tasks 105k utterances in WAVE audio file format: Single word spoken. 35 different words: Dog, cat, bed, bird, up, wow, yes, etc. One second or less 16 kHz sampling rate 16-bit single channel 2,618 speakers were recorded Recorded with phone or computer mic in realistic settings This is the dataset we’re going to use. It’s called the Speech Command Dataset. English only. Confidential – Oracle Internal/Restricted/Highly Restricted

References Warren, P. 2018, ”Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition” Shainath, T.N., Parada, C. 2015, “Convolutional Neural Networks for Small- footprint Keywork Spotting”, Interspeech 2015, ISCA Confidential – Oracle Internal/Restricted/Highly Restricted

The Lab GitHub Public Repository https://github.com/datascienceinc/speech-commands-oow2018 Any Questions? jr.gauthier@oracle.com All the Lab materials are freely available on github.com. Feel free to download the materials and go over all the notebooks in your free times. Confidential – Oracle Internal/Restricted/Highly Restricted

CNN Model for Keywork Spotting Task CNN Model ( 2 convo layers + 2 max pools + 2 dropouts; 2 FC layers) (see also Shainath & Parada 2015 ) Confidential – Oracle Internal/Restricted/Highly Restricted