Android Speech Recognition

Slides:



Advertisements
Similar presentations
Manifest File, Intents, and Multiple Activities. Manifest File.
Advertisements

The Activity Class 1.  One application component type  Provides a visual interface for a single screen  Typically supports one thing a user can do,
A Guide to Oracle9i1 Introduction To Forms Builder Chapter 5.
Cosc 4/5730 Android Text to Speech And Speech To Text.
Neal Stublen Computer Systems Hardware Display Keyboard Mouse Microphone Memory Chips Microprocessor.
INTERNATIONAL SUMMER ACADEMIC COURSE UNIVESITY OF NIS ISAC – Android programming.
CS378 - Mobile Computing Speech to Text, Text to Speech, Telephony.
CS378 - Mobile Computing What's Next?. Fragments Added in Android 3.0, a release aimed at tablets A fragment is a portion of the UI in an Activity multiple.
Mobile Programming Lecture 16 The Facebook API. Agenda The Setup Hello, Facebook User Facebook Permissions Access Token Logging Out Graph API.
Cosc 5/4730 Broadcast Receiver. Broadcast receiver A broadcast receiver (short receiver) – is an Android component which allows you to register for system.
DUE Hello World on the Android Platform.
1 Announcements Homework #2 due Feb 7 at 1:30pm Submit the entire Eclipse project in Blackboard Please fill out the when2meets when your Project Manager.
The Semantic Web and Microformats. The Semantic Web Syntax = how you say something – Letters, words, punctuation Semantics = meaning behind what you say.
A seminar on “Mobile Version of The Website”
Outline Grammar-based speech recognition Statistical language model-based recognition Speech Synthesis Dialog Management Natural Language Processing ©
CS378 - Mobile Computing Intents. Allow us to use applications and components that are part of Android System – start activities – start services – deliver.
Cosc 5/4730 Android Communications Intents, callbacks, and setters.
Linking Activities using Intents How to navigate between Android Activities 1Linking Activities using Intents.
Mobile Application Development Options most popular platforms/ application software: -BREW-Android -Symbian-JME -Palm OS-Windows Mobile -iPhone-Linux -BlackBerry.
Activity Android Club Agenda Hello Android application Application components Activity StartActivity.
Mobile Programming Midterm Review
Applications with Multiple Activities. Most applications will have more than one activity. The main activity is started when the application is started.
Speech Recognition Yonglei Tao. Voice-Activated GPS.
Introducing Intents Intents Bind application components and navigate between them Transform device into collection of interconnected systems Creating a.
Intents 1 CS440. Intents  Message passing mechanism  Most common uses:  starting an Activity (open an , contact, etc.)  starting an Activity.
Presentation Title 1 1/27/2016 Lucent Technologies - Proprietary Voice Interface On Wireless Applications Protocol A PDA Implementation Sherif Abdou Qiru.
Lecture 2: Android Concepts
Cosc 5/4735 Voice Actions Voice Interactions (API 23+)
© Copyright by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. 1 Outline 28.1 Java Speech API 28.2 Downloading and.
Speech Recognition Created By : Kanjariya Hardik G.
Technische Universität München Services, IPC and RPC Gökhan Yilmaz, Benedikt Brück.
Intents and Broadcast Receivers Dr. David Janzen Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution.
1. 2 Android location services Determining a device’s current location Tracking device movements Proximity alerts.
Glencoe Introduction to Multimedia Chapter 2 Multimedia Online 1 Internet A huge network that connects computers all over the world. Show Definition.
CS371m - Mobile Computing Intents 1. Allow us to use applications and components that are already part of Android System – start activities – start services.
Getting Started with HTML
Cosc 4735 Nougat API 24+ additions.
Visual Programming? (and FRIDAY!)
2/21/ :54 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
Intents and Broadcast Receivers
Android Application Web 1.
Lecture 2: Android Concepts
Introduction to Operating Systems
Lecture 5: Location Topics: Google Play Services, Location API.
several communicating screens
Linking Activities using Intents
DCR ARB Presentation Team 5: Tour Conductor.
Objectives Identify the built-in data types in C++
Google translate app demo
Sensors, maps and fragments:
Android Introduction Camera.
Dynamic Web Pages (Flash, JavaScript)
CIS 470 Mobile App Development
CIS 470 Mobile App Development
Android Sensor Programming
Many thanks to Jun Bum Lim for his help with this tutorial.
Building event-driven, long-running apps with Windows workflow
An Open-Source Based Speech Recognition Android Application for Helping Handicapped Students Writing Programs Tong Lai Yu, Santhrushna Gande.
Android Topics UI Thread and Limited processing resources
Activities and Intents
Lecture 5: Location Topics: Google Play Services, Location API.
Android Developer Fundamentals V2 Lesson 5
Objects First with Java
Lecture 2: Android Concepts
CIS 136 Building Mobile Apps
Objects First with Java
Activities and Fragments
Web Programming : Building Internet Applications Chris Bates CSE :
Exceptions and networking
CA16R405 - Mobile Application Development (Theory)
Presentation transcript:

Android Speech Recognition CS 5390 Mobile Application Development 04/27/2017 Presented by: Marco López

Outline What it is API Demo Resources

What it is Speech recognition is when software uses a microphone in order to detect sounds, and try to map it to words. In Android, speech recognition can be done through built-in service or activity. Requires internet connectivity in order to interpret a different language other than the device’s configured language. Since it is the service accessing the internet, the app being developed does not require to include this INTERNET permission. Google does this speech recognition with machine learning. They have an engine trained with large sample sizes of people speaking. The engine learns which words are used more often and in what context. Since this service uses internet connectivity, it is not recommended to be constantly running, as it will consume large amounts of battery and bandwidth. Interpreter challenges: Dialect – US vs British User enunciation Microphone quality Distance from microphone Environment noise Accents

API Android introduced speech recognition in API level 3 (Cupcake) through RecognizerIntent This intent launches a new activity running the speech recognizing software from which we can retrieve data from. API level 8 (Froyo) introduced SpeechRecognizer This is an Android service. SpeechRecognizer requires RECORD_AUDIO permission These are the two basic speech recognition APIs <uses-permission android:name="android.permission.RECORD_AUDIO"/>

API - RecognizerIntent Intent Action Summary ACTION_RECOGNIZE_SPEECH Starts an activity that will prompt the user for speech and send it through a speech recognizer. ACTION_VOICE_SEARCH_HANDS_FREE Starts an activity that will prompt the user for speech without requiring the user's visual attention or touch input. ACTION_WEB_SEARCH Starts an activity that will prompt the user for speech, send it through a speech recognizer, and either display a web search result or trigger another type of action based on the user's speech.

API – RecognizerIntent (cont.) In order to start Android’s voice recognition an intent must be created: startActivityForResult is called in order to receive something from the activity Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH); intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM); try { startActivityForResult(intent, REQ_CODE_SPEECH_INPUT); } catch (ActivityNotFoundException a) { Toast.makeText(getApplicationContext(), getString(R.string.speech_not_supported), Toast.LENGTH_SHORT).show(); } EXTRA_LANGUAGE_MODEL Informs the recognizer which speech model to prefer when performing ACTION_RECOGNIZE_SPEECH. The recognizer uses this information to fine tune the results. This extra is required. Activities implementing ACTION_RECOGNIZE_SPEECH may interpret the values as they see fit. LANGUAGE_MODEL_FREE_FORM Use a language model based on free-form speech recognition. This is a value to use for EXTRA_LANGUAGE_MODEL. LANGUAGE_MODEL_WEB_SEARCH Use a language model based on web search terms. This is a value to use for EXTRA_LANGUAGE_MODEL.

API - RecognizerInent (cont.) Different languages are supported when recognizing speech. The desired language must be specified in the intent Language format must be an IETF language tag (as defined by BCP 47), for example "en-US“: It is also possible to set the language to the device’s locale: The text in the invoked activity can be changed by: intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "en-US"); Local has 5 fields: Language Script Country (region) Variant Extensions intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault()); intent.putExtra(RecognizerIntent.EXTRA_PROMPT, getString(R.string.speech_prompt));

API - RecognizerInent (cont.)

API - RecognizerInent (cont.) onActivityResult is called when the activity called by startActivityForResult is done. The speech recognition activity returns an ArrayList of String through the intent. This array contains possible interpretations of what the user said into the microphone. /** * Receiving speech input */ @Override protected void onActivityResult(int requestCode, int resultCode, Intent data) { super.onActivityResult(requestCode, resultCode, data); switch (requestCode) { case REQ_CODE_SPEECH_INPUT: { if (resultCode == RESULT_OK && null != data) { ArrayList<String> result = data .getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS); txtSpeechInput.setText(result.get(0)); } break; } } }

API - SpeechRecognizer Using SpeechRecognizer requires a RecognitionListener. RecognitionListener methods Summary onReadyForSpeech(Bundle) Called when the endpointer is ready for the user to start speaking. onBeginningOfSpeech() The user has started to speak. OnRmsChanged(float) The sound level in the audio stream has changed. onBufferReceived(byte[]) More sound has been received. onEndOfSpeech() Called after the user stops speaking. onError(int) A network or recognition error occurred. onResults(Bundle) Called when recognition results are ready. onPartialResults(Bundle) Called when partial recognition results are available. onEvent(int, Bundle) Reserved for adding future events.

API – SpeechRecognizer (cont.) public void onResults(Bundle results) { String str = new String(); String test = new String(); Log.d(TAG, "onResults " + results); ArrayList data = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION); for (int i = 0; i < data.size(); i++) { Log.d(TAG, "result " + data.get(i)); str += data.get(i); if (i == 0) test += data.get(i); } mText.setText("results: " + test); }

API - SpeechRecognizer (cont.) Declare SpeechRecognizer inside activity and create intent the same way as with RecognizerIntent, but this time call startListening on the SpeechRecognizer as opposed to starting a new activity. sr = SpeechRecognizer.createSpeechRecognizer(this); sr.setRecognitionListener(new RecognitionListener() {…}); Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH); intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM); intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "es-MX"); intent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 5); sr.startListening(intent);

Example

Resources https://developer.android.com/reference/java/util/Locale.html https://developer.android.com/reference/android/speech/RecognizerIntent.html https://developer.android.com/reference/android/speech/SpeechRecognizer.html