Android Sensor Programming Voice and Speech Recognition Wenbing Zhao Department of Electrical Engineering and Computer Science Cleveland State University w.zhao1@csuohio.edu 11/28/2018 Android Sensor Programming
Android Sensor Programming Outline Android audio capture Speech recognition 11/28/2018 Android Sensor Programming
CIS 470: Mobile App Development Android Audio Capture https://www.tutorialspoint.com/android/android_audio_capture.htm Android provides MediaRecorder class to record audio or video In order to use MediaRecorder class ,you will first create an instance of MediaRecorder class MediaRecorder myAudioRecorder = new MediaRecorder(); Now you will set the source, output and encoding format and output file myAudioRecorder.setAudioSource(MediaRecorder.AudioSource.MIC); myAudioRecorder.setOutputFormat(MediaRecorder.OutputFormat.THREE_GPP); myAudioRecorder.setAudioEncoder(MediaRecorder.OutputFormat.AMR_NB); myAudioRecorder.setOutputFile(outputFile); After specifying the audio source and format and its output file, we can then call the two basic methods prepare and start to start recording the audio myAudioRecorder.prepare(); myAudioRecorder.start(); To display or log the sound level, use API: myAudioRecorder.getMaxAmplitude() 11/28/2018 CIS 470: Mobile App Development
CIS 470: Mobile App Development Android Audio Capture https://developer.android.com/reference/android/media/MediaRecorder.html 11/28/2018 CIS 470: Mobile App Development
CIS 470: Mobile App Development Audio Capture Create a new app and name it AudioCapture Add permissions in manifest <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"/> <uses-permission android:name="android.permission.RECORD_AUDIO" /> <uses-permission android:name="android.permission.STORAGE" /> 11/28/2018 CIS 470: Mobile App Development
CIS 470: Mobile App Development Modify activity_main.xml layout: Audio Capture <Button android:layout_width="wrap_content" android:layout_height="wrap_content" android:text="STOP" android:id="@+id/button2" android:layout_alignTop="@+id/button" android:layout_centerHorizontal="true" /> <Button android:layout_width="wrap_content" android:layout_height="wrap_content" android:text="Play" android:id="@+id/button3" android:layout_alignTop="@+id/button2" android:layout_alignParentRight="true" android:layout_alignParentEnd="true" /> <Button android:layout_width="wrap_content" android:layout_height="wrap_content" android:text="STOP PLAYING RECORDING " android:id="@+id/button4" android:layout_below="@+id/button2" android:layout_centerHorizontal="true" android:layout_marginTop="10dp" /> </RelativeLayout> <?xml version="1.0" encoding="utf-8"?> <RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android" xmlns:tools="http://schemas.android.com/tools" android:layout_width="match_parent" android:layout_height="match_parent" android:paddingBottom="@dimen/activity_vertical_margin" android:paddingLeft="@dimen/activity_horizontal_margin" android:paddingRight="@dimen/activity_horizontal_margin" android:paddingTop="@dimen/activity_vertical_margin"> <ImageView android:layout_width="wrap_content" android:layout_height="wrap_content" android:id="@+id/imageView" android:layout_alignParentTop="true" android:layout_centerHorizontal="true" android:src="@drawable/csu"/> <Button android:layout_width="wrap_content" android:layout_height="wrap_content" android:text="Record" android:id="@+id/button" android:layout_below="@+id/imageView" android:layout_alignParentLeft="true" android:layout_marginTop="37dp" /> 11/28/2018 CIS 470: Mobile App Development
CIS 470: Mobile App Development Audio Capture Modify MainActivity.java (download the entire java file from the course webpage buttonStart.setOnClickListener(new View.OnClickListener() { @Override public void onClick(View view) { if(checkPermission()) { AudioSavePathInDevice = Environment.getExternalStorageDirectory().getAbsolutePath() + "/" + CreateRandomAudioFileName(5) + "AudioRecording.3gp"; MediaRecorderReady(); try { mediaRecorder.prepare(); mediaRecorder.start(); } catch (IllegalStateException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } buttonStart.setEnabled(false); buttonStop.setEnabled(true); Toast.makeText(MainActivity.this, "Recording started", Toast.LENGTH_LONG).show(); } else { requestPermission(); } } }); 11/28/2018 CIS 470: Mobile App Development
CIS 470: Mobile App Development public void MediaRecorderReady(){ mediaRecorder=new MediaRecorder(); mediaRecorder.setAudioSource(MediaRecorder.AudioSource.MIC); mediaRecorder.setOutputFormat(MediaRecorder.OutputFormat.THREE_GPP); mediaRecorder.setAudioEncoder(MediaRecorder.OutputFormat.AMR_NB); mediaRecorder.setOutputFile(AudioSavePathInDevice); // If you do not wish to save the audio and only perform analysis // mRecorder.setOutputFile("/dev/null"); } public String CreateRandomAudioFileName(int string){ StringBuilder stringBuilder = new StringBuilder( string ); int i = 0 ; while(i < string ) { stringBuilder.append(RandomAudioFileName. charAt(random.nextInt(RandomAudioFileName.length()))); i++ ; } return stringBuilder.toString(); } private void requestPermission() { ActivityCompat.requestPermissions(MainActivity.this, new String[]{WRITE_EXTERNAL_STORAGE, RECORD_AUDIO}, RequestPermissionCode); } 11/28/2018 CIS 470: Mobile App Development
CIS 470: Mobile App Development @Override public void onRequestPermissionsResult(int requestCode, String permissions[], int[] grantResults) { switch (requestCode) { case RequestPermissionCode: if (grantResults.length> 0) { boolean StoragePermission = grantResults[0] == PackageManager.PERMISSION_GRANTED; boolean RecordPermission = grantResults[1] == PackageManager.PERMISSION_GRANTED; if (StoragePermission && RecordPermission) { Toast.makeText(MainActivity.this, "Permission Granted", Toast.LENGTH_LONG).show(); } else { Toast.makeText(MainActivity.this,"Permission Denied",Toast.LENGTH_LONG).show(); } } break; } } public boolean checkPermission() { int result = ContextCompat.checkSelfPermission(getApplicationContext(), WRITE_EXTERNAL_STORAGE); int result1 = ContextCompat.checkSelfPermission(getApplicationContext(), RECORD_AUDIO); return result == PackageManager.PERMISSION_GRANTED && result1 == PackageManager.PERMISSION_GRANTED; } 11/28/2018 CIS 470: Mobile App Development
CIS 470: Mobile App Development Homework Change the basic app to display graphically the maximum amplitude of the audio while you are recording. Your new app should allow the user to set the sampling period of the amplitude. The graphical display could be in line curve or bar chart with time as the horizontal axis and the amplitude as the vertical axis https://developer.xamarin.com/api/property/Android.Media.MediaRecorder.MaxAmplitude/ 11/28/2018 CIS 470: Mobile App Development
Android Speech Recognition Android speech recognition: the speech input will be streamed to a server, on the server voice will be converted to text and finally text will be sent back to our app Internet connection is required to perform speech recognition 11/28/2018 CIS 470: Mobile App Development
CIS 470: Mobile App Development Speech Recognition Create a new app and name it MySpeechRecognizer Add permissions in manifest <uses-permission android:name="android.permission.INTERNET"/> <uses-permission android:name="android.permission.RECORD_AUDIO"/> <uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" /> <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" /> 11/28/2018 CIS 470: Mobile App Development
CIS 470: Mobile App Development <?xml version="1.0" encoding="utf-8"?> <RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android" android:layout_width="match_parent" android:layout_height="match_parent" android:orientation="vertical" > <ProgressBar android:id="@+id/progressBar1" style="?android:attr/progressBarStyleHorizontal" android:layout_width="match_parent" android:layout_height="wrap_content" android:layout_alignParentLeft="true" android:layout_below="@+id/toggleButton1" android:layout_marginTop="28dp" android:paddingLeft="10dp" android:paddingRight="10dp" /> <TextView android:id="@+id/textView1" android:layout_width="wrap_content" android:layout_height="wrap_content" android:layout_below="@+id/progressBar1" android:layout_centerHorizontal="true" android:layout_marginTop="47dp" /> <ToggleButton android:id="@+id/toggleButton1" android:layout_width="wrap_content" android:layout_height="wrap_content" android:layout_alignParentTop="true" android:layout_centerHorizontal="true" android:layout_marginTop="26dp" android:text="ToggleButton" /> </RelativeLayout> Speech Recognition Modify activity_main.xml layout: 11/28/2018 CIS 470: Mobile App Development
CIS 470: Mobile App Development Speech Recognition Modify MainActivity.java (download the entire java file from the course webpage public class MainActivity extends AppCompatActivity implements RecognitionListener { private static final int REQUEST_RECORD_PERMISSION = 100; private TextView returnedText; private ToggleButton toggleButton; private ProgressBar progressBar; private SpeechRecognizer speech = null; private Intent recognizerIntent; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_main); returnedText = (TextView) findViewById(R.id.textView1); progressBar = (ProgressBar) findViewById(R.id.progressBar1); toggleButton = (ToggleButton) findViewById(R.id.toggleButton1); .... 11/28/2018 CIS 470: Mobile App Development
CIS 470: Mobile App Development @Override protected void onCreate(Bundle savedInstanceState) { .... progressBar.setVisibility(View.VISIBLE); speech = SpeechRecognizer.createSpeechRecognizer(this); speech.setRecognitionListener(this); recognizerIntent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH); recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_PREFERENCE, "en"); recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM); recognizerIntent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 3); toggleButton.setChecked(true); speech.startListening(recognizerIntent); toggleButton.setOnCheckedChangeListener(new CompoundButton.OnCheckedChangeListener() { @Override public void onCheckedChanged(CompoundButton buttonView, boolean isChecked) { if (isChecked) { progressBar.setVisibility(View.VISIBLE); progressBar.setIndeterminate(true); speech.startListening(recognizerIntent); } else { progressBar.setIndeterminate(false); progressBar.setVisibility(View.INVISIBLE); speech.stopListening(); } } }); } 11/28/2018 CIS 470: Mobile App Development
CIS 470: Mobile App Development @Override public void onBeginningOfSpeech() { Log.i(LOG_TAG, "onBeginningOfSpeech"); progressBar.setIndeterminate(false); progressBar.setMax(10); } @Override public void onBufferReceived(byte[] buffer) { Log.i(LOG_TAG, "onBufferReceived: " + buffer); } @Override public void onEndOfSpeech() { Log.i(LOG_TAG, "onEndOfSpeech"); } @Override public void onError(int errorCode) { String errorMessage = getErrorText(errorCode); Log.d(LOG_TAG, "FAILED " + errorMessage); speech.cancel(); speech.startListening(recognizerIntent); } @Override public void onEvent(int arg0, Bundle arg1) { Log.i(LOG_TAG, "onEvent"); } @Override public void onPartialResults(Bundle arg0) { Log.i(LOG_TAG, "onPartialResults"); } @Override public void onReadyForSpeech(Bundle arg0) { Log.i(LOG_TAG, "onReadyForSpeech"); progressBar.setVisibility(View.VISIBLE); } @Override public void onResults(Bundle results) { Log.i(LOG_TAG, "onResults"); ArrayList<String> matches = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION); float[] confs = results.getFloatArray(SpeechRecognizer.CONFIDENCE_SCORES); String text = ""; int i = 0; for (String result : matches) { if(confs[i] > 0.2) // to include text only if the confidence > 20% text += confs[i]+": "+ result + "\n"; i++; } returnedText.setText(text); speech.cancel(); progressBar.setVisibility(View.INVISIBLE); speech.startListening(recognizerIntent); } @Override public void onRmsChanged(float rmsdB) { progressBar.setProgress((int) rmsdB); } 11/28/2018 CIS 470: Mobile App Development
Offline Speech Recognition It seems that speech recognition can be done offline: https://stackoverflow.com/questions/17616994/offline-speech-recognition-in-android-jellybean Then the user can select any desired languages. When the download is done, he should disconnect from network, and then click on the "microphone" button of the keyboard. 11/28/2018 CIS 470: Mobile App Development