Download presentation
Presentation is loading. Please wait.
Published byScarlett Hancock Modified over 9 years ago
2
Language to Language Translation- A Way to Homogeneous India... Team effort of:- Anasree Chatterjee & Diwa Arunashree Mentor:- Prof. K.T.Talele
3
What is language? Need for proper communication Hazards of miscommunication Hence Need for our system Key users of our system Why the system ??
4
Our system overview.... Any out of 8 languages English Hindi Enjoy the words !
5
Speech to Text Text to Text Text to Speech Input Speech in English or Hindi Output Speech in 8 Different Language Speak in one language & listen in another language in just 3 steps ! English or Hindi speech to English or Hindi text e.g. English English text to text of selected output language e.g. Bengali Bengali text to Bengali speech
6
Voice Input Analog to Digital Feature Extraction Acoustic Model Acoustic Model Language Model Language Model Phonetic Lexicon Phonetic Lexicon Speech Engine/Decoder Speech Engine/Decoder Store Word in a File Speech to Text Architecture !
7
1.Voice Input 2.Analog to Digital 3.Feature Extracting Noise Filtering
8
Speech to Text Architecture ! Voice Input Analog to Digital Feature Extraction Acoustic Model Acoustic Model Language Model Language Model Phonetic Lexicon Phonetic Lexicon Speech Engine Speech Engine Store Word in a File Store Word in a File Acoustic Model Acoustic Model
9
Audio Recording Text Transcription Text Transcription Software Software ACOUSTIC MODEL Statistical Representations of the Sounds that make up each Word Statistical Representations of the Sounds that make up each Word Hidden Markov Model (HMM) Tool CMU Sphinx Train Tool CMU Sphinx Train Uses Components of ASR contd....
10
Speech to Text Architecture ! Voice Input Analog to Digital Feature Extraction Acoustic Model Acoustic Model Language Model Language Model Phonetic Lexicon Phonetic Lexicon Speech Engine Speech Engine Store Word in a File Store Word in a File Phonetic Lexicon Phonetic Lexicon
11
Phonetic representation of every word in vocabulary Valid words from output of acoustic model Valid words from output of acoustic model Phoneme -- basic unit of Phoneme -- basic unit of PHONETIC LEXICON Hindi :- Itrans-3 English :- phonetics Hindi :- Itrans-3 English :- phonetics Contains words + phonetic Contains words + phonetic Phonetizer Components of ASR contd....
12
Itrans-3 Phoneme Sound Wave Hindi Speech Hindi Script /UTF8 Hindi.dic Phoneme Hindi Word In:d:iyaa इंडिया / ইংডিযা / ఇండియా / ഇംഡിയാ / இடியா Paanii पानी / পানী / పానీ / പാനീ / பானீ In:d:iyaa इंडिया / ইংডিযা / ఇండియా / ഇംഡിയാ / இடியா Paanii पानी / পানী / పానీ / പാനീ / பானீ IT3 to UTF8
13
In:d:iyaa इंडिया / ইংডিযা / ఇండియా / ഇംഡിയാ / இடியா Paanii पानी / পানী / పానీ / പാനീ / பானீ In:d:iyaa इंडिया / ইংডিযা / ఇండియా / ഇംഡിയാ / இடியா Paanii पानी / পানী / పానీ / പാനീ / பானீ
14
English Speech English word Pocket Sphinx SphinxTrain Phoneme Sound Wave Cmu07.dic Phoneme Word
15
Speech to Text Architecture ! Voice Input Analog to Digital Feature Extraction Acoustic Model Acoustic Model Language Model Language Model Phonetic Lexicon Phonetic Lexicon Speech Engine Speech Engine Store Word in a File Store Word in a File Language Model Language Model
16
Statistical Language Model assigns probability to sequence of m words by probability distribution. Statistical Language Model assigns probability to sequence of m words by probability distribution. Captures underlying grammatical structure of language. Captures underlying grammatical structure of language. USE:- Restrict Word Search USE:- Restrict Word Search LANGUAGE MODEL Most common language models – n-gram LM Most common language models – n-gram LM Tool CMUCLMTK Tool CMUCLMTK Components of ASR contd....
17
CORPUS.ARPA Steps of Language Model:- Word frequencies Vocabulary file Corpus N-gram file Language Model in.ARPA format Create CORPUS.TXT CMU Cam LM TOOL KIT
18
.ARPA File
19
Speech to Text Architecture ! Voice Input Analog to Digital Feature Extraction Acoustic Model Acoustic Model Language Model Language Model Phonetic Lexicon Phonetic Lexicon Speech Engine Speech Engine Store Word in a File Store Word in a File Speech Engine Speech Engine
20
Compares input speech data with acoustic models Modified Version DTW Algorithm used Modified Version DTW Algorithm used SPEECH ENGINE / DECODER A spects of Speech Decoding A spects of Speech Decoding Tool CMU Sphinx-- PocketSphinx Tool CMU Sphinx-- PocketSphinx Components of ASR contd.... Determine which part of signal is speech and filter out silence durations Uses
21
Samples of PocketSphinx acting as a Decoder....
22
Retrieve Stored Word from File E.g. India Retrieve Stored Word from File E.g. India Database Script of Word in Selected Language E.g. इंडिया / ইংডিযা / ఇండియా / ഇംഡിയാ / இடியா Script of Word in Selected Language E.g. इंडिया / ইংডিযা / ఇండియా / ഇംഡിയാ / இடியா FINDRETRIVE Text to Text Architecture
23
Use & Creation of Database!
24
Speech Sound Database Text to Speech Architecture ! Input Text in UTF8 Encodings Phonetic Synthesizer Text parser Text to Phonetic Script Conversion Text to Phonetic Script Conversion Grapheme To Phoneme Rules Grapheme To Phoneme Rules Speech Synthesizer CV Pair Algorithm CV Pair Algorithm Sound concatenation Sound concatenation
25
Phonetic description syllable based. 8 kinds of sounds allowed V: a plain vowel CV: a consonant followed by a vowel VC: a vowel followed by a consonant CVC: a consonant followed by a vowel followed by a consonant HCV: a half consonant, followed by a CV HCVC: a half consonant, followed by a CVC 0C: a consonant alone G[0-9]*: a silence gap of the specified length (typical gaps (C -consonant, V -Vowel, H-Half Sound) Grapheme to Phoneme Conversion !
26
CONSONANTS :- VOWELS :- Consonants & Vowels !
27
Speech Sound Database Text to Speech Architecture ! Input Text in UTF8 Encodings Phonetic Synthesizer Text parser Text to Phonetic Script Conversion Text to Phonetic Script Conversion Grapheme To Phoneme Rules Grapheme To Phoneme Rules Speech Synthesizer CV Pair Algorithm CV Pair Algorithm Sound concatenation Sound concatenation
28
Unicode text common script. Speech Synthesizer common script Text to Phonetic Script ! Examples
29
Speech Sound Database Text to Speech Architecture ! Input Text in UTF8 Encodings Phonetic Synthesizer Text parser Text to Phonetic Script Conversion Text to Phonetic Script Conversion Grapheme To Phoneme Rules Grapheme To Phoneme Rules Speech Synthesizer CV Pair Algorithm CV Pair Algorithm Sound concatenation Sound concatenation
30
Sound files are gsm compressed i.e. “.gsm” fromat Sound units stored in the database are:- Total size of db --- 1 MB CV pairs :--- 1..33 * 2 4 6 8 9 10 12 13 14 15 VC pairs :--- 2 4 6 8 9 10 12 13 14 15 * 1..34 V :--- 1..14 C :--- 1..34 Halfs :--- ky kr kl kll kv ksh khy khr khl khv gy gr gl gv gn ghy ghr ghv ghn chy chr chv jy jv ty tr tv thy thr dy dr dv dhy dhr dhv ny nr nv tty ttr ttv ddy ddr ddv py pr pl pll fr fl by br bl bhy bhr bhl my mr vy vr vl CV pairs :--- 1..33 * 2 4 6 8 9 10 12 13 14 15 VC pairs :--- 2 4 6 8 9 10 12 13 14 15 * 1..34 V :--- 1..14 C :--- 1..34 Halfs :--- ky kr kl kll kv ksh khy khr khl khv gy gr gl gv gn ghy ghr ghv ghn chy chr chv jy jv ty tr tv thy thr dy dr dv dhy dhr dhv ny nr nv tty ttr ttv ddy ddr ddv py pr pl pll fr fl by br bl bhy bhr bhl my mr vy vr vl Sound Database !
31
CV files x.y.gsm named consonant number consonant number vowel number V files named x.gsm vowel number CV files x.y.gsm named vowel number consonant number consonant number Halfs files x.y.gsm named 2 consonants 0C files named x.gsm consonant number consonant number Sound Concatenation cvoffsets vcoffsets hoffsets voffsets 4 more Files
32
Speech Sound Database Text to Speech Architecture ! Input Text in UTF8 Encodings Phonetic Synthesizer Text parser Text to Phonetic Script Conversion Text to Phonetic Script Conversion Grapheme To Phoneme Rules Grapheme To Phoneme Rules Speech Synthesizer CV Pair Algorithm CV Pair Algorithm Sound concatenation Sound concatenation
33
Extended modules:- Constraints :- Training is tedious :- 2 input Languages. Phone generation of all Indian languages difficult. Can be trained for all Indian languages Increase accuracy Better quality of the text to speech synthesizer modules A larger dictionary approx. 2000-3000 words Future scope :- S2TT2ST2T File Reader S2T Reporter
35
BOL INDIA BOL PRIVATE LIMITED Masters of Computer Application. Sardar Patel Institute Of Technology. Andheri (West) Mumbai-58 Anasree Chatterjee (Director) Diwa Arunashree (Director) Prof. K.T.Talele (Joint Director) Shivani Nadkarni (Joint Director) Aditya Naravane (Joint Director ) Anasree Chatterjee (Director) Diwa Arunashree (Director) Prof. K.T.Talele (Joint Director) Shivani Nadkarni (Joint Director) Aditya Naravane (Joint Director ) “Language to Language Translator – A way To Homogeneous India ” Languator -- especially designed for the 3Ts’ that is T ravelers, T ourists and at pars the people who are victims of T ransferable jobs. It will also serve to certain extent the needs of S2T Reporters. “Language to Language Translator – A way To Homogeneous India ” Languator -- especially designed for the 3Ts’ that is T ravelers, T ourists and at pars the people who are victims of T ransferable jobs. It will also serve to certain extent the needs of S2T Reporters.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.