Language to Language Translation- A Way to Homogeneous India... Team effort of:- Anasree Chatterjee & Diwa Arunashree Mentor:- Prof. K.T.Talele
What is language? Need for proper communication Hazards of miscommunication Hence Need for our system Key users of our system Why the system ??
Our system overview.... Any out of 8 languages English Hindi Enjoy the words !
Speech to Text Text to Text Text to Speech Input Speech in English or Hindi Output Speech in 8 Different Language Speak in one language & listen in another language in just 3 steps ! English or Hindi speech to English or Hindi text e.g. English English text to text of selected output language e.g. Bengali Bengali text to Bengali speech
Voice Input Analog to Digital Feature Extraction Acoustic Model Acoustic Model Language Model Language Model Phonetic Lexicon Phonetic Lexicon Speech Engine/Decoder Speech Engine/Decoder Store Word in a File Speech to Text Architecture !
1.Voice Input 2.Analog to Digital 3.Feature Extracting Noise Filtering
Speech to Text Architecture ! Voice Input Analog to Digital Feature Extraction Acoustic Model Acoustic Model Language Model Language Model Phonetic Lexicon Phonetic Lexicon Speech Engine Speech Engine Store Word in a File Store Word in a File Acoustic Model Acoustic Model
Audio Recording Text Transcription Text Transcription Software Software ACOUSTIC MODEL Statistical Representations of the Sounds that make up each Word Statistical Representations of the Sounds that make up each Word Hidden Markov Model (HMM) Tool CMU Sphinx Train Tool CMU Sphinx Train Uses Components of ASR contd....
Speech to Text Architecture ! Voice Input Analog to Digital Feature Extraction Acoustic Model Acoustic Model Language Model Language Model Phonetic Lexicon Phonetic Lexicon Speech Engine Speech Engine Store Word in a File Store Word in a File Phonetic Lexicon Phonetic Lexicon
Phonetic representation of every word in vocabulary Valid words from output of acoustic model Valid words from output of acoustic model Phoneme -- basic unit of Phoneme -- basic unit of PHONETIC LEXICON Hindi :- Itrans-3 English :- phonetics Hindi :- Itrans-3 English :- phonetics Contains words + phonetic Contains words + phonetic Phonetizer Components of ASR contd....
Itrans-3 Phoneme Sound Wave Hindi Speech Hindi Script /UTF8 Hindi.dic Phoneme Hindi Word In:d:iyaa इंडिया / ইংডিযা / ఇండియా / ഇംഡിയാ / இடியா Paanii पानी / পানী / పానీ / പാനീ / பானீ In:d:iyaa इंडिया / ইংডিযা / ఇండియా / ഇംഡിയാ / இடியா Paanii पानी / পানী / పానీ / പാനീ / பானீ IT3 to UTF8
In:d:iyaa इंडिया / ইংডিযা / ఇండియా / ഇംഡിയാ / இடியா Paanii पानी / পানী / పానీ / പാനീ / பானீ In:d:iyaa इंडिया / ইংডিযা / ఇండియా / ഇംഡിയാ / இடியா Paanii पानी / পানী / పానీ / പാനീ / பானீ
English Speech English word Pocket Sphinx SphinxTrain Phoneme Sound Wave Cmu07.dic Phoneme Word
Speech to Text Architecture ! Voice Input Analog to Digital Feature Extraction Acoustic Model Acoustic Model Language Model Language Model Phonetic Lexicon Phonetic Lexicon Speech Engine Speech Engine Store Word in a File Store Word in a File Language Model Language Model
Statistical Language Model assigns probability to sequence of m words by probability distribution. Statistical Language Model assigns probability to sequence of m words by probability distribution. Captures underlying grammatical structure of language. Captures underlying grammatical structure of language. USE:- Restrict Word Search USE:- Restrict Word Search LANGUAGE MODEL Most common language models – n-gram LM Most common language models – n-gram LM Tool CMUCLMTK Tool CMUCLMTK Components of ASR contd....
CORPUS.ARPA Steps of Language Model:- Word frequencies Vocabulary file Corpus N-gram file Language Model in.ARPA format Create CORPUS.TXT CMU Cam LM TOOL KIT
.ARPA File
Speech to Text Architecture ! Voice Input Analog to Digital Feature Extraction Acoustic Model Acoustic Model Language Model Language Model Phonetic Lexicon Phonetic Lexicon Speech Engine Speech Engine Store Word in a File Store Word in a File Speech Engine Speech Engine
Compares input speech data with acoustic models Modified Version DTW Algorithm used Modified Version DTW Algorithm used SPEECH ENGINE / DECODER A spects of Speech Decoding A spects of Speech Decoding Tool CMU Sphinx-- PocketSphinx Tool CMU Sphinx-- PocketSphinx Components of ASR contd.... Determine which part of signal is speech and filter out silence durations Uses
Samples of PocketSphinx acting as a Decoder....
Retrieve Stored Word from File E.g. India Retrieve Stored Word from File E.g. India Database Script of Word in Selected Language E.g. इंडिया / ইংডিযা / ఇండియా / ഇംഡിയാ / இடியா Script of Word in Selected Language E.g. इंडिया / ইংডিযা / ఇండియా / ഇംഡിയാ / இடியா FINDRETRIVE Text to Text Architecture
Use & Creation of Database!
Speech Sound Database Text to Speech Architecture ! Input Text in UTF8 Encodings Phonetic Synthesizer Text parser Text to Phonetic Script Conversion Text to Phonetic Script Conversion Grapheme To Phoneme Rules Grapheme To Phoneme Rules Speech Synthesizer CV Pair Algorithm CV Pair Algorithm Sound concatenation Sound concatenation
Phonetic description syllable based. 8 kinds of sounds allowed V: a plain vowel CV: a consonant followed by a vowel VC: a vowel followed by a consonant CVC: a consonant followed by a vowel followed by a consonant HCV: a half consonant, followed by a CV HCVC: a half consonant, followed by a CVC 0C: a consonant alone G[0-9]*: a silence gap of the specified length (typical gaps (C -consonant, V -Vowel, H-Half Sound) Grapheme to Phoneme Conversion !
CONSONANTS :- VOWELS :- Consonants & Vowels !
Speech Sound Database Text to Speech Architecture ! Input Text in UTF8 Encodings Phonetic Synthesizer Text parser Text to Phonetic Script Conversion Text to Phonetic Script Conversion Grapheme To Phoneme Rules Grapheme To Phoneme Rules Speech Synthesizer CV Pair Algorithm CV Pair Algorithm Sound concatenation Sound concatenation
Unicode text common script. Speech Synthesizer common script Text to Phonetic Script ! Examples
Speech Sound Database Text to Speech Architecture ! Input Text in UTF8 Encodings Phonetic Synthesizer Text parser Text to Phonetic Script Conversion Text to Phonetic Script Conversion Grapheme To Phoneme Rules Grapheme To Phoneme Rules Speech Synthesizer CV Pair Algorithm CV Pair Algorithm Sound concatenation Sound concatenation
Sound files are gsm compressed i.e. “.gsm” fromat Sound units stored in the database are:- Total size of db MB CV pairs : * VC pairs : * V : C : Halfs :--- ky kr kl kll kv ksh khy khr khl khv gy gr gl gv gn ghy ghr ghv ghn chy chr chv jy jv ty tr tv thy thr dy dr dv dhy dhr dhv ny nr nv tty ttr ttv ddy ddr ddv py pr pl pll fr fl by br bl bhy bhr bhl my mr vy vr vl CV pairs : * VC pairs : * V : C : Halfs :--- ky kr kl kll kv ksh khy khr khl khv gy gr gl gv gn ghy ghr ghv ghn chy chr chv jy jv ty tr tv thy thr dy dr dv dhy dhr dhv ny nr nv tty ttr ttv ddy ddr ddv py pr pl pll fr fl by br bl bhy bhr bhl my mr vy vr vl Sound Database !
CV files x.y.gsm named consonant number consonant number vowel number V files named x.gsm vowel number CV files x.y.gsm named vowel number consonant number consonant number Halfs files x.y.gsm named 2 consonants 0C files named x.gsm consonant number consonant number Sound Concatenation cvoffsets vcoffsets hoffsets voffsets 4 more Files
Speech Sound Database Text to Speech Architecture ! Input Text in UTF8 Encodings Phonetic Synthesizer Text parser Text to Phonetic Script Conversion Text to Phonetic Script Conversion Grapheme To Phoneme Rules Grapheme To Phoneme Rules Speech Synthesizer CV Pair Algorithm CV Pair Algorithm Sound concatenation Sound concatenation
Extended modules:- Constraints :- Training is tedious :- 2 input Languages. Phone generation of all Indian languages difficult. Can be trained for all Indian languages Increase accuracy Better quality of the text to speech synthesizer modules A larger dictionary approx words Future scope :- S2TT2ST2T File Reader S2T Reporter
BOL INDIA BOL PRIVATE LIMITED Masters of Computer Application. Sardar Patel Institute Of Technology. Andheri (West) Mumbai-58 Anasree Chatterjee (Director) Diwa Arunashree (Director) Prof. K.T.Talele (Joint Director) Shivani Nadkarni (Joint Director) Aditya Naravane (Joint Director ) Anasree Chatterjee (Director) Diwa Arunashree (Director) Prof. K.T.Talele (Joint Director) Shivani Nadkarni (Joint Director) Aditya Naravane (Joint Director ) “Language to Language Translator – A way To Homogeneous India ” Languator -- especially designed for the 3Ts’ that is T ravelers, T ourists and at pars the people who are victims of T ransferable jobs. It will also serve to certain extent the needs of S2T Reporters. “Language to Language Translator – A way To Homogeneous India ” Languator -- especially designed for the 3Ts’ that is T ravelers, T ourists and at pars the people who are victims of T ransferable jobs. It will also serve to certain extent the needs of S2T Reporters.