Korea Maritime and Ocean University NLP Jung Tae LEE

Slides:



Advertisements
Similar presentations
PHONE MODELING AND COMBINING DISCRIMINATIVE TRAINING FOR MANDARIN-ENGLISH BILINGUAL SPEECH RECOGNITION Yanmin Qian, Jia Liu ICASSP2010 Pei-Ning Chen CSIE.
Advertisements

Summer 2011 Monday, 8/1. As you’re working on your paper Make sure to state your thesis and the structure of your argument in the very first paragraph.
For Wednesday Read chapter 19, sections 1-3 No homework.
Machine Learning Lecture 4 Multilayer Perceptrons G53MLE | Machine Learning | Dr Guoping Qiu1.
Kostas Kontogiannis E&CE
Development of Automatic Speech Recognition and Synthesis Technologies to Support Chinese Learners of English: The CUHK Experience Helen Meng, Wai-Kit.
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
CSC321: Neural Networks Lecture 3: Perceptrons
Modular Neural Networks CPSC 533 Franco Lee Ian Ko.
Biological sequence analysis and information processing by artificial neural networks Søren Brunak Center for Biological Sequence Analysis Technical University.
Simple Neural Nets For Pattern Classification
Handwritten Character Recognition Using Artificial Neural Networks Shimie Atkins & Daniel Marco Supervisor: Johanan Erez Technion - Israel Institute of.
Neural Networks Basic concepts ArchitectureOperation.
20.5 Nerual Networks Thanks: Professors Frank Hoffmann and Jiawei Han, and Russell and Norvig.
Linguisitics Levels of description. Speech and language Language as communication Speech vs. text –Speech primary –Text is derived –Text is not “written.
Chapter three Phonology
Data Mining with Neural Networks (HK: Chapter 7.5)
Artificial Neural Networks
An Introduction To The Backpropagation Algorithm Who gets the credit?
1 Speech synthesis 2 What is the task? –Generating natural sounding speech on the fly, usually from text What are the main difficulties? –What to say.
Phonetics and Phonology.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Phonetics and Phonology
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Artificial Neural Networks (ANN). Output Y is 1 if at least two of the three inputs are equal to 1.
Multiple-Layer Networks and Backpropagation Algorithms
Neural Networks Chapter 6 Joost N. Kok Universiteit Leiden.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
Chapter 9 Neural Network.
Analysis of a Neural Language Model Eric Doi CS 152: Neural Networks Harvey Mudd College.
Transcription of Text by Incremental Support Vector machine Anurag Sahajpal and Terje Kristensen.
Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy
English Linguistics: An Introduction
Classification / Regression Neural Networks 2
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
Unit 12 Rhythm of English Speech Xu Dehua. Rhythm and its Features  Rhythm  Rhythm is the internal law of English language. It is the regular occurrence.
Modelling Language Evolution Lecture 1: Introduction to Learning Simon Kirby University of Edinburgh Language Evolution & Computation Research Unit.
Methodology of Simulations n CS/PY 399 Lecture Presentation # 19 n February 21, 2001 n Mount Union College.
Multi-Layer Perceptron
Introduction to Neural Networks and Example Applications in HCI Nick Gentile.
Performance Comparison of Speaker and Emotion Recognition
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
EEE502 Pattern Recognition
CSC321 Lecture 5 Applying backpropagation to shape recognition Geoffrey Hinton.
NEURAL NETWORKS LECTURE 1 dr Zoran Ševarac FON, 2015.
C - IT Acumens. COMIT Acumens. COM. To demonstrate the use of Neural Networks in the field of Character and Pattern Recognition by simulating a neural.
Neural Networks 2nd Edition Simon Haykin
Phonetics and Phonology.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Essential components of the implementation are:  Formation of the network and weight initialization routine  Pixel analysis of images for symbol detection.
The Language of Thought : Part II Joe Lau Philosophy HKU.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Speech Recognition Created By : Kanjariya Hardik G.
Korea Maritime and Ocean University NLP Jung Tae LEE
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
An Introduction To The Backpropagation Algorithm.
Learning: Neural Networks Artificial Intelligence CMSC February 3, 2005.
NEURONAL NETWORKS AND CONNECTIONIST (PDP) MODELS Thorndike’s “Law of Effect” (1920’s) –Reward strengthens connections for operant response Hebb’s “reverberatory.
Introduction to Linguistics
Dr. Kenneth Stanley September 6, 2006
Job Google Job Title: Linguistic Project Manager
Neuro-Computing Lecture 4 Radial Basis Function Network
An Introduction To The Backpropagation Algorithm
Word2Vec.
S.N.U. EECS Jeong-Jin Lee Eui-Taik Na
Today’s Lecture Project notes Recurrent nets Unsupervised learning
Outline Announcement Neural networks Perceptrons - continued
Today’s Lecture Project notes Recurrent nets Unsupervised learning
PHONETICS AND PHONOLOGY INTRODUCTION TO LINGUISTICS Lourna J. Baldera BSED- ENGLISH 1.
Presentation transcript:

Korea Maritime and Ocean University NLP Jung Tae LEE

` 1. Introduction of NETtalk NETtalk  One of the method for converting text to speech(TTS).  Automated learning procedure for parallel network of deterministic processing units. Conventional approach is converted by applying phonolohical rules, and handling exceptions with a look-up table.  After trainig, it achives good performance and generalizes to novel words.

` Characteristics of TTS in Eng  English is amongst the most difficult languages to read aloud.  Speech sounds have exceptions that are often context-sensitive - EX) the “a” in almost all words ending in “ave”, such as “brave” and “gave”, is a long vowel, but-not in “have”, and some words can vary in pronuciation with their syntactic role. This is the problem in conventional approach

` DECtalk : commercial product  DECtalk used two methods for converting text to phonemes 1. A word is first looked up in a pronunciation dictionary of common words; if it is not found there the a set of phonological rules is applied. (For novel words that are not correctly pronounced) 2. alternative approach is based on massively-parallel network models. Knowledge in these models is distributed over many processing units and make decision by exchange of information between the processing unit

` In this paper :  Network learning algorithms with three layers.  NETtalk can be trained on any dialect of any languages.  Demonstrates that a relatively small network can capture most of the significant regularities in English pronunciation as well as absorb many of the irregulatities.

` 2. Network Architecture Processing Unit  The network is composed of processing units that non-linearly transform their summed, continuous-valued inputs. The connection strength, or weight, linking one unit to another unit can be a positive or negative real value.

Processing Unit  The ouput of the ith unit is determined by first summing all of its inputs

Processing Unit value, representing either an excitatory or an inhibitory influence of the first unit on the output of the second unit  NETtalk is hierarchically arranged into three layers of units

Representations of Letters and Phonemes  There are seven groups of units in the input layer - Each input group encodes one letter of the input text. - Seven letters are presented to the input units at any one time.  And one group of units in each of the other two layers - The desired output of the network is the correct phoneme, or contrastive speech sound, associated with the center, or fourth  Except for center letter provide a partial context for this decision - The test is stepped through the window letter-by-letter.  At each step, the network computes a phoneme, and after each word the weights are adjusted according to how closely the computed pronunciation matches the correct one.

Representations of Letters and Phonemes  The letters are represented by alphabet, plus an additional 3 units to encode punctuation and word boundaries  The phonemes, are represented in terms of 23 articulatory features, such as point of articulation, voicing, vowel height, and so on  Three additional units encode stress and syllable boundaries goal of the learning algorithm is to adjust the weights between the units in the network in order to make the hidden units good feature detectors

Learning Algorithm  Two texts were used to train the network: - Phonetic transcriptions from informal, continuous speech of a child - 20,012 word corpus from a dictionary A subset of 1000 words was chosen from this dictionary taken from the Brown corpus of the most common words in English Letters and phonemes were aligned like this: “phone” - /f-on-/

Learning Algorithm

` 3. Performance Performance  Two measures of performance were computed  Best Guess - best guess, which was the phoneme making the smallest angle with the output vector.  Perfect match - value of each articulatory feature was within a margin of 0.1 of its corrects value.

` Continuous Informal Speech  Learining after 50,000words. Perfect matches were at 55%.

` Continuous Informal Speech  Examples of raw output from the simulator stresses text phonemes 200word 1 iter 25 iter Cont’

` Continuous Informal Speech  Graphical summary of the weights between the letter units and some of the hidden units Negative(inhibitory weight) Positive(excitatory weight)

` Continuous Informal Speech  Damage to the network and recovery from damage.

` Dictionary  Used the 1000 most Common word in EGN. Hard pron soft pron

` 4. Summary Seven groups of nodes in the input layer, The text was stepped through the window on a letter-by-letter basis. standard back-propagation algorithm Strings of seven letters were thus presented to the input layer at any one time.

Korea Maritime and Ocean University NLP Jung Tae LEE