A COMPARISON OF HAND-CRAFTED SEMANTIC GRAMMARS VERSUS STATISTICAL NATURAL LANGUAGE PARSING IN DOMAIN-SPECIFIC VOICE TRANSCRIPTION Curry Guinn Dave Crist.

Slides:



Advertisements
Similar presentations
1 Drafting Preparing the Initial Composition. 2 Drafting Basics While prewriting activities and tools can assist the writer during initial drafting, the.
Advertisements

1 CS 388: Natural Language Processing: N-Gram Language Models Raymond J. Mooney University of Texas at Austin.
Language Models Naama Kraus (Modified by Amit Gross) Slides are based on Introduction to Information Retrieval Book by Manning, Raghavan and Schütze.
Albert Gatt Corpora and statistical methods. In this lecture Overview of rules of probability multiplication rule subtraction rule Probability based on.
SI485i : NLP Set 4 Smoothing Language Models Fall 2012 : Chambers.
SI485i : NLP Day 2 Probability Review. Introduction to Probability Experiment (trial) Repeatable procedure with well-defined possible outcomes Outcome.
NATURAL LANGUAGE PROCESSING. Applications  Classification ( spam )  Clustering ( news stories, twitter )  Input correction ( spell checking )  Sentiment.
A BAYESIAN APPROACH TO SPELLING CORRECTION. ‘Noisy channels’ In a number of tasks involving natural language, the problem can be viewed as recovering.
Confidence Estimation for Machine Translation J. Blatz et.al, Coling 04 SSLI MTRG 11/17/2004 Takahiro Shinozaki.
CS 4705 Lecture 13 Corpus Linguistics I. From Knowledge-Based to Corpus-Based Linguistics A Paradigm Shift begins in the 1980s –Seeds planted in the 1950s.
September SOME BASIC NOTIONS OF PROBABILITY THEORY Universita’ di Venezia 29 Settembre 2003.
LING 438/538 Computational Linguistics
1 Language Model (LM) LING 570 Fei Xia Week 4: 10/21/2009 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAA A A.
1 LM Approaches to Filtering Richard Schwartz, BBN LM/IR ARDA 2002 September 11-12, 2002 UMASS.
Language Model. Major role: Language Models help a speech recognizer figure out how likely a word sequence is, independent of the acoustics. A lot of.
I256 Applied Natural Language Processing Fall 2009 Lecture 7 Practical examples of Graphical Models Language models Sparse data & smoothing Barbara Rosario.
Introduction to Language Models Evaluation in information retrieval Lecture 4.
Spelling Checkers Daniel Jurafsky and James H. Martin, Prentice Hall, 2000.
C SC 620 Advanced Topics in Natural Language Processing Lecture 24 4/22.
Metodi statistici nella linguistica computazionale The Bayesian approach to spelling correction.
SI485i : NLP Set 12 Features and Prediction. What is NLP, really? Many of our tasks boil down to finding intelligent features of language. We do lots.
Natural Language Understanding
An Automatic Segmentation Method Combined with Length Descending and String Frequency Statistics for Chinese Shaohua Jiang, Yanzhong Dang Institute of.
SI485i : NLP Set 3 Language Models Fall 2012 : Chambers.
1 Advanced Smoothing, Evaluation of Language Models.
NGrams 09/16/2004 Instructor: Rada Mihalcea Note: some of the material in this slide set was adapted from an NLP course taught by Bonnie Dorr at Univ.
Formal Models of Language. Slide 1 Language Models A language model an abstract representation of a (natural) language phenomenon. an approximation to.
Multi-Style Language Model for Web Scale Information Retrieval Kuansan Wang, Xiaolong Li and Jianfeng Gao SIGIR 2010 Min-Hsuan Lai Department of Computer.
Robert Hass CIS 630 April 14, 2010 NP NP↓ Super NP tagging JJ ↓
BİL711 Natural Language Processing1 Statistical Language Processing In the solution of some problems in the natural language processing, statistical techniques.
1 High Resolution Statistical Natural Language Understanding: Tools, Processes, and Issues. Roberto Pieraccini SpeechCycle
6. N-GRAMs 부산대학교 인공지능연구실 최성자. 2 Word prediction “I’d like to make a collect …” Call, telephone, or person-to-person -Spelling error detection -Augmentative.
Improving Upon Semantic Classification of Spoken Diary Entries Using Pragmatic Context Information Daniel J. Rayburn Reeves Curry I. Guinn University of.
1 The Ferret Copy Detector Finding short passages of similar texts in large document collections Relevance to natural computing: System is based on processing.
Chapter 6: Statistical Inference: n-gram Models over Sparse Data
Statistical NLP: Lecture 8 Statistical Inference: n-gram Models over Sparse Data (Ch 6)
Language Modeling Anytime a linguist leaves the group the recognition rate goes up. (Fred Jelinek)
NLP. Introduction to NLP Extrinsic –Use in an application Intrinsic –Cheaper Correlate the two for validation purposes.
S1: Chapter 1 Mathematical Models Dr J Frost Last modified: 6 th September 2015.
Resolving Word Ambiguities Description: After determining word boundaries, the speech recognition process matches an array of possible word sequences from.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
1 Boostrapping language models for dialogue systems Karl Weilhammer, Matthew N Stuttle, Steve Young Presenter: Hsuan-Sheng Chiu.
USING A SPOKEN DIARY AND HEART RATE MONITOR IN MODELING HUMAN EXPOSURE TO AIRBORNE POLLUTANTS FOR EPA’S CONSOLIDATED HUMAN ACTIVITY DATABASE Curry I. Guinn,
Chapter 23: Probabilistic Language Models April 13, 2004.
Large Vocabulary Continuous Speech Recognition. Subword Speech Units.
Ngram models and the Sparcity problem. The task Find a probability distribution for the current word in a text (utterance, etc.), given what the last.
1 Introduction to Natural Language Processing ( ) Language Modeling (and the Noisy Channel) AI-lab
Estimating N-gram Probabilities Language Modeling.
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
金聲玉振 Taiwan Univ. & Academia Sinica 1 Spoken Dialogue in Information Retrieval Jia-lin Shen Oct. 22, 1998.
Natural Language Processing Statistical Inference: n-grams
A Maximum Entropy Language Model Integrating N-grams and Topic Dependencies for Conversational Speech Recognition Sanjeev Khudanpur and Jun Wu Johns Hopkins.
Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.
Introduction to N-grams Language Modeling. Dan Jurafsky Probabilistic Language Models Today’s goal: assign a probability to a sentence Machine Translation:
Learning, Uncertainty, and Information: Evaluating Models Big Ideas November 12, 2004.
N-Gram Model Formulas Word sequences Chain rule of probability Bigram approximation N-gram approximation.
Statistical Methods for NLP Diana Trandab ă ț
Statistical NLP Course for Master in Computational Linguistics 2nd Year Diana Trandabat.
Recent Paper of Md. Akmal Haidar Meeting before ICASSP 2013 報告者:郝柏翰 2013/05/23.
Tasneem Ghnaimat. Language Model An abstract representation of a (natural) language. An approximation to real language Assume we have a set of sentences,
Language Model for Machine Translation Jang, HaYoung.
Spelling Correction and the Noisy Channel Real-Word Spelling Correction.
Statistical Methods for NLP
N-Grams Chapter 4 Part 2.
N-Gram Based Approaches
Neural Language Model CS246 Junghoo “John” Cho.
CS4705 Natural Language Processing
Lecture 13 Corpus Linguistics I CS 4705.
CS249: Neural Language Model
Presentation transcript:

A COMPARISON OF HAND-CRAFTED SEMANTIC GRAMMARS VERSUS STATISTICAL NATURAL LANGUAGE PARSING IN DOMAIN-SPECIFIC VOICE TRANSCRIPTION Curry Guinn Dave Crist Haley Werth

Outline l Probabilistic language models »N-grams l The EPA project l Experiments

Probabilistic Language Processing: What is it? l Assume a note is given to a bank teller, which the teller reads as I have a gub. (cf. Woody Allen) l NLP to the rescue …. »gub is not a word »gun, gum, Gus, and gull are words, but gun has a higher probability in the context of a bank

Real Word Spelling Errors l They are leaving in about fifteen minuets to go to her house. l The study was conducted mainly be John Black. l Hopefully, all with continue smoothly in my absence. l Can they lave him my messages? l I need to notified the bank of…. l He is trying to fine out.

Letter-based Language Models l Shannon’s Game l Guess the next letter: l

Letter-based Language Models l Shannon’s Game l Guess the next letter: l W

Letter-based Language Models l Shannon’s Game l Guess the next letter: l Wh

l Shannon’s Game l Guess the next letter: l Wha Letter-based Language Models

l Shannon’s Game l Guess the next letter: l What Letter-based Language Models

l Shannon’s Game l Guess the next letter: l What d Letter-based Language Models

l Shannon’s Game l Guess the next letter: l What do Letter-based Language Models

l Shannon’s Game l Guess the next letter: l What do you think the next letter is? Letter-based Language Models

l Shannon’s Game l Guess the next letter: l What do you think the next letter is? l Guess the next word: l Letter-based Language Models

l Shannon’s Game l Guess the next letter: l What do you think the next letter is? l Guess the next word: l What Letter-based Language Models

l Shannon’s Game l Guess the next letter: l What do you think the next letter is? l Guess the next word: l What do Letter-based Language Models

l Shannon’s Game l Guess the next letter: l What do you think the next letter is? l Guess the next word: l What do you Letter-based Language Models

l Shannon’s Game l Guess the next letter: l What do you think the next letter is? l Guess the next word: l What do you think Letter-based Language Models

l Shannon’s Game l Guess the next letter: l What do you think the next letter is? l Guess the next word: l What do you think the Letter-based Language Models

l Shannon’s Game l Guess the next letter: l What do you think the next letter is? l Guess the next word: l What do you think the next Letter-based Language Models

l Shannon’s Game l Guess the next letter: l What do you think the next letter is? l Guess the next word: l What do you think the next word is? Letter-based Language Models

Word-based Language Models l A model that enables one to compute the probability, or likelihood, of a sentence S, P(S). lSimple: Every word follows every other word w/ equal probability (0-gram) »Assume |V| is the size of the vocabulary V »Likelihood of sentence S of length n is = 1/|V| × 1/|V| … × 1/|V| »If English has 100,000 words, probability of each next word is 1/ =.00001

Word Prediction: Simple vs. Smart Smarter: probability of each next word is related to word frequency (unigram) – Likelihood of sentence S = P(w 1 ) × P(w 2 ) × … × P(w n ) – Assumes probability of each word is independent of probabilities of other words. Even smarter: Look at probability given previous words (N-gram) – Likelihood of sentence S = P(w 1 ) × P(w 2 |w 1 ) × … × P(w n |w n-1 ) – Assumes probability of each word is dependent on probabilities of other words.

Training and Testing l Probabilities come from a training corpus, which is used to design the model. »Overly narrow corpus: probabilities don't generalize »Overly general corpus: probabilities don't reflect task or domain l A separate test corpus is used to evaluate the model, typically using standard metrics »Held out test set

Simple N-Grams l An N-gram model uses the previous N-1 words to predict the next one: »P(w n | w n-N+1 w n-N+2… w n-1 ) l unigrams: P(dog) l bigrams: P(dog | big) l trigrams: P(dog | the big) l quadrigrams: P(dog | chasing the big)

The EPA task l Detailed diary of a single individual’s daily activity and location l Methods of collecting the data: »External Observer »Camera »Self-reporting –Paper diary –Handheld menu-driven diary –Spoken diary

Spoken Diary l From an utterance like “I am in the kitchen cooking spaghetti”, map that utterance into »Activity(cooking) »Location(kitchen) l Text abstraction l Technique »Build a grammar »Example

Sample Semantic Grammar ACTIVITY_LOCATION -> ACTIVITY' LOCATION' : CHAD(ACTIVITY',LOCATION'). ACTIVITY_LOCATION -> LOCATION' ACTIVITY' : CHAD(ACTIVITY',LOCATION'). ACTIVITY_LOCATION -> ACTIVITY' : CHAD(ACTIVITY', null). ACTIVITY_LOCATION -> LOCATION' : CHAD(null,LOCATION'). LOCATION -> IAM LOCx' : LOCx'. LOCATION -> LOCx' : LOCx'. IAM -> IAM1. IAM -> IAM1 just. IAM -> IAM1 going to. IAM -> IAM1 getting ready to. IAM -> IAM1 still. LOC2 -> HOUSE_LOC' : HOUSE_LOC'. LOC2 -> OUTSIDE_LOC' : OUTSIDE_LOC'. LOC2 -> WORK_LOC' : WORK_LOC'. LOC2 -> OTHER_LOC' : OTHER_LOC'. HOUSE_LOC -> kitchen : kitchen_code. HOUSE_LOC -> bedroom : bedroom_code. HOUSE_LOC -> living room : living_room_code. HOUSE_LOC -> house : house_code. HOUSE_LOC -> garage : garage_code. HOUSE_LOC -> home : house_code. HOUSE_LOC -> bathroom : bathroom_code. HOUSE_LOC -> den : den_code. HOUSE_LOC -> dining room : dining_room_code. HOUSE_LOC -> basement : basement_code. HOUSE_LOC -> attic : attic_code. OUTSIDE_LOC -> yard : yard_code.

Statistical Natural Language Parsing l Use unigram, bigram and trigram probabilities l Use Bayes’ rule to obtain these probabilities: P(A|B) = P(B|A) * P(A)/ P(B) l The formula P(“kitchen”|30121 Kitchen) is computed by determining the percentage of times the word “kitchen” appears in diary entries that have been transcribed in the category Kitchen. l P(30121 Kitchen) is the probability that a diary entry is of the semantic category Kitchen. l P(“kitchen”) is the probability that “kitchen” appears in any diary entry. l Bayes’ rule can be extended to take into account each word in the input string.

The Experiment l Digital Voice Recorder + Heart Rate Monitor »Heart rate monitor will beep if the rate changes by more than 15 beats per minute between measurements (every 2 minutes)

Subjects IDSexOccupationAgeEducation 1F Manages Internet Company 52Some College 2FGrocery Deli Worker18Some College 3MConstruction Worker35High School 4FDatabase Coordinator29Graduate Degree 5FCoordinator for Non-profit56Some College 6MUnemployed50High School 7MRetired76High School 8MDisabled62High School 9M Environment Technician 56Graduate Degree

Recordings Per Day

Heart Rate Change Indicator Tones and Subject Compliance SNumber of Tones Per Day (Avg.) % of Times Subject Made a Diary Entry Corresponding to a Tone % % % % % % % % %

Per Word Speech Recognition PPer Word Recognition Rate (%)

Semantic Grammar Location/Activity Encoding Precision and Recall Word Rec. Rate LocationActivity PrecisionRecallPrecisionRecall Av

Word Recognition Accuracy’s Effect on Semantic Grammar Precision and Recall

Statistical Processing Accuracy Activity Accuracy Location Accuracy Hand- transcribed 86.7%87.5% Using speech Recognition 48.3%49.0%

Word Recognition Affects Statistical Semantic Categorization Rec. Rate % LocationActivity Accuracy % Av

Per Word Recognition Rate Versus Statistical Semantic Encoding Accuracy

Time, Activity, Location, Exertion Data Gathering Platform

Research Topics l Currently, guesses for the current activity and location are computed independently of each other »They are not independent! l Currently, guesses are based on the current utterance. »However, the current activity/location is not independent from previous activity/locations. l How do we fuse data from other sources (gps, beacons, heart rate monitor, etc.)?