Lecture 13 Corpus Linguistics I CS 4705.

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Word-counts, visualizations and N-grams Eric Atwell, Language Research.
Advertisements

1 CS 388: Natural Language Processing: N-Gram Language Models Raymond J. Mooney University of Texas at Austin.
Introduction to Computational Linguistics
N-Grams and Corpus Linguistics 6 July Linguistics vs. Engineering “But it must be recognized that the notion of “probability of a sentence” is an.
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio LECTURE 11 (Lab): Probability reminder.
1 I256: Applied Natural Language Processing Marti Hearst Sept 13, 2006.
A BAYESIAN APPROACH TO SPELLING CORRECTION. ‘Noisy channels’ In a number of tasks involving natural language, the problem can be viewed as recovering.
January 12, Statistical NLP: Lecture 2 Introduction to Statistical NLP.
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
Advanced AI - Part II Luc De Raedt University of Freiburg WS 2004/2005 Many slides taken from Helmut Schmid.
CSE111: Great Ideas in Computer Science Dr. Carl Alphonce 219 Bell Hall Office hours: M-F 11:00-11:
September BASIC TECHNIQUES IN STATISTICAL NLP Word prediction n-grams smoothing.
CS 4705 Lecture 13 Corpus Linguistics I. From Knowledge-Based to Corpus-Based Linguistics A Paradigm Shift begins in the 1980s –Seeds planted in the 1950s.
N-Grams and Language Modeling
1 LM Approaches to Filtering Richard Schwartz, BBN LM/IR ARDA 2002 September 11-12, 2002 UMASS.
Introduction to Language Models Evaluation in information retrieval Lecture 4.
Spelling Checkers Daniel Jurafsky and James H. Martin, Prentice Hall, 2000.
COMP205 Comparative Programming Languages Part 1: Introduction to programming languages Lecture 2: Structure of programs and programming languages as communication.
Statistical Natural Language Processing Advanced AI - Part II Luc De Raedt University of Freiburg WS 2005/2006 Many slides taken from Helmut Schmid.
Regular Expressions and Automata Chapter 2. Regular Expressions Standard notation for characterizing text sequences Used in all kinds of text processing.
Metodi statistici nella linguistica computazionale The Bayesian approach to spelling correction.
1 Statistical NLP: Lecture 13 Statistical Alignment and Machine Translation.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Natural Language Understanding
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
NGrams 09/16/2004 Instructor: Rada Mihalcea Note: some of the material in this slide set was adapted from an NLP course taught by Bonnie Dorr at Univ.
Formal Models of Language. Slide 1 Language Models A language model an abstract representation of a (natural) language phenomenon. an approximation to.
9/8/20151 Natural Language Processing Lecture Notes 1.
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Week 9: resources for globalisation Finish spell checkers Machine Translation (MT) The ‘decoding’ paradigm Ambiguity Translation models Interlingua and.
Applications (2 of 2): Recognition, Transduction, Discrimination, Segmentation, Alignment, etc. Kenneth Church Dec 9,
1 Computational Linguistics Ling 200 Spring 2006.
Natural Language Processing Introduction. 2 Natural Language Processing We’re going to study what goes into getting computers to perform useful and interesting.
Chapter 5. Probabilistic Models of Pronunciation and Spelling 2007 년 05 월 04 일 부산대학교 인공지능연구실 김민호 Text : Speech and Language Processing Page. 141 ~ 189.
NLP Language Models1 Language Models, LM Noisy Channel model Simple Markov Models Smoothing Statistical Language Models.
Introduction to CL & NLP CMSC April 1, 2003.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
A COMPARISON OF HAND-CRAFTED SEMANTIC GRAMMARS VERSUS STATISTICAL NATURAL LANGUAGE PARSING IN DOMAIN-SPECIFIC VOICE TRANSCRIPTION Curry Guinn Dave Crist.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 1 (03/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Introduction to Natural.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
What makes communication by language possible? “What makes the task [of understanding others] practicable at all is the structure the normative character.
CS 4705 Lecture 17 Semantic Analysis: Robust Semantics.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
CS416 Compiler Design1. 2 Course Information Instructor : Dr. Ilyas Cicekli –Office: EA504, –Phone: , – Course Web.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
Tasneem Ghnaimat. Language Model An abstract representation of a (natural) language. An approximation to real language Assume we have a set of sentences,
Spelling Correction and the Noisy Channel Real-Word Spelling Correction.
Linguistic knowledge for Speech recognition
PRESENTED BY: PEAR A BHUIYAN
Statistical Language Models
10/13/2017.
Statistical NLP: Lecture 13
Introduction to Textual Analysis
CS416 Compiler Design lec00-outline September 19, 2018
Efficient Estimation of Word Representation in Vector Space
--Mengxue Zhang, Qingyang Li
Introduction CI612 Compiler Design CI612 Compiler Design.
Probabilistic and Lexicalized Parsing
Advanced NLP: Speech Research and Technologies
CS621/CS449 Artificial Intelligence Lecture Notes
CSCI 5832 Natural Language Processing
Natural Language Processing
CS416 Compiler Design lec00-outline February 23, 2019
Developing Listening strategies
CPSC 503 Computational Linguistics
Lec00-outline May 18, 2019 Compiler Design CS416 Compiler Design.
Artificial Intelligence 2004 Speech & Natural Language Processing
CS249: Neural Language Model
Presentation transcript:

Lecture 13 Corpus Linguistics I CS 4705

From Knowledge-Based to Corpus-Based Linguistics A Paradigm Shift begins in the 1980s Seeds planted in the 1950s (Harris, Firth) Cut off by Chomsky Renewal due to Interest in practical applications (ASR, MT, …) Availability at major industrial labs of powerful machines and large amounts of storage Increasing availability of large online texts and speech data Crossover efforts with ASR community, fostered by DARPA

For many practical tasks, statistical methods perform better Less knowledge required by researchers

Next Word Prediction An ostensibly artificial task: predicting the next word in a sequence. From a NY Times story... Stocks plunged this …. Stocks plunged this morning, despite a cut in interest rates Stocks plunged this morning, despite a cut in interest rates by the Federal Reserve, as Wall ... Stocks plunged this morning, despite a cut in interest rates by the Federal Reserve, as Wall Street began

Stocks plunged this morning, despite a cut in interest rates by the Federal Reserve, as Wall Street began trading for the first time since last … Stocks plunged this morning, despite a cut in interest rates by the Federal Reserve, as Wall Street began trading for the first time since last Tuesday's terrorist attacks.

Human Word Prediction Clearly, at least some of us have the ability to predict future words in an utterance. How? Domain knowledge Syntactic knowledge Lexical knowledge

Claim A useful part of the knowledge needed to allow Word Prediction (guessing the next word) can be captured using simple statistical techniques. In particular, we'll rely on the notion of the probability of a sequence (e.g., sentence) and the likelihood of words co-occurring

Why would we want to do this? Why would anyone want to predict a word? If you say you can predict the next word, it means you can rank the likelihood of sequences containing various alternative words, or, alternative hypotheses You can assess the likelihood/goodness of an hypothesis

Many NLP problems can be modeled as mapping from one string of symbols to another. In statistical language applications, knowledge of the source (e.g, a statistical model of word sequences) is referred to as a Language Model or a Grammar

Why is this useful? Example applications that employ language models: Speech recognition Handwriting recognition Spelling correction Machine translation systems Optical character recognizers

Real Word Spelling Errors They are leaving in about fifteen minuets to go to her house. The study was conducted mainly be John Black. The design an construction of the system will take more than a year. Hopefully, all with continue smoothly in my absence. Can they lave him my messages? I need to notified the bank of…. He is trying to fine out.

Handwriting Recognition Assume a note is given to a bank teller, which the teller reads as I have a gub. (cf. Woody Allen) NLP to the rescue …. gub is not a word gun, gum, Gus, and gull are words, but gun has a higher probability in the context of a bank

For Spell Checkers Collect a list of commonly substituted words E.g. piece/peace, whether/weather, their/there ... Whenever you encounter one of these words in a sentence, construct the alternative sentence as well Assess the goodness of each and choose the one (word) with the more likely sentence E.g. On Tuesday, the whether On Tuesday, the weather

The Noisy Channel Model A probabilistic model developed by Claude Shannon to model communication (as over a phone line)  Noisy Channel  O = argmaxPr(I|O) = argmaxPr(I) Pr(O|I) I I the most likely input Pr(I) the prior probability Pr(I|O) the most likely I given O Pr(O|I) the probability that O is the output if I is the input

Review: Basic Probability Prior Probability (or unconditional probability) P(A), where A is some event Possible events: it raining, the next person you see being Scandinavian, a child getting the measles, the word ‘warlord’ occurring in the newspaper Conditional Probability P(A | B) the probability of A, given that we know B E.g. it raining, given that we know it’s October; the next person you see being Scandinavian, given that you’re in Sweden, the word ‘warlord’ occurring in a story about Afghanistan

Example F F F F F F I I I I P(Finn) = .6 P(skier) = .5 P(skier|Finn) = .67 P(Finn|skier) = .8

Next class Midterm Next class: Hindle & Rooth 1993 Begin studying semantics, Ch. 14