Language and Statistics

Slides:



Advertisements
Similar presentations
EECS738 Xue-wen Chen EECS 738: Machine Learning Fall 2011, Prof. Xue-wen Chen The University of Kansas.
Advertisements

Language Modeling.
Albert Gatt Corpora and Statistical Methods – Lecture 7.
SI485i : NLP Set 4 Smoothing Language Models Fall 2012 : Chambers.
Natural Language Processing Spring 2007 V. “Juggy” Jagannathan.
CSE 531: Performance Analysis of Systems Lecture 1: Intro and Logistics Anshul Gandhi 1307, CS building
Advanced AI - Part II Luc De Raedt University of Freiburg WS 2004/2005 Many slides taken from Helmut Schmid.
Shallow Processing: Summary Shallow Processing Techniques for NLP Ling570 December 7, 2011.
CS4705 Natural Language Processing.  Regular Expressions  Finite State Automata ◦ Determinism v. non-determinism ◦ (Weighted) Finite State Transducers.
N-Gram Language Models CMSC 723: Computational Linguistics I ― Session #9 Jimmy Lin The iSchool University of Maryland Wednesday, October 28, 2009.
Introduction LING 572 Fei Xia Week 1: 1/3/06. Outline Course overview Problems and methods Mathematical foundation –Probability theory –Information theory.
Statistical techniques in NLP Vasileios Hatzivassiloglou University of Texas at Dallas.
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
Statistical Natural Language Processing Advanced AI - Part II Luc De Raedt University of Freiburg WS 2005/2006 Many slides taken from Helmut Schmid.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
General Information Course Id: COSC6342 Machine Learning Time: MO/WE 2:30-4p Instructor: Christoph F. Eick Classroom:SEC 201
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
1 Advanced Smoothing, Evaluation of Language Models.
Multi-Style Language Model for Web Scale Information Retrieval Kuansan Wang, Xiaolong Li and Jianfeng Gao SIGIR 2010 Min-Hsuan Lai Department of Computer.
Machine Learning Queens College Lecture 13: SVM Again.
Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.
1 Bayesian Learning for Latent Semantic Analysis Jen-Tzung Chien, Meng-Sun Wu and Chia-Sheng Wu Presenter: Hsuan-Sheng Chiu.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 10a-11:30a Instructor: Christoph F. Eick Classroom:AH123
LING 388: Language and Computers Sandiway Fong Lecture 30 12/8.
NLP Language Models1 Language Models, LM Noisy Channel model Simple Markov Models Smoothing Statistical Language Models.
Lecture 6 Hidden Markov Models Topics Smoothing again: Readings: Chapters January 16, 2013 CSCE 771 Natural Language Processing.
Statistical NLP: Lecture 8 Statistical Inference: n-gram Models over Sparse Data (Ch 6)
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
Combining Statistical Language Models via the Latent Maximum Entropy Principle Shaojum Wang, Dale Schuurmans, Fuchum Peng, Yunxin Zhao.
Chapter6. Statistical Inference : n-gram Model over Sparse Data 이 동 훈 Foundations of Statistic Natural Language Processing.
CS 6961: Structured Prediction Fall 2014 Course Information.
Sequence Models With slides by me, Joshua Goodman, Fei Xia.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
Lecture 4 Ngrams Smoothing
Language modelling María Fernández Pajares Verarbeitung gesprochener Sprache.
1 Modeling Long Distance Dependence in Language: Topic Mixtures Versus Dynamic Cache Models Rukmini.M Iyer, Mari Ostendorf.
Yuya Akita , Tatsuya Kawahara
Statistical Decision-Tree Models for Parsing NLP lab, POSTECH 김 지 협.
1 An Introduction to Computational Linguistics Mohammad Bahrani.
Statistical Models for Automatic Speech Recognition Lukáš Burget.
A Maximum Entropy Language Model Integrating N-grams and Topic Dependencies for Conversational Speech Recognition Sanjeev Khudanpur and Jun Wu Johns Hopkins.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.
Statistical NLP Spring 2011 Lecture 3: Language Models II Dan Klein – UC Berkeley TexPoint fonts used in EMF. Read the TexPoint manual before you delete.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 1-2:30p Instructor: Christoph F. Eick Classroom:AH301
Recent Paper of Md. Akmal Haidar Meeting before ICASSP 2013 報告者:郝柏翰 2013/05/23.
Language Modeling Again So are we smooth now? Courtesy of Chris Jordan.
Problem Solving with NLTK MSE 2400 EaLiCaRA Dr. Tom Way.
Language Model for Machine Translation Jang, HaYoung.
Part-Of-Speech Tagging Radhika Mamidi. POS tagging Tagging means automatic assignment of descriptors, or tags, to input tokens. Example: “Computational.
Who am I? Work in Probabilistic Machine Learning Like to teach 
Tools for Natural Language Processing Applications
Statistical Models for Automatic Speech Recognition
Natural Language Processing (NLP)
Machine Learning in Natural Language Processing
Statistical NLP: Lecture 9
CS4705 Natural Language Processing
Statistical Models for Automatic Speech Recognition
Language and Statistics
Lecture 7 HMMs – the 3 Problems Forward Algorithm
CS4705 Natural Language Processing
CSCE 771 Natural Language Processing
Presented by Wen-Hung Tsai Speech Lab, CSIE, NTNU 2005/07/13
CSCE 771 Natural Language Processing
Language Model Approach to IR
CPSC 503 Computational Linguistics
Natural Language Processing (NLP)
Natural Language Processing (NLP)
Presentation transcript:

11-761 Language and Statistics Spring 2014 Roni Rosenfeld http://www.cs.cmu.edu/~roni/11761-s14/

Course Goals and Style Teaching statistical techniques for language technologies Plugging gaping holes in LTI/CS grad student education in probability, statistics and information theory. 13 January 2014 © Roni Rosenfeld, 2014

Course philosophy Socratic Method Highly interactive Highly adaptable participation strongly encouraged (pls state your name) Highly interactive Highly adaptable based on how fast we move Lots of Probability, Statistics, Information theory not in the abstract, but rather as the need arises Lectures emphasize intuition, not rigor or detail background reading will have rigor & detail 13 January 2014 © Roni Rosenfeld, 2014

Course Mechanics Highly recommended: learn & use a text processing language like perl, python, … Can you derive Bayes equation in your sleep? New this year: 11661 (masters level): no final project Hand in assignments via Blackboard Vigorous enforcement of collaboration & disclosure policy 13 January 2014 © Roni Rosenfeld, 2014

Background Material No single book exists which covers the course material. “Foundations of Statistical NLP”, Manning & Schutze Computational Linguistics perspective “Statistical Methods in Speech Recognition”, Jelinek “Text Compression”, Bell, Cleary & Witten first 4 chapters; rest is mostly text compression “Probability and Statistics”, DeGroot “All of Statistics” & “All of nonparametric Statistics”, Wasserman Lots of individual articles 13 January 2014 © Roni Rosenfeld, 2014

Syllabus (subject to change) Overview and Grand Thoughts What Is All This Good For? source-channel formulation Words, Words, Words type vs, token, Zipf, Mandlebrot, heterogeneity of langauge Modeling Word distributions - the unigram: [estimators, ML, zero frequency, smoothing, shrinkage, G-T] N-grams: Deleted Interpolation Model, backoff, toolkit Measuring Success: perplexity [entropy, KL-div, MI], the entropy of English, alternatives 13 January 2014 © Roni Rosenfeld, 2014

Syllabus (continued) Clustering: Latent Variable Models, EM class-based N-grams, hierarchical clustering hard and soft clustering Latent Variable Models, EM Hidden Markov Models, revisiting interpolated and class n-grams Part-Of-Speech tagging, Word Sense Disambiguation Decision & Regression Trees Particularly as applied to language Stochastic Grammars (SCFG, inside-outside alg., Link grammar) 13 January 2014 © Roni Rosenfeld, 2014

Syllabus (continued) Maximum Entropy Modeling exponential models, ME principle, feature induction... Language Model Adaptation caches, backoff Dimensionality reduction latent semantic analysis Syntactic Language Models 13 January 2014 © Roni Rosenfeld, 2014