Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.

Slides:



Advertisements
Similar presentations
ThemeInformation Extraction for World Wide Web PaperUnsupervised Learning of Soft Patterns for Generating Definitions from Online News Author Cui, H.,
Advertisements

For Friday No reading Homework –Chapter 23, exercises 1, 13, 14, 19 –Not as bad as it sounds –Do them IN ORDER – do not read ahead here.
1 Introduction to Natural Language Processing (Lecture for CS410 Text Information Systems) Jan 28, 2011 ChengXiang Zhai Department of Computer Science.
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
Shallow Processing: Summary Shallow Processing Techniques for NLP Ling570 December 7, 2011.
Introduction to Computational Linguistics Lecture 2.
1 Empirical Learning Methods in Natural Language Processing Ido Dagan Bar Ilan University, Israel.
Introduction LING 572 Fei Xia Week 1: 1/3/06. Outline Course overview Problems and methods Mathematical foundation –Probability theory –Information theory.
تمرين شماره 1 درس NLP سيلابس درس NLP در دانشگاه هاي ديگر ___________________________ راحله مکي استاد درس: دکتر عبدالله زاده پاييز 85.
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
Course Summary LING 572 Fei Xia 03/06/07. Outline Problem description General approach ML algorithms Important concepts Assignments What’s next?
CS 4705 Final Review CS4705 Julia Hirschberg. Format and Coverage Covers only material from thru (i.e. beginning with Probabilistic Parsing) Same format.
1 Natural Language Processing INTRODUCTION Husni Al-Muhtaseb Tuesday, February 20, 2007.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
ELN – Natural Language Processing Giuseppe Attardi
9/8/20151 Natural Language Processing Lecture Notes 1.
Introduction to Natural Language Processing Heshaam Faili University of Tehran.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
Authors: Ting Wang, Yaoyong Li, Kalina Bontcheva, Hamish Cunningham, Ji Wang Presented by: Khalifeh Al-Jadda Automatic Extraction of Hierarchical Relations.
Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.
Natural Language Processing Guangyan Song. What is NLP  Natural Language processing (NLP) is a field of computer science and linguistics concerned with.
L’età della parola Giuseppe Attardi Dipartimento di Informatica Università di Pisa ESA SoBigDataPisa, 24 febbraio 2015.
Introduction to CL & NLP CMSC April 1, 2003.
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
Research Topics CSC Parallel Computing & Compilers CSC 3990.
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
Natural language processing tools Lê Đức Trọng 1.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
CSA2050 Introduction to Computational Linguistics Lecture 1 Overview.
A.F.K. by SoTel. An Introduction to SoTel SoTel created A.F.K., an Android application used to auto generate text message responses to other users. A.F.K.
30 March – 8 April 2005 Dipartimento di Informatica, Universita di Pisa ML for NLP With Special Focus on Tagging and Parsing Kiril Ribarov.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 24 (14/04/06) Prof. Pushpak Bhattacharyya IIT Bombay Word Sense Disambiguation.
LING 001 Introduction to Linguistics Spring 2010 Syntactic parsing Part-Of-Speech tagging Apr. 5 Computational linguistics.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 1 (03/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Introduction to Natural.
Jeff Howbert Introduction to Machine Learning Winter Machine Learning Natural Language Processing.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
Shallow Parsing for South Asian Languages -Himanshu Agrawal.
Natural Language Processing Slides adapted from Pedro Domingos
1 An Introduction to Computational Linguistics Mohammad Bahrani.
Natural Language Processing (NLP)
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.
COSC 6336 Natural Language Processing
Approaches to Machine Translation
Sentiment analysis algorithms and applications: A survey
Natural Language Processing
Natural Language Processing (NLP)
Are End-to-end Systems the Ultimate Solutions for NLP?
Machine Learning in Natural Language Processing
Approaches to Machine Translation
CSCI 5832 Natural Language Processing
Computational Linguistics: New Vistas
CS246: Information Retrieval
CSCI 5832 Natural Language Processing
Knowledge Representation for Natural Language Understanding
Natural Language Processing (NLP)
CS224N Section 3: Corpora, etc.
Artificial Intelligence 2004 Speech & Natural Language Processing
Information Retrieval
Natural Language Processing (NLP)
Presentation transcript:

Introduction to CL Session 1: 7/08/2011

What is computational linguistics? Processing natural language text by computers  for practical applications ... or linguistic research Among practical applications  Sometimes the computer only needs to classify or transform the text ... but sometimes it needs to “understand”  Ex: Watson: winner of ‘Jeopardy’  CL vs. NLP (natural language processing)

NLP applications Automatic speech recognition (ASR): speech  text Machine translation (MT): L1  L2 Information retrieval (IR): Query + documents  a subset of doc Information extraction (IE): document  “database”

NLP applications (cont) Question answering (QA): Question + documents  Answer Summarization: documents  summary Natural language generation (NLG): representation  text

Other Applications Call Center Spam filter Spell checker Sentiment analysis: product reviews Bio-NLP: processing clinical data ….

Basic NLP tasks: Shallow processing Tokenization: – He visited New York in Morphological analysis: – visited  visit + -ed Part-of-speech tagging – He/Pron visited/V New/?? York/N in/Prep 2003/CD Name-entity tagging – He visited [LOCATION New York] in [YEAR 2003] Chunking – [NP He] [V visited] [NP New York] in [NP 2003]

Basic NLP tasks: Deep processing Parsing – (S (NP (PRON he)) (VP (V visited) ….) Semantic analysis – Semantic tagging: [AGENT He] visited [DEST New York] …. – Meaning: visit (he, New-York) Discourse – Co-reference: “He” refers to “John” – Discourse structure Dialogue Generation

Ambiguity Phonological ambiguity: (ASR) – “too”, “two”, “to” – “ice cream” vs. “I scream” – “ta” in Mandarin: he, she, or it Morphological ambiguity: (morphological analysis) – unlockable: [[un-lock]-able] vs. [un-[lock-able]] Syntactic ambiguity: (parsing) – John saw a man with a telescope. – Time flies like an arrow.

Ambiguity (cont) Lexical ambiguity: (WSD) – Ex: “bank”, “saw”, “run” Semantic ambiguity: (semantic representation) – Ex: every boy loves his mother – Ex: John and Mary bought a house Discourse ambiguity: – Susan called Mary. She was sick. (coreference resolution) – It is pretty hot here. (intention resolution) Machine translation: – “brother”, “cousin”, “uncle”, etc.

Ambiguity resolution Rule-based or knowledge-based: – Parsing: I saw a man with a hat I saw a man with a telescope (in my hand) – WSD: “bank” – MT: “brother”, “cousin”, “uncle” Statistical approach: – Require training data – Build a statistical model – Knowledge and rules can be incorporated into the model as features etc.

Major approaches to NLP Rule-based approach Statistical approach – Supervised learning – Semi-supervised learning – Unsupervised learning

Supervised learning algorithms Hidden Markov Model (HMM) Decision tree Decision list Naïve Bayes Transformation-based Learning (TBL) Maximum Entropy (MaxEnt) Support Vector Machine (SVM) Conditional Random Field (CRF) …

Data Raw text: – Monolingual: English/Chinese/Arabic Gigawords – Parallel data: UN data, EuroParl Treebank: – Syntactic treebanks: a set of parse trees – Proposition Bank: – Discourse Treebank Dictionaries WordNet FrameNet …

Applications Task1Task2Task_i ML1 ML_m ML2 … D1D2D_n … …

The role of linguistics knowledge in NLP An NLP system is language-independent. Good or bad? – Good: it can be ported to many languages without any changes. – Bad: it cannot take advantage of properties of certain languages. How to incorporate (linguistic) knowledge in statistical systems? – the design of models – as features – as filters –…–…  Building a treebank is an effective way.