Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 4705 Natural Language Processing Fall 2010 What is Natural Language Processing? Designing software to recognize, analyze and generate text and speech.

Similar presentations


Presentation on theme: "CS 4705 Natural Language Processing Fall 2010 What is Natural Language Processing? Designing software to recognize, analyze and generate text and speech."— Presentation transcript:

1

2 CS 4705 Natural Language Processing Fall 2010

3 What is Natural Language Processing? Designing software to recognize, analyze and generate text and speech Current real world applications –Searching very large text and speech corpora: e.g. the Web, Facebook, online news sources, telephone calls –Translating from one language to another: e.g. Arabic/English, –Summarizing very large amounts of text or speech: e.g. your email, the news –Building spoken dialogue systems: e.g. Amtrak’s ‘Julie’Julie

4 Open Problems in NLP If you want to find all references to union activities in New York, what keywords do you specify? –Union…and…Unions? United? Uniform? Onion? –Activities…and…Activity? Active? Actor? Action? Morphology: how words are composed of smaller units of meaning – which words are related? What’s the same about these sentences? Different? –John hit Bill –Bill was hit by John

5 –Bill, John hit –Who John hit was Bill Syntax: the way words are grouped together into larger constituents and phrases and the way these phrases can be ordered – how sentences are related Semantics: the context-independent ‘meaning’ of utterances (the similar part) Pragmatics: the context-dependent ‘meaning’ of utterances (some of the different part) If you want to find travel information about Nice, France why might you get documents on Nice views in Cleveland? –Word Sense Disambiguation: how to distinguish the different meanings of words spelled the same

6 Course Focus: NLP for Text and Speech Morphology, syntax, semantics, pragmatics/discourse Human language phenomena Techniques and algorithms for computational language processing –Parsing, information extraction/retrieval, statistical and machine learning approaches (corpus linguistics) Applications: Language generation and summarization, machine translation, dialogue systems and spoken language processing Next term: CS 4706 focuses on spoken NLP

7 Instructor Julia Hirschberg –Computational Linguist in CS –Focus: Spoken Language Processing –Lab: The Speech Lab, CEPSR 7LW3-AThe Speech Lab –Research: Deceptive speech Charismatic speech: Emotional speech: anger, uncertainty Speech summarization: Broadcast News Spoken Dialogue Systems: Games CorpusGames Corpus `Translating Prosody’: English – Mandarin –Course DetailsCourse Details

8 Is She Lying?

9

10

11 Bureaucracy Instructor: Julia HirschbergJulia Hirschberg –(julia@cs.columbia.edu) –Office and hours: CEPSR 705, TBA Teaching Assistant: Frank EnosFrank Enos – (frank@cs.columbia.edu) –Office and hours: CEPSR 726 TBA Syllabus available at http://www1.cs.columbia.edu/~julia/cs4705/syllab us07.html http://www1.cs.columbia.edu/~julia/cs4705/syllab us07.html

12 Text: Daniel Jurafsky and James H. Martin, Speech and Language Processing, Prentice-Hall, 2000 (available at CU Bookstore) Speech and Language Processing –Note errata available on website; check before reading each chapter pleaseerrata –Check courseworks Assignments: –3 homework assignments –Midterm and final exams –Four ‘free’ late days for homework assignments –You must get a CS account Evaluation: 50% homework + 50% exams

13 Academic Integrity Copying or paraphrasing someone's work (code included), or permitting your own work to be copied or paraphrased, even if only in part, is forbidden, and will result in an automatic grade of 0 for the entire assignment or exam in which the copying or paraphrasing was done. Your grade should reflect your own work. If you are going to have trouble completing an assignment, talk to the instructor or TA in advance of the due date please. Everyone: Read/write protect your homework files at all times.

14 For Next Class Look at syllabus Read Chapters 1-2 of J&M Questions?


Download ppt "CS 4705 Natural Language Processing Fall 2010 What is Natural Language Processing? Designing software to recognize, analyze and generate text and speech."

Similar presentations


Ads by Google