Lecture 1: Introduction

Lecture 1: Introduction
Kai-Wei Chang University of Virginia Couse webpage: CS6501– Natural Language Processing

Announcements Waiting list: Start attending the first few meetings of the class as if you are registered. Given that some students will drop the class, some space will free up. We will use Piazza as an online discussion platform. Please enroll.

Staff Instructor: Kai-Wei Chang Office: R412 Rice Hall Office hour: 2:00 – 3:00, Tue (after class). Additional office hour: 3:00 – 4:00, Thu TA: Wasi Ahmad Office: R432 Rice Hall Office hour: 4:00 – 5:00, Mon

This lecture Course Overview What is NLP? Why it is important? What will you learn from this course? Course Information What are the challenges? Key NLP components

What is NLP Wiki: Natural language processing (NLP) is a field of computer science, artificial intelligence, and computational linguistics concerned with the interactions between computers and human (natural) languages.

6 Go beyond the keyword matching
Identify the structure and meaning of words, sentences, texts and conversations Deep understanding of broad language NLP is all around us

Machine translation Facebook translation, image credit:

Image credit: Julia Hockenmaier, Intro to NLP

Dialog Systems

Sentiment/Opinion Analysis

Text Classification Other applications?

Question answering credit: 'Watson' computer wins at 'Jeopardy'

Question answering Go beyond search

14 Natural language instruction
15 Digital personal assistant
More on natural language instruction Semantic parsing – understand tasks Entity linking – "my wife" = "Kellie" in the phone book credit:

Unstructured text to database entries Yoav Artzi: Natural language processing

Christopher Robin is alive and well. He is the same person that you read about in the book, Winnie the Pooh. As a boy, Chris lived in a pretty home called Cotchfield Farm. When Chris was three years old, his father wrote a poem about him. The poem was printed in a magazine for others to read. Mr. Robin then wrote a book Q: who wrote Winnie the Pooh? Q: where is Chris lived?

18 What will you learn from this course
The NLP Pipeline Key components for understanding text NLP systems/applications Current techniques & limitation Build realistic NLP tools

19 What’s not covered by this course
Speech recognition – no signal processing Natural language generation Details of ML algorithms / theory Text mining / information retrieval

This lecture Course Overview What is NLP? Why it is important? What will you learn from this course? Course Information What are the challenges? Key NLP components

Overview New course, first time being offered Comments are welcomed Aimed at first- or second- year PhD students Lecture + Seminar No course prerequisites, but I assume programming experience (for the final project) basics of probability calculus, and linear algebra (HW0)

Grading No exam & HW -- hooray Lectures & forum Participate in discussion (additional credits) Review quizzes (25%): 3 quizzes Critical review report (10%) Paper presentation (15%) Final project (50%)

Quizzes Format Multiple choice questions Fill-in-the-blank Short answer questions Each quiz: ~20 min in class Schedule: see course website Closed book, Closed notes, Closed laptop

24 Critical review report
1 page maximum Pick one paper from the suggested list Summarize the paper (use you own words) Provide detailed comments What can be improved Potential future directions Other related work Some students will be selected to present their critical reviews

Paper presentation Each group has 2~3 students Picked one paper from the suggested readings, or your favorite paper Cannot be the same as critical review report Can be related to your final project Register your choice early 15 min presentation + 2 mins Q&A Will be graded by the instructor, TA, other students

Final Project Work in groups (2~3 students) Project proposal Written report, 2 page maximum Project report (35%) < 8 pages, ACL format Due 2 days before the final presentation Project presentation (15%) 5-min in-class presentation (tentative)

Late Policy Credit of 48 hours for all the assignments Including proposal and final project No accumulation No more grace period No make-up exam unless under emergency situation

Cheating/Plagiarism No. Ask if you have concerns UVA Honor Code:

29 Lectures and office hours
Participation is highly appreciated! Ask questions if you are still confusing Feedbacks are welcomed Lead the discussion in this class Enroll Piazza

Topics of this class Fundamental NLP problems Machine learning & statistical approaches for NLP NLP applications Recent trend in NLP

What to Read? Natural Language Processing ACL, NAACL, EACL, EMNLP, CoNLL, Coling, TACL Machine learning ICML, NIPS, ECML, AISTATS, ICLR, JMLR, MLJ Artificial Intelligence AAAI, IJCAI, UAI, JAIR

Questions?

This lecture Course Overview What is NLP? Why it is important? What will you learn from this course? Course Information What are the challenges? Key NLP components

34 Challenges – ambiguity
Word sense ambiguity

35 Challenges – ambiguity
Word sense / meaning ambiguity Credit:

36 Challenges – ambiguity
PP attachment ambiguity Credit: Mark Liberman,

37 Challenges -- ambiguity
Ambiguous headlines: Include your children when baking cookies Local High School Dropouts Cut in Half Hospitals are Sued by 7 Foot Doctors Iraqi Head Seeks Arms Safety Experts Say School Bus Passengers Should Be Belted   Teacher Strikes Idle Kids

38 Challenges – ambiguity
Pronoun reference ambiguity Credit:

39 Challenges – language is not static
Language grows and changes e.g., cyber lingo LOL Laugh out loud G2G Got to go BFN Bye for now B4N Idk I don't know FWIW For what it's worth LUWAMH Love you with all my heart

40 Challenges--language is compositional
Carefully Slide

41 Challenges--language is compositional
小心: Carefully Careful Take Care Caution 地滑: Slide Landslip Wet Floor Smooth

Challenges – scale Examples: Bible (King James version): ~700K Penn Tree bank ~1M from Wall street journal Newswire collection: 500M+ Wikipedia: 2.9 billion word (English) Web: several billions of words

This lecture Course Overview What is NLP? Why it is important? What will you learn from this course? Course Information What are the challenges? Key NLP components

Part of speech tagging

45 Syntactic (Constituency) parsing
Syntactic (Constituency) parsing

46 Syntactic structure => meaning
Image credit: Julia Hockenmaier, Intro to NLP

Dependency Parsing

Semantic analysis Word sense disambiguation Semantic role labeling Credit: Ivan Titov

49 Q: [Chris] = [Mr. Robin] ? Christopher Robin is alive and well. He is the same person that you read about in the book, Winnie the Pooh. As a boy, Chris lived in a pretty home called Cotchfield Farm. When Chris was three years old, his father wrote a poem about him. The poem was printed in a magazine for others to read. Mr. Robin then wrote a book Slide modified from Dan Roth

50 Co-reference Resolution
Christopher Robin is alive and well. He is the same person that you read about in the book, Winnie the Pooh. As a boy, Chris lived in a pretty home called Cotchfield Farm. When Chris was three years old, his father wrote a poem about him. The poem was printed in a magazine for others to read. Mr. Robin then wrote a book

Questions?

