Download presentation
Presentation is loading. Please wait.
Published byMeredith Green Modified over 6 years ago
1
Introduction to NLP What is Natural Language Processing?
Many slides reused from Dan Jurafsky/Christopher Manning (Stanford)
2
Question Answering: IBM’s Watson
Won Jeopardy on February 16, 2011! WILLIAM WILKINSON’S “AN ACCOUNT OF THE PRINCIPALITIES OF WALLACHIA AND MOLDOVIA” INSPIRED THIS AUTHOR’S MOST FAMOUS NOVEL Bram Stoker
3
Information Extraction
Event: Curriculum mtg Date: Aug. 23, 2017 Start: 10:00am End: 11:30am Where: JBHT 532 Subject: curriculum meeting Date: Aug. 22, 2017 To: Susan Gauch Hi Susan, we’ve now scheduled the curriculum meeting. It will be in JBHT 532 tomorrow from 10:00-11:30. -Dave Create new Calendar entry
4
Information Extraction & Sentiment Analysis
Attributes: zoom affordability size and weight flash ease of use Size and weight ✓ nice and compact to carry! since the camera is small and light, I won't need to carry around those heavy, bulky professional cameras either! the camera feels flimsy, is plastic and very light in weight you have to be very delicate in the handling of this camera ✓ ✗
5
Machine Translation Helping human translators Fully automatic
Enter Source Text: 这 不过 是 一 个 时间 的 问题 . Translation from Stanford’s Phrasal: This is only a matter of time.
6
Making progress on this problem…
The task is difficult! What tools do we need? Knowledge about language Knowledge about the world A way to combine knowledge sources How we generally do this: probabilistic models built from language data P(“maison” “house”) high P(“L’avocat général” “the general avocado”) low Luckily, rough text features can often do half the job.
7
Ambiguity makes NLP hard: “Crash blossoms”
100% REAL Violinist Linked to JAL Crash Blossoms Teacher Strikes Idle Kids Red Tape Holds Up New Bridges Hospitals Are Sued by 7 Foot Doctors Juvenile Court to Try Shooting Defendant Local High School Dropouts Cut in Half
8
A few of the 83+ parses for The post office will hold out discounts and service concessions as incentives. [Shortened WSJ sentence.] • S NP Aux VP The post office will V NP hold out NP Conj NP discounts and N N PP service concessions as incentives 8
9
• S NP Aux VP The post office will V NP PP hold out NP Conj NP
as incentives discounts and service concessions 9
10
• 1 S NP Aux VP The post office will VP Conj VP V NP and V NP PP
hold out discounts service concessions as incentives
11
• 1 S NP Aux VP The post office will VP Conj VP V NP and V NP PP
hold out discounts service concessions as incentives
12
• S NP Aux VP The post office will V PP PP hold P NP as incentives out
Conj NP discounts and service concessions 14
13
• 1 S NP VP The post office will hold VP Conj VP V NP and V NP PP
out discounts service concessions as incentives
14
Famous Ambiguity Examples
Time flies like an arrow Translation English -> Russian -> English: The spirit is willing but the flesh is weak Became “The vodka is good but the meat is rotten” Punctuation matters Czarina saved a man’s life by changing: Pardon impossible, to be sent to Siberia Became “Pardon, impossible to be sent to Siberia”
15
Where do problems come in?
Syntax Part of speech ambiguities Attachment ambiguities Semantics Word sense ambiguities (Semantic interpretation and scope ambigui- ties) 15
16
Why else is natural language understanding difficult?
non-standard English segmentation issues idioms Great Were SOO PROUD of what youve accomplished! U taught us 2 #neversaynever & you yourself should never give up either♥ dark horse get cold feet lose face throw in the towel the New York-New Haven Railroad neologisms world knowledge tricky entity names unfriend Retweet bromance Mary and Sue are sisters. Mary and Sue are mothers. Where is A Bug’s Life playing … Let It Be was recorded … … a mutation on the for gene … But that’s what makes it fun!
17
Language Technology making good progress still really hard
Sentiment analysis still really hard mostly solved Best roast chicken in San Francisco! Question answering (QA) The waiter ignored us for 20 minutes. Q. How effective is ibuprofen in reducing fever in patients with acute febrile illness? Spam detection Coreference resolution ✓ Let’s go to Agra! ✗ Paraphrase Carter told Mubarak he shouldn’t run again. Buy V1AGRA … Word sense disambiguation (WSD) XYZ acquired ABC yesterday Part-of-speech (POS) tagging I need new batteries for my mouse. ABC has been taken over by XYZ ADJ ADJ NOUN VERB ADV Summarization Parsing Colorless green ideas sleep furiously. The Dow Jones is up Economy is good I can see Alcatraz from the window! The S&P500 jumped Named entity recognition (NER) Housing prices rose Machine translation (MT) PERSON ORG LOC Dialog 第13届上海国际电影节开幕… Where is Citizen Kane playing in SF? Einstein met with UN officials in Princeton The 13th Shanghai International Film Festival… Castro Theatre at 7:30. Do you want a ticket? Information extraction (IE) Party May 27 add You’re invited to our dinner party, Friday May 27 at 8:30
18
Statistical NLP methods
Involve deriving numerical data from text Are usually but not always probabilistic Many techniques are used: – n-grams, history-based models, decision trees / de- cision lists, memory-based learning, loglinear mod- els, HMMs, neural networks, vector spaces, graphi- cal models, …
19
This class Teaches key theory and methods for statistical NLP:
Probability and information theory classifiers N-gram language modeling Statistical Parsing Inverted index, tf-idf, vector models of meaning For practical, robust real-world applications Information extraction Spelling correction Information retrieval Sentiment analysis
20
Skills you’ll need Simple linear algebra (vectors, matrices)
Basic probability theory C++ or Java or Python programming Knowledge of data structures User-level knowledge of linux
21
What is Natural Language Processing?
Introduction to NLP What is Natural Language Processing?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.