Learning to laugh By: Danielle Tabashi,

Slides:

Advertisements

Similar presentations

Document Summarization using Conditional Random Fields Dou Shen, Jian-Tao Sun, Hua Li, Qiang Yang, Zheng Chen IJCAI 2007 Hao-Chin Chang Department of Computer.

Advertisements

Introduction to Information Retrieval

Using Link Grammar and WordNet on Fact Extraction for the Travel Domain.

Engeniy Gabrilovich and Shaul Markovitch American Association for Artificial Intelligence 2006 Prepared by Qi Li.

Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:

Information Extraction and Ontology Learning Guided by Web Directory Authors:Martin Kavalec Vojtěch Svátek Presenter: Mark Vickers.

Automatic Classification of Semantic Relations between Facts and Opinions Koji Murakami, Eric Nichols, Junta Mizuno, Yotaro Watanabe, Hayato Goto, Megumi.

Predicting the Semantic Orientation of Adjectives

Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.

Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.

Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.

SI485i : NLP Set 12 Features and Prediction. What is NLP, really? Many of our tasks boil down to finding intelligent features of language. We do lots.

Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.

How writers use language to influence the reader

Mining and Summarizing Customer Reviews

Unit 1 – Understanding Non-Fiction and Media Texts

Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.

MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.

Active Learning for Class Imbalance Problem

Short Introduction to Machine Learning Instructor: Rada Mihalcea.

Bayesian Networks. Male brain wiring Female brain wiring.

1 A study on automatically extracted keywords in text categorization Authors:Anette Hulth and Be´ata B. Megyesi From:ACL 2006 Reporter: 陳永祥 Date:2007/10/16.

Name : Emad Zargoun Id number : EASTERN MEDITERRANEAN UNIVERSITY DEPARTMENT OF Computing and technology “ITEC547- text mining“ Prof.Dr. Nazife Dimiriler.

CROSSMARC Web Pages Collection: Crawling and Spidering Components Vangelis Karkaletsis Institute of Informatics & Telecommunications NCSR “Demokritos”

 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.

Universit at Dortmund, LS VIII

A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:

CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.

Bootstrapping for Text Learning Tasks Ramya Nagarajan AIML Seminar March 6, 2001.

Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,

1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)

Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.

1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )

Blog Summarization We have built a blog summarization system to assist people in getting opinions from the blogs. After identifying topic-relevant sentences,

Matwin Text classification: In Search of a Representation Stan Matwin School of Information Technology and Engineering University of Ottawa

Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.

Finding frequent and interesting triples in text Janez Brank, Dunja Mladenić, Marko Grobelnik Jožef Stefan Institute, Ljubljana, Slovenia.

26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.

From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:

Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.

NATURAL LANGUAGE PROCESSING

Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.

© NCSR, Frascati, July 18-19, 2002 CROSSMARC big picture Domain-specific Web sites Domain-specific Spidering Domain Ontology XHTML pages WEB Focused Crawling.

Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.

Non-fiction and Media Higher Tier.

How to teach reading Why teach reading? There are many reasons:

Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance Hello everyone,

Identifying Question Stems

Sentiment analysis algorithms and applications: A survey

Here Today - Gone Tomorrow!

Entry Task #1 – Date Self-concept is a collection of facts and ideas about yourself. Describe yourself in your journal in a least three sentences. What.

Lecture 15: Text Classification & Naive Bayes

WHAT IS READING COMPREHENSION?

Social Knowledge Mining

Machine Learning in Practice Lecture 11

Know Your Reading Strategies

Property consolidation for entity browsing

Introduction Task: extracting relational facts from text

A method for WSD on Unrestricted Text

Automatic Detection of Causal Relations for Question Answering

Family History Technology Workshop

Text Mining & Natural Language Processing

How writers use language to influence the reader

CS246: Information Retrieval

Ping LUO*, Fen LIN^, Yuhong XIONG*, Yong ZHAO*, Zhongzhi SHI^

Introduction to Sentiment Analysis

Introduction to Search Engines

Unit task Preparing and acting out a sketch about feelings

Presentation transcript:

Learning to laugh By: Danielle Tabashi, Based on the article: LEARNING TO LAUGH (AUTOMATICALLY): COMPUTATIONAL MODELS FOR HUMOR RECOGNITION by RADA MIHALCEA and CARLO STRAPPARAVA.

What makes us laugh???

What makes us laugh? Human-centric vocabulary- Fraud & Minsky: “laughter is often provoked by feelings of frustration caused by our own, sometime awkward, behavior”. ~25% of the jokes in the collection include the word “you”. ~15% include the word “I”.

What makes us laugh? Negation and negative orientation- ~20% of jokes in the collection contains negation: can’t, don’t, isn’t and so on. words with negative connotations: bad, failure. For example: “Money can’t buy you friends, but you do get a better class of enemy”.

What makes us laugh? Professional communities - Many jokes seem to target professional communities. For example: “It was so cold last winter that I saw a lawyer with his hands in his own pockets”. בדיחות שמיוחסות לקהילות של אנשי מקצוע למשל – בדיחות מתכנתים זה כנראה הכי רלוונטי לנו.

What makes us laugh? Human “weakness”- Events or entities that are associated with “weak” human moments. For example: “If you can’t drink and drive, then why do bars have parking lots?”

What is one-liner? A short sentence with comic effects and an interesting linguistic structure. Interesting linguistic structure means: simple syntax using rhetoric devices frequent use of creative language constructions meant to attract the readers’ attention. Example: Take my advice; I don’t use it anyway. The investigation is focused on the type of humor found in one-liners. Rhetoric אמנות הדיבור למשל חזרות על שורשים שנראה בהמשך alliteration

So…. How can we teach the computer to recognize Humor?!

Train the computational models on humorous and non-humorous examples. The main idea Train the computational models on humorous and non-humorous examples.

overview Humorous and non-humorous data-sets. Automatic humor recognition. Experimental results.

Humorous and non-humorous data-sets

Step 1: Build the humorous data-set

What is bootstrapping? A process which expands the data set. We use it when one data-set is really bigger than the other. Otherwise, the learning will be biased. Input: a few examples given from the user Output: a bigger set of examples. למה בכלל שנרצה להגדיל את אחת הקבוצות? למה שנרצה שהן יהיו שוות בגודלן בערך? כי אם יש לי 99 דוגמאות חיוביות ודוגמא אחת שלילית, ואני בונה מודל פשוט שעונה תמיד "כן" אז המודל שלי כביכול מדויק ב99% אבל זה כמובן ממש לא נכון משום שאנחנו לא עונים אף פעם "לא". זה מודל לא טוב.

Web bootstrapping Small group of one liners manually identified List of web pages that include at least one of the one liners Add to the group Html parsing web search engine Group of one-liners Add to the group Find more one liners.

Constraints Problem: the process can add noisy examples. Solution: constraints: a thematic constraint –keywords in the URL. a structural constraint –list, adjacent paragraphs.

Step 2: Build the non-humorous data-set

The problem We want to learn non-humorous examples which have similar structure and vocabulary to the humorous examples. So the classifier can recognize humor-specific features. Let’s see four different sets of negative examples.

4 sets of negative examples: Reuters titles – from news articles. A short sentence which are phrased to catch the readers’ attention.

4 sets of negative examples: Proverbs - one short sentence that transmits important facts or true experience. For example: Beauty is in the eye of the beholder. Proverbs פתגמים

4 sets of negative examples: British national corpus (BNC) sentences – select sentences which are similar in content with the one liners. For example: I wonder if there are some contradiction here. I, contradiction הם סממנים של אוצר מיללים של הומור כמו שהזכרנו בהתחלה אבל מצד שני, אין כאן באמת שום דבר הומוריסטי.

4 sets of negative examples: Open mind common sense (OMCS) sentences – explanations and assertions. The comic effect of jokes is based on statements that break our commonsensical understanding of the world. For example: A file is used for keeping documents. למשפטים הללו יש חשיבות עליונה בלמידה, משום שהומור מבוסס על שבירת ההבנה שלנו את העולם, ולכן צריך ללמד אותו להבדיל בין מקרים שבהם המשפט הוא באמת אמיתי לבין משפט שבו שינוי קטן גורם לקטע של הומור.

Build the data-set: summary Positive examples Negative examples Some one liners Reuters titles Proverbs Web- based bootstrapping BNC sentences OMCS sentences A lot of one liners

overview Humorous and non-humorous data-sets. Automatic humor recognition. Experimental results.

Automatic classification Humor-specific stylistic features Alliteration Antonymy Adult slang Content-based learning Naïve Bayes Support vector machine המטרה שלנו היא לבנות משהו שיסווג כל משפט כמשפט הומור או לא. המסווג ישתמש בשני כלים עיקריים לסיווג: כלי מבוסס על מבנה וסטייל של משפט וכלי מבוסס על תוכן. Antonymy הפכים במשמעות

Humor specific stylistic features Linguistic theories of humor have suggested many stylistic features that characterize humorous text, such as alliteration, antonymy and adult slang. כעת נתמקד בכלים המבוססים על מבנה וסטייל. Linguistic לשוני

Alliteration For example: Infants don’t enjoy infancy like adults do adultery. The algorithm steps: Convert the sentence to its phonetic chain using the CMU pronunciation dictionary. In our example: Infants - infancy is converted to IH1 N F AH0 N T S - IH1 N F AH0 N S IY0. Find the longest phonetic string matching chains. Count the matchings. Alliteration חזרתיות

Antonymy Always try to be modest, and be proud of it! Use the WordNet resource in order to recognize antonym in sentence. Specifically, use the antonymy relation among nouns, verbs, adjectives and adverbs. Antonymy הפכים במשמעות כמו בדוגמא צנוע לעומת גאה WordNet כלי ברשת לניתוח מילים מבחינת משמעות. מאפשר לקבל מילים נרדפות, הפכים וכולי.

Adult slang Search for sexual-oriented lexicon in the sentence. Use the WordNet Domains: extract all the synonyms sets labeled with the domain SEXUALITY. לא הבאתי כאן דוגמא כי היא מעט בוטה לדעתי, ובכללי כל הסלנג הזה הוא בוטה כי יש בו אלמנטיים מיניים. אז העדפתי להמנע.

Automatic classification Humor-specific stylistic features Alliteration Antonymy Adult slang Content-based learning Naïve Bayes Support vector machine

Content-based learning Another way to recognize humor is to use traditional text-classification. We should give the algorithm labeled examples so it can learn and classify unlabeled example. In this case, we use two algorithms: Naïve Bayes Support vector machine

Naïve Bayes The main idea is to estimate the probability of a category given a document using joint probabilities of words and documents. It assumes word independence. Build probabilities table according training set. Predict the category given a document according the probabilities table.

Example – Naïve Bayes sentence Text Label 1 I love you + 2 I hate you - 3 I love Ben P(+)=2/3 P(-)=1/3 P(i|+)=… P(i|-)=… P(Ben|+)=.. P(Ben|-)=… Sentence I Love You Hate Ben Label 1 + 2 - 3

Support vector machine Each data point is a vector in the p-dimensional space. Binary classifiers. Finds the hyperplane that best separates a set of positive examples from a set of negative examples.

overview Humorous and non-humorous data-sets. Automatic humor recognition. Experimental results.

Experimental Results Heuristic using humor-specific features Text classification with content features Combining stylistic and content features.

Heuristic using humor-specific features We use thresholds that are learned automatically using decision tree on a small subset. These thresholds are the only parameter required for a statement to be classified as humorous or non- humorous. We evaluate the model. Reminder: the stylistic humor- specific features are alliteration, antonymy and adult slang. דוגמא בשקף הבא לעץ החלטה

Decision tree Build a decision tree based on a small training set. Rainy? Temp>25 Temp > 20 winter spring winter spring

Heuristic using humor-specific features- Cont. The style of Reuters titles is the most different with respect to one-liners. The style of Proverbs is the most similar. The alliteration feature is the most useful indicator of humor.

Experimental Results Heuristic using humor-specific features Text classification with content features Combining stylistic and content features.

Text classification with content features The BNC sentences are the most similar data. The Reuters are the most different with respect to one-liners.

Experimental Results Heuristic using humor-specific features Text classification with content features Combining stylistic and content features.

Combining stylistic and content features The results: No improvement for Proverbs and OMCS because we have just seen that these statements cannot be clearly differentiated from one liners using stylistic features. Build a new learning model: For each example: create a vector of the text classifier and the three humor-specific features. These vectors are the training set for the learning machine. אינטואיטיבי מאוד: לבנות מודל שישלב את שניהם זה להפוך כל מופע לוקטור שיש בו קאורדינטות עבור כל אחד מהמהאפיינים הסטייליסטים וגם עבור המסווג טקסט.

Word similarities in different semantic spaces. Where computers fail. Difficulties Word similarities in different semantic spaces. Where computers fail. נציג מעט את הקשיים העיקריים במחקר.

Word similarities in different semantic spaces We can see that the BNC seems to be the most “neutral” in its suggestion. העניין פה הוא בכלל הבעיתיות של שימוש בניתוח תוכן לצורך זיהוי הומור, כי כמו שאנחנו רואים כל אחד מפרש דברים בצורה שונה והמרחב הסמנטי שלהם שונה לגמרי. הקדמה על דימיון סמנטי. דימיון סמנטי הוא שונה במרחבים סמנטים שונים. כלומר מילים שדומות ליפה במרחב הסמנטי של Bnc יהיו באמת כמו שאנחנו חושבים, הכי טבעי. אבל במרחבים אחרים נמצא דברים שונים לגמרי.

Word similarities in different semantic spaces One liners do not seem to recognize the real beauty. In agreement with other theories of humor that suggest incongruity and opposition as sources of laughter. כלומר בהמשך למה שדיברנו עליו בהתחלה במה גורם לנו לצחוק, אנחנו יכולים לראות שהפכים ואי התאמה מתבטאת פה די טוב

Word similarities in different semantic spaces The proverbs suggest that beauty vanishes soon. It can reflect the educational purpose of the proverbial sayings. Beauty vanishes soon יופי נעלם במהרה – בקטע של פתגם. למשל יש לנו שם דוהה fading וכו

Word similarities in different semantic spaces The OMCS gives words that are related to feminine beauty.

Word similarities in different semantic spaces Reuters suggest that beauty is related to achieving economy important targets.

Word similarities in different semantic spaces The one-liners are very similar to the commonsense and BNC and very different from Reuters. This is in agreement to the content-based classification results – HUMOR TENDS TO BE SIMILAR TO REGULAR TEXT.

Word similarities in different semantic spaces. Where computers fail. Discussion Word similarities in different semantic spaces. Where computers fail.

Where computers fail Irony – The humorous effect at more than 50% of the sample. The irony is targeted to the speaker / the dialogue partner / entire professional community. Some humor’ identifiers the computer does not recognize:

Where computers fail Ambiguity – The humorous effect at more than 20% of the sample. Word ambiguity and the corresponding potential misinterpretation. For example: “Change is inevitable, except from a vending machine.” “Change is inevitable, except from a vending machine.” At the first part of the sentence, we think about change as an action of changing something – and the second part refers it as a money. The surprise creates the humorous effect.

Where computers fail Incongruity– For example: “A diplomat is someone who can tell you to go to hell in such a way that you will look forward to the trip”. The comic effect cannot be recognized via WordNet and such on. We need corpus-based approaches to incongruity detection. הבעיה פה שוורדנט לא יוכל לעזור כי הוא לא יזהה את חוסר ההתאמה. אנחנו צריכים משהו שיעזור למצוא חוסר התאמה במשפט – שאינו בא בניגוד ברור של הפכים

Where computers fail Idiomatic expressions – For example: “I used to have an open mind, but my brains kept falling out”. The idiom ‘open mind’ means receptive to new ideas, while in the example we use ‘open’ as exposed. Idiomatic ניבים

Where computers fail Commonsense knowledge– For example: “I like kids, but I don’t think I could eat a whole one”. Vs “I like chickens, but I don’t think I could eat a whole one”. This one liner is based on the commonsense knowledge: one cannot eat kids. אם נשווה את זה למשפט I like chickens but I don’t think I could eat a whole one – שזה משפט הגיוני לגמרי חסר כל הומור.