Learning Multilingual Subjective Language via Cross-Lingual Projections Mihalcea, Banea, and Wiebe ACL 2007 NLG Lab Seminar 4/11/2008.

Slides:



Advertisements
Similar presentations
26./27. Juni 2006 Saarbrücken Workshop on multilingual semantic annotation, Saarbrücken, 26/ Comments on Emanuele Pianta: Exploiting Parallel Texts.
Advertisements

Development of a German- English Translator Felix Zhang.
Automatic Identification of Cognates, False Friends, and Partial Cognates University of Ottawa, Canada University of Ottawa, Canada.
Automatic Identification of Cognates and False Friends in French and English Diana Inkpen and Oana Frunza University of Ottawa and Greg Kondrak University.
Software Applications for Processing Romanian Texts. Demonstration and Comparison Sanda Cherata Babeş-Bolyai University Faculty of Letters.
Sentiment Analysis An Overview of Concepts and Selected Techniques.
A Brief Overview. Contents Introduction to NLP Sentiment Analysis Subjectivity versus Objectivity Determining Polarity Statistical & Linguistic Approaches.
1 JCDL 2011 Report Kazunari Sugiyama WING meeting 19 th August, 2011.
Automatic Metaphor Interpretation as a Paraphrasing Task Ekaterina Shutova Computer Lab, University of Cambridge NAACL 2010.
The University of Wisconsin-Madison Universal Morphological Analysis using Structured Nearest Neighbor Prediction Young-Bum Kim, João V. Graça, and Benjamin.
Assuming normally distributed data! Naïve Bayes Classifier.
Annotating Expressions of Opinions and Emotions in Language Wiebe, Wilson, Cardie.
CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?
Are Linguists Dinosaurs? 1.Statistical language processors seem to be doing away with the need for linguists. –Why do we need linguists when a machine.
Open Information Extraction From The Web Rani Qumsiyeh.
Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.
1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.
Shallow semantic parsing: Making most of limited training data Katrin Erk Sebastian Pado Saarland University.
Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff, Janyce Wiebe, Theresa Wilson Presenter: Gabriel Nicolae.
Learning Subjective Adjectives from Corpora Janyce M. Wiebe Presenter: Gabriel Nicolae.
WSD using Optimized Combination of Knowledge Sources Authors: Yorick Wilks and Mark Stevenson Presenter: Marian Olteanu.
EMPOWER 2 Empirical Methods for Multilingual Processing, ‘Onoring Words, Enabling Rapid Ramp-up Martha Palmer, Aravind Joshi, Mitch Marcus, Mark Liberman,
Carmen Banea, Rada Mihalcea University of North Texas A Bootstrapping Method for Building Subjectivity Lexicons for Languages.
Oana Adriana Şoica Building and Ordering a SenDiS Lexicon Network.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
Language Data Resources About Corpora. J. Sinclair: “Language looks rather different when you look at a lot of it at once.“ P. Eisner: “Znáte jej, ten.
Word Sense and Subjectivity Jan Wiebe Rada Mihalcea University of Pittsburgh University of North Texas.
Statistical Machine Translation Part III – Phrase-based SMT Alexander Fraser CIS, LMU München WSD and MT.
1 Co-Training for Cross-Lingual Sentiment Classification Xiaojun Wan ( 萬小軍 ) Associate Professor, Peking University ACL 2009.
B-KUL-H02B1A Natural Language Processing Taught by: Marie-Francine Moens Vincent Vandeghinste Lectures and exercises 2nd semester 4 study points.
An Effective Word Sense Disambiguation Model Using Automatic Sense Tagging Based on Dictionary Information Yong-Gu Lee
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
Page 1 SenDiS Sectoral Operational Programme "Increase of Economic Competitiveness" "Investments for your future" Project co-financed by the European Regional.
A MIXED MODEL FOR CROSS LINGUAL OPINION ANALYSIS Lin Gui, Ruifeng Xu, Jun Xu, Li Yuan, Yuanlin Yao, Jiyun Zhou, Shuwei Wang, Qiaoyun Qiu, Ricky Chenug.
Real-World Semi-Supervised Learning of POS-Taggers for Low-Resource Languages Dan Garrette, Jason Mielens, and Jason Baldridge Proceedings of ACL 2013.
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
Arpit Maheshwari Pankhil Chheda Pratik Desai. Contents 1. Introduction And Basic Definitions 2. Applications 3. Challenges 4. Problem Formulation and.
A Cross-Lingual ILP Solution to Zero Anaphora Resolution Ryu Iida & Massimo Poesio (ACL-HLT 2011)
11 Chapter 19 Lexical Semantics. 2 Lexical Ambiguity Most words in natural languages have multiple possible meanings. –“pen” (noun) The dog is in the.
“Set” American Sign Language IV. What do I do?  Each of you will receive a 3x5 card.
Bootstrapping for Text Learning Tasks Ramya Nagarajan AIML Seminar March 6, 2001.
1 Multi-Perspective Question Answering Using the OpQA Corpus (HLT/EMNLP 2005) Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University.
Improving Morphosyntactic Tagging of Slovene by Tagger Combination Jan Rupnik Miha Grčar Tomaž Erjavec Jožef Stefan Institute.
1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )
NRC Report Conclusion Tu Zhaopeng NIST06  The Portage System  For Chinese large-track entry, used simple, but carefully- tuned, phrase-based.
Recognizing Stances in Ideological Online Debates.
Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
CSC 594 Topics in AI – Text Mining and Analytics
Comparative Experiments on Sentiment Classification for Online Product Reviews Hang Cui, Vibhu Mittal, and Mayur Datar AAAI 2006.
Statistical Machine Translation Part III – Many-to-Many Alignments Alexander Fraser CIS, LMU München WSD and MT.
Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff School of Computing University of Utah Janyce Wiebe, Theresa Wilson Computing.
Mining Wiki Resoures for Multilingual Named Entity Recognition Xiej un
Subjectivity Recognition on Word Senses via Semi-supervised Mincuts Fangzhong Su and Katja Markert School of Computing, University of Leeds Human Language.
7/2003EMNLP031 Learning Extraction Patterns for Subjective Expressions Ellen Riloff Janyce Wiebe University of Utah University of Pittsburgh.
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
July 2002, DI Colloquium Semantic Annotation for Semantic Indexing Paul Buitelaar, Martin VolkMuchMore DFKI Language Technology Saarbrücken, Germany Eurospider.
Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.
Word Sense and Subjectivity (Coling/ACL 2006) Janyce Wiebe Rada Mihalcea University of Pittsburgh University of North Texas Acknowledgements: This slide.
English-Hindi Neural machine translation and parallel corpus generation EKANSH GUPTA ROHIT GUPTA.
Identifying Expressions of Opinion in Context Eric Breck and Yejin Choi and Claire Cardie IJCAI 2007.
A Sentiment-Based Approach to Twitter User Recommendation BY AJAY ABDULPUR RAJARAM NIKKAM.
SENSEVAL: Evaluating WSD Systems
Urdu-to-English Stat-XFER system for NIST MT Eval 2008
Statistical NLP: Lecture 9
Meni Adler and Michael Elhadad Ben Gurion University COLING-ACL 2006
Statistical NLP : Lecture 9 Word Sense Disambiguation
Presentation transcript:

Learning Multilingual Subjective Language via Cross-Lingual Projections Mihalcea, Banea, and Wiebe ACL 2007 NLG Lab Seminar 4/11/2008

Objective Build Romanian sentence subjectivity classifier from English resources No language-specific information

First Approach English Lexicon Romanian Lexicon Romanian Sentence Subjectivity Classifier Translate Train

First Approach English Lexicon Romanian Lexicon Romanian Sentence Subjectivity Classifier Translate Train

English Lexicon From OpinionFinder (Wiebe) Manual annotation 6856 entries (990 multi-word entries) Strong v.s. Weak

First Approach English Lexicon Romanian Lexicon Romanian Sentence Subjectivity Classifier Translate Train

Translation and Challenges Dictionary-based Lemma Word-sense disambiguation Multi-word translation

First Approach English Lexicon Romanian Lexicon Romanian Sentence Subjectivity Classifier Translate Train

Sample Translation

Translation Quality 2 Romanian annotators Accuracy: 94/150 (63%)

First Approach English Lexicon Romanian Lexicon Romanian Sentence Subjectivity Classifier Translate Train

Romanian Sentence Subjectivity Classifier Algorithm If  strong expression  subjective Else if  2 weak expressions  objective Else unknown

Romanian Classifier Performance

Problem with First Approach English Lexicon Romanian Lexicon Romanian Sentence Subjectivity Classifier Translate Train

Second Approach Opinion Finder Classifier Parallel Corpus: English Train Tag Parallel Corpus: Romanian ProjectTrain Romanian Classifier Annotated English Sentence Subjectivity Corpus

Second Approach Annotated English Sentence Subjectivity Corpus Opinion Finder Classifier Parallel Corpus: English Train Tag Parallel Corpus: Romanian ProjectTrain Romanian Classifier

OpinionFinder Sentence Subjectivity Classifier Performance

Second Approach Opinion Finder Classifier Parallel Corpus: English Train Tag Parallel Corpus: Romanian ProjectTrain Romanian Classifier Annotated English Sentence Subjectivity Corpus

Romanian Sentence Subjectivity Classifier Algorithm Naïve Bayes Word features

Romanian Classifier Performance

First Approach Language 1 Corpus Language 2 Corpus Language 2 Classifier Translate Train

Second Approach Annotated Language 1 Corpus Language 1 Classifier Parallel Corpus: Language 1 Train Tag Parallel Corpus: Language 2 ProjectTrain Language 2 Classifier