Learning Multilingual Subjective Language via Cross-Lingual Projections Mihalcea, Banea, and Wiebe ACL 2007 NLG Lab Seminar 4/11/2008
Objective Build Romanian sentence subjectivity classifier from English resources No language-specific information
First Approach English Lexicon Romanian Lexicon Romanian Sentence Subjectivity Classifier Translate Train
First Approach English Lexicon Romanian Lexicon Romanian Sentence Subjectivity Classifier Translate Train
English Lexicon From OpinionFinder (Wiebe) Manual annotation 6856 entries (990 multi-word entries) Strong v.s. Weak
First Approach English Lexicon Romanian Lexicon Romanian Sentence Subjectivity Classifier Translate Train
Translation and Challenges Dictionary-based Lemma Word-sense disambiguation Multi-word translation
First Approach English Lexicon Romanian Lexicon Romanian Sentence Subjectivity Classifier Translate Train
Sample Translation
Translation Quality 2 Romanian annotators Accuracy: 94/150 (63%)
First Approach English Lexicon Romanian Lexicon Romanian Sentence Subjectivity Classifier Translate Train
Romanian Sentence Subjectivity Classifier Algorithm If strong expression subjective Else if 2 weak expressions objective Else unknown
Romanian Classifier Performance
Problem with First Approach English Lexicon Romanian Lexicon Romanian Sentence Subjectivity Classifier Translate Train
Second Approach Opinion Finder Classifier Parallel Corpus: English Train Tag Parallel Corpus: Romanian ProjectTrain Romanian Classifier Annotated English Sentence Subjectivity Corpus
Second Approach Annotated English Sentence Subjectivity Corpus Opinion Finder Classifier Parallel Corpus: English Train Tag Parallel Corpus: Romanian ProjectTrain Romanian Classifier
OpinionFinder Sentence Subjectivity Classifier Performance
Second Approach Opinion Finder Classifier Parallel Corpus: English Train Tag Parallel Corpus: Romanian ProjectTrain Romanian Classifier Annotated English Sentence Subjectivity Corpus
Romanian Sentence Subjectivity Classifier Algorithm Naïve Bayes Word features
Romanian Classifier Performance
First Approach Language 1 Corpus Language 2 Corpus Language 2 Classifier Translate Train
Second Approach Annotated Language 1 Corpus Language 1 Classifier Parallel Corpus: Language 1 Train Tag Parallel Corpus: Language 2 ProjectTrain Language 2 Classifier