Download presentation
Presentation is loading. Please wait.
Published byRodger Hunter Modified over 9 years ago
1
Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology and National Institute of Informatics, Japan Journal of Information Processing and Management 2009 Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen
2
2009/04/27Cicilia Chia-ying Lee2 Outline 1. Problem Definition 2. Corpus: NTCIR-6 pilot 3. Approach in NTCIR-6 4. Revised Approach after NTCIR-6 5. Comparison and Discussion 6. Conclusion
3
2009/04/27Cicilia Chia-ying Lee3 Problem Definition(1/2) Identify opinion holder in opinion sentence It is important because news articles contain many opinions from different opinion holder Opinion holder: 1. The explicit noun phrases in the sentences 2. The inexplicit noun phrases (ex: anaphor) 3. The exophoric elements (ex: author)
4
2009/04/27Cicilia Chia-ying Lee4 Problem Definition(2/2) Author: the writer of the document Authority: the third parties Focused on different writing style Difference in syntactic constructs or term usages.
5
2009/04/27Cicilia Chia-ying Lee5 Corpus NTCIR-6 Opinion Analysis Pilot Task Evaluation method
6
2009/04/27Cicilia Chia-ying Lee6 Approach in NTCIR-6 Evaluation results in NTCIR-6 1 2 3
7
2009/04/27Cicilia Chia-ying Lee7 Author and Authority Opinion Extraction(1/4) Three opinion types (Wiebe et al 2005) 1. Explicit mentions of private states by a person, nation, or organization 2. Speech events expressing private states by an agent 3. Expressive subjective elements (author view)
8
2009/04/27Cicilia Chia-ying Lee8 Author and Authority Opinion Extraction(2/4) Japanese Train set: NTCIR-6, 4 training topics Features: Syntactic pairs of grammatical subjects and predicates such as pronouns Subjects : named entities, semantic primitives, and key terms Predicates : semantic primitives from a thesaurus Parser: Cabocha
9
2009/04/27Cicilia Chia-ying Lee9 Author and Authority Opinion Extraction(3/4) English Train set: MPQA Corpus Author view: ‘‘nested source” attributes was a ‘‘w” (writer) and not nested Feature: Syntactic pairs of the syntactic patterns such as nouns and adjectives/verbs Parser: Minipar
10
2009/04/27Cicilia Chia-ying Lee10 Author and Authority Opinion Extraction(4/4)
11
2009/04/27Cicilia Chia-ying Lee11 Rule-based Holder Identification (1) Bracketed elements of PER,ORG,LOC in the sentence. (2) Grammatical subject elements of PER, ORG, LOC in the sentence. (3) Grammatical subject elements of PER, ORG, LOC in the previous sentences. (4) PER, ORG, LOC in the sentences other than those classified by (1) or (2). Name entity extractor: NExT
12
Evaluation results in NTCIR-6(1/3) 2009/04/27Cicilia Chia-ying Lee12
13
2009/04/27Cicilia Chia-ying Lee13 Evaluation results in NTCIR-6(2/3) Opinion holder extraction (1) Extraction using term sequences (Cornell, GATE) (2) Lexicon-based heuristics (IIT) (3) Named entity extraction approach (TUT and others) Identify the author (1) To utilize author-related clues such as verbs (ICU-IR) (2) To detect author opinion holders when there were no holder candidates surrounding the opinionated sentences (EHBN, Cornell)
14
2009/04/27Cicilia Chia-ying Lee14 English: Author-opinionated sentences appeared more often Evaluation results in NTCIR-6(3/3)
15
2009/04/27Cicilia Chia-ying Lee15 Outline 1. Problem Definition 2. Corpus: NTCIR-6 pilot 3. Approach in NTCIR-6 4. Revised Approach after NTCIR-6 1. More features 2. Direct-subjective Classifier 5. Comparison and Discussion 6. Conclusion
16
2009/04/27Cicilia Chia-ying Lee16 More Features (1/3) Extend by ICU-IR approach Phrase governed by “say”, “by” NP followed by “according to”, “by” Subjects governed by opinion verbs Grammatical syntactic patterns Grammatical subject & verbs Auxiliary verb & verb
17
2009/04/27Cicilia Chia-ying Lee17 More Features(2/3)
18
2009/04/27Cicilia Chia-ying Lee18 More Features (3/3) Features selected based on χ -square tests on the MPQA corpus three count features: cntopnoun, cntopadj, and cntopadv in the subjective lexicon (Wilson et al)
19
Direct-subjective Classifier(1/2) Goal: Filtering the author-opinionated sentences Method: Combine opinion type 1 and 2 Train set : MPQA Classifier: SVM-light 2009/04/27Cicilia Chia-ying Lee19
20
Direct-subjective Classifier(2/2) 2009/04/27Cicilia Chia-ying Lee20 ↗ 0.1 ↗ 0.08
21
2009/04/27Cicilia Chia-ying Lee21 Comparison and Discussion Baseline: The algorithm from authority opinion Features selected based on χ -square tests on the MPQA corpus for the opinionated sentence extraction 7 topics contained more than 30% of author-opinionated sentences attained higher F-value
22
2009/04/27Cicilia Chia-ying Lee22 Conclusion Proposed an opinion holder identification system in both Japanese and English Features selected based on χ -square tests and direct-subjective classifier improve the result in English Future work: Public opinion Multilingual blogs
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.