Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

Progress update Lin Ziheng

System overview 2

Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self category: IN – Parent category: SBAR – Left sibling category: none – Right sibling category: S – Right sibling contains a VP: yes 3

Components – Connective classifier New features – Conn POS – Prev word + conn: even though, particularly since – Prev word POS – Prev word POS + conn POS – Conn + Next word – Next word POS – Conn POS + Next word POS – All lemmatized verbs in the sentence containing conn 4

Components – Argument labeler 5

Argument labeler – Argument position classifier Relative positions of Arg1 – Arg1 and Arg2 in the same sentence: SS (60.9%) – Arg1 in the immediately previous sentence: IPS (30.1%) – Arg1 in some non-adjacent previous sentence: NAPS (9.0%) – Arg1 in some following sentence: FS (0%, only 8 instances) FS ignored 6

Argument labeler – Argument position classifier Features: – Connective string – Conn POS – Conn position in the sentence: first, second, third, third last, second last, or last – Prev word – Prev word POS – Prev word + conn – Prev word POS + conn POS – Second prev word – Second prev word POS – Second prev word + conn – Second prev word POS + conn POS 7

Argument labeler – Argument extractor SS cases: handcrafted a set of syntactically motivated rules to extract Arg1 and Arg2 8

Argument labeler – Argument extractor An example: 9

Argument labeler – Argument extractor IPS cases: label the sentence containing the connective as Arg2 and the immediately previous sentence as Arg1 NAPS cases: – Arg1 locates in the second previous sentence in 45.8% of the NAPS cases – Use the majority decision and assume Arg1 is always in the second previous sentence 10

Components – Explicit classifier Prasad et al. (2008) reported human agreements of 94% on Level 1 classes and 84% on Level 2 types A baseline using only connectives as features gives 95.7% and 86% on Sec. 23 – Difficult to improve acc. on testing section 3 types of features: – Connective string – Conn POS – Conn + prev word 11

Components – Non-explicit classifier Non-explicit: Implicit, AltLex, EntRel, NoRel – 11 Level 2 types for Implicit/AltLex, plus EntRel and NoRel  13 types 4 feature sets from Lin et al. (2009) – Contextual features – Constituent parse features – Dependency parse features – Word-pair features 3 features to capture AltLex: Arg2_word1, Arg2_word2, Arg2_word3 12

Components – Attribution span labeler Two steps: split the text into clauses, and decide which clauses are attribution spans Rule-based clause splitter: – first split a sentence into clauses by punctuations – for each clause, we further split it if one of the following production links if found: VP  SBAR, S  SINV, S  S, SINV  S, S  SBAR, VP  S 13

Components – Attribution span labeler Attr span classifier features: (curr, prev and next clauses) – Unigrams of curr – Lowercased and lemmatized vers in curr – The first and last terms of curr – The last term of prev – The first term of next – The last term of prev + the first term of curr – The last term of curr + the first term of next – The position of curr in the sentence – Punctuations rules extracted from curr 14

Evaluation Train: 02-21, dev: 22, test: 23 Each component is tested – without and with error propagation (EP) from previous component – with gold standard (GS) parse trees and sentence boundaries, and with automatic (Auto) parser and sentence splitter 15

Evaluation – Connective classifier GS: increased acc and F1 by 2.05% and 3.05% Auto: increased acc and F1 by 1.71% and 2.54% Contextual info is helpful 16

Evaluation – Argument position classifier Able to accurately label SS But performs badly on the NAPS class – Due to the similarity between IPS and NAPS classes 17

Evaluation – Argument extractor Human agreements on partial and exact matches: 94.5% and 90.2% Exact F1 much lower than partial F1 – Due to small portions of text deleted 18

Evaluation – Explicit classifier Baseline: using only connective strings – 86% GS + no EP F1 increased by 0.44% 19

Evaluation – Non-explicit classifier Majority baseline: all classified as EntRel Adding EP degrades F1 by ~13%, but still outperforms baseline by ~6% 20

Evaluation – Attribution span labeler When EP added: the decrease of F1 is largely due to the drop in precision When Auto added: the decrease of F1 is largely due the drop in recall 21

Evaluation – The whole pipeline Definition: a relation is correct if its relation type is classified correctly, and both Arg1 and Arg2 are partially or exactly matched GS + EP – Partial: 46.38% F1 – Exact: 31.72% F1 22

On-going changes Joint learning Change rule-based argument extractor to a machine learning approach 23

Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

Similar presentations

Presentation on theme: "Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

Similar presentations

Presentation on theme: "Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self."— Presentation transcript:

Similar presentations

About project

Feedback