Download presentation
Presentation is loading. Please wait.
1
Two-Phase Semantic Role Labeling based on Support Vector Machines Kyung-Mi Park Young-Sook Hwang Hae-Chang Rim NLP Lab. Korea Univ.
2
2 Contents Introduction Two-phase semantic role labeling based on SVMs Semantic argument boundary identification phase Semantic role classification phase Experiments Conclusion
3
3 Introduction(1) Advantages of using SVMs high generalization performance in high dimensional feature spaces learning with combination of multiple features is possible by virtue of polynomial kernel functions Semantic Role Labeling(SRL) task is one of the multiclass classification task since SVM is a binary classifier, we have to extend SVMs to multiclass classification task we are often confronted with the unbalanced class distribution problem in a multiclass classification task
4
4 Introduction(2) If we try to apply SVMs in the SRL task we have to find a method of resolving the unbalanced class distribution problem Propose a two-phase SRL method Boundary identification phase + Role classification phase We can alleviate the unbalanced class distribution problem In the identification phase, only three SVM classifiers are required to identify B-ARG, I-ARG, O. We can decrease the number of negative examples. In the classification phase, we can ignore non-arguments constituents
5
5 Two-phase Semantic Role Labeling(1) First phase: semantic argument identification Phase Identify the boundary of semantic arguments First, segment a sentence into syntactic constituents(c) using a unit of chunk or subclause Second, classify syntactic constituents into B-ARG, I-ARG, O Second phase: semantic role classification phase assign appropriate semantic roles to the identified semantic arguments
6
6 Two-phase Semantic Role Labeling(2) InputOutput Under the existing constract, Rockwell said, it has already delivered 793 of the shipsets to Boeing. IN DT VBG NN, NNP VBD, PRP VBZ RB VBN CD IN DT NNS TO NNP. B-PP B-NP I-NP O B-NP B-VP O B-NP B-VP I-VP B-NP B-PP B-NP I-NP B-PP B-NP O (S* * (S* *S) * *S) O B-ORG O B-MISC O - exist - deliver - * (V*V) (A1*A1) * (AM-LOC* * *AM-LOC) * (A0*A0) * (AM-TMP*AM-TMP) (V*V) (A1* * *A1) * (A2*A2) * Underthe existingcontract, Rockwell said,it has already delivered 793ofthe shipsetstoBoeing CCCCCCCPCCCCC B-ARGI-ARG OOOB-ARGP I-ARG OB-ARG ARG P AM-LOCA0PA1A2
7
7 Semantic Argument Boundary Identification(1) Restrict the search space in terms of the constituents the left search boundary is set to the left boundary of the second upper clause the right search boundary is set to the right boundary of the immediate clause Utilize features for identifying syntactic constituents which are dependent to a predicate Semantic arguments are dependent on the predicate Features for finding dependency relations are implicitly represented
8
8 Semantic Argument Boundary Identification(2) 29 features are used for representing syntactic and semantic information related to dependency relationships between syntactic constituents and predicate FeaturesValues predicate-constituent (intervening features) position distance # of VP, NP, SBAR # of POS [CC] [,] [:] POS[“]&POS[”] path -2, -1, 1 0, 1, 2… -1, 0, 1 VP-PP-NP, … predicate itself & context headword, headword’s POS, chunk type beginning word’s POS context-1: headword, headword’s POS, chunk type MD, TO, VBZ, … constituent itself & context headword, headword’s POS, chunk type context-2: headword, headword’s POS, chunk type context-1: headword, headword’s POS, chunk type context+1: headword, headword’s POS, chunk type
9
9 Semantic Role Classification(1) We consider only 18 semantic roles based on frequency in the training data AM-MOD, AM-NEG are post-processed by hand-crafted rules we do not consider 19 semantic roles that appear less than 36 times in the training data — A5, AM-PRD, AM-REC, AA — R-A3, R-AA, R-AM-TMP, R-AM-LOC, R-AM-MNR, R-AM-ADV, R-AM- PNC — C-A0, C-A2, C-A3, C-AM-MNR, C-AM-ADV, C-AM-EXT, C-AM-DIS, C- AM-CAU 18 semantic roles A0, A1, A2, A3, A4, R-A0, R-A1, R-A2, C-A1 AM-TMP, AM-ADV, AM-MNR, AM-LOC, AM-DIS AM-PNC, AM-CAU, AM-DIR, AM-EXT
10
10 Semantic Role Classification(2) This phase also uses all features applied in the identification phase except for # of POS[:] and POS[“] & POS[”] In addition, we use voice feature This is a binary feature identifying whether the target phrase is active or passive Named-entity information is not used performance is decreased when NE information is included
11
11 Experiments(1) We used SVM light package (http://svm-light.joachims.org/) In both phases, we used a polynomial kernel (degree 2) with the one-vs-rest classification method Results on the development set (closed challenge) PrecisionRecallF-measureAccuracy Overall67.27%64.36%65.78%- Identification75.96%72.30%74.08%- Classification---85.45%
12
12 Experiments(2) Results on the test set (closed challenge) prec.rec.F1 Overall65.6362.4363.99 A078.2474.6076.38 A165.8366.4666.14 A249.8443.7046.57 A356.0434.0042.32 A462.8644.0051.76 A50.00 AM-ADV45.1844.3044.74 AM-CAU36.6722.4527.85 AM-DIR20.00 AM-DIS56.6258.2257.41 AM-EXT61.5457.1459.26 AM-LOC26.0131.1428.34 AM-MNR43.5435.6939.22 prec.rec.F1 AM-MOD97.4691.1094.17 AM-NEG94.9288.1991.43 AM-PNC40.0028.2433.10 AM-PRD0.00 AM-TMP51.8345.3848.39 R-A080.4983.0281.73 R-A175.0051.4361.02 R-A2100.0033.3350.00 R-A30.00 R-AM-LOC0.00 R-AM-MNR0.00 R-AM-PNC0.00 R-AM-TMP0.00 V96.66
13
13 Conclusion proposed a method of two-phase semantic role labeling based on the support vector machines By applying the two-phase method, we can alleviate the unbalanced class distribution problem caused by the negative examples Our system obtains F-measure of 63.99 % on the test set and 65.78 % on the development set
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.