计算机科学与技术学院 Chinese Semantic Role Labeling with Dependency-driven Constituent Parse Tree Structure Hongling Wang, Bukang Wang Guodong Zhou NLP Lab, School of Computer Science & Technology, Soochow University
Outlines Motivation Related work Tree kernel-based nominal SRL on D-CPT Experimentation and Results Conclusion
Motivation(1) Two kinds of method for SRL Feature-based VS. Tree Kernel-based The latter has the potential in better capturing structured knowledge in the parse tree structure. There are only a few studies employing tree kernel- based methods for SRL and most of them focus on the CPT structure.
Motivation(2) DPT (Dependency Parse Tree) captures the dependency relationship between individual words, while CPT (Constituent Parse Tree) concerns with the phrase structure in a sentence. CPT and DPT may behave quite differently in capturing different aspects of syntactic phenomena. We explore a tree kernel-based methods using a new syntactic parse tree structure, called dependency-driven constituent parse tree (D-CPT) for Chinese nominal SRL.
Related Work Some studies use tree kernel-based methods for verbal SRL Moschitti et al., Che et al., Zhang et al. There are a few related studies in other NLP tasks employing DPT in tree kernel-based methods and achieve comparable performance to the ones on CPT. semantic relation extraction and co-reference resolution To our knowledge, there are no reported studies on tree kernel-based methods for SRL from the DPT structure perspective.
D-CPT Structure The new D-CPT structure benefits from the advantages of both DPT and CPT since D-CPT not only keeps the dependency relationship information in DPT but also retains the basic structure of CPT. This is done by transforming the DPT structure to a new CPT-style structure, using dependency types instead of phrase labels in the traditional CPT structure.
Example of DPT with nominal predicate and its related arguments annotated
Example of achieving D-CPT structure from DPT structure
Extraction schemes 1) Shortest path tree (SPT) This extraction scheme only includes the nodes occurring in the shortest path connecting the predicate and the argument candidate, via the nearest commonly-governing node. 2) SV-SPT This extraction scheme includes the support verb information in nominal SRL. 3) H-SV-SPT Both of the head argument and the support verb information are included in nominal SRL.
Extraction schemes
Kernels In order to capture the complementary nature between feature-based methods and tree kernel- based methods, we combine them via a composite kernel. Our composite kernel is combined by linearly interpolating a convolution tree kernel K T over a parse tree structure and a feature-based linear kernel K L as follow:
Experiment setting Corpus: Chinese NomBank Training data: 648 files (chtb_081 to 899.fid) Test data: 72 files (chtb_001 to 040.fid and chtb_900 to 931.fid) Development data: 40 files (chtb_041 to 080.fid) Tools Classifier: SVM-light toolkit with the convolution tree kernel function SVM light –TK C (SVM) and λ (tree kernel) are fine-tuned to 4.0 and 0.5 respectively
Results on golden parse trees (1)
Results on golden parse trees (2)
Results on automatic parse trees
Results on Comparison experments(1)
Results on Comparison experments(2)
Results on the CoNLL-2009 Chinese corpus
Conclusion This paper systematically explores a tree kernel-based method on a novel D-CPT We propose a simple strategy, which transforms the DPT structure into a CPT-style structure. Several extraction schemes are designed to extract various kinds of necessary information for nominal SRL and verbal SRL Evaluation shows the effectiveness of D-CPT both on Chinese NomBank and CoNLL-2009 corpus.
Thanks! Q & C?