Download presentation
Presentation is loading. Please wait.
Published byNancy Bryant Modified over 9 years ago
1
Convolution Kernels on Constituent, Dependency and Sequential Structures for Relation Extraction Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: 2009.08.16 From EMNLP 2009 Truc-Vien T. Nguyen and Alessandro Moschitti and Giuseppe Riccardi Department of Information Engineering and Computer Science University of Trento Italy
2
Outlines Introduction SVM and Kernel Methods Kernels for Relation Extraction Experiments Discussion and Conclusion
3
Introduction Relation extraction ▫Defined in ACE (Automatic Content Extraction) Recognition of relations The relation detection and characterization task (RDC)
4
Introduction (Cont.) There are ▫five general types of relations, ▫some of which are further sub-divided, yielding a total of 24 types/subtypes of relations
5
SVM and Kernel methods The (integral) transform of f(t) by the kernel k(v,t) Time domain frequency domain
6
SVM and Kernel methods(cont.) Nonlinear separating hyperplane ▫Nonlinear mapping ▫Kernel trick Ex: 256 dim. billion dim.
7
Kernel Matrix Terminologies ▫Gram matrix K (kernel matrix) K ij = k(x i, x j ) ▫psd: positive semi-definite, v t Kv >= 0 for all v Inner product space in a Hilbert space ▫PSD is closed under Linear combination, polynomial expansion, and normalization (Schlkopf and Smola, 2001) Valid kernel function ▫Gram matrix K is psd iff function k(.,.) is a kernel. ▫Ref: Kernel Methods for Pattern Analysis (Theorem 3.11) +
8
Kernel Methods Kernel methods ▫Used in different algorithms, such as SVM, PCA, LDA, …… ▫Implicitly operate on feature space with much higher dimension ▫Involved only inner product of input samples Feature-based approach: Sample x 1 =(x 11, x 12, …, x 1d ) n samples Kernel methods approach:
9
Kernel Methods (cont.) x 1 =(x 11, x 12, …, x 1d )x 2 =(x 21, x 22, …, x 2d ) SVM (Quadratic Programming) solver Q
10
Kernel Methods in NLP Discrete structure ▫Tree ▫Word sequence ▫Haussler: Convolution Kernels on Discrete Structures (1999) Feature-based approach: Sample x 1 =(x 11, x 12, …, x 1d ) n samples
11
Kernels for Relation Extraction – Tree Structures Constituent tree and dependency tree PET: path-enclosed tree
12
Kernels for Relation Extraction – Tree Structures (cont.) DW: dependency word tree GR: Grammatical relation tree GRW: Grammatical relation and words
13
Kernels for Relation Extraction – Sequential Structures SK1: ▫T2-LOC Washington, U.S. T1-PER officials SK2 ▫T2-LOC NN, NNP T1-PER NNS SK3 ▫T2-LOC pobj, nn T1-PER nsubj SK4 ▫Washington T2-LOC In working T1-PER officials GPE U.S. SK5 ▫pobj T2-LOC prep ROOT T1-PER nsubj GPE nn SK6 ▫NN T2-LOC IN VBP T1-PER NNS GPE NNP
14
Kernel Matrix Computation A: Tree similarity, K(T i, T j ) ▫ST kernel: Sub-Tree kernel ▫SST kernel: subset tree kernel ▫PT kernel: partial tree kernel B: Sequence similarity, K(S 1, S j ) ▫WSK kernel: word sequence kernel Cancedda et al. JMLR 2003
15
Kernel Matrix Computation (cont.) Sample ▫R i = ( E 1, E 2 ) ▫R 1 : (Washington, U.S.) ▫R 2 : (Washington, Officials) C: Polynomial kernel Features ▫entity headword, entity type, entity subtype, mention, type, and LDC2 mention type Feature-based approach: Sample x 1 =(x 11, x 12, …, x 1d ) n samples
16
Kernel Matrix Combinations Sequential structures Tree structures +
17
Experiments Setup ▫Corpus ACE 2004 348 documents, 4400 relation instances Seven entity types and seven relation types Physical, Person/Social, Employment/Membership/Subsidiary, Agent-Artifact, PER/ORG Affiliation, GPE Affiliation, and Discourse ▫Generates 38696 negative examples ▫One v.s. all ▫5-fold cross-validation
18
Experiments Results
19
Experiments Results (cont.)
20
Discussion and Conclusion Conclusion ▫Investigating many kernel methods for RE Future work ▫Designing new kernels ▫Capturing other knowledge into kernel, such as ontologies …
21
Thanks!!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.