Presentation is loading. Please wait.

Presentation is loading. Please wait.

INSTITUTE OF COMPUTING TECHNOLOGY Forest-based Semantic Role Labeling Hao Xiong, Haitao Mi, Yang Liu and Qun Liu Institute of Computing Technology Academy.

Similar presentations


Presentation on theme: "INSTITUTE OF COMPUTING TECHNOLOGY Forest-based Semantic Role Labeling Hao Xiong, Haitao Mi, Yang Liu and Qun Liu Institute of Computing Technology Academy."— Presentation transcript:

1 INSTITUTE OF COMPUTING TECHNOLOGY Forest-based Semantic Role Labeling Hao Xiong, Haitao Mi, Yang Liu and Qun Liu Institute of Computing Technology Academy of Chinese Sciences AAAI 2010, Atlanta7/15/101

2 INSTITUTE OF COMPUTING TECHNOLOGY Semantic Role Labeling Given a sentence and its verbs Identify the arguments of the verbs Assign semantic labels (the roles they play) This company last year1000 cars in the U.S. sold Agent Patient ArgMod -TeMPoral ArgMod -TeMPoral ArgMod -LOCation ArgMod -LOCation PropBank (Kingsbury and Palmer 2002) 7/15/102

3 INSTITUTE OF COMPUTING TECHNOLOGY One Conventional Approach the roleof Celimeneisplayedby Kim Cattrall Patient Agent AAAI 2010, Atlanta7/15/103

4 INSTITUTE OF COMPUTING TECHNOLOGY One Conventional Approach the roleof Celimeneisplayedby Kim Cattrall PatientAgent S NP PP VP AUXVBNPP VP NP AAAI 2010, Atlanta7/15/104

5 INSTITUTE OF COMPUTING TECHNOLOGY One Conventional Approach the roleof Celimeneisplayedby Kim Cattrall PatientAgent S NP PP VP AUXVBNPP VP ? more than 15% AAAI 2010, Atlanta7/15/105

6 INSTITUTE OF COMPUTING TECHNOLOGY … 12 3 k … Solution k-best parses: limited scope: k too much redundancy 2 5 <50<2 6 S NP PP VP AUXVBNPP VP NP S PP VP AUXVBNPP VP … AAAI 2010, Atlanta7/15/106

7 INSTITUTE OF COMPUTING TECHNOLOGY Our Solution Forest A compact representation of many parses By sharing common sub-derivations Polynomial-space encoding of exponentially large set S NPPP VP AUXVBNPP NP VP S NP PP VP AUXVBNPP VP NP S PP VP AUXVBNPP VP … Unpack AAAI 2010, Atlanta7/15/107

8 INSTITUTE OF COMPUTING TECHNOLOGY Our Solution Forest A compact representation of many parses By sharing common sub-derivations Polynomial-space encoding of exponentially large set S NPPP VP AUXVBNPP NP VP AAAI 2010, Atlanta7/15/108

9 INSTITUTE OF COMPUTING TECHNOLOGY Outline Tree-based Semantic Role Labeling Parsing Selecting candidates Extracting features Classifying Forest-based Semantic Role Labeling Experiments Conclusion AAAI 2010, Atlanta7/15/109

10 INSTITUTE OF COMPUTING TECHNOLOGY Parsing S NP VP DTNNJJNNVBDNPPP CDNNSINNP NNPDT Thiscompanylastyear sold 1000carsin theU.S. AAAI 2010, Atlanta7/15/1010

11 INSTITUTE OF COMPUTING TECHNOLOGY Selecting Candidates S NP VP DTNNJJNNVBDNPPP CD NNS INNP NNPDT sold Thiscompanylastyear 1000carsin theU.S. AAAI 2010, Atlanta7/15/1011

12 INSTITUTE OF COMPUTING TECHNOLOGY Extracting Features S NP VP DTNNJJNNVBDNPPP CDNNSINNP NNPDT Thiscompanylastyear sold 1000carsin theU.S. Path to the predicate Thiscompanylastyear 1000carsin theU.S. NNS NP  S  VP  VBN AAAI 2010, Atlanta7/15/1012

13 INSTITUTE OF COMPUTING TECHNOLOGY Extracting Features S NP VP DTNNJJNNVBDNPPP CDINNP NNPDT Thiscompanylastyear sold 1000carsin theU.S. Position: left Thiscompanylastyear 1000carsin theU.S. NNS NP  S  VP  VBN left AAAI 2010, Atlanta7/15/1013

14 INSTITUTE OF COMPUTING TECHNOLOGY Extracting Features S NP VP DTNNJJNNVBDNPPP CDINNP NNPDT Thiscompanylastyear sold 1000carsin theU.S. Head word: company Thiscompanylastyear 1000carsin theU.S. NNS NP  S  VP  VBN left company AAAI 2010, Atlanta7/15/1014

15 INSTITUTE OF COMPUTING TECHNOLOGY Extracting Features S NP VP DTNNJJNNVBDNPPP CDINNP NNPDT Thiscompanylastyear sold 1000carsin theU.S. Head POS tag: NN Thiscompanylastyear 1000carsin theU.S. NNS NP  S  VP  VBN left company NN … AAAI 2010, Atlanta7/15/1015

16 INSTITUTE OF COMPUTING TECHNOLOGY Classifying S NP VP DTNNJJNNVBDNPPP CDINNP NNPDT Thiscompanylastyear sold 1000carsin theU.S. S(Agent)=0.1 S(Patient)=0.1 S(None)=0.5 … S(AM-TMP)=0.9 S(Patient)=0.1 S(None)=0.1 … S(Agent)=0.2 S(Patient)=0.8 S(None)=0.1 … S(Agent)=0.8 S(Patient)=0.1 S(None)=0.1 … S(AM-LOC)=0.9 S(Agent)=0.1 S(None)=0.1 … Computing Score using a trained classifier Thiscompanylastyear 1000carsin theU.S. NNS 16

17 INSTITUTE OF COMPUTING TECHNOLOGY Classifying S NP VP DTNNJJNNVBDNPPP CDINNP NNPDT Thiscompanylastyear sold 1000carsin theU.S. S(Agent)=0.8 … S(AM-LOC)=0.9 … Thiscompanylastyear 1000carsin theU.S. NNS S(None)=0.5 … S(AM-TMP)=0.9 … S(Patient)=0.8 … Best score for each constituent Simply sort them Choose the best label sequence NP 17

18 INSTITUTE OF COMPUTING TECHNOLOGY Classifying S NP VP DTNNJJNNVBDNPPP CDINNP NNPDT Thiscompanylastyear sold 1000carsin theU.S. Agent AM-TMPV Patient AM-LOC Thiscompanylastyear 1000carsin theU.S. NNS 18

19 INSTITUTE OF COMPUTING TECHNOLOGY Outline Tree-based Semantic Role Labeling Forest-based Semantic Role Labeling Parsing into a forest Selecting candidates Extracting features on forest Classifying Experiments Conclusion AAAI 2010, Atlanta7/15/1019

20 INSTITUTE OF COMPUTING TECHNOLOGY Forest the roleof Celimene is played by Kim Cattrall S NPPP VP AUXVBNPP NP VP Hyper-graph Hyper-edge Node AAAI 2010, Atlanta7/15/1020

21 INSTITUTE OF COMPUTING TECHNOLOGY Selecting Candidates the roleof Celimene is played by Kim Cattrall S NPPP VP AUXVBNPP NP VP AAAI 2010, Atlanta7/15/1021

22 INSTITUTE OF COMPUTING TECHNOLOGY Exacting features Path to the predicate the roleof Celimene is played by Kim Cattrall S NPPP VP AUXVBNPP NP VP NP  NP  S  VP  VP  VBN AAAI 2010, Atlanta7/15/1022

23 INSTITUTE OF COMPUTING TECHNOLOGY Exacting features Path to the predicate the roleof Celimene is played by Kim Cattrall S NPPP VP AUXVBNPP NP VP NP  S  VP  VP  VBN NP  NP  S  VP  VP  VBN shortest AAAI 2010, Atlanta7/15/1023

24 INSTITUTE OF COMPUTING TECHNOLOGY Exacting features Parent Label NP  S  VP  VP  VBN the roleof Celimene is played by Kim Cattrall S NPPP VP AUXVBNPP NP VP AAAI 2010, Atlanta7/15/1024

25 INSTITUTE OF COMPUTING TECHNOLOGY Exacting features Parent Label the roleof Celimene is played by Kim Cattrall NPPP VP AUXVBNPP VP NP  S  VP  VP  VBN in the shortest path AAAI 2010, Atlanta7/15/1025

26 INSTITUTE OF COMPUTING TECHNOLOGY New Features Parsing score (Fractional value (Mi et al., 2008)) Inside-outside Marginal prob. the roleof Celimene is played by Kim Cattrall S NPPP VP AUXVBNPP NP VP NP  S  VP  VP  VBN f(NP 3 ) AAAI 2010, Atlanta7/15/1026

27 INSTITUTE OF COMPUTING TECHNOLOGY Classifying S(Patient)=0.8 S(Agent)=0.1 S(None)=0.2 … S(Patient)=0.5 S(Agent)=0.1 S(None)=0.3 … the roleof Celimene is played by Kim Cattrall S NPPP VP AUXVBNPP NP VP S(Agent)=0.8 S(Patient)=0.1 S(None)=0.2 … AAAI 2010, Atlanta7/15/1027

28 INSTITUTE OF COMPUTING TECHNOLOGY Classifying S(Patient)=0.8 … the roleof Celimene is played by Kim Cattrall S NP PP VP AUXVBNPP NP VP S(Agent)=0.8 … PatientAgent AAAI 2010, Atlanta7/15/1028

29 INSTITUTE OF COMPUTING TECHNOLOGY Outline Tree-based Semantic Role Labeling Forest-based Semantic Role Labeling Experiments Conclusion AAAI 2010, Atlanta7/15/1029

30 INSTITUTE OF COMPUTING TECHNOLOGY Experiments Corpus: CoNLL-2005 shared task Sections 02-21 of PropBank for training Section 24 for development set Section 23 for test set Total 43,594 sentences 262,281 arguments AAAI 2010, Atlanta7/15/1030

31 INSTITUTE OF COMPUTING TECHNOLOGY Experiments Training sentences Parse into 1-best and forest Prune forest using inside-outside algorithm Train classifiers Decoding sentences Parse into 1-best and forest Prune forest using inside-outside algorithm Use classifiers AAAI 2010, Atlanta7/15/1031

32 INSTITUTE OF COMPUTING TECHNOLOGY Features Predicate lemma Path to predicate Path length Partial path Position Voice Head word/POS tag … AAAI 2010, Atlanta7/15/1032

33 INSTITUTE OF COMPUTING TECHNOLOGY Results on Dev Set precision recall F 1-best 50-best forest(p3) 9.63×10 5 forest(p5) 5.78×10 6 33

34 INSTITUTE OF COMPUTING TECHNOLOGY Results on Tst Set AAAI 2010, Atlanta7/15/1034

35 INSTITUTE OF COMPUTING TECHNOLOGY Outline Tree-based Semantic Role Labeling Forest-based Semantic Role Labeling Experiments Conclusion AAAI 2010, Atlanta7/15/1035

36 INSTITUTE OF COMPUTING TECHNOLOGY Conclusion Forest Exponentially encode many parses Enlarge the candidate space Explore more rich features Improve the quality significantly Not necessary using very large forest Can NOT use k-best to simulate Future works Features on forest AAAI 2010, Atlanta7/15/1036

37 INSTITUTE OF COMPUTING TECHNOLOGY Thank you! Patient AAAI 2010, Atlanta7/15/1037


Download ppt "INSTITUTE OF COMPUTING TECHNOLOGY Forest-based Semantic Role Labeling Hao Xiong, Haitao Mi, Yang Liu and Qun Liu Institute of Computing Technology Academy."

Similar presentations


Ads by Google