Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó.

Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Overview (Very) quick introduction to my framework Testing the Semantic Module  Different input corpora  Smoothing Comparing the Semantic Module to standard selectional preference methods

Modelling Semantic Processing General idea: Build a  probabilistic  large scale  broad coverage model of syntactic and semantic sentence processing

Semantic Processing Assign thematic roles on the basis of co- occurrence statistics from semantically annotated corpora Corpus-based frequency estimates of:  Semantic Subcategorisation (Probability of seeing the role with the verb)  Selectional Preferences (Probability of seeing the argument head in a role given the verb frame)

Testing the Semantic Module Evaluate just thematic fit of verbs and argument phrases Evaluation: 1.Correlate predictions with human judgments 2.Role labelling (prefer correct role) Try  Different input corpora  Smoothing

Training Data Frequency counts from the PropBank (ca. 3000 verb types)  Very specific domain  Relatively flat, syntax-based annotation FrameNet (ca. 1500 verb types)  Deep semantic annotation: Frames code situations, group verbs that describe similar events and their arguments  Extracted from balanced corpus  Skewed sample through frame-wise annotation

Development/Test Data Development: 60 verb-argument pairs from McRae et al. 98  Two judgments for each data point: Agent/Patient  Use to determine optimal parameters of clustering (number of clusters, smoothing) Test: 50 verb-argument pairs, 100 data points

Sparse Data Raw frequencies are sparse:  1 (Dev)/2 (Test) pairs seen in PropBank  0 (Dev)/2 (Test) pairs seen in FrameNet Use semantic classes as level of abstraction: Class-based smoothing

Smoothing Reconstruct probabilities for unseen data Smoothing by verb and noun classes  Count class members instead of word tokens Compare two alternatives :  Hand-constructed classes  Induced verb classes (clustering)

Hand-constructed Verb and Noun classes WordNet: Use top-level ontology and synsets as noun classes VerbNet: Use top-level classes for verbs Presumably correct and reliable Result: No significant correlations with human data for either training corpus

Induced Verb Classes Automatically cluster verbs  Group by similarities of argument heads, paths from argument to verb, frame, role labels  Determine optimal number of clusters and parameters of the clustering algorithm on the development set

Induced Classes, PB/FN Data points covered  / Significance Raw data 2-/- 2 All Arguments 59ns 12  =0.55/ p<0.05 Just NPs 48ns 16  =0.56/ p<0.05

Results Hand-built classes do not work (with this amount of data) Module achieves reliable correlations with FN data:  Important result for the overall feasibility of my model

Adding Noun Classes (PB/FN) Data points covered  / Significance Raw data 2-/- 2 PB, all args, Noun classes 4  =1/ p<0.01 FN, just NPs, Noun classes 18  =0.63/ p<0.01

Results Hand-built classes do not work (with this amount of data) Module achieves reliable correlations with FN data Adding noun classes helps yet a little

Comparison with Selectional Preference Methods Have established that our system reliably predicts human data How do we do in comparison to standard computational linguistics methods?

Selectional Preference Methods Clark & Weir (2002)  Add data points by finding the topmost class in WN that still reliably mirrors the target word frequency Resnik (1996)  Quantify contribution of WN class n to the overall preference strength of the verb Both rely on WN noun classes, no verb class smoothing

Selectional Preference Methods (PB/FN) Data points covered  / Significance Labelling (Cov/Acc) Sem. Module 118  =0.63/ p<0.01 38%/47.4% Sem. Module 216  =0.56/ p<0.05 30%/60% Clark & Weir 72ns84%/50% 23ns36%/50% Resnik 75ns74%/48.6% 46ns50%/48%

Results Too little input data  No results for selectional preference models  Small coverage for Semantic Module Semantic module manages to make predictions all the same  Relies on verb clusters: Verbs are less sparse than nouns in small corpora Annotate larger corpus with FN roles

Annotating the BNC Annotate large, balanced corpus: BNC  More data points for verbs covered in FN  More verb coverage (though purely syntactic annotation for unknown verbs) Results:  Annotation relatively sensible and reliable for non-FN verbs  Frame-wise annotation in FN causes problems for FN verbs

Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó.

Similar presentations

Presentation on theme: "Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó.

Similar presentations

Presentation on theme: "Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó."— Presentation transcript:

Similar presentations

About project

Feedback