Two Discourse Driven Language Models for Semantics

Two Discourse Driven Language Models for Semantics
Haoruo Peng and Dan Roth ACL-2016

A Typical “Semantic Sequence” Problem
Co-reference “he” refers to whom? QA Who was arrested? / Was Kevin arrested by police? Event What happened to Robert after the robbery? “Understanding what could come next” is central to multiple natural language understanding tasks. [Kevin] was robbed by [Robert], and [he] was arrested by the police. First, let’s start with a motivating example …

Two Key Questions How do we model the sequential nature of NL at a semantic level? What do we mean by “a semantic level”? Outcome: Semantic Language Models (SemLMs) Quality evaluation of SemLMs How do we use the model to better support NLU tasks? Application to co-reference Application to shallow discourse parsing

Frame-Based Semantic Sequence
[Kevin] was robbed by [Robert], and [he] was arrested by the police. predicate rob predicate arrest sub obj sub obj Robert Kevin … the police he … argument argument argument argument For the modelling problem, we choose the frame-based definition. [Chambers and Jurafsky, 2008; Bejan, 2008; Jans et al., 2012] [Granroth-Wilding et al., 2015; Pichotta and Mooney, 2016] …

What We Do Differently … … Infuse discourse information
[Kevin] was robbed by [Robert], but the police mistakenly arrested [him]. predicate predicate connective Into the mix (between frames) Also this is important … … argument argument argument argument

Two Different Sequences
[Kevin] was robbed by [Robert], but the police mistakenly arrested [him]. Frame-Chain (FC) Sequences Entity-Centered (EC) Sequences SRL SRL Shallow Discourse Parsing Rob.01 Arrest.01 sub obj but sub obj Robert Kevin the police Kevin him Co-reference Two models for different usages Rob.01 but Arrest.01 EOS Kevin Rob.01-obj but Arrest.01-obj 6

Semantic Language Model (SemLM)
SemLM Units Disambiguated Predicates or Disambiguated Predicates with argument role label Discourse Markers including Language Models Semantic Sequences Semantic Knowledge Rob.01 Rob.01-obj but EOS

Language Model Implementations
N-gram (UNI, BG, TRI) Skip-Gram (SG) Continuous Bag-of-Words (CBOW) Log-bilinear (LB) An extension to cbow

Building SemLMs from Scratch
Semantic Role Labeling Shallow Discourse Parsing End-to-End Co-reference A Large Collection of Text Documents 20 Years NYT 1.8M Documents Annotated Documents Illinois NLP Packages FrameNet Mapping Augment to Verb Phrases Augment to Compound Verbs Filter Rare Units ( >=20 times) Add “UNK” SemLM Units SemLM Vocabulary

Augment to Verb Phrases + preposition “take over” take.03(over) + negation “not like” like.01(not) be + adj. “be happy” be.02(happy) Augment to Compound Verbs eat and drink “eat.01-drink.01” decide to buy “decide.01-buy.01” Together Actual Vocabulary He doesn’t want to give up. want.01(not)-give.08(up)

20 Years NYT 1.8M Documents Annotated Documents Illinois NLP Packages FrameNet Mapping Augment to Verb Phrases Augment to Compound Verbs Filter Rare Units ( >=20 times) Add “UNK” SemLM Units SemLM Vocabulary SemLM Training Similar Settings as LM Two Models (FC / EC) Four Implementations 8 Different SemLMs

Design Choices Generate a Probabilistic Model
V.S. script learning  Too Sparse e.g. Narrative Schema [Chambers and Jurafsky, 2009] Noise in Preprocessing (SRL, Co-ref, etc.) Large Data  Robustness (Shown in Quality Evaluation) Entity Modeling in Semantic Sequences The right level of abstraction is hard to determine. [The doctor] told [Susan] that [she] had been busy Person [The doctor] told [Susan] that [she] had cancer Doctor/Patient [Mary] told [Susan] that [she] had cancer ?

Quality of SemLMs Two Standard Tests Test Corpus Perplexity Test
Narrative Cloze Test Test Corpus NYT Hold-out Data (10% of NYT corpus) + Automatic Annotation Gold PropBank Data with Frame Chains + Gold Annotation Gold Ontonotes Data with Coref Chains + Gold Annotation

Quality of SemLMs Perplexity Test

Quality of SemLMs Narrative Cloze Test (MRR) in the paper

Application to NLP Tasks
Co-reference Resolution Mention-Pair Model Conditional Probability Add as additional features Wiseman better results, but orthogonal

Application to NLP Tasks
Shallow Discourse Parsing CoNLL Shared Task Setting (Connective Sense Classification) Conditional Probability Add as additional features

Conclusion Thank You! How do we model the sequential nature of NL at a semantic level? SemLMs: Discourse Driven Two Models - Four Implementations High Quality How do we use the modelling to better support NLU tasks? Two Tasks, which utilize two models separately SemLM Conditional Probability as Features Trained Embeddings SemLMs Available: me at

Example Output FC-LB EC-LB
P(convict.01 | arrest.01, indict.01) = 8.2×10-3 P(rescue.01 | arrest.01, indict.01) = 6.7×10-5 EC-LB P(arrest.01-obj | rob.01-obj, but) = 2.1×10-3 P(arrest.01-obj | rob.01-sub, but) = 1.7×10-4 P(arrest.01-obj | rob.01-sub, and) = 7.3×10-3

Two Discourse Driven Language Models for Semantics

Similar presentations

Presentation on theme: "Two Discourse Driven Language Models for Semantics"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Two Discourse Driven Language Models for Semantics

Similar presentations

Presentation on theme: "Two Discourse Driven Language Models for Semantics"— Presentation transcript:

Similar presentations

About project

Feedback