Farrar on Ontologies for NLP Fourth w/s on multimodal semantic representation, Tilburg 10-11 Jan 2005.

Farrar on Ontologies for NLP Fourth w/s on multimodal semantic representation, Tilburg 10-11 Jan 2005

How I understand the paper Activity 4 sometimes titled –Logico-semantic relations / entities –Semantic Data Categories Farrar’s paper discusses –Ontologies for NLP (cf. referential descriptors) Defense of Bateman 1992, “The theoretical status of ontologies in NLP” Main claim: ontologies for NLP need to be shaped by linguistic concerns

The ontology debate (Hobbs 1985, quoted by Bateman) “Semantics is the attempted specification of the relation between language and the world. (..) There is a spectrum of choices one can make (..). At one end of the spectrum (..), one can adopt the “correct” theory of the world, the one given by quantum mechanics and the other sciences. (..) At the (other) end, one can assume a theory of the world that is isomorphic to the way we talk about it.”

The standard response to Hobb’s question: “Have your cake and eat it”: multilevel semantics (e.g., various systems at Philips and BBN; early theories of underspecification): –a `deep’ semantics that’s as close to the scientific facts as you like –a `shallow’ semantics that’s as close to linguistic structure as you like –any finite number of levels in between –a computable mapping between adjacent levels

Farrar & Bateman Defend the Generalised Upper Model (Bateman et al.) Appear to use multilevel semantics: “separation between linguistic and nonlinguistic levels of knowledge”. Conceptual/semantic distinction Appear to argue against classic denotational (e.g.,Montague-style) semantics (e.g., this seemed to be the gist of example school= x[purpose(x)=learning], where x is either institution or building)

To find out how the Upper Model compares with denotational multilevel semantics, let’s look at some examples

Example 1: Àmerican’ in TENDUM (Bunt et al., Philips/IPO) Shallow semantics: Àmerican company’: x(company(x) & AM(x)) Àmerican passenger’: x(passenger(x) & AM(x)) Àmerican airplane’: x(airplane(x) & AM(x)) Deep semantics: replace ÀM’ by one of x(country(headq(x))=USA [company] x(nationality(x)=USA [passenger] x(country(headq(carrier(x)))=USA [airplane] x(country(headq(builder(x)))=USA [airplane]

The idea behind this trick: Shallow semantics: one shallow constant `AM’ used for many different properties, matching English usage Deep semantics: AM replaced by an expression that has straightforward denotation in the world

Generalised Upper Model (GMU) GMU appears to opt for shallow constants Often, these appear to cover cases that are semantically very different It is not always clear how they are linked with deep (denotational) expressions (cf., Bunt & Romary 2004: “Do concepts have modeltheoretic semantics?”) This may result in formulas whose meaning we don’t really understand

Example 2: Generalised possession (after Bateman 1992) The handle of the door, the door’s handle,.. [part-of relation] The desk of John, John’s desk,.. [ownership relation] The son of Abraham, Abraham’s son,.. [“son of” relation] The father of Isaac, Isaac’s father,.. [“father of” relation]

Shallow relation constant POS (part of, owned by, …) The handle of the door  x(POS(x,door) & handle(x)) The desk of John  x(POS(x,John) & desk(x)) But: Abraham’s son  x(POS(x,Abraham) & son(x)) ??? Isaac’s father  x(POS(x,Isaac) & father(x)) ???

The issue illustrated by possession Is a shared form sufficient for postulating a shared (shallow) representation (e.g., POS in the case of generalised possession)? This issue can also be illustrated by focussing on the definite determiner

Example 3: Identifiability Generalised Upper Model works with abstract notions like definiteness (identifiability): The pope lives in Italy He is the son of a rich banker A man collapsed. (..) The man died He is the best left-footer in Scotland All these could be generated from something like “.. IDENT pope..”, “..IDENT son..”, etc.

Identifiability (ctd.) Using one constant IDENT does not tell us what the different usages of the definite article have in common. In NLG, it could put an unreasonable burden on previous modules (which decide whether to generate IDENT or not) Perhaps the right question is not How close to NL should an ontology be?, but How do we link the different levels of meaning?

Example 4: The weather The deepest question in this area: How deep should deep representations be? (Quantum mechanics, cf. Hobbs??) `The wind blows (fiercely), and it’s snowing too’ What we want: –Generate this from numerical weather data –Interpret it and draw inferences –Note: it <> the wind

`The wind blows (fiercely)’ Suppose shallow rep. = blow(w) deep rep. = speed(w)>50mph Mapping from shallow to deep: blow = x(speed(x)>50mph) The wind blows =shallow= blow(w) =deep= x(speed(x)>50mph)(w) = speed(w)>50mph

`It is snowing’ What’s a suitable shallow representation? `it’ does not refer Maybe just an atomic proposition `Snow’ Possible mapping from shallow to deep: Snow =  x(precipitation(x)& type(x)=snow & quantity(x)>10mm p/h)

Questions Are these the kinds of mismatches between NL and reality that you see as the main challenges for building ontologies that are useful to NLP? Does the proposed (classical multilevel semantics) approach look reasonable to you? How does this approach compare to the Generalised Upper Model?

Farrar on Ontologies for NLP Fourth w/s on multimodal semantic representation, Tilburg 10-11 Jan 2005.

Similar presentations

Presentation on theme: "Farrar on Ontologies for NLP Fourth w/s on multimodal semantic representation, Tilburg 10-11 Jan 2005."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Farrar on Ontologies for NLP Fourth w/s on multimodal semantic representation, Tilburg 10-11 Jan 2005.

Similar presentations

Presentation on theme: "Farrar on Ontologies for NLP Fourth w/s on multimodal semantic representation, Tilburg 10-11 Jan 2005."— Presentation transcript:

Similar presentations

About project

Feedback