Presentation is loading. Please wait.

Presentation is loading. Please wait.

QuASI: Question Answering using Statistics, Semantics, and Inference Marti Hearst, Jerry Feldman, Chris Manning, Srini Narayanan Univ. of California-Berkeley.

Similar presentations


Presentation on theme: "QuASI: Question Answering using Statistics, Semantics, and Inference Marti Hearst, Jerry Feldman, Chris Manning, Srini Narayanan Univ. of California-Berkeley."— Presentation transcript:

1 QuASI: Question Answering using Statistics, Semantics, and Inference Marti Hearst, Jerry Feldman, Chris Manning, Srini Narayanan Univ. of California-Berkeley / ICSI / Stanford Univ

2 Outline Project Overview Motivating Example Research Approaches Text-based Analysis Concept-based Analysis Summary

3 Main Goals Support Question-Answering and NLP in general by: Deepening our understanding of concepts that underlie all languages Creating empirical approaches to identifying semantic relations from free text Developing probabilistic inferencing algorithms

4 Two Main Thrusts Text-based: Use empirical corpus-based techniques to extract simple semantic relations Combine these relations to perform simple inferences “statistical semantic grammar” Concept-based: Determine language-universal conceptual principles Determine how inferences are made among these

5 Principal Project Personnel Text-based Prof. Marti Hearst Prof. Chris Manning Concept-based Prof. Jerry Feldman Dr. Srini Narayanan

6 Universal Schemas Inference Algorithms Probabilistic Knowledge Semantic Relations Statistical Semantic Parser Phase Two Answers Input Other Applications Target System: Overview Training Corpora Cognitive Linguistics

7 Motivating Example Anthrax Scare Continues to Paralyze the Federal Government The inhalation anthrax scare threatens to cripple several Federal Government activities. Legislation in Congress continues to remain at a standstill while the Senate is conducting business at a reduced pace. The Postal service confirms that manual inspections for anthrax spores have reduced mail processing and delivery to a slow crawl.

8 Motivating Example: Labeling Conceptual Relations Anthrax Scare[NN] Continues to[Aspect] Paralyze[Force Dynamic] the Federal Government[NN] The inhalation anthrax scare[NN] threatens to cripple[Force Dynamic] several Federal Government activities [NN]. Legislation in Congress continues to[Aspect] remain at a standstill [ES Map]while the Senate is conducting [Aspect] business at a reduced pace[ES Map]. The Postal service [NN] confirms that manual inspections[NN] for anthrax spores[NN] have reduced[Aspect, Scale] mail processing and delivery [NN] to a slow crawl[ES Map].

9 Text-based Analysis

10 Main Tasks: Modify probabilistic parsing algorithms to Take semantic relations into account Better support ambiguity resolution Support co-reference resolution Automate identification of semantic relations via Machine Learning by Leveraging off of lexical ontologies Building large training sets via bootstrapping techniques

11 Towards better statistical parsers: Head Corner Based Derivation Process Start with a known goal category, which is the start symbol of the grammar First find a head for that constituent, and then parse outward from that head This approach Better exploits headed structure of natural language Lets us work outward from “islands of certainty”

12 Towards better statistical parsers: Head Corner Based Derivation Process Once we have a head, we decide what kind of phrase it heads what kinds of arguments the head is likely to have. Then recursively apply this procedure to each argument This gives us a generative probabilistic model of sentence probabilities. Crucially, we always have governing and less oblique heads available, thus supporting disambiguation.

13 Semantic Role Analysis Semantic roles provide a limited level of semantics that nevertheless allows reasoning across lexicalization patterns Goal is to explore bootstrapping knowledge of semantic roles from limited lexical resources

14 Semantic Role Labeling Example on NN Compounds inhalation anthrax scare anthrax scare -> caused-by relation caused-by(PublicConcern, InfectiousDisease) inhalation anthrax -> type-of relation, or more specifically, contracted-by relation contracted-by(Disease,ExposureType) -> InfectiousDisease inhalation anthrax scare caused-by(PublicConcern, contracted-by (Disease,ExposureType))

15 Semantic Role Labeling Example on NN Compounds Approach: Train a model based on labeled data and a lexical hierarchy Preliminary results: ~60% accuracy on an 18- way classification and small training set Next step: Create a larger training set via bootstrapping Find lexico-syntactic patterns that unambiguously indicate the relation of interest Use these to label new instances Use these + lexical ontology to create probability model of which subtrees, when combined, yield which relations

16 Concept-based Analysis

17 Inference and Conceptual Schemas Hypothesis: Linguistic input is converted into a mental simulation based on bodily-grounded structures. Components: Semantic schemas image schemas and executing schemas are abstractions over neurally grounded perceptual and motor representations Linguistic units lexical and phrasal construction representations invoke schemas Inference links these structures and provides parameters for a simulation engine

18 Concept-based Analysis Main Tasks: Formalize Image Schemas Identify Cross-lingual Conceptual Schemas Apply Probabilistic Relational Models to Inferencing over Conceptual Schemas

19 Conceptual Schemas Much is known about conceptual schemas, particularly images schemas However, this understanding has not yet been formalized We will develop such a formalism They have also not been checked extensively against other languages We will examine Chinese, Russian, and German, in addition to English

20 Extending Inferential Capabilities Given the formalization of the conceptual schemas How to use them for inferencing? Earlier pilot systems Used Bayesian belief networks Successfully construed certain inferences But don’t scale New approach Probabilistic relational models Support an open ontology

21 A Common Representation Representation should support Uncertainty, probability Conflicts, contradictions Current plan Probabilistic Relational Models (Koller et al.) DAML + OIL

22 An Open Ontology for Conceptual Relations Build a formal markup language for conceptual schemas We propose to use DAML+OIL as the base. Advantages of the approach Common framework for extending and reuse Closer ties to other efforts within AQUAINT as well as the larger research community on the Semantic Web. Some Issues Expressiveness of DAML+OIL Representing Probabilistic Information

23 DAML-I: An Image Schema Markup Language A basic type of schema <daml:subPropertyOf rdf:resource="&conc-rel;#role"/

24 Putting it all Together We have proposed two different types of semantics Universal conceptual schemas Semantic relations In Phase I they will remain separate However, we are exploring using PRMs as a common representational format In later Phases they will be combined

25 Summary Goal: Deep Semantic Interpretation of Text Build a foundation for deep, yet robust and scalable, semantic analysis of human language, with applications to question answering from huge text collections. Use semantic schemas, probabilistic language processing and knowledge representation, machine learning and bootstrapping.


Download ppt "QuASI: Question Answering using Statistics, Semantics, and Inference Marti Hearst, Jerry Feldman, Chris Manning, Srini Narayanan Univ. of California-Berkeley."

Similar presentations


Ads by Google