Download presentation
Presentation is loading. Please wait.
Published byNeil Miller Modified over 9 years ago
1
1 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT IKRAFT: Interactive Knowledge Representation and Acquisition from Text Yolanda Gil Varun Ratnakar www.isi.edu/expect/projects/trellis trellis.semanticweb.org USC/Information Sciences Institute gil@isi.edu
2
2 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Motivation: How KBs Are Built Today Knowledge Acquisition Tools Read/ask /study/listen... …reason/deduce/solve …analyze/group/index... …structure/relate/fit... KB Domain Expert Knowledge Engineer
3
3 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Motivation: The Aftermath of Knowledge Base Development Knowledge Acquisition Tools …reason/deduce/solve Read/ask /study/listen... …analyze/group/index... …structure/relate/fit... KB Domain Expert Knowledge Engineer TRASH
4
4 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Motivation: Capturing the Design of Knowledge Bases ((( )) ()))) Richer representations More ambiguous More versatile (defconcept bridge ())) More formal More concrete More introspectible Introductory texts, expert hints, explanations, dialogues, comments, examples, exceptions,... Info. extraction templates, dialogue segments and pegs, filled-out forms, high-level connections,... Alternative formalizations (KIF, MELD, RDF,…), alternative views of the same notion (e.g., what is a threat) Descriptions augmented with prototypical examples & exceptions, problem-solving steps and substeps,... WWW
5
5 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Claims Knowledge can be reused at any level of (in)formality Knowledge can be extended more easily Addt’l documents and semi-formal structures readily available Knowledge can be translated and integrated at any level to facilitate interoperability KR languages can be a straitjacket for some kinds of knowledge Intelligent systems will provide better justifications Many users want to know where axioms came from before they trust system’s reasoning Content providers will not need to be sophisticated programmers/knowledge engineers May be easier for end users to organize knowledge rather than formalize it Good symbiosis of sophisticated and unsophisticated users
6
6 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT An Example: Building a Knowledge Base from a Textbook (DARPA Rapid Knowledge Formation -- RKF) “…The first step a cell takes in reading out part of its genetic instructions is to copy the required portion of the nucleotide sequence of DNA – the gene – into a nucleotide sequence of RNA. The process is called transcription because the information, though copied into another chemical form, is still written in essentially the same language – the language of nucleotides. Like DNA, RNA is a linear polymer made of four different types of nucleotides subunits linked together by phosphodiester bonds. It differs from DNA chemically in two respects: (1) the nucleotides in RNA are ribonucleotides – that is, they contain the sugar ribose (hence the name ribonucleic acid) rather than deoxyribose; (2) although, like DNA, RNA contains the bases adenine (A), guanine (G), and cytosine (C), it contains uracil (U) instead of the thymine (T) in DNA. Since U, like T, can base-pair by hydrogen- bonding with A, the base-pairing properties described for DNA also apply to RNA…” -- Essential Cell Biology, Alberts et al. 1992
7
7 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Protein Synthesis in RKF’s SHAKEN Authored by a Biologist [Chaudri et al 2001]
8
8 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Step 1: Selecting Relevant Knowledge Fragments “…The first step a cell takes in reading out part of its genetic instructions is to copy the required portion of the nucleotide sequence of DNA – the gene – into a nucleotide sequence of RNA. The process is called transcription because the information, though copied into another chemical form, is still written in essentially the same language – the language of nucleotides. Like DNA, RNA is a linear polymer made of four different types of nucleotides subunits linked together by phosphodiester bonds. It differs from DNA chemically in two respects: (1) the nucleotides in RNA are ribonucleotides – that is, they contain the sugar ribose (hence the name ribonucleic acid) rather than deoxyribose; (2) although, like DNA, RNA contains the bases adenine (A), guanine (G), and cytosine (C), it contains uracil (U) instead of the thymine (T) in DNA. Since U, like T, can base-pair by hydrogen- bonding with A, the base-pairing properties described for DNA also apply to RNA…” -- Essential Cell Biology, Alberts et al. 1992
9
9 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Step 2: Composing Stylized Knowledge Fragments - ribose - it is a kind of sugar, like deoxyribose - it is contained in the nucleotides of RNA - uracil - it is a kind of nucleotide, like adenine and guanine - it can base-pair with adenine - RNA - it is a kind of nucleic acid, like DNA - it contains uracil instead of thymine - it is single-stranded - it folds in complex 3-D shapes - nucleotides are linked with phospohodiester bonds, like DNA - there are many types of RNA - RNA is the template for synthesizing protein - its nucleotides contain the sugar ribose (DNA has deoxyribose) - gene - subsequence of DNA that can be used as a template to create protein - protein synthesis - non-destructive creation process: RNA and protein created from DNA - its speed is regulated by the cell - substeps: (ordered in sequence) 1) RNA transcription - a DNA fragment (a gene) is copied, just like DNA is copied during DNA synthesis - the result is an RNA chain 2) protein translation - RNA is used as a template
10
10 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Step 3: Creating Knowledge Base Items … (defconcept uracil :is-primitive nucleotide :constraints (:the base-pair adenine)) (defconcept RNA :is (:and nucleic-acid (:some contains uracil))) …
11
11 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT IKRAFT: Interactive Knowledge Representation and Acquisition from Text User starts with documents, extracts a small amount of information from them Text contains significant portions for context/reference/recall IKRAFT allows users to annotate text with statements, expressed in natural language Highlight portions of original text, annotate statement Statements tend to be stylized Statements are parsed, system generates summary of: Objects Events/actions
12
12 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT IKRAFT: Annotating Manual Information Extraction
13
13 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT IKRAFT: Extracting Statements from Complementary/Contradictory Text Sources
14
14 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT IKRAFT: Documenting Seismic Hazard in Southern California
15
15 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Seismic Hazard Analysis (SHA) for Southern California Earthquake Center (SCEC)
16
16 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT DOCKER: Scientist Publishes SHA Models SCEC ontologies AS97 msg types AS97 ontology constrs docs User specifies: Types of model parameters Format of input messages Documentation Constraints User Interface Constraint Acquisition Model Specification DOCKER Web Browser Wrapper Generation (WSDL, PWL) AS97
17
17 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Documenting the Model with IKRAFT
18
18 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Documenting Each Constraint
19
19 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Formalizing Simple Constraints
20
20 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Documentation of Constraints (Some Are Formalized, Some Are Not)
21
21 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT DOCKER: Engineer Uses SHA Model User Interface Shared ontologies AS97 msg types AS97 ontology constrs docs Constraint Reasoning User can: Browse through SHA models Invoke SHA models Get help in selecting appropriate model KR&R (Powerloom) Model Reasoning Pathway Elicitation DOCKER Web Browser AS97
22
22 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT DOCKER Detects Constraint Violations
23
23 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Should Engineer Override Constraint Specified by Model Developer?
24
24 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Engineer Brings Up IKRAFT to Find Reasons for the Constraint
25
25 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Engineer Can Check Additional Model Constraints (Not Formalized)
26
26 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Constraints Grounded on Model Documentation
27
27 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Engineers Makes an Informed Decision on Whether to Override the Constraint
28
28 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Discussion Overhead in capturing the rationale? Related to motivation and payoff Rationale here is captured in a very simple process Related Work: Documenting design rationale [Shum 96] Methodologies for knowledge base development [Schreiber et al 00] Higher-level languages, e.g., KARL [Fensel et al 98]
29
29 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Conclusions and Future Work IKRAFT helps users document formal expressions Each formal expression is back up by a concise NL statement that is linked back to one or more sources Users can understand justification for system’s reasoning (e.g., SHA) Future work: NLP techniques to extract terms from user’s concise statements Controlled grammar for formulation of statements Other documentation: e.g., tables, forms, exceptions High payoff in capturing the rationale of knowledge bases
30
30 USC INFORMATION SCIENCES INSTITUTE Yolanda GilIKRAFT Speculation: Will the (Semantic) Web End Up Looking Like This? ((( )) ()))) Richer representations More ambiguous More versatile (defconcept bridge ())) More formal More concrete More introspectible Introductory texts, expert hints, explanations, dialogues, comments, examples, exceptions,... Info. extraction templates, dialogue segments and pegs, filled-out forms, high-level connections,... Alternative formalizations (KIF, MELD, RDF,…), alternative views of the same notion (e.g., what is a threat) Descriptions augmented with prototypical examples & exceptions, problem-solving steps and substeps,...
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.