Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reading to Learn Q2 review (10/17/05)

Similar presentations


Presentation on theme: "Reading to Learn Q2 review (10/17/05)"— Presentation transcript:

1 Reading to Learn Q2 review (10/17/05)
Peter Clark Phil Harrison Tom Jenkins John Thompson Rick Wojcik (Boeing Phantom Works) David Israel (SRI)

2 Forward This is an abbreviated, edited, and annotated version of the full review presentation, designed to highlight issues of interest to the Mobius project. The full presentation is available on request. Peter Clark

3 Index Introduction Converting controlled-language Chemistry Text to logic (KM) discusses what we’d like the text to say vs. what the text acually says Knowledge Integration how easy is it to integrate the new knowledge into the existing KB, what are the problems and solutions? Information extraction A short, separate exploration into acquiring chemistry knowledge by scanning a large corpus (Schubert-style) to extract “tuples” The Bigger Picture Roles of controlled language in Mobius, and 9 desirables for the Mobius KB

4 Part 1: Introduction Peter Clark

5 SRI-Boeing’s Reading to Learn Seedling
Goal: study issues in learning through reading by working with a reduced version of the problem, namely working with controlled, rather than unrestricted natural language. The NLP task is factored into two: full NL → CL, CL → logic Rationale: by sidestepping some of the linguistic issues of full NLP, can focus on knowledge integration issues methods for full NL → CL can be studied separately this project

6 SRI-Boeing’s Reading to Learn Seedling
Approach: Rewrite 5 pages of chemistry text into our controlled language, CPL Extend and use our CPL interpreter to generate logic Integrate this new knowledge with an existing chemistry knowledge base (from the Halo Pilot), which has the new knowledge surgically deleted from it Evaluate the performance of the CPL-extended KB with the original Report on the problems encountered and solutions developed

7 Part 2: Converting Chemistry Text to CPL, then to KM
John Thompson Oct. 17, 2005

8 Forward The 5 pages of text we are using concerns acid-base reactions, and relative strengths of acids. One would hope that the target knowledge we want is explicitly mentioned in the text, so we can then transcribe it to CPL (our controlled language). However, this is not always the case. In this section, we review the knowledge we need (a brief tutorial on the chemistry is provided), then perform some detective work to look for it in the text. We report the results and lessons learned. Bottom line: the knowledge we want is only a small fraction of the text, and rarely stated in the nice explicit form that we would like.

9 Outline 2.1 New Insights about the Text
2.2 Short Tutorial on Chemistry 16.1 &16.2 acid-base reactions 2.3 Examples of Converting Text to CPL, and CPL to KM conjugate acid-base pairs relative strengths of acids & bases solving equilibrium reactions 2.4 Lessons Learned

10 2.1 New Insights about the Text
Almost all of the text is fluff and examples, not useful for solving chemistry questions The human reader learns by examples, but the computer needs the general rules There are only a few key sentences (usually italicized) that are critical The sample exercises (previously ignored) are also key to the problem solving methods

11 2.2 Tutorial on Acids & Bases
Arrhenius definitions of acid and base “Acids are substances that, when dissolved in water, increase the concentration of H+ ions. Likewise, bases are substances that, when dissolved in water, increase the concentration of OH- ions.” based only on how a chemical reacts with water Bronsted-Lowry definitions of acid and base “According to their definition, an acid is a substance (molecule or ion) that can donate a proton to another substance. Likewise, a base is a substance that can accept a proton.” In this sense, being an acid or base is just a role a chemical plays amphoteric: these chemicals can act as an acid or a base This information is nice but isn’t directly used in problems

12 Tutorial on Acids & Bases
Conjugate acid-base pairs add or subtract a proton (H+) from a formula given an acid, get its conjugate base by removing one H and one charge: acid = H2O, its conjugate base = OH- given a base, get its conjugate acid by adding one H and one charge: base = H2O, its conjugate acid = H3O+ sample exercise: find conjugate base of HCO3- answer: CO3 2-

13 Tutorial on Acids & Bases
Relative strengths of acids & bases (given in table) Figure 16.4 lists strong, weak, & negligible acids, in order Also lists their conjugate bases in 2nd column The stronger the acid, the weaker its base, and vice-versa ACID BASE HCl (strongest) Cl- (weakest) H2SO4 HSO4- HNO3 NO3- H3O+ H2O etc. etc. Sample exercise: which base is stronger, H2O or HSO4- ? answer: H2O (it’s lower in the BASE column)

14 Tutorial on Acids & Bases
Equilibrium in an acid-base reaction given a reaction equation with an acid + base on the left side and the resulting base + acid on the right side HCl H2O  H3O Cl- acid base conj. acid conj. base the reaction might go left-to-right or left-to-right it moves from the side with the stronger base to the side with the weaker base H2O is a stronger base than Cl- (see the table) so this reaction moves from left to right so the equilibrium lies to the right (where the weaker base is) sample exercise: HSO CO3 2-  SO HCO3-

15 CPL for Conjugate Acid-Base Pairs – at the chemical level
Textbook: “Every acid has a conjugate base, formed by removing a proton from the acid. ... Similarly, every base has associated with it a conjugate acid, formed by adding a proton to the base.” Restated in CPL: “A proton is transferring from an acid molecule IF-AND-ONLY-IF the acid molecule is converting to the conjugate base molecule of the acid molecule.” “A proton is transferring to a base molecule IF-AND-ONLY-IF the base molecule is converting to the conjugate acid molecule of the base molecule.” Note: do we have to say “molecule” or can it be deduced via metonymy? Note: do we have to say “Bronsted-Lowry” everywhere (or can we deduce it from the context)? But this is not the useful level for problem solving

16 2.3 Going from text to CPL The following examples illustrate the difficulty of going from the original text to a useful controlled language (CPL) rendition. In particular: the English text mixes chemicals as substances, molecules, and formulae freely. This has to be disentangled in the CPL. the text concisely refers to algebraic notions (“add a hydrogen to the formula”) which are complex, and need to be spelt out in CPL (or already known in the KB) for the reasoner to perform such manipulations. some declarative knowledge is expressed procedurally in text. It needs to be rewritten in a declarative form to acquire it in the form we would like for reasoning. problem-solving methods needed for answering questions are not stated explicitly, or only shown through examples. Without these, we have to rely on general problem-solving techniques (e.g., backward chaining) to work.

17 CPL for Conjugate Acid-Base Pairs – at the formula level
Textbook: “Every acid has a conjugate base, formed by removing a proton from the acid. ... Similarly, every base has associated with it a conjugate acid, formed by adding a proton to the base.” Restated in CPL, but at the formula level: “Y is the conjugate base of acid X IF-AND-ONLY-IF the formula of Y equals the formula of X minus one H and one charge.” “X is the conjugate acid of base Y IF-AND-ONLY-IF the formula of X equals the formula of Y plus one H and one charge.” Note: the formula-level CPL (“minus one H”) is difficult to express Will we need a built-in subroutine to do the formula manipulation?

18 KM for Conjugate Acid-Base Pairs – at the formula level
CPL: “Y is the conjugate base of acid X IF-AND-ONLY-IF the formula of Y equals the formula of X minus one H and one charge.” KM: a Acid-Molecule chemical-formula: X conjugate-base: a Base-Molecule with chemical-formula: Y IFF <==>: Formula-Subtraction [is this function built into the ontology?] base: a Chemical-Formula with represents: X remove: a H-letter, a Charge-Unit result: a Chemical-Formula with represents: Y Note: differences between the CPL words and the ontology

19 Conjugate Acid-Base Questions
Find the conjugate base of CHO2- a Acid-Molecule chemical-formula: CHO2- conjugate-base: a Base-Molecule with chemical-formula: Y IFF <==>: Formula-Subtraction base: a Chemical-Formula with represents: CHO2- remove: a H-letter, a Charge-Unit result: a Chemical-Formula with represents: Y If a built-in routine can do the formula subtraction, we get: resullt = CO2 2- (represents Y, the conjugate base molecule)

20 Conjugate Acid-Base Reactions
Some problems involve finding the conjugate acid-base pairs in a reaction: CHO H3O+ ↔ HCHO H2O May need a problem-solving method: for each molecule’s formula on the LHS, compute its conjugate acid and try to match it on the RHS, if not found, then compute its conjugate base and try to match it on the RHS save each conjugate acid-base pair this is not explicit in the text, more like common sense But in this case the rules can be designed to try all pairings automatically, via backward chaining

21 CPL for Strengths of Acids & Bases
Textbook: “In other words, the stronger an acid, the weaker is its conjugate base; the stronger a base, the weaker is its conjugate acid.” Restated in CPL: “Acid X is stronger than acid Y IF-AND-ONLY-IF the conjugate base of X is weaker than the conjugate base of Y.” “Base X is stronger than base Y IF-AND-ONLY-IF the conjugate acid of X is weaker than the conjugate acid of Y.” Note: these are true statements, but what we need to solve the textbook problems are all the contents of the table in Figure 16.4.

22 Table of Acid & Base Strengths
Figure 16.4’s table generates many CPL statements, such as: “HCl is a strong acid.” “The conjugate base of HCl is Cl-minus.” “Cl-minus is a negligible base.” “HCl is a stronger acid than H2SO4.” “H2SO4 is a strong acid.” “The conjugate base of H2SO4 is HSO4-minus.” “HSO4-minus is a negligible base.” “HSO4-minus is a stronger base than Cl-minus.” H2SO4 is a stronger acid than HNO3.” ... etc.

23 KM for Acid & Base Strengths
Example CPL: “HCl is a strong acid.” “The conjugate base of HCl is Cl-minus.” “Cl-minus is a negligible base.” “HCl is a stronger acid than H2SO4.” Corresponding KM: HCl-Molecule acid-base-strength: Strong-Acid conjugate-base: Cl-Minus-Ion stronger-acid-than: H2SO4-Molecule Cl-Minus-Ion acid-base-strength: Negligible-Base Note: assume stronger-acid-than is transitive in KM

24 CPL for Acid-Base Reactions
Textbook: “From these examples we conclude that in every acid-base reaction the position of the equilibrium favors transfer of the proton to the stronger base.” Restated in CPL for problem solving: “IF the base represented by a formula on the left side of an equation of an acid-base reaction is weaker than the base represented by a formula on the right side of the equation, THEN a proton transfers to the stronger base [irrelevant] AND the equilibrium of the equation is on the left side of the equation.” [and a similar rule for when the equilibrium is on the right side] Note: most of this was not explicit in the text, we had to elaborate! We need rules that will solve problems for us

25 KM for Acid-Base Reactions
CPL: “IF the base represented by a formula on the left side of an equation of an acid-base reaction is weaker than the base represented by a formula on the right side of the equation, THEN a proton transfers to the stronger base [irrelevant, skip below] AND the equilibrium of the equation is on the left side of the equation.” KM: Acid-Base-Reaction equation: a Chemical-Equation with left-side: a Chemical-Formula (represents: Base-1 weaker-base-than: Base-2) right-side: a Chemical-Formula (represents: Base-2) Implies => the Chemical-Equation equilibrium-side: the left-side of self

26 2.4 Lessons Learned - 1 Need to study the sample exercises and sample tests to identify the key knowledge can an automatic reading system do this? Need to focus on the textbook sentences needed to solve test questions skip all the fluff and the examples Even the key sentences may not spell out all the knowledge needed to solve problems need to reword and expand the text, to generate the needed logic

27 Lessons Learned - 2 The text is full of ambiguity and metonymy
mixes together the gross substance level and molecular level mixes together molecular events and equation descriptions uses context to avoid saying “Bronsted-Lowry” acid & base talks loosely about things that are precise in the ontology The text generates disconnected knowledge, but problem-solving requires carefully crafted rules that chain together can an automatic reading system do this?

28 Lessons Learned - 3 Sometimes we need to invoke lower-level functions
such as for formula manipulation: adding an H and a charge Sometimes we may need Problem Solving Methods algorithms for doing things that are hard to do declaratively can these be read into the system? or must they be hand-crafted as background knowledge? Will the new knowledge integrate with the background knowledge? Peter Clark will present the issues

29 Part 3: Knowledge Integration
Peter Clark

30 Forward We want to integrate our CPL-generated knowledge with the existing chemistry knowledge base (with the equivalent knowledge deleted). Ideally, our CPL generated knowledge would look similar to the equivalent hand-built knowledge in the KB, so we can just remove the latter and insert the former. Unfortunately, this is not the case; the hand-built knowledge is highly complex and intertwined. In this section we study the previous encoding of this chemistry knowledge, look at why it is complex and why this makes knowledge integration hard, and then reflect on how we would like that original knowledge to have looked so that knowledge integration would be easier. (Note this KB was never intended to support knowledge integration, so this is not a criticism of the KB from that point of view) There are two key bottom lines: Like software, a knowledge base needs to be designed for reuse/extension. We attempt to identify some principles for doing this. Critical to this is not just the design of the KB, the the design of the reasoning algorithm. A large amount of the KB complexity could be avoided by using a “smart” reasoning algorithm which tolerates and corrects representational short-hands (“loosespeak”), rather than requiring the KE to spell out everything in full.

31 Knowledge-Based Chemistry…

32 Key Questions we will Discuss:
3.1 How does HaloKB encode these pages? 2 case studies: Conjugate acid calculations Relative acid strengths How difficult is it to integrate with this encoding? 3.2 What would make integration easier? alternative form of the KB? different/extra reasoning methods?

33 3.1 Case Study 1: Conjugate Acid Calculations
CHO H3O+ ↔ HCHO H2O Which is the correct acid/conjugate base pair in the above reaction? (a) CHO2- / H2O (b) H3O+ /H2O (c) H3O+ / HCHO2 (d) HCHO2 / H2O (e) CHO2- / HCHO2 Answer: Right answer, as H3O+ is the conjugate acid of (i.e., one proton more than) H2O.

34 How does HaloKB do it? This is what the Halo KB does:
Take each answer in turn, e.g., (a) CHO2- / H2O, then: Compute conjugate acid of H2O Does the result = CHO2- ? Repeat for (b) – (e) ;;; Method: Add an H+ Compute-Conjugate-Acid: input: a Chemical output: a Chemical (the conjugate acid of the input)

35 Conjugate Acid Calculation
How does the Halo KB compute the conjugate acid of a base? e.g., given H2O, return H3O+ See next slide…

36 ? (every Compute-Conjugate-Acid has
(input ((a Chemical with (plays ((a Base-Role)))))) (parent_formula ((the term of (the nested-atomic-chemical-formula of (the has-basic-structural-unit of (the input of Self)))))) (target-unit ((if (the parent_formula of Self) then (:set (#'(LAMBDA () (GET-CONJUGATE-ACID-ATOMIC-FORMULA-BACK (KM0 '(|the| |parent_formula| |of| |Self|))))))))) (output ((if (oneof (the input of Self) where (It isa H2O-Substance)) then (a H3O-Plus-Substance) else ((forall (allof2 (the target-unit of Self) where ((not (It2 = (the parent_formula of Self))))) (the output of (a Identify-Chemical with (input ((a Chemical with (has-basic-structural-unit ((the output of (a Identify-Chemical-Entity with ((a Chemical-Entity with (nested-atomic-chemical-formula ((a Chemical-Formula with (term (It))))))))))))))))))))))) ?

37 Comments The vast majority of this code is converting betweeen substances, molecules, and formulae and back again (see next slides) One can view this as the inability of the reasoner to handle metonymy – if the “input” to a computation is a substance, but the computation is performed on a formula, then the KR has to laboriously convert the substance to a formula. This single conceptual distinction in chemistry, and the requirement that our logics spell out everything with absolute precision (rather than tolerate and correct some representational short-hand/loosespeak/metonymy), is responsible for perhaps 50% of the complexity of the chemistry KB!!! Let’s look at this further…

38 The HaloKB Chemical Ontology
4 key conceptual notions, related by binary relations (slots): Chemical Molecule FormulaObject Formula “H2O” 2 H + 1 O term has-nested-atomic-formula has-basic-structural-unit

39 Compute-Conjugate-Acid
input: Chemical parent_formula: Chemical (input) → Molecule → FormulaObject → Formula target-unit: LISP: Formula (parent_formula) → Formula (conjugate) output: Formula (target-unit) → FormulaObject → Molecule → ClassifiedMolecule → Chemical → ClassifiedChemical This is a schematic summary of the original KM representation, showing how the conversions happen in the KM code “H2O” 2H+O 2H+O 3H+O 3H+O “H3O” (result)

40 Note the icons, annotating where the conversions happen in the KM
(every Compute-Conjugate-Acid has (input ((a Chemical with (plays ((a Base-Role)))))) (parent_formula ((the term of (the nested-atomic-chemical-formula of (the has-basic-structural-unit of (the input of Self)))))) (target-unit ((if (the parent_formula of Self) then (:set (#'(LAMBDA () (GET-CONJUGATE-ACID-ATOMIC-FORMULA-BACK (KM0 '(|the| |parent_formula| |of| |Self|))))))))) (output ((if (oneof (the input of Self) where (It isa H2O-Substance)) then (a H3O-Plus-Substance) else ((forall (allof2 (the target-unit of Self) where ((not (It2 = (the parent_formula of Self))))) (the output of (a Identify-Chemical with (input ((a Chemical with (has-basic-structural-unit ((the output of (a Identify-Chemical-Entity with ((a Chemical-Entity with (nested-atomic-chemical-formula ((a Chemical-Formula with (term (It))))))))))))))))))))))) Note the icons, annotating where the conversions happen in the KM “H2O” 2H+O “H3O” 3H+O

41 Identify-Chemical In addition, the KM involves some procedural-like calls to check chemical classification. Really, these should be performed automatically be the reasoner than appear in the KR itself. Input: a generic chemical with a basic-structural-unit Output: a specific chemical Really should be done by automatic classification Identify-Chemical (a method) Substance: basic-structural-unit: HCl HCl-Substance

42 Actually manipulating the formula…
All the previous KM is simply doing type conversion, to then pass data to a procedural attachment to get from formula H2O to H3O+ ! However, in an earlier version of the KB, the formula conversion was performed directly in KM. Let’s have a look at what that looks like…

43 Compute-Conjugate-Acid Method
Add a unit of charge… The actual statement where the charge in the formula is increased by one! Note again all the type conversions. Method returns: (a Chemical-Entity with (charge ((a Charge-Value with (value ((:pair ((the1 of (the value of (the charge of (the has-basic-structural-unit of (the input of Self))))) + 1) *electronic-charge)))))) (nested-atomic-chemical-formula ((a Chemical-Formula with (term (… …and a hydrogen…

44 The actual statement where the number of H’s in the formula is either
…. (nested-atomic-chemical-formula ((a Chemical-Formula with (term ( (if (oneof (the elements of (the term of (the nested-atomic-chemical-formula of (the has-basic-structural-unit of (the input of Self))))) where (((the2 of It) = H-Plus) or ((the2 of It) = H))) then (forall-seq (the input of Self)))) (if (((the2 of It) = H-Plus) or ((the2 of It) = H)) then (:pair ((the1 of It) + 1) (the2 of It)) else It )) else (the append of (:seq (:seq (:pair 1 H)) (the input of Self)))))))))))) The actual statement where the number of H’s in the formula is either - increased by one - or one added (if none there) Algebraic manipulation is complex! Also again note all the type conversions.

45 Case Study 2: Relative Strengths of Acids
The textbook presents a rank-order of relative strengths. However, the KB encodes these with three qualitative values: *negligible, *weak, *strong

46 Relative Strengths of Acids in KM
(every Acid-Role has (intensity ( (a Intensity-Value with (value ( (:pair ;; Case statement for Acids. (if ((the played-by of Self) isa Ionic-Compound-Substance) then (if (((the played-by of Self) isa HCl-Substance) or ((the played-by of Self) isa HBr-Substance) or ((the played-by of Self) isa HI-Substance) or ((the played-by of Self) isa HClO3-Substance) or ((the played-by of Self) isa HClO4-Substance) or ((the played-by of Self) isa H2SO4-Substance) or ((the played-by of Self) isa HNO3-Substance)) *strong else (if (((the played-by of Self) isa H3PO4-Substance) or ((the played-by of Self) isa HF-Substance) or ((the played-by of Self) isa HC2H3O2-Substance) or ((the played-by of Self) isa H2CO3-Substance) or (Not designed for easy extension/integration!)

47 Computation: Comparing Qualitative Strengths
The KM for comparing acids embeds qualitative magnitude comparison within a chemistry-specific frame. It is also hard-wired to 3-value comparisons only. Better we’d like to disentangle the comparison method from the chemistry, and generalize the method. (every Compare-Relative-Strengths-of-Acids has (output ((if (((the1 of (the value of (the intensity of (the Acid-Role plays of (the first of (the input of Self)))))) = *strong) and ((the1 of (the value of (the intensity of (the second of (the input of Self)))))) /= *strong)) then (the first of (the input of Self)) [Compare-Relative-Strengths-of-Acids-output-1] ))) Method: #1, #2 → Strongest S, notS → #1 notS, S → #2 W, Neg → #1 Neg, W → #2 S=strong W=weak N=negligible

48 3.2 Where to? (every Reaction has ;; Encodes L.22(C). (direction (
(if (not (the direction of Self)) then (a Direction-Value with (value ((if (the output of (a Compute-Equilibrium-Position with (input (Self)))) (if ((the output of = (the raw-material of Self)) *left else *right))))) [Reaction-direction-1] )))

49 Observations The complex and precarious HaloKB constructs are unlikely to be generated automatically from text HaloKB mixes procedural and declarative knowledge (undesirable); Some HaloKB constructs are solely implementational (reasoner-specific) HaloKB is not elaboration tolerant in places In places, the HaloKB ontology projects a not-so-linguistic view of the world (e.g., Compute methods) to that in the text book (problem for integration) KM is error-intolerant

50 Five Desirables for a Extensible KB 1. Need Metonymy-Tolerant Repns
The precision that logic requires of our written representations is a fundamental barrier to robustness IF “the acid on the left” is stronger than “the acid on the right” THEN the reaction direction is “to the right” “the acid denoted by the formula on the left side of the equation of the reaction” Alternative: Preserve metonymy in the KB Have it resolved at reasoning time

51 1. Metonymy-Tolerant Repns (cont)
(every Compare-Relative-Strengths-of-Acids has (output ((if (((the1 of (the value of (the intensity of (the Acid-Role plays of (the first of (the input of Self)))))) = *strong) and ((the1 of (the value of (the intensity of (the second of (the input of Self)))))) /= *strong)) then (the first of (the input of Self))))) if we had a metonymy-tolerant reasoner, we could instead write… (every Compare-Relative-Strengths-of-Acids has (output ((if ((the intensity of (the first of (the Chemicals)) = *strong) and ((the intensity of (the second of (the Chemicals)) /= *strong) then (the strongest of (the Chemicals)) = (the first of (the Chemicals)))))

52 1. Metonymy-tolerance: Need Background Knowledge!
Mixing chemical, molecular, and formula views Need background K to untangle the mess Note the fluidity of reference in written English!!! “HC2H3O2(aq)+…C2H3O2-” basic-unit formula

53 2. Need to Separate Declarative and Procedural Knowledge
Procedural: (Conjugate-Acid calculation) input: a Base-Chemical output: convert Chemical → Molecule → Formula, append “H”, then → Molecule’ → Acid-Chemical Declarative: Acid-Chemical = Base-Chemical + H + constraint reasoner to solve constraints

54 2. Need to Separate Declarative and Procedural Knowledge (cont)
The English text doesn’t help! often procedural descriptions are given reflects what goes on in real life we mentally draw declaratives from this HX (aq) + H2O (aq) ↔ X- (aq) + H3O+ (aq) “In the forward reaction, HX donates a proton to H2O. In the reverse reaction, the H3O+ ion donates a proton to the X- ion.”

55 2. Need to Separate Declarative and Procedural Knowledge (cont)
“Every acid has a conjugate base, formed by removing a proton from the acid. ... Similarly, every base has associated with it a conjugate acid, formed by adding a proton to the base.” Acid-Chemical = Base-Chemical + H

56 2. Need to Separate Declarative and Procedural Knowledge (cont)
Mixed Declarative (every Compare-Relative-Strengths-of-Acids has (output ((if (((the1 of (the value of (the intensity of (the Acid-Role plays of (the first of (the input of Self)))))) = *strong) and ((the1 of (the value of (the intensity of (the second of (the input of Self)))))) /= *strong)) then (the first of (the input of Self)) [Compare-Relative-Strengths-of-Acids-output-1] ))) HCl *strong H2CO3 *weak … … + *strong > *weak > … + Procedural (PSM) Find object(s) with qualitatively largest attribute value “Theory of magnitudes”

57 3. Syntactic Organization Matters!
Elaboration tolerance: Add/modify knowledge (semantics) by (only) adding formulae (syntactics) (every Acid-Role has (intensity ( (a Intensity-Value with (value ( (:pair ;; Case statement for Acids. (if ((the played-by of Self) isa Ionic-Compound-Substance) then (if (((the played-by of Self) isa HCl-Substance) or ((the played-by of Self) isa HBr-Substance) or ((the played-by of Self) isa HI-Substance) or ((the played-by of Self) isa HClO3-Substance) or ((the played-by of Self) isa HClO4-Substance) or ((the played-by of Self) isa H2SO4-Substance) or ((the played-by of Self) isa HNO3-Substance)) then *strong else Not elaboration-tolerant

58 3. Syntactic Organization Matters!
Elaboration-tolerant Better…. intensity(HCl-Substance, *strong) intensity(HBr-Substance, *strong) intensity(HI-Substance, *strong) intensity(HClO3-Substance, *strong) intensity(HClO4-Substance, *strong) intensity(H2SO4-Substance, *strong) intensity(HNO3-Substance, *strong) intensity(HF-Substance, *weak) intensity(HC2H3O2-Substance, *weak) intensity(H2CO3-Substance, *weak)

59 4. Challenges aligning to the ontology
Key: mapping from English words/phrases to knowledge-base concepts HCl-Substance ↔ “HCl” Easy *strong/*weak/*negligible ↔ “HCl is stronger than H2O” Compute-Equilibrium-Position ↔ “equilibrium position” Hard! Reified computational method property of a reaction

60 5. Need Error-Tolerant Reasoning
KM can go belly-up with a contradiction Rather need to detect and correct contradictions Detect: explore (ruminate), not just myopic backchaining richer background knowledge Correct: reasoner supports suspension of assumptions/rules (TMS?) search mechanism to control this

61 The Ideal Background KB
Simple structures and metonymy-tolerant Declarative and procedural knowledge separated include a constraint reasoner Elaboration-tolerant organization Linguistically rooted ontology Error-tolerant reasoner

62 Part 4: Tuple generation and analysis for KB scale-up
Phil Harrison Boeing Phantom Works

63 Forward This segment turns to a different topic. As a side-line, we have been exploring automatic extraction of general knowledge from text, following Schubert’s approach. We ran our “tuple” generator over some chemistry text, and here report on the general method, challenges, and ways forward. Bottom line: This information extraction approach may be useful for rapidly acquiring certain types of knowledge from a text corpus.

64 The Problem Uncontrolled (non CPL) text is difficult for computers to analyze. Advances are needed in all areas of NLP: grammar, WSD, semantic representation, and discourse processing. A method is needed for extracting as much knowledge as possible from text.

65 Tuple extraction The Parse trees are a source of “head-complement” or “head-modifier” tuples: From “The heavy man bought an expensive book”  (S “man” “buy” “book”) (AN “heavy” “man”) (AN “expensive” “book”) “Books can be bought” “Men can be heavy” “Books can be expensive” Even incorrect parses can generate some valid tuples.

66 Selected examples from chemistry
Acids can be strong or weak. Acids can be vitamins. Acids can release ions. Bases can turn litmus blue. Carbonates can form CO2. Chlorides can be ions. Concentrations can be measured.

67 More examples (unfiltered)…
Acids can react. Acids can release ions. Acids can separate stepwises. Acids can taste. Acids can turn litmuss. Addeds can be water. Alkalines can be NaHCO3. Alkalines can be material. Alkalis can have concentration. Alkalis can have dissociation. Amazinglys can be natural. Ammonias can be bases. Amounts can be different. Amounts can be dissolveed. Amounts can be great. Amounts can be makeed. Amounts can be measured. Amounts can be similar. Amounts can be small. Amounts can be specific. Amounts can depend. Analysiss can be dimensional. Anions can have combination. …. Substances can form ions. Substances can gain electronss. Substances can have amount. Substances can have mixture. Substances can have solubility. Substances can lose electronss. Substances can react. Substances can react ions. Sulfates can be weighed. Sulfates can dissolve. Sulfides can react. Sulfidess can have reaction. Sums can equal charges. Surroundingss can have pH. Systems can be biological. Tables can show relationships. Tastes can be sour. Tastings can be permited. Teachers can be high. Teachers can be less. Teachers can be useful. Temperatures can be. Temperatures can be given. ….

68 Uses of tuples Extracted tuples can guide parsing of subsequent sentences. Tuples can aid new word sense detection and WSD. Iteration using recent tuples allows more knowledge extraction from a corpus. The generic sentences derived from tuples can be integrated into a KB.

69 Assigning senses to tuples
Start with a tuple such as: (VN “eat” “metal”) To assign a sense to “eat”, find tuples with “eat” replaced by another word If alternates like “corrode” share a Synset with “eat”, select the sense of “eat” as that Synset. If no Synset, hypothesize a new sense.

70 Building concept hierarchies
Tuple sets can provide hypernym information Compare sets of concepts derived from e.g. (VN x “fruit”) (VN x “apple”) (VN x “orange”) Everything that can be done to fruit can be done to apples and oranges, but not vice versa.

71 Tuple Knowledge vs CPL Knowledge
Tuples: generic encoding of possible actions, relations (including causality), parts, and properties with possible values. No script-like knowledge. CPL includes these plus the ability to express temporal and causal sequences that are expected in typical situations and roles of participants in situations.

72 Building on Tuple Knowledge
Possible states, actions, causes, relations, parts, etc. need integration with situational expectations, functional purpose, and knowledge of the evolution of typical situations. Rules to generate hypotheses about this integration are needed. Such hypotheses could be reviewed manually.

73 Part 5: The Bigger Picture
including Moebius and beyond

74 Roles for Controlled Language in Mobius
A knowledge acquisition tool for pump-priming Restricted reading from a corpus: scan corpus for sentences/fragments in CPL “translators” convert full English to CPL A tool for interpreting statistically-derived fragments A dialog tool for question-asking browsing/interacting with the KB

75 9 Key Steps for Success 1. Acquire/build a large, linguistically-based ontology rapidly expand the Component Library fix and merge in WordNet 2. Enrich the ontology with simple, core knowledge purpose, has-part, causes, prevents, etc. collocations (“rain falls”, etc.) Possible sources include: OpenMind (MIT), Learner (ISI), KnowItAll (UW) from volunteers on the Web information extraction from a corpus 3. Solve the sense disambiguation problem Perhaps can bootstrap with sufficient background K

76 9 Key Steps for Success 4. Create models of textual discourse, e.g.,
teach by example use of metaphor and analogy definitions/general principles introductory “fluff” + software for identifying and applying the models 5. Develop new reasoning tools robust reasoning with contradictions reasoning with uncertainty constraint-based reasoning

77 9 Key Steps to Success 6. Knowledge integration:
Solve the pervasive metonymy problem Requires large background KB to start with plus matching algorithm “HC2H3O2(aq)+…C2H3O2-” basic-unit formula

78 9 Key Steps to Success 7. Develop new active introspection methods
Determine areas of coherence “Ruminate” to identify conflict Use of metadata to label quality of knowledge 8. Develop core problem-solving theories/methods: Reasoning about time, change Setting up and solving constraints Solve by elimination etc. 9. Deal with alternative viewpoints/levels of detail - support partitions/contexts in the KB (rather than a monolithic whole)


Download ppt "Reading to Learn Q2 review (10/17/05)"

Similar presentations


Ads by Google