Download presentation
Presentation is loading. Please wait.
Published byErik Parrish Modified over 9 years ago
1
Reading to Learn Q3 Review Peter Clark John Thompson Tom Jenkins Phil Harrison Bill Murray
2
Agenda This Seedling and Mobius –Major lessons learned Reformulations in CPL –Whole 5 pages –Key Sentences How do other texts compare? Generics How to identify “important” text Principles for an extensible KB Evaluation discussion Tuples as another source of knowledge
3
SRI-Boeing’s Reading to Learn Seedling Goal: –study issues in learning through reading by working with a reduced version of the problem, namely working with controlled, rather than unrestricted natural language. The NLP task is factored into two: full NL → CL, CL → logic Rationale: –by sidestepping some of the linguistic issues of full NLP, can focus on knowledge integration issues –methods for full NL → CL can be studied separately this project
4
SRI-Boeing’s Reading to Learn Seedling Approach: –Rewrite 5 pages of chemistry text into our controlled language, CPL –Extend and use our CPL interpreter to generate logic –Integrate this new knowledge with an existing chemistry knowledge base (from the Halo Pilot), which has the new knowledge surgically deleted from it –Evaluate the performance of the CPL-extended KB with the original –Report on the problems encountered and solutions developed
5
This Seedling in Mobius Knowledge Integration Introspection Natural Language Processing Test Generation This seedling
6
Summary Q3: –Completed coding of key sentences in CPL –Demonstration of inference with that knowledge –Study of cues for identifying important text –Assembly of key lessons learned –Interaction with ISI –Exploration of shallow knowledge extraction Q4 –Finish interpretation of additional sentences –Assemble qualitative and quantitive evaluations –Continue interaction with ISI: Side-by-side study –Final report
7
Main Results and Messages With some hand-holding, part of the “Mobius loop” can be done –But: chemistry is a formidable domain Contributions: –10 key lessons learned for a larger project –Qualitative and quantitative evaluation data
8
10 Key Lessons Much of the text is irrelevant (“fluff”) Much important knowledge is conveyed by examples & diagrams General principles are rarely spelt out clearly Text is full of ambiguity, metaphor, and metonymy/“loosespeak” Declarative knowledge may be hidden in procedural descriptions Text creates disconnected knowledge, which may not chain well Discourse structure is important Generic sentences are ubiquitous Many sentences pose major representational challenges Traditional KR structures are difficult to extend
9
Two Reformulations into CPL… Reformulation of the whole 5 pages into CPL –Approximately 250 sentences –Syntactic conversion + pseudo-logic –generally not inference capable, esp. generics Re-reformulation of first subsection into explicit if-thens Inference capable but greater distance from source text Reformulation of key pieces into CPL –approximately 10 if-then rules –inference capable –barely recognizable from the original source text
10
Agenda This Seedling and Mobius –Major lessons learned Reformulations in CPL –Whole 5 pages –Key Sentences How do other texts compare? Generics How to identify “important” text Principles for an extensible KB Evaluation discussion Tuples as another source of knowledge
11
Some CPL Rules IF a substance is an acid THEN the substance tastes sour. IF an acid contacts an acid-sensitive dye THEN the acid changes the color of the dye. IF a substance is a base THEN the substance tastes bitter. IF a substance is a base THEN the substance feels slippery. IF a substance is an acid THEN the substance contains hydrogen. IF a thing is a base THEN the thing is a substance. IF an Arrhenius base contacts water THEN the base emits OH-minus ions in the water. IF an Arrhenius acid is dissolving in water THEN the dissolving is increasing the concentration of H-plus ions in the water. IF an Arrhenius base is dissolving in water THEN the dissolving is increasing the concentration of OH-minus ions in the water IF a substance is a HCl substance THEN the substance is an Arrhenius acid. IF hydrogen chloride gas is in water THEN the gas dissolves easily in the water. IF hydrogen chloride gas is in water THEN the gas reacts with the water.
12
Reformulation of the 5 pages… Note: introductory material, flowery language, fluff, complex sentences, parentheticals.
13
IF a substance is an acid THEN the substance tastes sour. IF an acid contacts an acid-sensitive dye THEN the acid changes the color of the dye. IF a substance is a base THEN the substance tastes bitter. IF a substance is a base THEN the substance feels slippery.
14
IF a substance is a HCl substance THEN the substance is an Arrhenius acid. IF hydrogen chloride gas is in water THEN the gas dissolves easily in the water. IF hydrogen chloride gas is in water THEN the gas reacts with the water. HCl is the chemical symbol for hydrogen chloride. IF a substance is an aqueous solution of HCl substance THEN the substance is hydrochloric acid. IF a substance is concentrated hydrochloric acid THEN 37 percent of the mass of the substance is HCl. IF a substance is concentrated hydrochloric acid THEN the concentration of HCl in the substance is 12 M. ← (Implied but not explicit)
15
IF a substance is an aqueous solution of HCl substance THEN the substance is hydrochloric acid. (every Hydrochloric-Acid has-definition (instance-of (Aqueous-Solution)) (has-solute ((a HCl-Substance))) the'(e1,x1,e2) & aqueous'(e3,x1) & solution'(e2,x1) & of'(e4,x1,x2) & hcl'(e5,x2) & know'(e6,z1,x1,x3) & as'(e7,e6,x3) & hydrochloric'(e8,x3) & acid'(e9,x3) (surface logical form) CPL Halo KB style
16
Summary of Interpretation Challenges Interpreting generics. –"Acids cause some dyes to change color." how to handle negation. –"Some substances containing hydrogen are not acids." –"The transfer leaves no undissociated acid molecules" Vague attributes ("properties", "due to") –“Properties of aqueous solutions of Arrhenius acids are due to H-plus ions" –coreference with nominalizations ("react"/"reaction") –"Hydrogen chloride reacts... The reaction produces..." naming: how to represent both the name and the symbol for a chemical. –"An aqueous solution of HCl is called hydrochloric acid." how to get new technical vocabulary + meanings into the system. –"NaOH dissociates in water." –"H2O abstracts the proton from HX" how to represent definitions. –"Arrhenius acids and defined..." how to state that one category is more general than another. –"Bronsted-Lowry acids are more general than Arrhenius acids."
17
Summary of Interpretation Challenges (cont) how to represent "sometimes". –"An HO3-plus ion sometimes reacts with an H2O molecule." how to represent modals/tendancies like "can". –"A molecule of a Bronsted-Lowry acid can donate a proton..." how to represent an argument (proof), and generalize from it. –"Therefore, the H2O molecule acts as a Bronsted-Lowry base.“ –"Substances with negligible acidity contain hydrogen, but the substances do not behave as acids in water." vagueness ("is mostly", "nearby", "some") –"The NH4Cl is mostly solid particles." –"Some acids are better proton donors than other acids." –"A weak acid partly transfers the acid's protons to the water." –"Proton-transfer reactions are governed by the relative strengths of the bases" –"The solution has a negligible concentration of HCl molecules." –"An aqueous solution of acetic acid consists mainly of HC2H3O2 molecules" –"The aqueous solution has relatively few H3O-plus ions" metonymy –"The H2O molecule in Equation 16.5 donates a proton" –"In Equation 16.9 HX dissolves in water." –"Equation 16.9 describes the behavior of a strong acid in water."
18
Summary of Interpretation Challenges (cont) definitions with negation. –"An H-plus ion is a proton with no valence electron." presuppositions –"Acids cause some dyes to change color." –"A Bronsted-Lowry acid always reacts with a nearby Bronsted-Lowry base." generalized formulae and equations –"In Equation 16.6 the symbol HX denotes an acid." how to compute and represent differences –"An acid and a base differing only in a proton are called a conjugate pair" how to handle definite references ("the" base) that haven't been introduced. –"Removing a proton from the acid produces the conjugate base." change over time –"The HNO2 molecule becomes the NO2-minus ion." –"The H2O molecule changes into the hydronium ion" –"Acids cause some dyes to change color." semi-malformed sentences –"A stronger acid has a weaker conjugate base." How to state and represent hypothetical situations. –"Assume that H2O is a stronger base than X-minus in Equation 16.9."
19
Summary of Interpretation Challenges (cont) Generalization from examples –“In any reaction we can identify two sets of conjugate acid-base pairs. For example, consider the reaction…” Information in tables and diagrams
20
Agenda This Seedling and Mobius –Major lessons learned Reformulations in CPL –Whole 5 pages –Key Sentences How do other texts compare? How to identify “important” text Principles for an extensible KB Evaluation discussion Tuples as another source of knowledge
21
Recall from Last Time … Most of the textbook sentences are “fluff” and examples –and are not needed to solve test questions A few key sentences (and a table) are the heart of this section of the textbook –and are often given in italics These key sentences are not worded as precisely as needed for automatic translation into axioms that can chain together to solve a problem –in fact, some parts are not stated at all –students look at diagrams and examples and figure it out
22
Overview 4 key pieces of knowledge in the Section: –Computing the direction of the reaction Rewriting in CPL Compare to UT’s KM encoding Compare to ISI’s shallow logical form –Identifying the acids/bases in a reaction –Computing the conjugate of an acid/base –Comparing the strengths of two acids/bases
23
Overview 4 key pieces of knowledge in the Section: –Computing the direction of the reaction Rewriting in CPL Compare to UT’s KM encoding Compare to ISI’s shallow logical form –Identifying the acids/bases in a reaction –Computing the conjugate of an acid/base –Comparing the strengths of two acids/bases
24
A Key Sentence in Our Textbook Let’s look at one example of a key sentence: “From these examples we conclude that in every acid-base reaction the position of the equilibrium favors transfer of the proton to the stronger base.” Restated in Sample Exercise 16.3: “Thus, the equilibrium favors the direction in which the proton moves from the stronger acid and becomes bonded to the stronger base.” “In other words, the reaction favors consumption of the stronger acid and stronger base and formation of the weaker acid and weaker base.”
26
Rewriting a Sentence into CPL IF there is a reaction AND one base in the reaction is stronger than the other base in the reaction THEN the direction of the reaction is away from the stronger base. [“favors transfer to” → “direction is away from”] “In every acid-base reaction the position of the equilibrium favors transfer of the proton to the stronger base.” IF there is a reaction AND there is a base on the left side of the reaction AND there is a base on the right side of the reaction AND the first base is stronger than the second base THEN the direction of the reaction is to the right. Textbook Naïve Encoding 1 Naïve Encoding 2
27
Further Refinement of the CPL IF there is a reaction AND there is a base on the left side of the reaction AND there is a base on the right side of the reaction AND the first base is stronger than the second base THEN the direction of the reaction is to the right. Naïve Encoding 2 “The chemical entity whose formula is on the left side of the equation of the reaction and which plays a base role”
28
Final CPL Rule That Worked! IF there is an equation of a reaction AND a first chemical entity has a chemical formula AND the first chemical formula is part of the left side of the equation AND the first chemical entity is playing a base role AND a second chemical entity has a second chemical formula AND the second chemical formula is part of the right side of the equation AND the second chemical entity is playing a base role AND the first chemical entity is stronger than the second chemical entity THEN the direction of the reaction is right [to the right] AND the equilibrium side of the reaction is right. [lies on the right] “the base on the LHS” “the base on the RHS” (means “stronger base than”) (UT’s rep. uses Reaction, but should use Equation)
29
Compare Sentence to Final CPL In every acid-base reaction the position of the equilibrium favors transfer of the proton to the stronger base. IF there is an equation of a reaction AND a first chemical entity has a chemical formula AND a second chemical entity has a second chemical formula AND the first chemical formula is part of the left side of the equation AND the second chemical formula is part of the right side of the equation AND the first chemical entity is playing a base role AND the second chemical entity is playing a base role AND the first chemical entity is stronger than the second chemical entity THEN the direction of the reaction is right AND the equilibrium side of the reaction is right. (There’s a 2 nd rule like this that concludes the direction is left) not actually used!
30
KM Generated from CPL –(_Equation7461 equation-of _Reaction7462) –(|_Chemical Entity7468| has-chemical-formula |_Chemical Formula7469|) –(|_Chemical Formula7469| equal _Part7485) –(_Part7485 is-part-of |_Left Side7483|) –(|_Left Side7483| is-region-of _Equation7461) –(|_Chemical Entity7475| has-chemical-formula |_Chemical Formula7476|) –(|_Chemical Formula7476| equal _Part7494) –(_Part7494 is-part-of |_Right Side7492|) –(|_Right Side7492| is-region-of _Equation7461) –(|_Chemical Entity7468| plays |_Base Role7501|) –(|_Chemical Entity7475| plays |_Base Role7508|) –(|_Chemical Entity7468| stronger-base-than |_Chemical Entity7475|) – –(_Direction7518 value *right) –(_Direction7518 direction-of _Reaction7462) –(|_Equilibrium Side7524| property *right) –(|_Equilibrium Side7524| equilibrium-side-of _Reaction7462) chem. on LHS chem. on RHS THEN IF
31
Structure of the CPL Axioms 1. Find equilibrium side (or direction) of equation 2. Find out if a chemical is playing a base role in the equation 4. Check whether one base is stronger than another base 3. Find out if a chemical is the conjugate base of another chemical 3b. Check whether one formula differs from another in an H+ 3a. Look in Table, or … 4a. Look in Table (not in CPL)
32
Notes on our CPL Rule The wording is way different from the original text! The literal sentence translation would not have produced anything that could solve a problem, given an equation “In every acid-base reaction the position of the equilibrium favors transfer of the proton to the stronger base.” –this would create a Favoring event –the position of the equilibrium is the agent –the transfer of the proton is the object –what does this mean?
33
Overview 4 key pieces of knowledge in the Section: –Computing the direction of the reaction Rewriting in CPL Compare to UT’s KM encoding Compare to ISI’s shallow logical form –Identifying the acids/bases in a reaction –Computing the conjugate of an acid/base –Comparing the strengths of two acids/bases
34
How UT Encoded This "In acid/base equilibrium reactions, the reaction proceeds in the direction of the side where equilibrium lies“ [their comment for use in explanations] (every Reaction has … (direction ( (if (not (the direction of Self)) then (a Direction-Value with (value ((if (the output of (a Compute-Equilibrium-Position with (input (Self)))) then (if ((the output of (a Compute-Equilibrium-Position with (input (Self)))) = (the raw-material of Self)) then *left else *right))))) To find the direction of a reaction… Compute the equilibrium position … If the chemicals match the raw materials Then the direction is left, else right
35
(every Compute-Equilibrium-Position has (input ((a Reaction))) (output ( ;; See if both the strong acid and base are on the LHS. (if (;; Check the acids. ((the output of (a Compare-Relative-Strengths-of-Acids with (input ( (oneof (the raw-material of (the input of Self)) where (the Acid-Role plays of It)) (oneof (the result of (the input of Self)) where (the Acid-Role plays of It)))))) = (oneof (the raw-material of (the input of Self)) where (the Acid-Role plays of It))) and ;; Check the bases. ((the output of (a Compare-Relative-Strengths-of-Bases with (input ( (oneof (the raw-material of (the input of Self)) where (the Base-Role plays of It)) (oneof (the result of (the input of Self)) where (the Base-Role plays of It)))))) = (oneof (the raw-material of (the input of Self)) where (the Base-Role plays of It)))) then (the result of (the input of Self)) else (the raw-material of (the input of Self)))))) If the stronger of… the raw material acid… and the result acid… is the raw material acid… (same for bases) then equilibrium is on the result side else the raw material side UT’s Compute-Equilibrium-Position
36
Notes on UT’s Encoding Very procedural! Various procedural methods are encoded –both qualitative and quantitative Nothing like the textbook sentences Their representation does not match the natural conceptual model we expected –see the next slide
37
Mismatches between UT and CPL UT put a “direction” slot on a Reaction, we expected it to be on an Equation UT has no model of the left and right sides of an Equation, only the “raw-materials” and “result” slots of a Reaction UT has a Conjugate-Acid-Base-Pair concept, but lacks the conjugate-base & conjugate-acid relations we expected UT has no slot for the “equilibrium-side” of an Equation, only the “direction” of a reaction
38
More Mismatches between UT and CPL UT gives us no primitives to use for formula manipulation (adding an H+), it’s buried within their Compute-Conjugate-Acid UT’s model of Formula does not include a “charge” slot, they’ve only attached it to the Chemical itself UT has no notion of “stronger-base-than,” they only label a chemical with “intensity” = strong or weak. So, it would help if the conceptual model were closer to natural language!
39
Overview 4 key pieces of knowledge in the Section: –Computing the direction of the reaction Rewriting in CPL Compare to UT’s KM encoding Compare to ISI’s shallow logical form –Identifying the acids/bases in a reaction –Computing the conjugate of an acid/base –Comparing the strengths of two acids/bases
40
ISI’s Shallow Logical Form for our Sentence from'(e1,e2,x1) & these'(e3,s1,e4) & example'(e4,x1) & plural'(e7,x1,s1) & we'(e8,x2) & plural'(e9,x2,s2) & conclude'(e2,x2,x3,z1) & that'(e10,e2,e11) & in'(e12,e11,x4) & every'(e13,x4,e14) & acid-base'(e15,x4) & reaction'(e14,x4) & the'(e16,x5,e17) & position'(e17,x5) & of'(e18,x5,x6) & the'(e19,x6,e20) & equilibrium'(e20,x6) & favor'(e11,x5,x7,z2) & transfer'(e21,x7) & of'(e22,x7,x8) & the'(e23,x8,e24) & proton'(e24,x8) & to'(e25,x7,x9) & the'(e26,x9,e27) & strong'(e28,x9) & base'(e27,x9) “From these examples we conclude that in every acid- base reaction the position of the equilibrium favors transfer of the proton to the stronger base.”
41
Graph of ISI’s Shallow Logical Form z2 = favor (x5, x7) x5 = position x6 = equilibrium of (x5, x6) x7 = transfer of (x7, x8) x8 = proton to (x7, x9) x9 = base strong (x9) z1 = conclude(x2, x3) x2 = wex3 = [missing!] x1 = example from (x1) ? these that x4 = reaction every(x4)acid-base(x4) in(x4) ? ?
42
Notes on ISI’s Shallow Logical Form Not far removed from a syntactic parse They plan to do much more development of this Will probably produce a literal translation –there will be a Favoring event, with agent & object As with the naïve CPL sentence, a literal translation would not help solve a Chemistry problem
43
Overview 4 key pieces of knowledge in the Section: –Computing the direction of the reaction Rewriting in CPL Compare to UT’s KM encoding Compare to ISI’s shallow logical form –Identifying the acids/bases in a reaction –Computing the conjugate of an acid/base –Comparing the strengths of two acids/bases
45
CPL for 2 nd Key Sentence “In any acid-base (proton transfer) reaction we can identify two sets of conjugate acid-base pairs.” IF there is an equation of a reaction AND a first chemical entity has a chemical formula AND a second chemical entity has a second chemical formula AND the first chemical formula is part of the left side of the equation AND the second chemical formula is part of the right side of the equation AND the first chemical entity is the conjugate base of the second chemical entity THEN the first chemical entity is playing a base role AND the second chemical entity is playing an acid role. (There’s a 2 nd rule like this with first & second reversed)
46
UT Code for 2 nd Key Sentence (every Chemical has (plays ( (if ((the term of (the atomic-chemical-formula of (the has-basic-structural-unit of Self))) and (not (the Base-Role plays of Self))) then (if ((has-value (oneof (the result of (the Reaction raw-material-of of Self)) where (((the elements of (the term of (the atomic-chemical-formula of (the has-basic-structural-unit of It)))) = (forall2 (the elements of (the term of (the atomic-chemical-formula of (the has-basic-structural-unit of Self)))) (if ((the2 of It2) = H) then (:pair ((the1 of It2) + 1) H) else It2))) or... then (a Base-Role) jump to the other side of the equation! Reaction Chemical raw-material result “… has an extra H” “IF one of the chemicals on the other side of the reaction…” “…THEN this chemical’s a base”
47
Overview 4 key pieces of knowledge in the Section: –Computing the direction of the reaction Rewriting in CPL Compare to UT’s KM encoding Compare to ISI’s shallow logical form –Identifying the acids/bases in a reaction –Computing the conjugate of an acid/base –Comparing the strengths of two acids/bases These last two items are presented in a table
48
Conjugate Acid-Base Pairs IF there is an HCl and a Cl-Minus THEN the conjugate base of the HCl is the Cl-minus. IF there is an H3O-Plus and an H2O THEN the conjugate base of the H3O-Plus is the H2O. Etc. CPL Textbook
49
Relative Strengths of Bases IF there is a Cl-Minus and an HSO4-Minus THEN the HSO4-Minus is a stronger base than the Cl-Minus. IF there is a HSO4-Minus and an NO3-Minus THEN the NO3-Minus is a stronger base than the HSO4-Minus. IF there is an NO3-Minus and an H2O THEN the H2O is a stronger base than the NO3- Minus. Etc. CPL Textbook
50
Lessons from Key Sentences - 1 The key sentences did not translate literally into useful logic –they had to be carefully rewritten in CPL –and knowledge was added from studying diagrams and examples –and they were tested with each other to chain together It was difficult to make use of the UT representations –they were very procedural –their representations were further removed from the English –so, we should use more natural representations ISI’s shallow logical forms may produce literal translations –again, not useful for solving problems
51
Lessons from Key Sentences - 2 Reading knowledge directly from a Chemistry text would be very challenging –the knowledge has to be written precisely enough for a computer (with little common sense) to encode –knowledge in tables and diagrams may be critical –the knowledge has to chain together to solve difficult exam problems –we need text that is written much more dryly and precisely –we need a domain that doesn’t have such difficult exam problems
52
Agenda This Seedling and Mobius –Major lessons learned Reformulations in CPL –Whole 5 pages –Key Sentences How do other texts compare? Generics How to identify “important” text Principles for an extensible KB Evaluation discussion Tuples as another source of knowledge
53
Are Other Chemistry Texts Better? We looked at Web explanations and at ‘Chemistry Made Simple’ types of books Discovered that each teacher explains it differently Most jump right into quantitative formulas for computing where a reaction’s equilibrium lies –but our textbook teaches it qualitatively first, which is rare Other sources are not any easier to process
54
Examples of Other Sources “Think of a Bronsted acid-base reaction as a competition between the 2 bases in the system for protons. The stronger base ‘wins” and forces the equilibrium in the direction of the weaker acid and base.” (Web) [some books say that an acid is a proton donor] “… the acid molecule does not ‘give’ or ‘donate’ the proton, it has it taken away. In the same sense, you do not donate your wallet to the pickpocket, you have it removed from you.” (another website) “The base is a molecule with a built-in ‘drive’ to collect protons. As soon as the base approaches the acid, it will (if it is strong enough) rip the proton off the acid molecule and add it to itself.”
55
More Examples from Other Sources “You see, some bases are stronger than others, meaning some have a large ‘desire’ for protons, while other bases have a weaker drive. It’s the same way with acids, some have very weak bonds and the proton is easy to pick off, while other acids have stronger bonds, making it harder to ‘get the proton’.” “Remember that an acid-base reaction is a competition between two bases (think about it!) for a proton. If the stronger of the two acids and the stronger of the two bases are reactants (appear on the left side of the equation), the reaction is said to proceed to a large extent.” Note the heavy use of metaphors in these qualitative explanations! The more readable by humans, the less readable by computers!
56
Agenda This Seedling and Mobius –Major lessons learned Reformulations in CPL –Whole 5 pages –Key Sentences How do other texts compare? Generics How to identify “important” text Principles for an extensible KB Evaluation discussion Tuples as another source of knowledge
57
Agenda This Seedling and Mobius –Major lessons learned Reformulations in CPL –Whole 5 pages –Key Sentences How do other texts compare? Generics How to identify “important” text Principles for an extensible KB Evaluation discussion Tuples as another source of knowledge
58
Review Earlier analysis: –Much of the textbook is “irrelevant” for the purposes of computer-based reading motivational material, illustrative material, humor –Other sentences/parts are critical Questions: –Can a computer automatically find the critical items? –What cues might indicate the important material?
59
This brief analysis… Here, just consider two categories: –important vs. unimportant material Categories of surface cues: –linguistic –context –layout –typography (e.g., font changes) Looked at several text books: –B&L, Chemistry Made Simple, Cliffs Notes
60
Cues for Importance/Unimportance Verb tense: past tense suggests irrelevance –chemical facts are generally presented in the present tense; past tense usually signals a historical digression; but biological facts include evolutionary facts, which require past tense. Cue phrases for important generalizations –“for example” and (less so) “thus” precede examples but follow important generalizations.
61
Cues for Importance/Unimportance Long sentences (>20) suggest irrelevance –Average sentence length for chemistry is about 15 words; biology, ca. 24 words. –15 words seems to allow a good balance of simplicity and complexity for stepping through explanations. CPL should target this number. –Summaries tend to have long complex sentences that are harder to process. Also true for sentences in review texts: Cliffs Notes, Instant Notes.
62
Cues for Importance/Unimportance Everyday words suggest applications. Nominalized verbs suggest irrelevance –exception: basic chemical changes (e.g., reaction, combustion, evaporation) Keywords: –“if”, “when”, “because”, “for” indicate important sentences –“For example” precedes an illustration also indicates stuff prior is an important generality –“although”: typically part of fluffy sentence
63
Cues for Importance/Unimportance Definitional patterns: important! – “x is substance y”, “x is a y that does z”, “x is called y” First and last sentences in a paragraph tend to be important (unless transitional) –set the topic of the paragraph Text in bold or italics is often important Repetition: could this be exploited?
64
Summary Many surface cues exist Could identify important material by –surface cues –“deeper” model of the document structure e.g. Motivation → General principle → Example → Reinforce general principle Could the document automatically be turned into a labeled, networked structure like this? How document-specific are these patterns?
65
Agenda This Seedling and Mobius –Major lessons learned Reformulations in CPL –Whole 5 pages –Key Sentences How do other texts compare? Generics How to identify “important” text Principles for an extensible KB Evaluation discussion Tuples as another source of knowledge
66
Principles for an Extensible KB e.g.: add/modify knowledge (semantics) by (only) adding formulae (syntactics) A formalism is elaboration tolerant to the extent that it is convenient to modify a set of facts expressed in the formalism to take into account new phenomena or changed circumstances. [John McCarty] Elaboration Tolerance: Syntactic simplicity Metonymy-tolerant reasoning Separate procedural and declarative knowledge Three Key Desirables for this:
67
Syntactic Simplicity (every Acid-Role has (intensity ( (a Intensity-Value with (value ( (:pair ;; Case statement for Acids. (if ((the played-by of Self) isa Ionic-Compound-Substance) then (if (((the played-by of Self) isa HCl-Substance) or ((the played-by of Self) isa HBr-Substance) or ((the played-by of Self) isa HI-Substance) or ((the played-by of Self) isa HClO3-Substance) or ((the played-by of Self) isa HClO4-Substance) or ((the played-by of Self) isa H2SO4-Substance) or ((the played-by of Self) isa HNO3-Substance)) then *strong else Not elaboration-tolerant Many syntactically large and complex structures in the original Halo KB, e.g.,
68
Syntactic Simplicity Better would be to factor them smaller units, e.g., intensity(HCl-Substance, *strong) intensity(HBr-Substance, *strong) intensity(HI-Substance, *strong) intensity(HClO3-Substance, *strong) intensity(HClO4-Substance, *strong) intensity(H2SO4-Substance, *strong) intensity(HNO3-Substance, *strong) … intensity(HF-Substance, *weak) intensity(HC2H3O2-Substance, *weak) intensity(H2CO3-Substance, *weak) … Elaboration-tolerant
69
CPL Produces Syntactically Simple Structures… IF(_Intensity9 instance-of Intensity-Value) (_Chemical8 instance-of Chemical) (_Intensity5 instance-of Intensity-Value) (_Chemical4 instance-of Chemical) (_Intensity5 property *strong) (_Intensity5 intensity-of _Chemical4) (_Intensity9 property *weak) (_Intensity9 intensity-of _Chemical8) THEN(_Chemical4 stronger-than _Chemical8) (every Compare-Relative-Strengths-of-Acids has (output ((if ((the intensity of (the first of (the Chemicals)) = *strong) and ((the intensity of (the second of (the Chemicals)) = *weak) then (the strongest of (the Chemicals)) = (the first of (the Chemicals))))) “Traditional” KM: CPL triples:
70
Metonymy/Loosespeak Metonymy: One word substitutes for a closely related word Loosespeak: More generally, the “literal” interpretation is wrong Examples: –“The kettle is boiling.” –“I’m just going to change the washing machine.” –“It’s your turn to clean out the rabbit.” –“Remove a proton from the acid” –“The acid on the left of the equation” –“The reaction moves to the right” –“NaCl dissolves in water”
71
Handling Metonymy/Loosespeak 1. Detect inconsistencies / “unusualities” Need extensive world knowledge for this 2. If found, create and evaluate alternative interpretations –Metonymic transformation rules (e.g., Lakoff, Fass) PART for WHOLE (“Get your butt over here”) PLACE for INSTITUTION (“The White House isn’t saying anything”) PLACE for EVENT (“Remember the Alamo”) SUBSTANCE for MOLECULE (“NaCl dissolves”) FORMULA for SUBSTANCE (“NaCl is on the left of the eqn”)
72
Metonymy Tolerance (“Loosespeak”) Could greatly reduce syntactic complexity –~50% of HaloKB is doing type conversions Example of extensive metonymy: basic-unit “HC 2 H 3 O 2 (aq)+…C 2 H 3 O 2 - ” formula
73
(every Compare-Relative-Strengths-of-Acids has (output ((if (((the1 of (the value of (the intensity of (the Acid-Role plays of (the first of (the input of Self)))))) = *strong) and ((the1 of (the value of (the intensity of (the Acid-Role plays of (the second of (the input of Self)))))) /= *strong)) then (the first of (the input of Self))))) (every Compare-Relative-Strengths-of-Acids has (output ((if ((the intensity of (the first of (the Chemicals)) = *strong) and ((the intensity of (the second of (the Chemicals)) /= *strong) then (the strongest of (the Chemicals)) = (the first of (the Chemicals))))) Metonymy Tolerance if we had a metonymy-tolerant reasoner, we could instead write…
74
Separating Procedural and Declarative Knowledge Procedural descriptions are uni-directional, and difficult to introspect on Better: domain-specific, declarative knowledge + general-purpose procedural algorithms “Every acid has a conjugate base, formed by removing a proton from the acid.... Similarly, every base has associated with it a conjugate acid, formed by adding a proton to the base.” Acid-Chemical = Base-Chemical + H+
75
(every Compare-Relative-Strengths-of-Acids has (output ((if (((the1 of (the value of (the intensity of (the Acid-Role plays of (the first of (the input of Self)))))) = *strong) and ((the1 of (the value of (the intensity of (the Acid-Role plays of (the second of (the input of Self)))))) /= *strong)) then (the first of (the input of Self)) [Compare-Relative-Strengths-of-Acids-output-1] ))) Declarative Procedural (PSM) Mixed HCl *strong H 2 CO 3 *weak… + Find object(s) with qualitatively largest attribute value *strong > *weak > … + “Theory of magnitudes” Separating Procedural and Declarative Knowledge
76
Agenda This Seedling and Mobius –Major lessons learned Reformulations in CPL –Whole 5 pages –Key Sentences How do other texts compare? Generics How to identify “important” text Principles for an extensible KB Evaluation discussion Tuples as another source of knowledge
77
Possible Quantitative Metrics Behavioral: –Ablation study: Question-answering performance Analytic: –Complexity of CPL vs Halo KB encodings –Amount of domain K added by Boeing in writing CPL –% of Halo KB that would be simplified if metonymy handled –% of original text encodable in CPL –Time taken to encode the KBs –% of source text which is important (vs. fluff) –Bar graph of textual phenomena vs. frequency of occurrence e.g., metaphor, examples, metonymy, diagrams –Measure of redundancy in the text book
78
Behavioral Evaluation: Ablation Methodology Approach: a. Create set of questions b. Send qns to Halo KB, measure % correct c. Ablate the Halo KB, add in ours d. Send qns to new KB, measure % correct Issues: How to ensure a fair comparison? –defining the space of questions to look at How to ablate the UT KB?
79
Behavioral Evaluation: Relevant AP Questions from the Halo Pilot Questions from Halo Pilot Syllabus & Sample Qns Q10. Given an equilibrium reaction, which species in the reaction act as bases? Q33. Each of the following can act as both a Bronsted acid and a Bronsted base EXCEPT... Questions from Challenge Exam, Project Halo Q18. Given an equilibrium reaction, the species that act as acids include which of the following? Q19. Given an equilibrium reaction, the correct acid/conjugate base pair is... Q37. Which of the following species forms an acid when added to water? Q38. Which of the following (lists of chemicals) is in correct order of increasing acidity?
80
Behavioral Evaluation: Variations on a Theme… Four main question patterns: –What is the conjugate base/acid of X? –Is X stronger/weaker acid/base than Y? –Find the conjugate acid-base pairs in equation E –What is the direction of the equilibrium?
81
Core Knowledge Encodings Conjugate pairs Relative strengths Labelling acid/bases in a reaction Computing direction of the reaction Giant KM procedure for formula manipulation Qualitative absolute strengths (strong/weak/negligible) + qualitative comparison Giant KM procedure for reaction manipulation KM rule TaskHalo KB Lookup table Relative strength assertions if-then rule using conjugate pairs if-then rule CPL
82
Core Knowledge Encodings Conjugate pairs Relative strengths Labelling acid/bases in a reaction Computing direction of the reaction Giant KM procedure for formula manipulation Qualitative absolute strengths (strong/weak/negligible) + qualitative comparison Giant KM procedure for reaction manipulation KM rule TaskHalo KB Lookup table Relative strength assertions if-then rule using conjugate pairs if-then rule CPL More general ≈ ≈ (equivalent)
83
Behavioral Evaluation: Discussion Points We can predict the outcome of any evaluation –can see the internals of each system So what is a fair sample set? –Generate instantiations of the 4 templates? –AP exam questions? –Extend to cover other knowledge in the 5 pages? none of it contained in Halo KB
84
Analytic Evaluation Possible Metrics include: –Complexity of CPL vs Halo KB encodings –Amount of domain K added by us in writing CPL –% of KB simplified if metonymy handled –% of original text encodable in CPL –Time taken to encode the KBs –% of source text which is important (vs. fluff) –Bar graph of textual phenomena vs. frequency e.g., metaphor, examples, metonymy, diagrams –Measure of redundancy in the text book
85
Agenda This Seedling and Mobius –Major lessons learned Reformulations in CPL –Whole 5 pages –Key Sentences How do other texts compare? Generics How to identify “important” text Principles for an extensible KB Evaluation discussion Tuples as another source of knowledge
86
Knowledge Mining There is a largely untapped source of general knowledge in texts, lying at a level beneath the explicit assertional content, and which can be harnessed. “The camouflaged helicopter landed near the embassy.” helicopters can land helicopters can be camouflaged Schubert’s Conjecture: Our attempt: “lightweight” LFs generated from Reuters LF forms: (S subject verb object (prep noun) (prep noun) …) (NN noun … noun) (AN adj noun)
87
Knowledge Mining HUTCHINSON SEES HIGHER PAYOUT. HONG KONG. Mar 2. Li said Hong Kong’s property market remains strong while its economy is performing better than forecast. Hong Kong Electric reorganized and will spin off its non-electricity related activities. Hongkong Electric shareholders will receive one share in the new subsidiary for every owned share in the sold company. Li said the decision to spin off … Newswire Article Shareholders may receive shares. Companies may be sold. Shares may be owned. Implicit, tacit knowledge
88
Knowledge Mining – our attempt ;; Atoms can combine (S "atom" "combine") ;; For example, combustion reactions are redox reactions because elemental oxygen is converted to compounds of oxygen (Section 3.2). (S "reaction" "be" "reaction") (S-ADJ "oxygen" "converted" ("to" "compound")) (AN "elemental" "oxygen") ;; Plan: Metals react with acids to form salts and gas. (S "metal" "react" (PP "with" "acid")) ;; Extensive oxidation can lead to the failure of metal machinery parts or the deterioration of metal structures. (S "oxidation" "lead" (PP "to" "failure")) (S "oxidation" "lead" (PP "to" "deterioration")) (AN "extensive" "oxidation") Fragment of the raw data (Brown & Lemay)
90
Summary Q3: –Completed coding of key sentences in CPL –Demonstration of inference with that knowledge –Study of cues for identifying important text –Assembly of key lessons learned –Interaction with ISI –Exploration of shallow knowledge extraction Q4 –Finish interpretation of additional sentences –Assemble qualitative and quantitive evaluations –Continue interaction with ISI: Side-by-side study –Final report
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.