Learning to Transform Natural to Formal Language Presented by Ping Zhang Rohit J. Kate, Yuk Wah Wong, and Raymond J. Mooney
May 13th, Overview Background SILT CL ANG and G EOQUERY Semantic Parsing using Transformation rules String-based learning Tree-based learning Experiments Future work Conclusion
May 13th, Natural Language Processing (NLP) Natural Language—human language. English The reason to process NL: To provide a much user-friendly interface Problems: NL is too complex. NL has many ambiguities. Until now, NL cannot be used to program a computer.
May 13th, Classification of Language Traditionally classification (Chomsky Hierarchy) Regular grammar Context-free grammar—Formal Language Context-sensitive grammar Unrestricted grammar—Natural Language All programming languages are less flexible than context-sensitive languages currently. For example, C++ is a restricted context-sensitive language.
May 13th, An Approach to process NL Map a natural language to a formal query or command language. Therefore, NL interfaces to complex computing and AI systems can be more easily developed. English Formal Language Map Compiler Interpreter
May 13th, Grammar Terms Grammar: G = (N, T, S, P) N: finite set of Non-terminal symbols T: finite set of Terminal symbols S: Starting non-terminal symbol, S ∈ N P: finite set of productions Production: x->y For example, Noun -> “computer” AssignmentStatement -> i := 10; Statements -> Statement; Statements
May 13th, SILT SILT—Semantic Interpretation by Learning Transformations Transformation rules Map substrings in NL sentences or subtrees in their corresponding syntactic parse trees to subtrees of the formal-language parse tree. SILT learns transformation rules from training data—pairs of NL sentences and manual translated formal language statements. Two target formal languages: CL ANG G EOQUERY
May 13th, CL ANG A formal language used in coaching robotic soccer in the RoboCup Coach Competition. C LANG grammar consists of 37 non-terminals and 133 productions. All tactics and behaviors are expressed in terms of if-then rules An example: ( (bpos (penalty-area our) ) (do (player-except our {4} ) (pos (half our) ) ) ) “If the ball is in our penalty area, all our players except player 4 should stay in our half.”
May 13th, G EOQUERY A database query language for a small database of U.S. geography. The database contains about 800 facts. Based on Prolog with meta-predicates augmentations. An example: answer(A, count(B, (city(B), loc(B, C), const(C, countryid(usa) ) ),A) ) “How many cities are there in the US?”
May 13th, Two methods String-based transformation learning Directly maps strings of the NL sentences to the parse tree of formal languages Tree-based transformation learning Maps subtrees to subtrees between two languages. Assumes the syntactic parse tree and parser of the NL sentences are provided
May 13th, Semantic Parsing Pattern matching Patterns found in NL Templates based on productions NL phrases Formal expression Rule representation for two methods “TEAM UNUM has the ball” CONDITION →(bowner TEAM {UNUM}) S NP TEAM UNUM VP VBZ has NP DT the NN ball
May 13th, Examples of Parsing 1. “If our player 4 has the ball, our player 4 should shoot.” 2. “If TEAM UNUM has the ball, TEAM UNUM should ACTION.” our 4 our 4 (shoot) 3. “If CONDITION, TEAM UNUM should ACTION.” (bowner our {4}) our 4 (shoot) 4. “If CONDITION,DIRECTIVE.” (bowner our {4}) (do our {4} (shoot) ) 5. RULE ( (bowner our {4}) (do our {4} (shoot) ))
May 13th, Variations of Rule Representation SILT allows patterns to skip some words or nodes “if CONDITION, DIRECTIVE.” -> ”then” To deal with non-compositionality SILT allows to apply constrains “in REGION” matches “CONDITION -> (bpos REGION)” if “in REGION” follows “the ball ”. SILT allows to use templates with multi productions “TEAM player UNUM has the ball in REGION” CONDITION → (and (bowner TEAM UNUM) (bpos REGION))
May 13th, Learning Transformation Rules Input: A training set T of NL sentences paired with formal representations; a set of productions in the formal grammar Output: A learned rule base L Algorithm: Parse all formal representations in T using. Collect positive P and negative examples N for all ∈. L = ∅ Until all positive examples are covered, or no more good rules can be found for any ∈, do: R’ = FindeBestRules(,P,N) L = L ∪ R’ Apply rules in L to sentences in T. Given a NL sentence S: P: if is used in the formal expression of S, then S is positive to N: if is not used in the formal expression of S, then S is negative to
May 13th, Issues of SILT Learning Non-compositionality Rule cooperation Rules are learn in order. Therefore an over-general ancestor will lead to a group of over-general child rules. Further, no rule can cooperate with that kind of rules. Two approaches can solve: 1. Find the single best rule for all competing productions in each iteration. 2. Over generate rules; then find a subset which can cooperate
May 13th, FindBestRule() For String-based Learning Input: A set of productions in the formal grammar; sets of positive P and negative examples N for each in Output: The best rule BR Algorithm: R = ∅ For each production π ∈ Π : Let R π be the maximally-specific rules derived from P. Repeat for k = 1000 times: Choose r1, r2 ∈ R π at random. g = GENERALIZE(r1, r2, π) Add g to R. R = R ∪ R BR = argmax r ∈ R goodness(r) Remove positive examples covered by BR from P.
May 13th, FindBestRule() Cont. Goodness (r) GENERALIZE r1, r2 : two transformation rules based on the same production For example: π : Region -> (penalty-area TEAM) pattern 1: TEAM ‘s penalty box pattern 2: TEAM penalty area Generalization: TEAM penalty
May 13th, Tree-based Learning Similar FindBestRules() algorithm GENERALIZE Find the largest common subgraphs of two rules. For example: π : Region -> (penalty-area TEAM) Pattern 1 Pattern 2 Generalization NN penalty NP TEAM POS ‘s NN box PRP$ TEAM NN area NP NN penalty NP, TEAM TEAM NN penalty NN
May 13th, Experiment As for CL ANG 300 pieces selected randomly from log files of 2003 RoboCup Coach Competition. Each formal instruction was translated into English by human. Average length of a NL sentence is words. As for G EOQUERY 250 questions were collected from undergraduate students. All English queries were translated manually. Average length of a NL sentence is 6.87 words.
May 13th, Result for CL ANG
May 13th, Result for CL ANG (Cont.)
May 13th, Result for G EOQUERY
May 13th, Result for G EOQUERY (Cont.)
May 13th, Time Consuming Time consuming in minutes.
May 13th, Future Work Though improved, SILT still lacks robustness of statistical parsing. The hard-matching symbolic rules of SILT are sometimes too brittle. A more unified implementation of tree-based SILT which allows to directly compare and evaluate the benefit of using initial syntactic parsers.
May 13th, Conclusion A novel approach, SILT, can learn transformation rules that maps NL sentences into a formal language. It shows better overall performance than previous approaches. NLP, still a long way to go.
May 13th, Thank you! Questions or comments?