Download presentation
Presentation is loading. Please wait.
Published byBruce McDowell Modified over 9 years ago
1
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Unsupervised Learning of Probabilistic Context-Free Grammar Using Iterative Biclustering Kewei Tu and Vasant Honavar Artificial Intelligence Research Laboratory Department of Computer Science Iowa State University www.cs.iastate.edu/~honavar/aigroup.html www.cild.iastate.edu
2
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Unsupervised Learning of Probabilistic Context-Free Grammar Greedy search to maximize the posterior of the grammar given the corpus Iterative (distributional) biclustering Competitive experimental results
3
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Outline Introduction Probabilistic Context Free Grammars (PCFG) The Algorithm based on Iterative Biclustering (PCFG-BCL) Experimental results
4
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Outline Introduction Probabilistic Context Free Grammars (PCFG) The Algorithm based on Iterative Biclustering (PCFG-BCL) Experimental results
5
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Motivation Probabilistic Context-Free Grammar (PCFG) find applications in many areas including: Natural Language Processing Bioinformatics Important to learn PCFG from data (training corpus) Labeled corpus not always available Hence the need for unsupervised learning
6
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Task Unsupervised learning of a PCFG from a positive corpus a square is above the triangle the square rolls a triangle rolls the square rolls a triangle is above the square a circle touches a square the triangle covers the circle …… S NP VP NP Det N VP Vt NP (0.3) | Vi PP (0.2) | rolls (0.2) | bounces (0.1) ……
7
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Outline Introduction Probabilistic Context Free Grammars (PCFG) The Algorithm based on Iterative Biclustering (PCFG-BCL) Experimental results
8
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar PCFG Context-free Grammar (CFG) G = (N, Σ, R, S) N: non-terminals Σ: terminals R: rules S N : the start symbol Probabilistic CFG Probabilities on grammar rules
9
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar P-CNF Probabilistic Chomsky normal form (P-CNF) Two types of rules: A BC A a
10
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar The AND-OR form P-CNF in the AND-OR form Two types of non-terminals: AND, OR AND OR 1 OR 2 OR A 1 | A 2 | a 1 | a 2 | …… with probabilities
11
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar The AND-OR form P-CNF in the AND-OR form
12
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar The AND-OR form P-CNF in the AND-OR form can be divided into two parts Start rules S … A set of AND-OR groups Each group: AND OR1 OR2 Bijection between ANDs and groups An OR may appear in multiple groups
13
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar The AND-OR form P-CNF in the AND-OR form can be divided into two parts
14
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Outline Introduction Probabilistic Context Free Grammars (PCFG) The Algorithm based on Iterative Biclustering (PCFG-BCL) Experimental results
15
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar PCFG-BCL: Outline Start with only the terminals Repeat the two steps Learn a new AND-OR group by biclustering Attach the new AND to existing ORs Post-processing: add start rules In principle, these steps are sufficient for learning any CNF grammar
16
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar PCFG-BCL: Outline Find new rules that yield the greatest increase in the posterior of the grammar given the corpus Local search, with the posterior as the objective function Use a prior that favors simpler grammars to avoid overfitting
17
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar PCFG-BCL Repeat the two steps Learn a new AND-OR group by biclustering Attach the new AND to existing ORs Post-processing: add start rules
18
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Intuition Construct a table T Index the rows and columns by symbols appearing in the corpus The cell at row x and column y records the number of times the pair xy appears in the corpus
19
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar An AND-OR group corresponds to a bicluster
20
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar The bicluster is multiplicatively coherent for any two rows i,j and two columns k,l
21
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Expression-context matrix of a bicluster Each row: a symbol pair contained in the bicluster Each column: a context in which the symbol pairs appear in the corpus It’s also multiplicatively coherent.
22
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Intuition If there’s a bicluster that is multiplicatively coherent and has a multiplicatively coherent expression-context matrix Then an AND-OR group can be learned from the bicluster
23
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Probabilistic Justification Change in likelihood as a result of adding an AND-OR group to a PCFG Bicluster multiplicative coherence Expression-context matrix multiplicative coherence
24
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Prior To prevent overfitting, use a prior that favors simpler grammars P(G) 2 DL(G) DL(G) is the description length of the grammar
25
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Learning a new AND-OR group by biclustering find in the table T a bicluster that leads to the maximal posterior gain create a new AND-OR group from the bicluster reduce the corpus using the new rules E.g., “the circle” is rewritten to the new AND symbol update T A new row and column are added for the new AND symbol
26
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar PCFG-BCL Repeat the two steps Learn a new AND-OR group by biclustering Attach the new AND to existing ORs Post-processing: add start rules
27
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Attaching the new AND under existing ORs For the new AND symbol N … There may exist OR symbols in the learned grammar, s.t. O N is in the target grammar Such rules can't be learned in the biclustering step When learning O, N doesn’t exist When learning N, only learn N AB We need an additional step to find such rules Recursion is learned in this step
28
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Intuition Adding rule O N = adding a new row/column to the bicluster If O N is true, then the expanded bicluster is multiplicatively coherent the expanded expression-context matrix is multiplicatively coherent If we find an OR symbol s.t. the expanded bicluster has this property Then a new rule O N can be added to the grammar
29
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Probabilistic Justification Likelihood gain is an approximation of the expanded bicluster To prevent overfitting, the prior is also considered
30
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Attaching the new AND under existing ORs Try to find OR symbols that lead to large posterior gain When found add the new rule O N to the grammar do a maximal reduction of the corpus update the table T
31
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar PCFG-BCL Repeat the two steps Learn a new AND-OR group by biclustering Attach the new AND to existing ORs Post-processing: add start rules
32
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Postprocessing For each sentence in the corpus: If it’s fully reduced to a single symbol x, then add S x If not, a few options… Return the grammar
33
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Outline Introduction Probabilistic Context Free Grammars (PCFG) The Algorithm based on Iterative Biclustering (PCFG-BCL) Experimental results
34
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Experiments Measurements weak generative capacity precision, recall, F-score Test data artificial, English-like CFGs
35
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Experiment results P=Precision, R=Recall, F=F-score Number in the parentheses: standard deviation PCFG-BCL outperforms EMILE and ADIOS with lower standard deviations [Adriaans, et al., 2000][Solan, et al., 2005]
36
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Summary An unsupervised PCFG-learning algorithm It acquires new grammar rules by iterative biclustering on a table of symbol pairs In each step it tries to maximize the increase of the posterior of the grammar Competitive experimental results
37
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Work in progress Alternative strategies for optimizing the objective function Evaluation on and adaptation to real world applications (e.g., natural language), wrt. both weak and strong generative capacity
38
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Thank you~
39
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Backup…
40
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Step 1 Bicluster multiplicative coherence E-C matrix multiplicative coherence Prior gain (bias towards large BC) Likelihood Gain Posterior gain:
41
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Step 2 Intuition Remember O is learned by extracting a bicluster adding rule O N = adding a new row/column to the bicluster
42
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Expanding the bicluster The expanded bicluster should still be multiplicatively coherent
43
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Step 2 Intuition Expression-context matrix adding rule O N = adding a set of new rows to the E-C matrix
44
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Expanding the expression-context matrix The expanded expression-context matrix should still be multiplicatively coherent.
45
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Step 2 Likelihood gain: : the expected numbers of appearance of the symbol pairs when applying the current grammar to expand the current partially reduced corpus.
46
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Grammar selection/averaging Run the algorithm for multiple times to get multiple grammars Use the posterior of the grammars to do model selection/averaging Experimental results: Improved the performance Decreased the standard deviations
47
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Time Complexity N: # of ANDs k: average # of rules headed by an OR c: average column# of Expr-Cont Matrix h: average # of ORs that produce an AND or terminal d: a recursion depth limit ω: sentence# in the corpus m: average sentence length
48
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar biclustering vs. distributional clustering V1 makes | likes V2 likes | is Figure from [Adriaans, et al., 2000]
49
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar biclustering vs. substitutability heuristic N1 tea | coffee N2 eating Figure from [Adriaans, et al., 2000]
50
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar
51
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar A set of multiplicatively coherent biclusters, which represent a set of AND-OR groups in the grammar.
52
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Related work Unsupervised CFG learning EMILE [Adriaans et al., 2000] ABL [Zaanen, 2000] [Clark, 2001; 2007] ADIOS [Solan et al., 2005] Main difference Distributional biclustering A unified method for different types of rules
53
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Related work Unsupervised PCFG learning Inside-outside [Stolcke&Omohundro, 1994] [Chen 1995] [Kurihara&Sato, 2004; 2006] [Liang et al., 2007] Main difference Different prior Structure search method
54
Iowa State University Department of Computer Science, Iowa State University Artificial Intelligence Research Laboratory Center for Computational Intelligence, Learning, and Discovery CCILD Talk presented at ICGI 2008, St Malo, France, September 2008. Kewei Tu and Vasant Honavar Related work Unsupervised parsing (not CFG) [Klein&Manning, 2002; 2004] U-DOP [Bod, 2006]
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.