Download presentation
Presentation is loading. Please wait.
1
THE LEARNING AND USE OF GRAPHICAL MODELS FOR IMAGE INTERPRETATION Thesis for the degree of Master of Science By Leonid Karlinsky Under the supervision of Professor Shimon Ullman
2
Introduction
4
Part I: Part I: MaxMI Training
5
Classification subset of “ trained ” features Goal: Classify C, using a subset of “ trained ” features - F on new examples with minimum error Training tasks: Best F Best Efficient model More … Best = Maximal MI
6
MaxMI Training - The Past Model: simple “ Flat ” structure, NCC thresholds Training: Features and thresholds selected one by one 123 456 Cond. independence in C increased MI upper bound More …
7
MaxMI Training – Our Approach modelalltogether Learn model and all together maximizing:
8
MaxMI Training – Learning MaxMI: Decompose MI EfficientlyGDL Efficiently learn parameters using GDL Maximize all for all together More …
9
MaxMI Training – Assumptions 1.TAN model structure 1.TAN model structure – Tree Augmented Na ï ve Bayes [Friedman, 97] 2.Feature Tree (FT) 2.Feature Tree (FT) – can remove C preserving the feature tree.
10
MaxMI Training – TAN and 1.TAN structure is unknown 2.Learn and TAN s.t.: is maximized. Asymptotic correctness FT holds Efficiency
11
MaxMI Training – MaxMI hybrid
12
More … [Chow & Liu, 68] MaxMI: [Friedman, 97]
13
MaxMI Training – MaxMI hybrid Convergent algorithm: TAN More …
14
MaxMI Training – empirical results More …
15
MaxMI Training – empirical results More …
16
MaxMI Training – Generalizations Train any parameters Any low-TREEWIDTH structure Even without assumptions:
17
Back to the Goals
18
Part II: Part II: Loopy MAP approximation
19
Loopy network example Want to solve MAP: NP-hard in general! [Cooper 90, Shimony 94]
20
Our approach – opening loops maximize Now, we can maximize: legal The assignment is legal for the loopy problem if:
21
Our approach – opening loops Legally Legally maximize: Can maximize unrestricted: Usually slow connections Our solution – slow connections
22
Our approach – slow connections Fix z=Z legalize Now legalize and return to step one. Maximize-and-Legalize Iterate until convergence. This is the Maximize-and-Legalize algorithm. Maximize Maximize (loop-free, use GDL):
23
Our approach – slow connections When will this work? The intuition: The intuition: z-minor Strong z-minor Strong z-minor Weak z-minor Weak z-minor global maximumsingle step global maximum – single step local optimumseveral steps local optimum – several steps
24
Making the assumptions true Selecting z-variables The intuition: The intuition: recursive z-selection strong z-minor Recursive strong z-minor: single step, global maximum! Recursive weak z-minor: iterations, local maximum. Different / Same speed Remove – Contract – Split algorithm More …
25
Making the assumptions true Approximating the function The intuition: The intuition: recursively “ chip away ” small parts of the function More …
26
Existing approximation algorithms Clustering Clustering: triangulation [Pearl, 88] Loopy Belief Revision [McEliece, 98] Loopy Belief Revision [McEliece, 98] Bethe-Kikuchi Free-Energy Bethe-Kikuchi Free-Energy: CCCP [Yuille, 02] Tree Re-Parametrization (TRP) [Wainwright, 03] Tree Re-Parametrization (TRP) [Wainwright, 03]
27
Experimental Results More …
28
Experimental Results More …
29
More… Maximum MI vs. Minimum P E More …
32
Classification Specifics How do we classify a new example? What are “the best” features and parameters? Why maximize MI? MAP: Maximize MI: More reasons – if time permits Tightly related to P E Back …
33
MaxMI Training - The Past - Reasons Why did it work? Conditional independence in C What was missing? Increased MI upper bound Conditional independence in C was assumed! Conditional independence in C was assumed! 123 456 Maximizing the “ whole ” MI. Maximizing the “ whole ” MI. Learning model structure. Learning model structure. Back …
34
MaxMI Training – JT JTTAN JT structure = TAN structure TREEWIDTH GDL - exponential in TREEWIDTH TREEWIDTH = 2 Back …
35
MaxMI Training – EM Why not EM? static training data EM assumes static training data! Not true in our scenario! [Redner, Walker, 84] EM algorithm: Training CPTs with EM Back …
36
MaxMI Training – MaxMI hybrid solution [Chow, Liu 68] “ Best ” Feature Tree [Friedman, et al. 97] “ Best ” TAN Back … [We, 2004] Maximal MI
37
MaxMI Training – MaxMI hybrid solution Increase: ? ICR Non-decrease: TAN Asymptotic correctness Back …
38
MaxMI Training – MaxMI hybrid Back …
39
MaxMI Training – empirical results Before training: After training: Back …
40
MaxMI Training – empirical results Back …
41
MaxMI Training – empirical results Error rate on training DB Error rate on test DB MI model to class on training DB Class entropy on training DB Training DB Size Test DB SizeFace Parts Model 251350.7582424640.7926908347672257MaxMI Training 351360.7224293520.7926908347672257Original Training Miss=15, FA=3 Miss=62, FA=360.7568551680.7926908347672257 MaxMI Training with constrained TAN restructure Miss=16, FA=3 Miss=30, FA=440.7465169130.7926908347672257 MaxMI Training with greedy TAN restructure N / A Miss=33, FA=1090.747114840.7926908347672257 Alternative MaxMI Training with TAN restructure Miss=30, FA=5 Miss=84, FA=460.7386769810.7926908347672257 Threshold only training (without restructure) N / A67N / A0.7926908347672257 Observed & Un-observed model training constructed from the all- observed model and soft EM Back …
42
MaxMI Training – empirical results Error rate on training DB Error rate on test DB MI model to class on training DB Class entropy on training DB Training DB Size Test DB SizeCow Parts Model Miss=36, FA=16 Miss=84, FA=64N / A0.465356639612256Original Training Miss=25, FA=17 Miss=53, FA=42N / A0.465356639612256MaxMI Training Miss=17, FA=12 Miss=32, FA=48N / A0.465356639612256 MaxMI Training with constrained TAN restructure Miss=23, FA=16 Miss=59, FA=30N / A0.465356639612256 MaxMI Training with greedy TAN restructure N /A89N / A0.465356639612256 Observed & Un-observed model training constructed from the all- observed model and trained using soft EM Back …
43
Remove – Contract – Split Back …
44
Making the assumptions true Approximating the function Strong z-minor Strong z-minor Challenge: Challenge: selecting proper Z constants Benefit: Benefit: single step convergence Weak z-minor Weak z-minor Drawback: Drawback: exponential in number of “ chips ” Benefit: Benefit: less restrictive Back …
45
The clique tree Back …
46
Experimental Results A2 (same "slow" speed) Sample CountValue CountNode CountModel Size Average Match (%) Average Mismatch Average Approximation 50.31%15-1694.11%1000431 Depth=3, Branching=5 63.70%11-1294.55%1000331 Depth=3, Branching=5 84.60%4-597.16%1000231 Depth=3, Branching=5 93.62%1-298.34%~2000225 Based on Natural feature trees, 4 cliques of size 7 A2 (different "slow" speed) Sample CountValue CountNode CountModel Size Average Match (%) Average Mismatch Average Approximation 65.22%10-1198.26%1000431 Depth=3, Branching=5 74.51%7-898.08%1000331 Depth=3, Branching=5 88.62%3-498.55%1000231 Depth=3, Branching=5 86.14%3-497.85%~2000225 Based on Natural feature trees, 4 cliques of size 7
47
Experimental Results Random Slow Connections Sample CountValue CountNode CountModel Size Average Match (%) Average Mismatch Average Approximation 34.58%20-2182.70%1000431 Depth=3, Branching=5 45.48%16-1781.52%1000331 Depth=3, Branching=5 62.23%11-1279.37%1000231 Depth=3, Branching=5 N/A ~2000225 Based on Natural feature trees, 4 cliques of size 7 Loopy Belief Revision (50 messages per node) Sample CountValue CountNode CountModel Size Average Match (%) Average Mismatch Average Approximation N/A 1000431 Depth=3, Branching=5 55.31%13-1489.17%1000331 Depth=3, Branching=5 72.80%8-988.73%1000231 Depth=3, Branching=5 87.73%3-493.34%~2000225 Based on Natural feature trees, 4 cliques of size 7
48
Experimental Results Loopy Belief Revision (10 messages per node) Sample CountValue CountNode CountModel Size Average Match (%) Average Mismatch Average Approximation 41.95%17-1887.65% 1000431 Depth=3, Branching=5 54.02%14-1586.74%1000331 Depth=3, Branching=5 71.80%8-985.78%1000231 Depth=3, Branching=5 N/A ~2000225 Based on Natural feature trees, 4 cliques of size 7 Ignore Sibling Loopy Links Sample CountValue CountNode CountModel Size Average Match (%) Average Mismatch Average Approximation 29.25%21-2274.04%1000431 Depth=3, Branching=5 38.56%19-2071.89%1000331 Depth=3, Branching=5 56.09%13-1469.38%1000231 Depth=3, Branching=5 63.88%9-1073.45%~2000225 Based on Natural feature trees, 4 cliques of size 7 Back …
50
MaxMI Training – extensions Observed and unobserved model. MaxMI augmented to support O&U Training observed only + EM heuristic. Complete training Constrained and greedy TAN restructure. MaxMI vs. MinP E in ideal scenario – characterization and comparison. Future research directions
51
MaxMI vs. MinP E MinP E : MaxMI: Fano & inverse Fano (binary C): Back …
52
MaxMI vs. MinP E – ideal scenario MinP E : MaxMI: Setting: n-valued C, k-valued F. Arrange: Select F: Divide: Select F: Back …
53
MaxMI vs. MinP E – ideal scenario In general MaxMI MinP E In special cases MaxMI MinP E With increase in number of guesses: Implications: Back …
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.