Download presentation
Presentation is loading. Please wait.
Published byBonnie Hall Modified over 9 years ago
1
© 2009 Robert Hecht-Nielsen. All rights reserved. 1 Andrew Smith University of California, San Diego 10.14.09 Building a Visual Hierarchy
2
© 2009 Robert Hecht-Nielsen. All rights reserved. 2 Outline Building A Visual Hierarchy Learning layer-by-layer Inference – filling in a missing segment of an image Examples \ Applications/Products & Future work
3
© 2009 Robert Hecht-Nielsen. All rights reserved. 3 Choosing an appropriate problem We want to: Model human visual processes. Understand vision in terms of Confabulation Theory. Build practical applications. Begin basis for much deeper research. Answer: Build image modeling system. Represent images in terms of textural components (low statistical order). Represent images as symbolic (discrete) tuples.
4
© 2009 Robert Hecht-Nielsen. All rights reserved. 4 Machine Vision vs. Biological Vision Machine Vision Pixels --- local representation. Orthogonal Biological Vision Filter/Feature responses Massively overcomplete/non-orthogonal
5
© 2009 Robert Hecht-Nielsen. All rights reserved. 5 Confabulation & vision (Pixels → Modules & Symbols) Features (symbols) develop in a layer of the hierarchy as commonly seen inputs from their inputs. Knowledge links are simple conditional probabilities between symbols: p( | ) where and are symbols in connected modules All knowledge can therefore be learned by simple co-occurrence counting. p( | ) = C( , ) / C( ) Confabulation operations: Given evidence, find the answer that maximizes: p( | ) p( | ) p( | ) p( | )
6
© 2009 Robert Hecht-Nielsen. All rights reserved. 6 Building a vision hierarchy Can no longer use SSE to evaluate model [ SSE maximizes p( | , , ) ] Instead, make use of generative model: –Always be able to generate a plausible image.
7
© 2009 Robert Hecht-Nielsen. All rights reserved. 7 Data set 4,300 1.5 Mpix natural images (BW)
8
© 2009 Robert Hecht-Nielsen. All rights reserved. 8 Vision Hierarchy – level “0” We know the first transformation from neuroscience research: simple cells approximate Gabor filters. 5 scales, 16 orientations (odd + even) Parameters picked to closely resemble feline simple cells. Same approach is used elsewhere in lab. [Minnett, et al.]
9
© 2009 Robert Hecht-Nielsen. All rights reserved. 9 Vision Hierarchy – level “0” Does the full convolution preserve information in images? (inverted by LS) Very closely.
10
© 2009 Robert Hecht-Nielsen. All rights reserved. 10 Vision Hierarchy – level “0” We can do even better by super-sampling an image before encoding:
11
© 2009 Robert Hecht-Nielsen. All rights reserved. 11 Vision Hierarchy – level “0” Supersampling RMSE: 1x: 0.0202 2x: 0.0081 3x: 0.0051 4x: 0.0044 5x: 0.0038
12
© 2009 Robert Hecht-Nielsen. All rights reserved. 12 Inverting Gabor Representations Studied by Daugman Simple cells (found in 1950s) re-represent “pixel” data, were first characterized by Daugman as Gabor Logons in 1980's. Attempted to answer “How much information is lost?” “not much!” -- Able to completely reconstruct images. (i.e. what we've just seen in previous few slides) Frame Analysis can show: Can mathematically prove when complete inversion is possible. Optimal linear inverse.
13
© 2009 Robert Hecht-Nielsen. All rights reserved. 13 Vision Hierarchy – level 1 We now have a simple-cell like representation. How to create a symbolic representation (“Complex Cells”)? Apply principle of Confabulation Theory: Collect common sets of inputs from simple cells: similar to a Vector Quantizer. Keep the 5-scales separate –(quantize 16-dimensions, not 80)
14
© 2009 Robert Hecht-Nielsen. All rights reserved. 14 Vision Hierarchy – level 1 To create actual symbols, we use a vector quantizer –Trade-offs (threshold of quantizer) : Number of symbols Preservation of information Probability accuracy Solution Use angular distance metric (dot-product) –Keep only symbols that occurred in training set more than 200 times, to get accurate p( ). –After training, ~95% of samples should be within threshold of at least one symbol. –Pick a threshold so images can be plausibly generated.
15
© 2009 Robert Hecht-Nielsen. All rights reserved. 15 Vision Hierarchy – level 1 Oops! Ignoring wavelet magnitude makes all “texture features” equally prominent.
16
© 2009 Robert Hecht-Nielsen. All rights reserved. 16 Vision Hierarchy – level 1 Symbolic representation can generate plausible images: A theory of animal vision that actually demonstrates that animals can see!
17
© 2009 Robert Hecht-Nielsen. All rights reserved. 17 Vision Hierarchy – level 1 ~8,000 symbols are learned for each of the 5 scales. Complex local features develop. (unlike PCA re- representations & ICA representations)
18
© 2009 Robert Hecht-Nielsen. All rights reserved. 18 Vision Hierarchy – level 1 Now image is re- represented as 5 “planes” of symbols:
19
© 2009 Robert Hecht-Nielsen. All rights reserved. 19 Knowledge links: Learn which symbols may be next to which symbols (conditional probabilities) Learn which symbols may be over/under which symbols. Go out to ‘radius’ 7. Consistent with cortical representation of knowledge Very large (10s of GB) set of knowledge.
20
© 2009 Robert Hecht-Nielsen. All rights reserved. 20 Texture modeling – (inference) What if a portion of our image symbol representation is damaged? Blind spot CCD defect brain lesion We can use confabulation (generation) to infer a plausible replacement.
21
© 2009 Robert Hecht-Nielsen. All rights reserved. 21 Texture modeling – Inference 1 Fill in missing region by confabulating from lateral & different scale neighbors (rad 5).
22
© 2009 Robert Hecht-Nielsen. All rights reserved. 22 Texture modeling
23
© 2009 Robert Hecht-Nielsen. All rights reserved. 23 Texture modeling
24
© 2009 Robert Hecht-Nielsen. All rights reserved. 24 Texture modeling
25
© 2009 Robert Hecht-Nielsen. All rights reserved. 25 More Examples 1/7 (find the replacements)
26
© 2009 Robert Hecht-Nielsen. All rights reserved. 26 More Examples 1/7 (replacement locations)
27
© 2009 Robert Hecht-Nielsen. All rights reserved. 27 More Examples 2/7 (find the replacements)
28
© 2009 Robert Hecht-Nielsen. All rights reserved. 28 More Examples 2/7 (replacement locations)
29
© 2009 Robert Hecht-Nielsen. All rights reserved. 29 More Examples 3/7 (find the replacements)
30
© 2009 Robert Hecht-Nielsen. All rights reserved. 30 More Examples 3/7 (replacement locations)
31
© 2009 Robert Hecht-Nielsen. All rights reserved. 31 More Examples 4/7 (find the replacements)
32
© 2009 Robert Hecht-Nielsen. All rights reserved. 32 More Examples 4/7 (replacement locations)
33
© 2009 Robert Hecht-Nielsen. All rights reserved. 33 More Examples 5/7 (find the replacements)
34
© 2009 Robert Hecht-Nielsen. All rights reserved. 34 More Examples 5/7 (replacement locations)
35
© 2009 Robert Hecht-Nielsen. All rights reserved. 35 More Examples 6/7 (find the replacements)
36
© 2009 Robert Hecht-Nielsen. All rights reserved. 36 More Examples 6/7 (replacement locations)
37
© 2009 Robert Hecht-Nielsen. All rights reserved. 37 Texture modeling Conclusions This visual hierarchy does an excellent job at capturing an image up to a certain order of complexity. Given this visual hierarchy and its learned knowledge links, missing regions could plausibly be filled in. This could be a reasonable explanation for what animals do. Preparing for publication (IEEE Transactions on Image Processing), with help from Professor Serge Belongie (CSE). Last hurdle to graduation!
38
© 2009 Robert Hecht-Nielsen. All rights reserved. 38 Texture modeling – Inference 2 Super-resolution: If we have a low resolution image, can we confabulate (generate) a high-resolution version? “Space out” the symbols, and confabulate values for the new neighbors
39
© 2009 Robert Hecht-Nielsen. All rights reserved. 39 Texture modeling
40
© 2009 Robert Hecht-Nielsen. All rights reserved. 40 Texture modeling
41
© 2009 Robert Hecht-Nielsen. All rights reserved. 41 Texture modeling Super-resolution: conclusions Having learned the statistics of natural images, the generative properties of this hierarchy can confabulate (generate) plausible high-resolution versions of its input.
42
© 2009 Robert Hecht-Nielsen. All rights reserved. 42 References
43
© 2009 Robert Hecht-Nielsen. All rights reserved. 43 Applications DVD HD “upconversion” -- exist in current DVD players Intelligent Pixel Creation (superresolution) Intelligent Frame Interpolation (increasing frame rates) Imagine an online ONR service available to all US govt. agencies... Generating high-resolution images from damaged, low-resolution (in specific contexts). Analyzing surveillance data. Low-resolution video High-resolution image
44
© 2009 Robert Hecht-Nielsen. All rights reserved. 44 The next level… Level 2 symbol hierarchy Collect commonly recurring regions of level 1 symbols. Symbols at Level 2 will fit together like puzzle pieces. Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.