Primal Sketch Integrating Structure and Texture Ying Nian Wu UCLA Department of Statistics Keck Meeting April 28, 2006 Guo, Zhu, Wu (ICCV, 2003; GMBV, 2004; CVIU, 2006)
sketchable image input image + sketch graph synthesized textures texture regions synthesized image = A Generative Model for Natural Images
Outline Sparse coding Markov random field Primal sketch model Sketch pursuit algorithm
Sparse Coding Olshausen and Field (1996)
input image 500 bases 800 bases matching pursuit Matching Pursuit Mallat and Zhang (1993)
Symbolic representation of 300 bases Reconstructed image Primak sketch
Markov Property: MRF model = Gibbs distribution One example of neighborhood Markov Random Fields Besag (1974) Geman and Geman (1984) Cross and Jain (1983)
MRF model & Image ensemble Image ensemble (Wu, Zhu & Liu, 2000) MRF model (Zhu, Wu & Mumford, 1997)
Feature statistics : histograms of filter responses (Heeger and Bergen, 1995) Filtering – convolution original image Ifilter responses J of the “dy” filter a set of filters F
Histogram of Filter Responses
Average histogram error
800 bases 50*70 patch A sample of image ensemble with 5*13=65 parameters
observed image sampled image from image ensemble Primak sketch
Sparse Coding vs. MRF MRF models target high complexity patterns. Sparse Coding models target low complexity patterns. const: related to the dictionary p*: fitted MRF q: any distribution
Primal Sketch Model Image pixels = Sketchable & Non-sketchable Sketchable: sparse coding using image primitives Non-sketchable: feature statistics/Markov random fields Integration: Non-sketchable interpolates sketchable Non-sketchable recycles failed sketch detections
Sketches Elder and Zucker, 1998
Sketch Graph Vertices: 1,2,3 – corners 4,5,6,7 – end points 8,9,10 – junctions, etc Sketch graph
Image Primitives a) Geometric b) Photometric
Geometric Photometric sketch image Sketch Graph Model
Sketchs = Gabor clusters Alignment across spatial and frequency domains
Non-alignment Pool marginal histograms Gabor filters Sketches Alignment Sketch Graph Textures Integrating structure and texture
Model fitting First: Sketch pursuit aided by Gabors Second: Non-sketchable texturing Sketchability test
Sketch pursuit objective Approximated model
Sketch Pursuit Phase I input imageEdge/ridge mapedge/ridge strength Proposals: a set of sketches as candidates. Select the sketches in the order of likelihood gain.
Sketch Pursuit Phase II Refinement Evolve the sketch graph by graph operators. Initialization Refinement
Graph Operators operatorsgraph changeillustration G 1, G 1 ’create / remove a stroke G 2, G 2 ’grow / shrink a stroke G 3, G 3 ’connect / disconnect vertices G 4, G 4 ’ extend one stroke and cross / disconnect and combine G 5, G 5 ’ extend two strokes and cross / disconnect and combine
operatorsgraph changeillustration G 6, G 6 ’ combine two stroke / break a stroke G 7, G 7 ’ combine two parallel strokes / split one into two parallel G 8, G 8 ’ merge two vertices / split a vertex G 9, G 9 ’create / remove a blob G 10, G 10 ’ switch between a stroke(s) and a blob
Graph Editing Phase I Phase II A B AB
Phase II Algorithm 1.Randomly choose a local sub-graph (S 0 ) 2.Try all 10 pair of graph operators 1~ 5 steps, to generate a set of new graph candidates (S 4,S 2,S 3 ) 3.Compare all new graph candidates 4.Select the one with the largest posterior gain (e.g. S 4 ), accept the new graph. If no positive gain, no update. 5.Repeat 1~4 until no update S0S0 S1S1 S2S2 S3S3 G1G1 S4S4 G3G3 G4G4
synthesized texturestexture regions K-mean clustering Histograms in 7x7 window 7 filters x 7 bins
Primal Sketch Model Result input image reconstructed image sketch graph sketchable image
input imagesketch graph reconstructed imagesketchable image
input imagereconstructed imagesketch graph
input imagesketch graphreconstructed image
Lossy Image Coding sketchable image sketch graph codes for the vertices: 152*2*9 = 2,736 bits codes for the strokes: 275*2*4.7 = 2, 585 bits codes for the profiles: 275*(2.4*8+1.4*2) = 6,050 bits Total codes for structures (18,185 pixels) 11,371 bits = 1,421 bytes
synthesized textures texture regions codes for the region boundaries: 3659*3 = 10,977 bits codes for the texture histograms: 7*5*13*4.5 = 2,048 bits Total codes for textures (41,815 pixels) 13,025 bits = 1,628 bytes Total codes for whole image (72,000 pixels), 3,049 bytes
Scaling
Wu, Zhu, Bahrami, Li (2006)