Image Analogies Aaron Hertzmann (1,2) Charles E. Jacobs (2) Nuria Oliver (2) Brian Curless (3) David H. Salesin (2,3) 1 New York University 1 New York.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Active Appearance Models
Change Detection C. Stauffer and W.E.L. Grimson, “Learning patterns of activity using real time tracking,” IEEE Trans. On PAMI, 22(8): , Aug 2000.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 4 – Digital Image Representation Klara Nahrstedt Spring 2009.
Texture Synthesis on [Arbitrary Manifold] Surfaces Presented by: Sam Z. Glassenberg* * Several slides borrowed from Wei/Levoy presentation.
Chapter 3 Image Enhancement in the Spatial Domain.
Presented By: Vennela Sunnam
Hongliang Li, Senior Member, IEEE, Linfeng Xu, Member, IEEE, and Guanghui Liu Face Hallucination via Similarity Constraints.
CS 551 / CS 645 Antialiasing. What is a pixel? A pixel is not… –A box –A disk –A teeny tiny little light A pixel is a point –It has no dimension –It occupies.
1.  Texturing is a core process for modeling surface details in computer graphics applications › Texture mapping › Surface texture synthesis › Procedural.
Image Processing IB Paper 8 – Part A Ognjen Arandjelović Ognjen Arandjelović
A Sampled Texture Prior for Image Super-Resolution Lyndsey C. Pickup, Stephen J. Roberts and Andrew Zisserman, Robotics Research Group, University of Oxford.
IMAGE UPSAMPLING VIA IMPOSED EDGE STATISTICS Raanan Fattal. ACM Siggraph 2007 Presenter: 이성호.
IMAGE RESTORATION AND REALISM MILLIONS OF IMAGES SEMINAR YUVAL RADO.
December 5, 2013Computer Vision Lecture 20: Hidden Markov Models/Depth 1 Stereo Vision Due to the limited resolution of images, increasing the baseline.
Boundary matting for view synthesis Samuel W. Hasinoff Sing Bing Kang Richard Szeliski Computer Vision and Image Understanding 103 (2006) 22–32.
Texture Synthesis Tiantian Liu. Definition Texture – Texture refers to the properties held and sensations caused by the external surface of objects received.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Overview of Texture Synthesis Ganesh Ramanarayanan Cornell Graphics Seminar.
Order-Independent Texture Synthesis Li-Yi Wei Marc Levoy Gcafe 1/30/2003.
Image Quilting for Texture Synthesis and Transfer Alexei A. Efros1,2 William T. Freeman2.
Parallel Controllable Texture Synthesis Sylvain Lefebvre, Hugues Hoppe SIGGRAPH (3),
Adaptive Rao-Blackwellized Particle Filter and It’s Evaluation for Tracking in Surveillance Xinyu Xu and Baoxin Li, Senior Member, IEEE.
Curve Analogies Aaron Hertzmann Nuria Oliver Brain Curless Steven M. Seitz University of Washington Microsoft Research Thirteenth Eurographics.
Texture Reading: Chapter 9 (skip 9.4) Key issue: How do we represent texture? Topics: –Texture segmentation –Texture-based matching –Texture synthesis.
Region Filling and Object Removal by Exemplar-Based Image Inpainting
Image Enhancement.
Processing Image and Video for An Impressionist Effect Peter Litwinowicz Apple Computer, Inc. Siggraph1997.
Fast Texture Synthesis Tree-structure Vector Quantization Li-Yi WeiMarc Levoy Stanford University.
Fast multiresolution image querying CS474/674 – Prof. Bebis.
Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.
Radial Basis Function Networks
Super-Resolution of Remotely-Sensed Images Using a Learning-Based Approach Isabelle Bégin and Frank P. Ferrie Abstract Super-resolution addresses the problem.
Input: Original intensity image. Target intensity image (i.e. a value sketch). Using Value Images to Adjust Intensity in 3D Renderings and Photographs.
Tal Mor  Create an automatic system that given an image of a room and a color, will color the room walls  Maintaining the original texture.
Multiclass object recognition
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5409 T-R 10:30am – 11:50am.
Machine Vision ENT 273 Image Filters Hema C.R. Lecture 5.
Multimodal Interaction Dr. Mike Spann
Light Using Texture Synthesis for Non-Photorealistic Shading from Paint Samples. Christopher D. Kulla, James D. Tucek, Reynold J. Bailey, Cindy M. Grimm.
Multiscale Moment-Based Painterly Rendering Diego Nehab and Luiz Velho
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL Presented by 2006/8.
December 4, 2014Computer Vision Lecture 22: Depth 1 Stereo Vision Comparing the similar triangles PMC l and p l LC l, we get: Similarly, for PNC r and.
03/05/03© 2003 University of Wisconsin Last Time Tone Reproduction If you don’t use perceptual info, some people call it contrast reduction.
Previous lecture Texture Synthesis Texture Transfer + =
December 9, 2014Computer Vision Lecture 23: Motion Analysis 1 Now we will talk about… Motion Analysis.
Advanced Computer Graphics (Fall 2010) CS 283, Lecture 26 Texture Synthesis Ravi Ramamoorthi Slides, lecture.
Non-Photorealistic Rendering and Content- Based Image Retrieval Yuan-Hao Lai Pacific Graphics (2003)
Similarity Searching in High Dimensions via Hashing Paper by: Aristides Gionis, Poitr Indyk, Rajeev Motwani.
2005/12/021 Content-Based Image Retrieval Using Grey Relational Analysis Dept. of Computer Engineering Tatung University Presenter: Tienwei Tsai ( 蔡殿偉.
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
Synthesizing Natural Textures Michael Ashikhmin University of Utah.
2D Texture Synthesis Instructor: Yizhou Yu. Texture synthesis Goal: increase texture resolution yet keep local texture variation.
A Multiresolution Symbolic Representation of Time Series Vasileios Megalooikonomou Qiang Wang Guo Li Christos Faloutsos Presented by Rui Li.
MIT AI Lab / LIDS Laboatory for Information and Decision Systems & Artificial Intelligence Laboratory Massachusetts Institute of Technology A Unified Multiresolution.
Course14 Dynamic Vision. Biological vision can cope with changing world Moving and changing objects Change illumination Change View-point.
Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs.
Speaker Min-Koo Kang March 26, 2013 Depth Enhancement Technique by Sensor Fusion: MRF-based approach.
Machine Vision Edge Detection Techniques ENT 273 Lecture 6 Hema C.R.
Jianchao Yang, John Wright, Thomas Huang, Yi Ma CVPR 2008 Image Super-Resolution as Sparse Representation of Raw Image Patches.
- photometric aspects of image formation gray level images
Fast multiresolution image querying
We propose a method which can be used to reduce high dimensional data sets into simplicial complexes with far fewer points which can capture topological.
Efficient Image Classification on Vertically Decomposed Data
Dynamical Statistical Shape Priors for Level Set Based Tracking
Data-driven methods: Texture 2 (Sz 10.5)
Matching Words with Pictures
Goodfellow: Chapter 14 Autoencoders
Presentation transcript:

Image Analogies Aaron Hertzmann (1,2) Charles E. Jacobs (2) Nuria Oliver (2) Brian Curless (3) David H. Salesin (2,3) 1 New York University 1 New York University 2 Microsoft Research 3 University of Washington

Introduction A.nal.o.gy A.nal.o.gy A systematic comparison between structures that uses properties of and relations between objects of a source structure to infer properties of and relations between objects of a target structure A systematic comparison between structures that uses properties of and relations between objects of a source structure to infer properties of and relations between objects of a target structure Given a pair of images A and A ’ (the unfiltered and filtered source images, respectively), along with some additional unfiltered target image B, synthesize a new filtered target image B ’ such that A : A ’ :: B : B ’ Given a pair of images A and A ’ (the unfiltered and filtered source images, respectively), along with some additional unfiltered target image B, synthesize a new filtered target image B ’ such that A : A ’ :: B : B ’

Introduction We use an autoregression algorithm, based primarily on recent work in texture synthesis by Wei and Levoy and Ashikhmin. We use an autoregression algorithm, based primarily on recent work in texture synthesis by Wei and Levoy and Ashikhmin. Indeed, our approach can be thought of as a combination of these two approaches, along with a generalization to the situation of corresponding pairs of images, rather than single textures. Indeed, our approach can be thought of as a combination of these two approaches, along with a generalization to the situation of corresponding pairs of images, rather than single textures. In order to allow statistics from an image A to be applied to an image B with complete different colors, we sometimes operate in a preprocessed luminance space. In order to allow statistics from an image A to be applied to an image B with complete different colors, we sometimes operate in a preprocessed luminance space.

Introduction Applications: Applications: Traditional texture synthesis Traditional texture synthesis Improved texture synthesis Improved texture synthesis Super-resolution Super-resolution Texture transfer Texture transfer Artistic filter Artistic filter Texture-by-numbers Texture-by-numbers

Related work Machine learning for graphics Machine learning for graphics Texture synthesis Texture synthesis Non-photorealistic rendering Non-photorealistic rendering Example-based NPR Example-based NPR

Algorithm As input, our algorithm takes a set of three images, the unfiltered source image A, the filtered source image A ’, and the unfiltered target image B. As input, our algorithm takes a set of three images, the unfiltered source image A, the filtered source image A ’, and the unfiltered target image B. It produces the filtered target image B ’ as output. It produces the filtered target image B ’ as output. Our approach assumes that the two source images are registered Our approach assumes that the two source images are registered The colors at and around any given pixel p in A correspond to the colors at and around that same pixel p in A ’ The colors at and around any given pixel p in A correspond to the colors at and around that same pixel p in A ’ We are trying to learn the image filter We are trying to learn the image filter

Algorithm We use A(p) (or A ’ (p)) to denote the complete feature vector of A(p) (or A ’ (p)) at pixel p We use A(p) (or A ’ (p)) to denote the complete feature vector of A(p) (or A ’ (p)) at pixel p Similarly, We use B(q) (or B ’ (q)) to denote the complete feature vector of B(q) (or B ’ (q)) at pixel q Similarly, We use B(q) (or B ’ (q)) to denote the complete feature vector of B(q) (or B ’ (q)) at pixel q We will need to keep track of the position p of the source pixel that was copied to pixel q of the target, we will store this addition data structure s(.) We will need to keep track of the position p of the source pixel that was copied to pixel q of the target, we will store this addition data structure s(.) For example, s(q)=p For example, s(q)=p We will actually use a multiscale representation of all five of these quantities in our algorithm, we typically index each of these arrays by their multiscale level l using subscripts We will actually use a multiscale representation of all five of these quantities in our algorithm, we typically index each of these arrays by their multiscale level l using subscripts For example, presents the source image A at a given resolution, then represents a corresponding lower- resolution image at the next coarser level For example, presents the source image A at a given resolution, then represents a corresponding lower- resolution image at the next coarser level

Algorithm First, in an initialization phase, multiscale (Guassian pyramid) representations of A, A ’, and B is constructed, along with their feature vectors and some additional indices used for speeding the matching process. First, in an initialization phase, multiscale (Guassian pyramid) representations of A, A ’, and B is constructed, along with their feature vectors and some additional indices used for speeding the matching process. We use L to denote the maximum level (the highest- resolution) We use L to denote the maximum level (the highest- resolution) At each level l, statistics pertaining to each pixel q in the target pair are compared against statistics for every pixel p in the source pair, and the “ best ” match is found. At each level l, statistics pertaining to each pixel q in the target pair are compared against statistics for every pixel p in the source pair, and the “ best ” match is found. The pixel that matched best is recorded in The pixel that matched best is recorded in

Algorithm

Algorithm The heart of the image analogies algorithm is the BestMatch subroutine The heart of the image analogies algorithm is the BestMatch subroutine The routine finds the pixel p in the source pair that best matches the pixel being synthesized, using two different approaches: The routine finds the pixel p in the source pair that best matches the pixel being synthesized, using two different approaches: an approximate search, which attempts to efficiently find the closest-matching pixel according to the feature vectors of p,q, and their neighborhood; an approximate search, which attempts to efficiently find the closest-matching pixel according to the feature vectors of p,q, and their neighborhood; a coherence search, based on Ashikhmin ’ s approach,which attempts to preserve coherence with the neighboring synthesized pixels a coherence search, based on Ashikhmin ’ s approach,which attempts to preserve coherence with the neighboring synthesized pixels

Algorithm

Algorithm Since L2-norm is imperfect measure of perceptual similarity, coherent pixels will often look better than the best match under L2 Since L2-norm is imperfect measure of perceptual similarity, coherent pixels will often look better than the best match under L2 Thus, the larger the value of, the more coherence is favored over accuracy in the synthesized image Thus, the larger the value of, the more coherence is favored over accuracy in the synthesized image In order to keep the coherence term consistent at different scales, we attenuate it by a factor of since pixel locations at coarser scales are spaced further apart than at finer scales. In order to keep the coherence term consistent at different scales, we attenuate it by a factor of since pixel locations at coarser scales are spaced further apart than at finer scales. We typically use for color non- photorealistic filters, for line art filters, and for texture synthesis. We typically use for color non- photorealistic filters, for line art filters, and for texture synthesis.

Algorithm We use to denote the concatenation of all the feature vectors within some neighborhood N(p) of both source image A and A ’ at both the current resolution level l and at the coarser resolution level l-1. We use to denote the concatenation of all the feature vectors within some neighborhood N(p) of both source image A and A ’ at both the current resolution level l and at the coarser resolution level l-1. The norm is computed as a weighted distance over the feature vectors F(p) and F(q), using a Gaussian kernel, so that differences in the feature vectors of pixels further from p and q have a smaller weight relative to the differences at p and q The norm is computed as a weighted distance over the feature vectors F(p) and F(q), using a Gaussian kernel, so that differences in the feature vectors of pixels further from p and q have a smaller weight relative to the differences at p and q

Algorithm For the BestApproximateMatch procedure, we have tried using both approximate-nearest-neighbor search (ANN) and tree structured vector quantization (TSVQ), using the same norm over the feature vectors. For the BestApproximateMatch procedure, we have tried using both approximate-nearest-neighbor search (ANN) and tree structured vector quantization (TSVQ), using the same norm over the feature vectors. The BestCoherenceMatch procedure simply returns s(r*)+(q-r*), where The BestCoherenceMatch procedure simply returns s(r*)+(q-r*), where and N(q) is the neighborhood of already synthesized pixels adjacent to q in. This formula essentially returns the best pixel that is coherent with some already-synthesized portion of adjacent to q, which is the key insight of Ashikhmi ’ s method

Algorithm

Algorithm However, for some filters, we found that our source pairs did not contain enough data to match the target pair well using RGB color. However, for some filters, we found that our source pairs did not contain enough data to match the target pair well using RGB color. An alternative, which we have used to generate many of the results shown in this paper, is to compute and store the luminance at each pixel and use it in place of RGB in the distance metric. An alternative, which we have used to generate many of the results shown in this paper, is to compute and store the luminance at each pixel and use it in place of RGB in the distance metric. Luminance can be computed in a number of ways; we use the Y channel from the YIQ color space, where I and Q channels are “ color difference ” components. Luminance can be computed in a number of ways; we use the Y channel from the YIQ color space, where I and Q channels are “ color difference ” components. After processing in luminance space, we can recover the color simply by copying the I and Q channels of the input B image into the synthesized B ’ image, followed by a conversion back to RGB. After processing in luminance space, we can recover the color simply by copying the I and Q channels of the input B image into the synthesized B ’ image, followed by a conversion back to RGB. Our approach is to apply a linear map that matches the means and variances of the luminance distributions : Our approach is to apply a linear map that matches the means and variances of the luminance distributions : where and the mean luminances, and and are the standard deviations of the luminances, both taken with respect to luminance distributions in A and B

Application Traditional image filters Traditional image filters

Application Improved texture synthesis Improved texture synthesis The algorithm we have described, when used in this way for texture synthesis, combines the advantages of the weighted L2 norm and Ashikhmin ’ s algorithm. The algorithm we have described, when used in this way for texture synthesis, combines the advantages of the weighted L2 norm and Ashikhmin ’ s algorithm. For example, the synthesized textures shown in next figure have a similar high quality to those of Ashikhmin ’ s algorithm, without the edge discontinuities. For example, the synthesized textures shown in next figure have a similar high quality to those of Ashikhmin ’ s algorithm, without the edge discontinuities.

Improved texture synthesis

Application Super-resolution Super-resolution Image analogies can be used to effectively “ hallucinate ” more detail in low-resolution images, given some low-and high-resolution pairs (used as A and A ’ ) for small portions of the image. Image analogies can be used to effectively “ hallucinate ” more detail in low-resolution images, given some low-and high-resolution pairs (used as A and A ’ ) for small portions of the image. Training data is used to specify a “ super-resolution ” filter that is applied to a blurred version of the full image to recover an approximation to the higher- resolution original. Training data is used to specify a “ super-resolution ” filter that is applied to a blurred version of the full image to recover an approximation to the higher- resolution original.

Super-resolution

Application Texture transfer Texture transfer We filter an image B so that it has the texture of a given example texture A ’. We filter an image B so that it has the texture of a given example texture A ’. We can trade off the appearance between that of the unfiltered image B and that of the texture by introducing a weight into the distance metric that emphasizes similarity of the (A,B) pair over that of the (A ’,B ’ ) We can trade off the appearance between that of the unfiltered image B and that of the texture by introducing a weight into the distance metric that emphasizes similarity of the (A,B) pair over that of the (A ’,B ’ ) For better results, we also modify the neighborhood matching by using single-scale 1*1 neighborhoods in the A and B images. For better results, we also modify the neighborhood matching by using single-scale 1*1 neighborhoods in the A and B images.

Texture transfer

Application Artistic filters Artistic filters For the color artistic filters in this paper, we performed synthesis in luminance space, using the preprocessing described. For the color artistic filters in this paper, we performed synthesis in luminance space, using the preprocessing described. For line art filters, using steerable filter responses in feature vectors leads to significant improvement. For line art filters, using steerable filter responses in feature vectors leads to significant improvement. We suspect that this is because line art depends significantly on gradient directions in the input images. We suspect that this is because line art depends significantly on gradient directions in the input images.

Application Texture-by-numbers Texture-by-numbers It allows new imagery to be synthesized by applying the statistics of a labeled example image to a new labeling image B. It allows new imagery to be synthesized by applying the statistics of a labeled example image to a new labeling image B. A major advantage of texture-by-numbers is that it allows us to synthesize from images for which ordinary texture synthesis would produce poor results A major advantage of texture-by-numbers is that it allows us to synthesize from images for which ordinary texture synthesis would produce poor results

The End age-analogies age-analogies age-analogies age-analogies