Style/Content separation Evgeniy Bart, Dan Levi April 13, 2003.

Style/Content separation Evgeniy Bart, Dan Levi April 13, 2003

Artistic styles Photograph

Artistic styles Impressionist

Artistic styles Expressionist

Artistic styles Pointillist

Photographic styles *Pictures by Aya Aner-Wolf

Fonts ABCDE ABCDE ABCDE ABCDE Style Content

Faces *Images from FERET database

Tasks Extrapolation: Extrapolation of familiar style to new content

Tasks Extrapolation: Extrapolation of familiar content to new style

Tasks Translation:

Task specification by analogy : : ? Image analogies, Hertzmann et al.

Wei&Levoy Ashikhmin

Region growing Somewhat similar to quilting *Picture from presentation by Tal and Zeev

Region growing

Combining the two … : :

… hierarchically : :

Selecting best match

Arbitrating ? yes no Use Wei&Levoy value Use Ashikhmin value

: : Results – artistic filters

Super-resolution : Training 1

Super-resolution Results :

A lonely pine is standing In the North where high winds blow. He sleeps; and the whitest blanket wraps him in ice and snow. He dreams - dreams of a palm-tree that far in an Orient land Languishes, lonely and drooping, Upon the burning sand. H. Heine, translated by L. Untermeyer

Texture by numbers : :

Parameters annEpsilon: [float] = 1.000000 ashLastLevel: [bool] = false biasPenalty: [float] = 0.000000 cheesyBoundaries: [bool] = true coherenceEps: [float] = 5.000000 coherencePow: [float] = 2.000000 createSrcLocHisto: [bool] = false decayWeight: [double] = 0.000000 filterColorspace: [enum] = {Lab, Luv, RGB, XYZ} filterMM: [string] = (none!) filterModeMask: [string] = (none!) filterProcedure: [enum] = {Copy, Synthesize} filteredFeatureType: [enum] = {Difference, Raw} filteredPyramidType: [enum] = {Gaussian, Laplacian} finalSourceFac: [float] = -1.000000 gainPenalty: [float] = 0.000000 heurAnnEpsilon: [float] = 1.000000 heurMaxTSVQDepth [int] = 7 histogramEq: [bool] = false levelWeighting: [float] = 1.000000 matchBtoA: [bool] = false matchGrayHistogram: [bool] = false matchMeanVariance: [bool] = false maxTSVQDepth: [int] = 20 maxTSVQError: [float] = 0.000000 modeMaskWeight: [float] = 0.010000 neighborhoodWidth: [int] = 5 pyramidType: [enum] = {Gaussian, Laplacian, Steerable} samplerEpsilon: [float] = 0.100000 searchMethod: [enum] = {ANN, Ash, HeurANN, HeurTSVQ, Image, MLP, TSVQ, TSVQR, Vector} sourceColorspace: [enum] = {Lab, Luv, RGB, XYZ} srcWeight: [float] = 1.000000 targetMM: [string] = (none!) targetModeMask: [string] = (none!) useBias: [bool] = false useFilter: [bool] = true useFilterModeMask: [bool] = false useGain: [bool] = false useInterface: [bool] = true useRandomStart: [bool] = true useSigmoidalDecay: [bool] = false useSplineWeights: [bool] = true useTargetModeMask: [bool] = false useYIQ: [bool] = false numHiddenNeurons: [int] = 20 numLevels: [int] = 2 numPasses: [int] = 1 numTSVQBacktracks: [int] = 8 onePixelSource: [bool] = false oneway: [bool] = false pyramidHeight: [int] = 4

3D rotation : :

What went wrong? There is some structure, but not simple correspondence Need more knowledge about objects

Rectangular parallelepipeds (cuboids)

Representation by 3D point coordinates Linear classes, Vetter&Poggio

May combine linearly += + =

Only 3 dimensions ++= = Call it linear class

Linear operators Linear operator L

Example: rotation ++= =

Rotation If Then

Example: projection ++= =

Projection If Then

Example: projection + rotation ++= =

Rotation + projection If Then But also We may work entirely in 2D domain!

+ + Working in 2D : : + +

Results : :

Can we apply this idea to faces? Linear class assumption: Object may be represented as linear combination of other (similar) objects ++ = Use raw images as basis Reconstruction quality will be poor PCA reconstruction is much better Can we use PCA?

+ + Using linear classes : : + +

Eigenfaces do not correspond Cannot use the same coefficients

So far: Solving Specific Tasks (Image analogies), Linear Classes Learn Interaction Generalize To New Examples Goal : General Style/Content Framework

.............

Motivation : Linear Models Faces Images Form A linear subspace Illumination variations (of same face) can be modeled by a low-dimensional linear space (Hallinan ‘94) Eigenvector Basis Reconstruct faces Model: Linear In Style And In Content eigenfaces (Turk, Pentland ‘91)

Bilinear Models (Tenenbaum, Freeman 2000,97) is bilinear if : Linear in x Linear in y y constant

Bilinear Forms : Example x,y ∈ Real f(x,y) = xy Bilinear Forms To Model Style And Content 2 Models : Symmetric, And Asymmetric

K Image in Style s Content c Style vector I J Content vector IXJ Interaction Matrix Symmetric Bilinear Model

K Image in Style s Content c J Content vector Symmetric Bilinear Model Pixel Style vector J Style Matrix AsAs JXK

Symmetric Bilinear Model : basis vectors

Toy Example: Symmetric Model Images: Style: Color Content: Shape (0,0,1,1) (2,2,2,0) (0,3,0,3) Content: (0,0,1,1) (1,0,1,0) (0,1,1,1) Style: 123123

Face Example - Symmetric Style: Pose, Content: Person

K Image in Style s Content c Style vector I J Content vector IXJ Interaction Matrix Asymmetric Bilinear Model WksWks

K Image in Style s Content c J Content vector Asymmetric Bilinear Model Pixel Style vector J Style Matrix AsAs JXK A s : Content  Images

K Image in Style s Content c J Content vector Asymmetric Bilinear Model J

……… J K Image in Style s Content c A style specific basis Mixed by content coefficients

Content: (0,0,1,1) (1,0,1,0) (0,1,1,1) Toy Example: Asymmetric Model Style:

Face Example - Asymmetric Style: Pose, Content: Person

Training Person Illumination Style Content Image Matrix:

Training – Model Fitting Problem : Given {y sc (t) } t = 1..T find model parameters Error Minimization: Asymmetric: Closed SVD solution or Quasi-Newton methods  A s, b c Free parameter: J – content vector dimension. Symmetric: Iterative solution using SVD  a s, W k, b c

Content: person Style: illumination Translation - Faces Asymmetric Model Cannot handle translation!

Problem: C = 23 (faces), S = 3 (illuminations) Training : Fit a symmetric model using iterative SVD with I = S, J = C  a s, W k, b c Generalization: find a s`, b c ` that minimize E* =  k | y k s`c` - a s` W k b c` | 2 Alternating Iterative Linear Solution Translation : Produce a s` W k b c, a s W k b c` for each s and c. Translation – Symmetric Model

Translation - Results

Extrapolation Content: Letter, Style: Font Main Problem : Image Representation Linear combinations of letters should look like a letter.

A Displacement Vector Warp Map

Coulomb Warp Map For unique mapping: physical model of electrostatic forces. Linear combination of letters looks like a letter

Extrapolation Scheme Fit An Asymmetric bilinear model S = 5 training fonts (styles), C = 62 characters (content), K=2888 data dim. closed-form SVD  A s, b c C={c 1,…,c M } letters in a new style s’  find best fitting A s’ Minimize : E* =  c ║y s’c - A s’ b c ║  ∂E*/ ∂ A s’ = 0  Set J High(~60) Overfitting on Test Data (173,280 degrees of freedom!)

Constraint: Close To Symmetric A OLC =  s α s A s (Optimal Linear Combination) α s - style parameters (symmetric) Minimize: E* =  c ║y s’c - A s’ b c ║ + λ║ A s’ - A OLC ║ ∂E*/ ∂ A s’ = 0 Extrapolate missing letters by: y s’c = A s’ b c

Results

Symmetric vs. Aymmetric Can reduce dimensionality of factors Learns the structure of factor interactions : handles translation More Flexible Too flexible: overfitting Cannot handle translation Can be overcome by combining both

Bilinear Models (Tenenbaum, Freeman 2000,97) General framework for two factor problems Explicit parameterized representations of each factor and their interaction Natural generalization for extrapolation and translation tasks Fast algorithms (SVD) Pros: Cons: Assumes Linearity In Each Factor Find Clever Input Representations Decompose To Sub Problems

Example-Based Style Synthesis ( Ido Drori Hezi Yeshurun Daniel Cohen-Or 03)

Algorithm Outline 1.Divide Image To Overlapping Tiles

Algorithm Outline 1.Divide Image To Overlapping Tiles 2. Find Best Match In Each Scene

Algorithm Outline 1.Divide Image To Overlapping Tiles 2. Find Best Match In Each Scene 3.Synthesize tiles By Bilinear Model

Algorithm Outline 1.Divide Image To Overlapping Tiles 2. Find Best Match In Each Scene 3.Synthesize tiles By Bilinear Model 4. Image Quilting

Algorithm Outline 1.Decompose Image To Tiles 2. Find Best Match In Each Scene 3.Synthesize tiles By Bilinear Model 4. Image Quilting 5. Image Analogies

Create Gaussian Pyramids For Examples And Input Images Apply Algorithm To Each Level From Coarse To Fine

Finding Best Matching Fragment Similar GeometryAgreeing Boundaries V search = (,,,,, ) Gradient Laplacian Luminance For Each Training Scene : Create V In every Position And Orientation Search For Nearest Neighbor

Style/Content separation Evgeniy Bart, Dan Levi April 13, 2003.

Similar presentations

Presentation on theme: "Style/Content separation Evgeniy Bart, Dan Levi April 13, 2003."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Style/Content separation Evgeniy Bart, Dan Levi April 13, 2003.

Similar presentations

Presentation on theme: "Style/Content separation Evgeniy Bart, Dan Levi April 13, 2003."— Presentation transcript:

Similar presentations

About project

Feedback