Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis Speaker: ZHAO Jian (zhaojian90@u.nus.edu)

Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis Speaker: ZHAO Jian Homepage: Affiliation: Learning and Vision Group, ECE, NUS

S 90° GT S 90° GT 90° S 75° S 45° S

𝐿 𝑡𝑣 = 𝑖,𝑗 ( ( 𝑥 𝑖,𝑗+1 − 𝑥 𝑖,𝑗 ) 2 +( 𝑥 𝑖+1,𝑗 − 𝑥 𝑖,𝑗 ) 2 ) 𝛽 2

Implementation Details
LR: , BATCH_SIZE =10, INPUT_SIZE = 128*128*3, BETA = 1, ALPHA = 0.001, LAMBDA_1 = 0.3, LAMBDA_2 = 0.001, LAMBDA_3 = 0.003, LAMBDA_4 =

90° 75° 60° 45° 30° 15° GT

Quantitative Results

Underlying Problems for Re-implementation
The “Template Landmark Location” for patch position aggregation is not given. How to fuse the feature maps of the 4 Landmark Located Patch Networks then (estimate the template with the frontal GT)? What if we cannot detect the 4 patches (left eye <-> right eye)? The network architecture for the discriminator is not given, which might be as same as the encoder of the global network of the generator. Training details are not given. Re-implementation is tricky, time-consuming, and cannot promise to work properly. **The weight for the pixel loss is too large, while the weights for other losses are too small. Thus, other losses seem to act as gimmick in this paper, which is not reasonable for GAN-based framework, and leading to severe overfitting problem. But we can have some improvements (e.g., acceleration, adaptive aggregation of losses, Siamese structure, learning to learn, more training data) and have a try. Q & A

Re-implementation results (preliminary)
Dataset: Multi-PIE (4324 img, 250 sub, 0-90°) LR: , BATCH_SIZE =10, INPUT_SIZE = 128*128*3, BETA = 1, ALPHA = 0.001, LAMBDA_1 = 0.3, LAMBDA_2 = 0.001, LAMBDA_3 = 0.003, LAMBDA_4 = Iterations: ~100k Training time: ~1d; Testing time: 20ms / img

Problems: poor generalization capacity -> overfitting? / sub-optimal hyper parameters ? / unreasonable weights for losses? w/o pixel loss

Problems: poor generalization capacity -> overfitting? / sub-optimal hyper parameters ? / unreasonable weights for losses? TP-GAN on other data

Present results

Possible solutions Incorporate all available Multi-PIE data for training TP-GAN. Improve and modify the pixel loss (l1 loss), since this supervision signal is too strong, leading the network overfitting quickly. In original version of TP-GAN, the discriminator seems not contribute too much to the optimization of the generator, which means that the authors are using the pixel loss (with large weight of 1.0) to make the generator memorize each frontal ground truth! Improvement on generator: Domain-Adversarial Training -> global generator & 4 local patch generator.

Possible solutions Improvement on discriminator: adopt Siamese-like discriminator and inject the dynamic convolution to capture more information with domain transfer learning (use pre-trained LightCNN to predict dynamic kernel weights of the discriminator to replace the ip loss) to optimize the generator for photorealistic frontal face synthesis. Tune relevant parameters.

Thank you!

Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis Speaker: ZHAO Jian (zhaojian90@u.nus.edu)

Similar presentations

Presentation on theme: "Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis Speaker: ZHAO Jian (zhaojian90@u.nus.edu)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis Speaker: ZHAO Jian (zhaojian90@u.nus.edu)

Similar presentations

Presentation on theme: "Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis Speaker: ZHAO Jian (zhaojian90@u.nus.edu)"— Presentation transcript:

Similar presentations

About project

Feedback