Presentation is loading. Please wait.

Presentation is loading. Please wait.

Self-Supervised Cross-View Action Synthesis

Similar presentations


Presentation on theme: "Self-Supervised Cross-View Action Synthesis"— Presentation transcript:

1 Self-Supervised Cross-View Action Synthesis
Kara Schatz Advisor: Dr. Yogesh Rawat UCF CRCV – REU, Summer 2019

2 Synthesize a video from an unseen view.
Project Goal Synthesize a video from an unseen view. The goal of this project is to be able to synthesize a video from an unseen view

3 Synthesize a video from an unseen view.
Project Goal Synthesize a video from an unseen view. Given: video of the same scene from a different viewpoint appearance conditioning from the desired viewpoint In order to achieve this, our approach will use a video of the same scene from a different viewpoint as will as appearance conditioning from the desired viewpoint

4 Approach This diagram shows the approach that we are using to accomplish our goal. The overall idea is to use a network to learn the appearance of the desired view and another network to learn a representation for the 3D pose in a different view of the video. Then, we will take both of those and input them into a video generator that will reconstruct the video from the desired view. To do the training, we will run the network on two different views and reconstruct both viewpoints. Once trained, we will only need to give one view of the video an one frame of the desired view.

5 Dataset: NTU RGB+D 56K+ videos 3 camera angles: -45°, 0°, +45°
The dataset that we will use is the NTU dataset, which contains over 56 thousand videos that are taken from 3 different camera angles, which give us the different viewpoints. The videos will be resized and randomly cropped to 112x112 before being passed to the network

6 Dataset: NTU RGB+D 13K+ training videos 5K+ testing videos
3 camera angles: -45°, 0°, +45° Training inputs: Each sample uses 2 randomly chosen views Resize and randomly crop to 112x112 8/16 frames For training inputs this week, I have been using 2 randomly chosen views out of the 3 available views. I resized and randomly cropped to 112x112 and used 8 frames

7 Total Loss vs. Epochs Skip rate = 2 Skip rate = 3

8 Losses But you’ll remember that the overall loss is actually composed of 3 separate losses

9 Consistency Loss vs. Epochs
Skip rate = 2 Skip rate = 3

10 Reconstruction Losses vs. Epochs
Skip rate = 2 Skip rate = 3 Skip rate = 2 Skip rate = 3

11 Total Loss vs. Epochs No precropping Precropping

12 Total Loss vs. Epochs Skip rate = 2 Skip rate = 3

13 Output Frames

14 Output Frames

15 Next Steps After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance.

16 Next Steps Use Panoptic Dataset Provides more viewpoints
After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance.

17 Next Steps Modify Network
After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance.

18 Next Steps Modify Network
After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance.

19 Next Steps Modify Network Trans- formation Trans-formation viewpoint
After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance. Trans-formation viewpoint

20 Next Steps Modify Network Trans- formation Trans-formation Key-points
viewpoint Trans- formation Key-points After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance. Trans-formation Key-points viewpoint


Download ppt "Self-Supervised Cross-View Action Synthesis"

Similar presentations


Ads by Google