Self-Supervised Cross-View Action Synthesis

Slides:



Advertisements
Similar presentations
Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference.
Advertisements

Modeling 3D Deformable and Articulated Shapes Yu Chen, Tae-Kyun Kim, Roberto Cipolla Department of Engineering University of Cambridge.
Computer Vision REU Week 2 Adam Kavanaugh. Video Canny Put canny into a loop in order to process multiple frames of a video sequence Put canny into a.
Self-Supervised Segmentation of River Scenes Supreeth Achar *, Bharath Sankaran ‡, Stephen Nuske *, Sebastian Scherer *, Sanjiv Singh * * ‡
November 9, 2010Neural Networks Lecture 16: Counterpropagation 1 Unsupervised Learning So far, we have only looked at supervised learning, in which an.
Traffic Sign Recognition Jacob Carlson Sean St. Onge Advisor: Dr. Thomas L. Stewart.
Multi-Output Learning for Camera Relocalization Abner Guzmán-Rivera UIUC Pushmeet Kohli Ben Glocker Jamie Shotton Toby Sharp Andrew Fitzgibbon Shahram.
Titre. Geographic Information System GIS offer powerful tools for adding spatial perspectives to: –Planning –Research –Technology transfer –Impact assessment.
Height Estimation from Egocentric Video- Week 1 Dr. Ali Borji Aisha Urooj Khan Jessie Finocchiaro UCF CRCV REU 2016.
Week 4 Report UCF Computer Vision REU 2012 Paul Finkel 6/11/12.
A Hierarchical Deep Temporal Model for Group Activity Recognition
Yann LeCun Other Methods and Applications of Deep Learning Yann Le Cun The Courant Institute of Mathematical Sciences New York University
Naifan Zhuang, Jun Ye, Kien A. Hua
Unsupervised Learning of Video Representations using LSTMs
Neural Network Architecture Session 2
Data Mining, Neural Network and Genetic Programming
Example, BP learning function XOR
Summary of Week 1 (May 23 – May 27, 2016)
Automatic Lung Cancer Diagnosis from CT Scans (Week 4)
Compositional Human Pose Regression
Structured Predictions with Deep Learning
CSCI 5922 Neural Networks and Deep Learning: NIPS Highlights
Adversarially Tuned Scene Generation
Textual Video Prediction
Video Summarization via Determinantal Point Processes (DPP)
INTRODUCTION TO Machine Learning
Project Name: Country:
Two-Stream Convolutional Networks for Action Recognition in Videos
Change in Expression after modifying fix_b?
Change in Expression after modifying fix_b?
Change in Expression after modifying fix_b?
Project 7: Modeling Social Network Structures and their Dynamic Evolutions with User- Generated Data from IoT REU Student: Emma Ambrosini Graduate mentors:
Change in Expression after modifying fix_b?
CAR EVALUATION SIYANG CHEN ECE 539 | Dec
Flexera.
Image to Image Translation using GANs
Deep Cross-media Knowledge Transfer
Controlling BOH4M.
Lip movement Synthesis from Text
Example, BP learning function XOR
Project # 12, Smart Walker REU student: Jonathan Guilbe Graduate mentors: Sharare Zehtabian, Siavash Khodadadeh Faculty mentor(s): Dr. Turgut, Dr. Boloni.
Viewpoint in Photography
Neural Network Pipeline CONTACT & ACKNOWLEDGEMENTS
Project Midterm Presentation
Count by 10’s, 5’s and 2’s and then fill in the missing numbers!
Human-object interaction
INTRODUCTION TO Machine Learning
REU - End to End Self Driving Car
Background Task Fashion image inpainting Some conceptions
Unrolling the shutter: CNN to correct motion distortions
CRCV REU UCF Summer 2019 Arisa Kitagishi.
Week 3: Moving Target Detection Using Infrared Sensors
Multi-UAV to UAV Tracking
Deep screen image crop and enhance
Weak-supervision based Multi-Object Tracking
CRCV REU 2019 Kara Schatz.
Cengizhan Can Phoebe de Nooijer
Appearance Transformer (AT)
Week 3 Volodymyr Bobyr.
Self-Supervised Cross-View Action Synthesis
Week 7 Presentation Ngoc Ta Aidean Sharghi
Self-Supervised Cross-View Action Synthesis
Sign Language Recognition With Unsupervised Feature Learning
Self-Supervised Cross-View Action Synthesis
Week 6: Moving Target Detection Using Infrared Sensors
Jiahe Li
REU Program 2019 Week 5 Alex Ruiz Jyoti Kini.
Truman Action Recognition Status update
Self-Supervised Cross-View Action Synthesis
Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision.
Presentation transcript:

Self-Supervised Cross-View Action Synthesis Kara Schatz Advisor: Dr. Yogesh Rawat UCF CRCV – REU, Summer 2019

Synthesize a video from an unseen view. Project Goal Synthesize a video from an unseen view. The goal of this project is to be able to synthesize a video from an unseen view

Synthesize a video from an unseen view. Project Goal Synthesize a video from an unseen view. Given: video of the same scene from a different viewpoint appearance conditioning from the desired viewpoint In order to achieve this, our approach will use a video of the same scene from a different viewpoint as will as appearance conditioning from the desired viewpoint

Approach This diagram shows the approach that we are using to accomplish our goal. The overall idea is to use a network to learn the appearance of the desired view and another network to learn a representation for the 3D pose in a different view of the video. Then, we will take both of those and input them into a video generator that will reconstruct the video from the desired view. To do the training, we will run the network on two different views and reconstruct both viewpoints. Once trained, we will only need to give one view of the video an one frame of the desired view.

Datasets NTU 13K+ training videos 5K+ testing videos 3 camera angles: -45°, 0°, +45° So pan has far less samples, but way more viewpoints so the training set is more diverse

Datasets NTU PANOPTIC 13K+ training videos 5K+ testing videos 3 camera angles: -45°, 0°, +45° 3800 training samples 500 testing samples 100 cameras So pan has far less samples, but way more viewpoints so the training set is more diverse

Total Loss vs. Epochs Batch size = 20 Frame count = 16 Skip rate = 2 NTU Panoptic

Total Loss vs. Epochs Batch size = 20 Frame count = 16 Skip rate = 2 NTU Panoptic

Output Frames

Output Frames NTU Noticed that the people get cropped out in pan a lot…

Output Frames PANOPTIC NTU Noticed that the people get cropped out in pan a lot… Think diff is that the colors are so close in pan its hard to differenentiate

Modified Network After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance.

Modified Network After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance.

Modified Network Key Point Extraction Key Point Extraction Key-points After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance. Key Point Extraction Key-points

Modified Network Key Point Extraction Trans- formation viewpoint Key Point Extraction Trans- formation Key-points Estimated Keypoints Key-points After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance. Key Point Extraction Trans-formation Key-points Estimated Keypoints Key-points viewpoint

Modified Network Key Point Extraction Trans- formation viewpoint Key Point Extraction Trans- formation Key-points Estimated Keypoints Key-points After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance. Key Point Extraction Trans-formation Key-points Estimated Keypoints Key-points viewpoint

Total Loss vs. Epochs Dataset = NTU Batch size = 20 Frame count = 16 Skip rate = 2 New network Old network

Total Loss vs. Epochs Dataset = Panoptic Batch size = 20 Frame count = 16 Skip rate = 2 New network Old network

Next Steps Reconstruction with new network After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance.

Next Steps Reconstruction with new network Fix dataset issues Missing data Cropping people out After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance.

Next Steps Reconstruction with new network Fix dataset issues Missing data Cropping people out Using close cameras After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance.

Next Steps Reconstruction with new network Fix dataset issues Missing data Cropping people out Using close cameras Modify Network design After that, I can start making changes to hopefully improve the model. I can make changes to the network, the loss function I am using, and the data input strategies to see how those impact performance.