Textual Video Prediction

Textual Video Prediction
REU Student: Emily Cosgrove Graduate Student: Amir Mazaheri Professor: Dr. Shah

Preliminary Overview Deep Learning, CNNs, and RNNs
Computer Vision and Natural Language Processing General Adversarial Networks (GANs) Video Prediction Missing Idea?

Problem description Goal: Use NLP and textual information for video prediction Possible Contribution: Enhanced/different video prediction

Problem Description Current Video Prediction Systems: Our System:
Input Frames GAN Predicted Frames Our System: Input Frames GAN Predicted Frames Input Sentence

Tasks Step 3: Prepare our measurements Step 4: Formulate our solution
Step 1: Study current methods to predict videos Learn how to run and setup current method’s codes Step 2: Study datasets which have been used for video prediction so far Possibly provide textual annotations for some of them. Step 3: Prepare our measurements How do we evaluate our results? Which other methods can we compare with? Step 4: Formulate our solution Discuss ideas to solve the problem Step 5: Implementation We will use Keras or Tensorflow to implement our ideas. Step 6: Baseline experiments

Weekly Progress Introductory Meetings with Mentor
Read papers related to topic General Adversarial Networks (Goodfellow) Decomposing Motion and Content for Natural Video Sequence Prediction (Ruben Villegas, et.) Began Step 1 Model we are currently working with

Research Paper: General Adversarial Networks
Author: Goodfellow Generator v. Discriminator Input: Random Noise Loss Functions Discriminator Generator 𝛻 θ g 1 m 𝑖=1 𝑚 log (1 −𝐷 𝐺 𝑧 𝑖 ) 𝛻 θ d 1 m 𝑖=1 𝑚 log 𝐷 𝑥 𝑖 + log (1 −𝐷 𝐺 𝑧 𝑖 )

Next week Continue Step 1 Preprocess movie dataset
Study codes for current methods Read and study paper related to code Preprocess movie dataset

References Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems Mathieu, Michael, Camille Couprie, and Yann LeCun. "Deep multi-scale video prediction beyond mean square error." arXiv preprint arXiv: (2015). Villegas, Ruben, Jimei Yang, Seunghoon Hong, Xunyun Lin, and Honglak Lee. “Decomposing Motion and Content for Natural Video Sequence Prediction.” ICLR (2017).

Textual Video Prediction

Similar presentations

Presentation on theme: "Textual Video Prediction"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Textual Video Prediction

Similar presentations

Presentation on theme: "Textual Video Prediction"— Presentation transcript:

Similar presentations

About project

Feedback