Textual Video Prediction

Textual Video Prediction
REU Student: Emily Cosgrove Graduate Student: Amir Mazaheri Professor: Dr. Shah

PSNR PSNR (peak signal to noise ratio)
Most common measurement for video prediction PSNR = 10 log 𝑀𝐴𝑋 𝑖 2 𝑀𝑆𝐸

Prednet We trained and tested PREDNET on our data Movie dataset
It was originally trained on the KITTI datasets It predicts one frame We are working on to change the code to predict multiple frames

PSNR Table Method Name PSNR Details PREDNET 17.58
Predicts Just one Frame N/A Predicts Multiple Frames (Working on the code) ConvLSTM 21.3 Predicts Multiple Frames STN (Prediction of tx and ty) 31.235 ConvLSTM + Text

Next steps Compute the spatial attention Possible Usage of Text
Copy pixels out of attention area Predict pixels inside the attention area

Video Spatial Attention
Generated Text LSTM Background Video Video Spatial Attention

Textual Video Prediction

Similar presentations

Presentation on theme: "Textual Video Prediction"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Textual Video Prediction

Similar presentations

Presentation on theme: "Textual Video Prediction"— Presentation transcript:

Similar presentations

About project

Feedback