Download presentation
Presentation is loading. Please wait.
1
Textual Video Prediction
REU Student: Emily Cosgrove Graduate Student: Amir Mazaheri Professor: Dr. Shah
2
PSNR PSNR (peak signal to noise ratio)
Most common measurement for video prediction PSNR = 10 log ππ΄π π 2 πππΈ
3
Prednet We trained and tested PREDNET on our data Movie dataset
It was originally trained on the KITTI datasets It predicts one frame We are working on to change the code to predict multiple frames
4
PSNR Table Method Name PSNR Details PREDNET 17.58
Predicts Just one Frame N/A Predicts Multiple Frames (Working on the code) ConvLSTM 21.3 Predicts Multiple Frames STN (Prediction of tx and ty) 31.235 ConvLSTM + Text
5
Next steps Compute the spatial attention Possible Usage of Text
Copy pixels out of attention area Predict pixels inside the attention area
6
Video Spatial Attention
Generated Text LSTM Background Video Video Spatial Attention
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.