Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sequence-to-Segments Networks for Segment Detection Zijun Wei1, Boyu Wang1, Minh Hoai1, Jianming Zhang2, Xiaohui Shen3, Zhe Lin2, Radomír Měch2, Dimitris.

Similar presentations


Presentation on theme: "Sequence-to-Segments Networks for Segment Detection Zijun Wei1, Boyu Wang1, Minh Hoai1, Jianming Zhang2, Xiaohui Shen3, Zhe Lin2, Radomír Měch2, Dimitris."— Presentation transcript:

1 Sequence-to-Segments Networks for Segment Detection Zijun Wei1, Boyu Wang1, Minh Hoai1, Jianming Zhang2, Xiaohui Shen3, Zhe Lin2, Radomír Měch2, Dimitris Samaras1 1Stony Brook University, 2Adobe Research, 3ByteDance AI Lab Detecting Segments of Interest S2N Model Overview Experimental Results: Action Proposals Task: Given an input sequence, finding segments of interest. Applications: Video summarization; Video action proposal in untrimmed videos; Challenges: Global dependency: The interestingness of segments also depend on the whole sequence; Interdependency: Segments are not independent; Efficiency: Segment search space increases exponentially Quantitative Results (THUMOS14) *Frequency: Proposals per Second Overall: Global dependency: representing sequence by encoding stage Interdependency: decoding segments sequentially Efficiency: pointing to starting and ending positions directly (vs. sliding window) Experimental Results: Video Summarization Our contributions Sequence-to-Segment Network (S2N),: an end-to-end network architecture for detecting segments in a sequence. Hungarian matching: customized for matching multiple predictions with ground truth Earth Mover’s Distance: models segment localization loss State-of-the-art performance: on both video summarization and video action proposal tasks. F1 Score on SumMe dataset SDU: GRU for state update Pointer Network Modules for boundary localization MLP regression/classification for confidence prediction bj /dj= argmax(i) g (hj, ei) where g(hj, ei) = vT tanh (W1ei + W2hj) Qualitative Visualization Training S2N S2N is trained end-to-end: Problem Formulation Future Directions localization loss: Earth Mover’s Distance Cross Entropy EMD Input sequence: Out segments: start, end, score Ground truth segments: Hungarian matching: matching G(ground truth) and S(proposal) Modify GRU to record longer sequences (e.g. IndRNN) Explore other applications (EEG, NLP, etc.) Base S2N on a fully convolutional Encoder-Decoder (seqCNN) Apply S2N to action detection in untrimmed videos A, B: ground truth 1, 2, 3, 4: predicted segments o ∈ {0, 1} assignment; n: order; l: localization Acknowledgements: This project was partially supported by NSF-CNS , NSF-IIS , NSF-IIS , the Partner University Fund, the SUNY2020 Infrastructure Transportation Security Center, and a gift from Adobe


Download ppt "Sequence-to-Segments Networks for Segment Detection Zijun Wei1, Boyu Wang1, Minh Hoai1, Jianming Zhang2, Xiaohui Shen3, Zhe Lin2, Radomír Měch2, Dimitris."

Similar presentations


Ads by Google