Presentation is loading. Please wait.

Presentation is loading. Please wait.

Automatic Generation of Personalized Music Sports Video ACM MM’2005

Similar presentations


Presentation on theme: "Automatic Generation of Personalized Music Sports Video ACM MM’2005"— Presentation transcript:

1 Automatic Generation of Personalized Music Sports Video ACM MM’2005
Jinjun Wang, Changshenf Xu, Engsiong Chng, Lingyu Duan, Kongwah Wan, Qi Tian 2018/12/9 by pj

2 Outline Introduction System framework Video content selection
3.1 Video/Audio analysis 3.2 Text analysis 3.3 Align the text event with A/V stream Automatic video composition Experiment results 2018/12/9 by pj

3 Introduction Sports broadcasting is more and more popular.
One major advantage of digital broadcast is the possibility of delivering customized and interactive TV programs. 2018/12/9 by pj

4 Introduction However, current production of Music Video is very labor-intensive and inflexible. 無法自動化產生,要有專人剪接。 不能符合不同使用者的愛好。Ex. 喜歡看特定球員、球隊。 2018/12/9 by pj

5 Introduction The two challenge for automatic generation of personalized MSV is: Semantic sports video content selection by “event” – goal, injury, card, … “player/team” – Ronaldo, Germany, … ”topic” – the happiness of teams when winning, … Automatic video composition 2018/12/9 by pj

6 Introduction The contributions of this paper
The use of TWB(text web broadcast) improves event detection Enable sports video content selection by player/team Align TWB text event with video event Propose video-centric and music-centric schemes to automatically generate MSV(music sport video). 2018/12/9 by pj

7 System framework 2018/12/9 by pj

8 3.1 Video/Audio analysis 3.1 Video/Audio analysis
Shot boundary selection(F1) Shot is a basic analysis unit Using M2-Edit Pro software 2018/12/9 by pj

9 3.1 Video/Audio analysis 3.1 Video/Audio analysis
Semantic shot classification(F2) The shots transition reveals the state of the gtame far view, in-field medium view, in-field close-up view, out-field medium view, out-field close-up view Reference “Soccer replay detection using scene transition structure analysis”, ICASSP, March2005 (J. Wang) 2018/12/9 by pj

10 3.1 Video/Audio analysis 3.1 Video/Audio analysis Replay detection(F3)
The director launch a replay for interest event Detect flying logo, slow motion. Nowadays, above 95% broadcast sports video use flying-logo to launch replays. 2018/12/9 by pj

11 3.1 Video/Audio analysis 3.1 Video/Audio analysis Camera motion(F4)
The camera motion provides a useful cue to represent the activity of the game “average motion magnitude”, “motion entropy”, “dominant motion direction”, “camera pan/tilt/zoom factor” Reference “Automatic replay generation for soccer video broadcasting”, ACM MM’04. ( J.Wang) 2018/12/9 by pj

12 3.1 Video/Audio analysis 3.1 Video/Audio analysis Audio keyword(F5)
There are some signi¯cant game-speci¯c sounds that have strong relationships to the action of players, referees, commentators and audience in sports videos. “whistle”, “acclaim”, “noise” Reference “Automatic replay generation for soccer video broadcasting”, ACM MM’04. ( J.Wang) 2018/12/9 by pj

13 3.2 Text analysis 3.2 Text analysis
It can increase the accuracy of video event detection It can detect “red/yellow card”, “player”, … event player time team 2018/12/9 by pj

14 3.2 Text analysis 3.2 Text analysis Keyword definition 2018/12/9 by pj

15 3.2 Text analysis 3.2 Text analysis Text event detection
Keyword might have different apperance. “goal”, “g-o-a-l”, “gooooaaaaal” The software dtSearch supports fuzzy – 可以漏字, ex gooal v.s goal stemming – 文法變化, ex foul v.s fouling phonic – 聽起來像的單字, ex smith v.s smithe Player/team extraction 一開始要建好 database,用來做 string matching。 2018/12/9 by pj

16 3.3 Align the text event with A/V stream
Q: The inaccuracy of the time-stamp in TWB abouts 2-3 minutes. S: 2018/12/9 by pj

17 3.3 Align the text event with A/V stream
HMM Maximum evaluate function Weight: wn = 0.2, we= 0.8 G(M): shot count for different events. By training 2018/12/9 by pj

18 Automatic video composition
4.1 Video-Centric MSV In our implementation for this scheme, personalized video contents are first selected from the prepared video content selection pool in chronological order and multiplexed with music clips. For the video-centric case, there is no necessary to align the video shot boundaries with music structures boundaries. 2018/12/9 by pj

19 Automatic video composition
4.2 Music-Centric MSV Analyzing the semantic music structure Semantic: Intro前奏, Verse主歌, Chrous副歌, Ending, Bridge過門音樂 Reference “Content-based music structure analysis with applications to music semantics understanding” ACM MM’04 2018/12/9 by pj

20 Automatic video composition
4.2 Music-Centric MSV Content matching Far view for “Intro”, closeup view for “Chorus”, … User define Tempo matching Hence the tempo matching module performs the alignment between shot boundaries and music structure boundaries. 2018/12/9 by pj

21 Automatic video composition
4.2 Music-Centric MSV Select: 符合 user defined rules Event 和 music 的 duration & motion 相差不多的 Evaluate function T’ = [duration motion] 所以相差越少, v越大 符合rule? or 1? event i 有 k 個shots 2018/12/9 by pj

22 Experiment results Dataset: Accuracy of A/V and text alignment
7 World-Cup 2002, 4 Euro-Cup 2004 About 16 hours Accuracy of A/V and text alignment Boundary decision accuracy 不好的原因: (1) The error of A/V feature. (2) Inaccuracy of TWB timestamp. 2018/12/9 by pj

23 Experiment results 主題清不清楚 夠不夠簡潔 能不能代表original video
This is mainly because our current system is unable to identify whether every single shot in an event is related to the required player/team or not. video 跟 music配合的好不好 2018/12/9 by pj

24 Experiment results 分數不高的原因:
(1) Our music-centric MSV contains several event types which makes it difficult to understand and thus lowering the Clarity score. (2) Because of the requirement to match the music boundary, shots within an event is sometimes discarded. 2018/12/9 by pj


Download ppt "Automatic Generation of Personalized Music Sports Video ACM MM’2005"

Similar presentations


Ads by Google