Synchronization for Multi-Perspective Videos in the Wild

Slides:

Advertisements

Similar presentations

Generation of Multimedia TV News Contents for WWW Hsin Chia Fu, Yeong Yuh Xu, and Cheng Lung Tseng Department of computer science, National Chiao-Tung.

Advertisements

How did you use media technologies in the construction and research, planning and evaluation stages?

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

SETTING UP A PROJECT Adobe Premiere Pro CS6. Getting started… Adobe Premiere Pro project file stores links to all the video and sound files-aka…clips.

Lesson 01: The Digital Experience  Transition from traditional devices to multipurpose digital devices. Wired phones move to cell phones and now smart.

Multimedia Database Systems

Kien A. Hua Division of Computer Science University of Central Florida.

Report on Intrusion Detection and Data Fusion By Ganesh Godavari.

ADVISE: Advanced Digital Video Information Segmentation Engine

The Horizon Report 2009 K-12 Edition Technologies on the Horizon

CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.

Kyle Heath, Natasha Gelfand, Maks Ovsjanikov, Mridul Aanjaneya, Leo Guibas Image Webs Computing and Exploiting Connectivity in Image Collections.

Web 2.0 tools for your teaching and learning programme.

Project 2 SIFT Matching by Hierarchical K-means Quantization

Evaluation question four How did you use media technologies in the construction and research, planning and evaluation stages ?

Streaming Predictions of User Behavior in Real- Time Ethan DereszynskiEthan Dereszynski (Webtrends) Eric ButlerEric Butler (Cedexis) OSCON 2014.

Business Software What is database software? p. 145 Allows you to create, access, and manage data Add, change, delete, sort, and retrieve data Next.

Mining Cross-network Association for YouTube Video Promotion Ming Yan, Jitao Sang, Changsheng Xu*. 1 Institute of Automation, Chinese Academy of Sciences,

Report on Intrusion Detection and Data Fusion By Ganesh Godavari.

Experimental Results ■ Observations:  Overall detection accuracy increases as the length of observation window increases.  An observation window of 100.

Online Kinect Handwritten Digit Recognition Based on Dynamic Time Warping and Support Vector Machine Journal of Information & Computational Science, 2015.

PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.

Sara Davilahttp:// Webinar Etiquette Thank you for joining us today. Please note the following. Use your telephone or computer for.

Using Cross-Media Correlation for Scene Detection in Travel Videos.

DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.

Refined Online Citation Matching and Adaptive Canonical Metadata Construction CSE 598B Course Project Report Huajing Li.

Similarity Measurement and Detection of Video Sequences Chu-Hong HOI Supervisor: Prof. Michael R. LYU Marker: Prof. Yiu Sang MOON 25 April, 2003 Dept.

Chapter 12 Technology in Social Studies Instruction John Magee John Magee Andrew Colpitts Andrew Colpitts.

TRECVID IES Lab. Intelligent E-commerce Systems Lab. 1 Presented by: Thay Setha 05-Jul-2012.

 DM-Group Meeting Liangzhe Chen, Oct Papers to be present  RSC: Mining and Modeling Temporal Activity in Social Media  KDD’15  A. F. Costa,

Search Engine Optimization

Experience Report: System Log Analysis for Anomaly Detection

Blended Learning Hilda Cedillo

Data Platform and Analytics Foundational Training

Free PowerPoint Templates Android Assignment Help BookMyEssay.

Digital Video Library - Jacky Ma.

XProtect 2017 R3 New features Mobile access control DLNA out AAC audio

Technologies: for Enhancing Broadcast Programmes with Bridgets

Detecting Semantic Concepts In Consumer Videos Using Audio Junwei Liang, Qin Jin, Xixi He, Gang Yang, Jieping Xu, Xirong Li Multimedia Computing Lab,

Chapter 4: Application Software

What is the Internet? © EIT, Author Gay Robertson, 2016.

Getting Started with Power Query

Capturing, Processing and Experiencing Indian Monuments

Saliency-guided Video Classification via Adaptively weighted learning

Inquiry, Pedagogy, & Technology: Automated Textual Analysis of 30 Refereed Journal Articles David A. Thomas Mathematics Center, University of Great Falls,

Big-Data Fundamentals

Traditional Media and New Media Timeline

My Media Timeline Fernanda R. De Vera ABM- Isaiah.

Automatic Extraction of Conceptual Maps from Design Team Documents

Personalized Social Image Recommendation

Collective Network Linkage across Heterogeneous Social Platforms

Web Data Extraction Based on Partial Tree Alignment

Multimedia Information Retrieval

Social Knowledge Mining

Bo Wang1, Yingfei Xiong2, Zhenjiang Hu3, Haiyan Zhao1,

#VisualHashtags Visual Summarization of Social Media Events using Mid-Level Visual Elements Sonal Goel (IIIT-Delhi), Sarthak Ahuja (IBM Research, India),

A User Attention Based Visible Watermarking Scheme

Securing the Internet of Things: Key Insights and Best Practices Across the Industry Theresa Bui Revon IoT Cloud Strategy.

Weihong Li, Hao Tang and Zhigang Zhu

Multimedia Information Retrieval

Citizen Journalism News Literacy.

Title of poster... M. Author1, D. Author2, M. Author3

Web Mining Department of Computer Science and Engg.

Deep Cross-media Knowledge Transfer

Ying Dai Faculty of software and information science,

Integrating Deep Learning with Cyber Forensics

Discovery of Blog Communities based on Mutual Awareness

Paper Reading Dalong Du April.08, 2011.

XProtect 2017 R3 New features Mobile access control DLNA out AAC audio

Title of ePoster... M. Author1, D. Author2, M. Author3

Presentation transcript:

Synchronization for Multi-Perspective Videos in the Wild Junwei Liang, Poyao Huang, Jia Chen and Alexander Hauptmann Carnegie Mellon University {junweil, poyaoh, jiac, alex}@cs.cmu.edu Abstract Video Synchronization The Boston Dataset In the era of social media, a large number of user-generated videos are uploaded to the Internet every day, capturing events all over the world. Reconstructing the event truth based on information mined from these videos has been an emerging challenging task. Temporal alignment of videos ``in the wild" which capture different moments at different positions with different perspectives is the critical step. In this paper, we propose a hierarchical approach to synchronize videos.Our system utilizes clustered audio-signatures to align video pairs. Global alignment for all videos is then achieved via forming alignable video groups with self-paced learning. Experiments on the Boston Marathon dataset show that the proposed method achieves excellent precision and robustness. We collect a real-world event synchronization dataset “Boston Marathon 2013”, in which two consecutive explosions happened on the sidewalk near the finish line of a traditional city marathon in Boston in 2013. This event received widespread international media attention and synchronizing videos of such event is very useful to event reconstruction and analysis. We constructed the queries “Boston marathon 2013 explosion”, “Boston marathon 2013 bomb”, “Boston marathon 2013 after explosion”, “Boston marathon 2013 after bomb” to crawl videos from Youtube and Dailymotion, two of the most popular video sharing websites. We crawled the top 500 search results from each query on Youtube, and all the search results from Dailymotion. We manually refined the relevance of all the crawled search results by removing irrelevant videos, resulting in 347 relevant videos. The relevant video is defined as on-site videos of the “Boston Marathon 2013” event. Pre-processing and Feature Extraction. Since many user-generated videos are edited before uploading to social media, our system first chunk videos into time-continuous segments based on the shot boundary detection. In an unexpected violent event, people are scared and the video quality may be low and too blurry to see any useful visual evidence. Therefore in this system we focus on the audio modality for synchronization. We extract low-level audio features from the audios, leaving out the videos with no sound. Audio Temporal Signature Extraction. Most user-generate videos are very noisy. In order to extract useful audio signature at each given time frame, our system first conducts an unsupervised clustering to get an audio signature dictionary, and then assigns each time frame of the video segments to the closest k centers. Pariwise Matching Matrix Computing. After assigning each time frame, our system computes the matching matrix $m_{ij}$ for each video segment pairs $v_i$ and $v_j$. Each element of the matching matrix for each pair of video segments is calculated by a function: Introduction With the growing world-wide connectivity and popularity of camera-embedded smart devices, events around the world can now be captured and rapidly shared via social media. When an event happens, especially those with a large crowd of people, different videos would record different moments of the same event at different positions from different perspectives. For example, New Year's Eve at NYC, Carnival in Brazil, and Boston Marathon bombing all have hundreds or even thousands of attendees upload videos of the event. The collection of these user-generated recordings not only enable new applications such as free-view video and 3D-reconstruction, but it may also help our society achieve a more unbiased understanding of the event truth. Such information would particularly important for conflict or violence events, in which the truth of what happened is critical to the general public and for law enforcement to take action. Video In the Wild. Unlike videos captured by fixed, calibrated surveillance cameras, consumer videos are captured ``in the wild" (i.e., at varying time, location, perspectives, with different devices such as smart phones, watches or camcorders.) These videos are noisy and sometimes with low quality. In an unexpected violent event, people are often scared and the videos may be too blurry or shaky to see. Useful information about the event may spread across different time segments of different videos. Therefore, To properly process and analyze a video collection, one main problem that must be solved is to synchronize these video and put them into a global timeline. where s in [1,c_i] and t in [1,c_j] give the time frame number of video segment v_i and v_j. v^s_i is a vector of center numbers for the s-th frame of v_i. For each time frame in the video segment pair, our system checks through the center number vectors and adds a matching score p to m^{st}_{ij} if the center numbers are the same. The Boston Dataset link: http://aladdin1.inf.cs.cmu.edu:8081/boston/ Technical Report: http://www.cmu.edu/chrs/documents/Videos-Boston-Marathon.pdf Experiments Video Synchronization Demo System Acknowledgement This project was conducted in partnership with Carnegie Mellon’s Center for Human Rights Science (http://www.cmu.edu/chrs). The authors would like to thank the MacArthur Foundation, Oak Foundation, and Humanity United for their generous support of this collaboration.