An Introduction to Video Fingerprinting

Slides:

Advertisements

Similar presentations

Distinctive Image Features from Scale-Invariant Keypoints

Advertisements

DONG XU, MEMBER, IEEE, AND SHIH-FU CHANG, FELLOW, IEEE Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment.

Presented by Xinyu Chang

TP14 - Local features: detection and description Computer Vision, FCUP, 2014 Miguel Coimbra Slides by Prof. Kristen Grauman.

Object Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition l Panoramas,

An Introduction to Sparse Coding, Sparse Sensing, and Optimization Speaker: Wei-Lun Chao Date: Nov. 23, 2011 DISP Lab, Graduate Institute of Communication.

Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.

Robust video fingerprinting system Daniel Luis

A Novel Scheme for Video Similarity Detection Chu-Hong Hoi, Steven March 5, 2003.

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

CMPT-884 Jan 18, 2010 Video Copy Detection using Hadoop Presented by: Cameron Harvey Naghmeh Khodabakhshi CMPT 820 December 2, 2010.

Event prediction CS 590v. Applications Video search Surveillance – Detecting suspicious activities – Illegally parked cars – Abandoned bags Intelligent.

Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,

ADVISE: Advanced Digital Video Information Segmentation Engine

A Study of Approaches for Object Recognition

Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.

CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.

Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2005 with a lot of slides stolen from Steve Seitz and.

Object Recognition Using Distinctive Image Feature From Scale-Invariant Key point D. Lowe, IJCV 2004 Presenting – Anat Kaspi.

Scale Invariant Feature Transform (SIFT)

Video Fingerprinting: Features for Duplicate and Similar Video Detection and Query- based Video Retrieval Anindya Sarkar, Pratim Ghosh, Emily Moxley and.

1 Invariant Local Feature for Object Recognition Presented by Wyman 2/05/2006.

Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.

Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.

Lecture 6: Feature matching and alignment CS4670: Computer Vision Noah Snavely.

DVMM Lab, Columbia UniversityVideo Event Recognition Video Event Recognition: Multilevel Pyramid Matching Dong Xu and Shih-Fu Chang Digital Video and Multimedia.

Large-Scale Content-Based Image Retrieval Project Presentation CMPT 880: Large Scale Multimedia Systems and Cloud Computing Under supervision of Dr. Mohamed.

Overview Introduction to local features

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.

Computer vision.

1 Faculty of Information Technology Generic Fourier Descriptor for Shape-based Image Retrieval Dengsheng Zhang, Guojun Lu Gippsland School of Comp. & Info.

School of Information Technology & Electrical Engineering Multiple Feature Hashing for Real-time Large Scale Near-duplicate Video Retrieval Jingkuan Song*,

Robust Mesh-based Hashing for Copy Detection and Tracing of Images Chun-Shien Lu, Chao-Yong Hsu, Shih-Wei Sun, and Pao-Chi Chang Proc. IEEE Int. Conf.

Characterizing activity in video shots based on salient points Nicolas Moënne-Loccoz Viper group Computer vision & multimedia laboratory University of.

Image Retrieval Part I (Introduction). 2 Image Understanding Functions Image indexing similarity matching image retrieval (content-based method)

Introduction to Visible Watermarking IPR Course: TA Lecture 2002/12/18 NTU CSIE R105.

Overview Harris interest points Comparing interest points (SSD, ZNCC, SIFT) Scale & affine invariant interest points Evaluation and comparison of different.

Local invariant features Cordelia Schmid INRIA, Grenoble.

Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.

Lecture 4: Feature matching CS4670 / 5670: Computer Vision Noah Snavely.

Copyright Protection of Images Based on Large-Scale Image Recognition Koichi Kise, Satoshi Yokota, Akira Shiozaki Osaka Prefecture University.

A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.

Features-based Object Recognition P. Moreels, P. Perona California Institute of Technology.

Online Kinect Handwritten Digit Recognition Based on Dynamic Time Warping and Support Vector Machine Journal of Information & Computational Science, 2015.

Character Identification in Feature-Length Films Using Global Face-Name Matching IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 11, NO. 7, NOVEMBER 2009 Yi-Fan.

Dengsheng Zhang and Melissa Chen Yi Lim

Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp

Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.

by Mitchell D. Swanson, Bin Zhu, and Ahmed H. Tewfik

Kylie Gorman WEEK 1-2 REVIEW. CONVERTING AN IMAGE FROM RGB TO HSV AND DISPLAY CHANNELS.

Chittampally Vasanth Raja 10IT05F vasanthexperiments.wordpress.com.

Chittampally Vasanth Raja vasanthexperiments.wordpress.com.

Overview Introduction to local features Harris interest points + SSD, ZNCC, SIFT Scale & affine invariant interest point detectors Evaluation and comparison.

CS848 Similarity Search in Multimedia Databases Dr. Gisli Hjaltason Content-based Retrieval Using Local Descriptors: Problems and Issues from Databases.

CSE 185 Introduction to Computer Vision Feature Matching.

Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.

Line Matching Jonghee Park GIST CV-Lab..  Lines –Fundamental feature in many computer vision fields 3D reconstruction, SLAM, motion estimation –Useful.

On Using SIFT Descriptors for Image Parameter Evaluation Authors: Patrick M. McInerney 1, Juan M. Banda 1, and Rafal A. Angryk 2 1 Montana State University,

Computer Vision Group Department of Computer Science University of Illinois at Urbana-Champaign.

Similarity Measurement and Detection of Video Sequences Chu-Hong HOI Supervisor: Prof. Michael R. LYU Marker: Prof. Yiu Sang MOON 25 April, 2003 Dept.

MMC LAB Secure Spread Spectrum Watermarking for Multimedia KAIST MMC LAB Seung jin Ryu 1MMC LAB.

Digital Video Library - Jacky Ma.

SIFT Scale-Invariant Feature Transform David Lowe

A review of audio fingerprinting (Cano et al. 2005)

TP12 - Local features: detection and description

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

Aim of the project Take your image Submit it to the search engine

Multimedia Information Retrieval

Feature descriptors and matching

Presented by Xu Miao April 20, 2005

Presentation transcript:

An Introduction to Video Fingerprinting Speaker : Wei–lun Chao Advisor : Prof. Jian-jiun Ding DISP Lab Graduate Institute of Communication Engineering National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan Abstract A video copy detection technique A new way of thinking against watermarking Techniques of it are not too complicated until now Still lots of space to be improved National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan Outline & Content 1. Introduction 2. Challenge 3. Basic Structure 4. Existed Techniques 5. Performance Evaluation 6. Our Works 7. Conclusion & Future works 8. Acknowledge 9. Reference National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 1 1. Introduction What is video copy detection and its goal? For database and copyright management Copy V.S Similarity Two main direction: Content-based video copy detection (CBCD) Watermarking National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 1 Watermarking Watermarking relies on inserting a distinct pattern into the video stream It can be simply classified into visible and invisible watermarking A widely included research National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 1 CBCD Pertinent features are extracted as “fingerprints” or “signatures” of the video Comparing the signatures to determine a copy or not ”The media itself is the watermark” National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 1 Comparison Difference\Technique Watermarking CBCD Embed something Yes No Main protecting process Before propagation After propagation Additional memory Yes (Key) Yes (Feature database) Matching step Detection + Extraction + Comparison Extraction + Comparison Techniques Advantages Disadvantages Watermarking Multi-functional Compare the embedded pattern, Afraid of attacking CBCD Fast search, no insertion, HVS preserved Additional data memory, Database standard National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 2 2. Challenge People feel the same, while computers don’t ! Distortions Luminance Geometrical modification Compression & different formats Post-processing Malicious attack CBCD V.S CBVR (content-based video retrieval) National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 3 3. Basic Structure There are three important properties a video fingerprinting technique should have! Robustness: distortion tolerance Pair-wise independent: identification Database search efficiency: practical National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 3 Diagram of CBCD system National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 3 System structure Three parts of the system: Database operation (off-line) Query operation (on-line) Matching (on-line) Functional blocks: Shot detection Key-frame extraction Feature extraction Matching & decision The most important one: Feature extraction + arrangement National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 3 Feature extraction Difference between images and videos: Time 3 dimensions to explore information (feature): Color Spatial Temporal 2 kinds of descriptor: Global Local Arrangement: features ->a signature National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 3 Matching & Decision Three steps: Searching Voting algorithm: best match Threshold: Is the best match a real match? Local description case is more complicated to deal with at this step: Temporal and spatial registration for each point National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

Is there any similarity? 3 Is there any similarity? Asymmetric relation!!! National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 4 4. Existed Techniques Four functional blocks: Shot detection Key frame extraction Feature extraction Matching & decision Feature extraction: global/local descriptors color/spatial/temporal image based/ sequence based National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 4 4.1 Key-frame extraction Key-frames forms a compact expression of a video clip Found techniques: Motion energy model Spatial temporal color distribution National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 4 4.2 Global features Shot boundary Color Different color space histogram, moment, dominant color Changes in or between frames Motion, gradient Ordinal feature Spatial, temporal ordinal Transform-based National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 4 YCbCr histogram The signature for a shot: “A 125 x 8 matrix” National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

Motion direction histogram 4 Motion direction histogram EX: 15x15 blocks per frame , i=0,…… 4 National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

Centroids of gradient orientation 4 Centroids of gradient orientation The signature for each frame: “MN CGO scalars” National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 4 Ordinal feature Spatial ordinal Temporal ordinal For each frame: S (t) = (𝒓𝟏,𝒓𝟐,……,𝒓𝑵) For each video clip: A 72-bin histogram National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

Compact Fourier-Mellin transform 4 Compact Fourier-Mellin transform Original formula: AFMT: RST-invariant property: (,𝒚)=𝒇(𝜶(𝒙𝒄𝒐𝒔𝜷+𝒚𝒔𝒊𝒏𝜷),𝜶(−𝒙𝒔𝒊𝒏𝜷+𝒚𝒄𝒐𝒔𝜷)) Fast AFMT approximation: National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

Compact Fourier-Mellin transform 4 Compact Fourier-Mellin transform National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

Compact Fourier-Mellin transform 4 Compact Fourier-Mellin transform PCA (principal component analysis) Each frame: d-dim vector National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

Comparison of global descriptors 4 Comparison of global descriptors Recall: What we want to solve, page 8 General problem: Post-product processing Technique Advantage Disadvantage Shot boundary Compact, only frame index Algorithm, only ok for whole movies Color Easy to implement High noise sensitivity (inter video not intra video) Spatial/temporal change More information included Computational complexity Ordinal feature Immune to global color change Sensitive to post-processing Transform-based RST-invariant Computational complexity, high feature dimension National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 4 4.3 Local features Additional work: Points-of-interest extraction Ex: Harris point, SIFT Method: A. Joly method Space time interest points Video Copy tracking (ViCopT) Scale invariant feature transform (SIFT) National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 4 A. Joly Method Harris point detector Key-frame based Signature for each point is a 20-deimensional vector in [0,255]D=20 : at four nearby locations National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 4 ViCopT Try to find the trajectory of points of interest Signal description Trajectory description Labeling: Background/Motion National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 4 ViCopT Labeling: label Background: motionless and persistent points along frames label Motion: moving and persistent points Background Motion National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 4 ViCopT Asymmetric Technique: Off-line: Trajectories On-line: Points of interest in key frames Latter steps: Similarity searching Spatio-temporal registration Combination of labels National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 4 ViCopT framework National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

SIFT – Scale invariant feature transform 4 SIFT – Scale invariant feature transform Using the SIFT-point detector (128-dim vector) Computational complexity due to pair-wise comparison Point detection Orientation Descriptor computation National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 4 SIFT With location, scale, and orientation National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 4 SIFT 16 regions, each with 8 orientations, totally 128-dim vector National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 4 SIFT In order to simplify the matching step, the quantization (codebook) method is used for each point and the signature for each frame is a d-bin histogram Each descriptor is quantized into a codeword based on the trained codebook K-means clustering National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 4 4.4 Matching idea Matching type: Sequential matching Vector matching Local point matching: intersection idea Matching method: L1, L2 distance Pattern recognition or machine learning idea National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 4 4.5 Complete comparison Techniques Advantage Disadvantage Global descriptor: (color, ordinal…) Simple computation Easy matching Not robust for logo insertion, cropping, and post-product processing Local descriptor: (ViCopT, SIFT…) Robust to post-product processing…, and can be used for CBVR Complex matching and decision (pair-wise, intersection…) National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 4 Complete comparison More dimensions (color, spatial, temporal), more robust. Local descriptor is stronger than global descriptor especially for logo and word insertion Ordinal recording is better than actual value recording It’s better to record the direction of an image by some algorithms in order to deal with image rotation National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

5. Performance Evaluation Precision-recall curve (P-R curve) F1 score Recall = 𝑻𝒓𝒖𝒆𝑷𝒐𝒔𝒊𝒕𝒊𝒗𝒆 (pa)/ 𝑵𝒍𝒍𝑻𝒖𝒓𝒆 Precision = 𝑻𝒓𝒖𝒆𝑷𝒐𝒔𝒊𝒕𝒊𝒗𝒆 (pa)/ 𝑵𝑨𝒍𝒍𝑷𝒐𝒔𝒕𝒊(pa) 𝑭=𝟐×𝑷×𝑹 / 𝑷+𝑹 National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 6 6. Our Works Database: 2600 scenes, 100 for duplication. 63 duplicated videos are created , then totally we have 6300+2600 = 8900 videos (frame drop, blurring, AWGN, JPEG, Gamma correction) Shot detection Software key-frame detection Spatial temporal color distribution Feature extraction YCbCr histogram, ordinal, CGO, CFMT, and SIFT Matching & detection L1, L2 distance National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 6 Our work result National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

7. Conclusion & Future works Is there any practical case of CBVD? Youtube website Another comparison The importance of feature selection: V.S CBIR Techniques Characteristics Watermarking Malicious attack, for example, put fake patterns inside! Need some knowledge of cryptography and coding Video fingerprinting Could perform kinds of method on one video!!! Could use lots of image and video ideas National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 7 Future works New features Higher speed Put into practice Deal with more complicated cases While, there are practical cares! National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

National Taiwan University, Taipei, Taiwan 8 8. Acknowledge I’d like to thank for my advisor, Prof. Pierre Moulin, and group members, Julien Dubois and Ryan Rogowski, in UIUC!! National Taiwan University, Taipei, Taiwan DISP Lab @ MD531

9 9. Reference [1]A. Hampapur, K. Hyun, and R. M. Bolle. Comparison of sequence matching techniques for video copy detection. volume 4676, pages 194_201. SPIE, 2001. [2] J. Law-To, O. Buisson, V. Gouet-Brunet, and N. Boujemaa. Video copy detection on the Internet: the challenges of copyright and multiplicity. In ICME'07: Proceedings of the IEEE International Conference on Multimedia and Expo, pages 2082_2085, 2007. [3] Sunil Lee and Chang D Yoo. Video fingerprinting based on centroids of gradients orientations. In ICASSP '06: Proceedings of the 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 401_404, Washington, DC, USA, 2006. IEEE Computer Society. [4] J. Law-To, L. Chen, A. Joly, I. Laptev, O. Buisson, V. Gouet-Brunet, N. Boujemaa, and F. Stentiford. Video copy detection: a comparative study. In CIVR '07: Proceedings of the 6th ACM international conference on Image and video retrieval, pages 371_378, New York, NY, USA, 2007. ACM. [5] A. Sarkar, P. Ghosh, E. Moxley, and B. S. Manjunath. Video fingerprinting: Features for duplicate and similar video detection and query-based video retrieval. In SPIE – Multimedia Content Access: Algorithms and Systems II, Jan 2008.