TEMPORAL EVENT CLUSTERING FOR DIGITAL PHOTO COLLECTIONS Matthew Cooper, Jonathan Foote, Andreas Girgensohn, and Lynn Wilcox ACM Multimedia ACM Transactions.

Slides:

Advertisements

Similar presentations

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Advertisements

Complex Networks for Representation and Characterization of Images For CS790g Project Bingdong Li 9/23/2009.

Top-Down & Bottom-Up Segmentation

1 Machine Learning: Lecture 10 Unsupervised Learning (Based on Chapter 9 of Nilsson, N., Introduction to Machine Learning, 1996)

Clustering & image segmentation Goal::Identify groups of pixels that go together Segmentation.

Automatic Histogram Threshold Using Fuzzy Measures 呂惠琪.

Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.

1 A scheme for racquet sports video analysis with the combination of audio-visual information Visual Communication and Image Processing 2005 Liyuan Xing,

Relevance Feedback Content-Based Image Retrieval Using Query Distribution Estimation Based on Maximum Entropy Principle Irwin King and Zhong Jin Nov

Multimedia Indexing and Retrieval Kowshik Shashank Project Advisor: Dr. C.V. Jawahar.

Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.

Chapter 1: Introduction to Pattern Recognition

HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.

NCKU CSIE Visualization & Layout for Image Libraries Baback Moghaddam, Qi Tian IEEE Int’l Conf. on CVPR 2001 Speaker: 蘇琬婷.

Clustering… in General In vector space, clusters are vectors found within  of a cluster vector, with different techniques for determining the cluster.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

1 Abstract This paper presents a novel modification to the classical Competitive Learning (CL) by adding a dynamic branching mechanism to neural networks.

Relevance Feedback based on Parameter Estimation of Target Distribution K. C. Sia and Irwin King Department of Computer Science & Engineering The Chinese.

Distinguishing Photographic Images and Photorealistic Computer Graphics Using Visual Vocabulary on Local Image Edges Rong Zhang,Rand-Ding Wang, and Tian-Tsong.

Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.

Presented by Zeehasham Rasheed

KNN, LVQ, SOM. Instance Based Learning K-Nearest Neighbor Algorithm (LVQ) Learning Vector Quantization (SOM) Self Organizing Maps.

Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.

Relevance Feedback Content-Based Image Retrieval Using Query Distribution Estimation Based on Maximum Entropy Principle Irwin King and Zhong Jin The Chinese.

Image Segmentation by Clustering using Moments by, Dhiraj Sakumalla.

Unsupervised Learning of Categories from Sets of Partially Matching Image Features Kristen Grauman and Trevor Darrel CVPR 2006 Presented By Sovan Biswas.

Lecture 19 Representation and description II

1 An Efficient Classification Approach Based on Grid Code Transformation and Mask-Matching Method Presenter: Yo-Ping Huang Tatung University.

Data mining and machine learning A brief introduction.

Presented by Tienwei Tsai July, 2005

Watch, Listen and Learn Sonal Gupta, Joohyun Kim, Kristen Grauman and Raymond Mooney -Pratiksha Shah.

Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.

COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL Presented by 2006/8.

Chapter 10, Part II Edge Linking and Boundary Detection The methods discussed in the previous section yield pixels lying only on edges. This section.

ECE738 Advanced Image Processing Face Detection IEEE Trans. PAMI, July 1997.

A Two-level Pose Estimation Framework Using Majority Voting of Gabor Wavelets and Bunch Graph Analysis J. Wu, J. M. Pedersen, D. Putthividhya, D. Norgaard,

PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.

Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering.

Content-Based Image Retrieval Using Fuzzy Cognition Concepts Presented by Tienwei Tsai Department of Computer Science and Engineering Tatung University.

Efficient EMD-based Similarity Search in Multimedia Databases via Flexible Dimensionality Reduction / 16 I9 CHAIR OF COMPUTER SCIENCE 9 DATA MANAGEMENT.

CSSE463: Image Recognition Day 11 Lab 4 (shape) tomorrow: feel free to start in advance Lab 4 (shape) tomorrow: feel free to start in advance Test Monday.

2005/12/021 Content-Based Image Retrieval Using Grey Relational Analysis Dept. of Computer Engineering Tatung University Presenter: Tienwei Tsai ( 蔡殿偉.

Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.

2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )

CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.

Levels of Image Data Representation 4.2. Traditional Image Data Structures 4.3. Hierarchical Data Structures Chapter 4 – Data structures for.

1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.

Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.

1 An Efficient Classification Approach Based on Grid Code Transformation and Mask-Matching Method Presenter: Yo-Ping Huang.

A Multiresolution Symbolic Representation of Time Series Vasileios Megalooikonomou Qiang Wang Guo Li Christos Faloutsos Presented by Rui Li.

Chapter 13 (Prototype Methods and Nearest-Neighbors )

Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -

CSSE463: Image Recognition Day 11 Due: Due: Written assignment 1 tomorrow, 4:00 pm Written assignment 1 tomorrow, 4:00 pm Start thinking about term project.

1 E.V. Myasnikov 2007 Digital image collection navigation based on automatic classification methods Samara State Aerospace University RCDL 2007Интернет-математика.

Content Based Color Image Retrieval vi Wavelet Transformations Information Retrieval Class Presentation May 2, 2012 Author: Mrs. Y.M. Latha Presenter:

Computational Biology Group. Class prediction of tumor samples Supervised Clustering Detection of Subgroups in a Class.

Color Image Segmentation Mentor : Dr. Rajeev Srivastava Students: Achit Kumar Ojha Aseem Kumar Akshay Tyagi.

An Image Database Retrieval Scheme Based Upon Multivariate Analysis and Data Mining Presented by C.C. Chang Dept. of Computer Science and Information.

Bag-of-Visual-Words Based Feature Extraction

DIGITAL SIGNAL PROCESSING

CSSE463: Image Recognition Day 11

Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas

CSSE463: Image Recognition Day 11

Improving Retrieval Performance of Zernike Moment Descriptor on Affined Shapes Dengsheng Zhang, Guojun Lu Gippsland School of Comp. & Info Tech Monash.

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.

Introduction to Pattern Recognition

Text Categorization Berlin Chen 2003 Reference:

CSSE463: Image Recognition Day 11

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.

CSSE463: Image Recognition Day 11

Presentation transcript:

TEMPORAL EVENT CLUSTERING FOR DIGITAL PHOTO COLLECTIONS Matthew Cooper, Jonathan Foote, Andreas Girgensohn, and Lynn Wilcox ACM Multimedia ACM Transactions on Multimedia Computing, Communications and Application

OUTLINE Introduction Feature extraction Clustering techniques Supervised event clustering Unsupervised event clustering Clustering goodness criteria Experimental result Conclusion

I NTRODUCTION Users navigate their photos Temporal order Visual content Associate time and content with the notion of a specific “event” Photos associated with an event often exhibit little coherence in terms of either low-level image features or visual similarity photographs from the same event are taken in relatively close proximity in time

B ASIC CONCEPTS --- E VENT Events are naturally associated with specific times and places. Birthday party Vacation Wedding

B ASIC CONCEPTS --- EXIF & CBIR Exchangeable Image File (EXIF): Time, Location, Focal length, Flash, etc. => Season, place, weather, indoor/outdoor,etc Content-based Image Retrieval (CBIR): Color, Texture, Shape, etc. => Face & Fingerprint Recognition,etc Metadata

FEATURE EXTRACTION EXIF headers are processed to extract the timestamp The N photos in the collection are then ordered in time so the resulting timestamps, {t n :n = 1,..., N},satisfy t 1 ≤ t 2 ≤ … ≤ t N Time difference between indices (photos) is nonuniform t 1 t 2 t 3 t 4 t 5 t 6 ….. t

FEATURE EXTRACTION Computing similarity matrices S K temporal similarity matrix

FEATURE EXTRACTION Computing similarity matrix low-frequency discrete cosine transform (DCT) coefficients from each photo using the cosine distance measure content-based similarity matrix

FEATURE EXTRACTION computing novelty scores K=1000K=10000K= peaks in the novelty scores = cluster boundaries between contiguous groups of similar photos

CLUSTERING TECHNIQUES Supervised event clustering Based on LVQ Unsupervised event clustering Scale-space analysis of the raw timestamp data Temporal Similarity Analysis Combining Time and Content-Based Similarity

Supervised event clustering Let K take M values : K ≡ {K 1,..., K M } Define the M × N matrix N(j,i) = ν K j (i), where Based on LVQ (Learning Vector Quantization) [Kohonen 1989] LVQ codebook discriminates between the two classes “event boundary” and “event interior.” The codebook vectors for each class are used for nearest- neighbor classification of the novelty features for each photo in the test set.

Supervised event clustering In the training phase, a codebook is calculated using an iterative procedure Each step Nearest codebook vector to each training sample is determined shifted toward or away the training sample If Nx and Mc are in the same class If Nx and Mc aren’t in the same class

Supervised event clustering ALGORITHM 1 (LVQ-BASED PHOTO CLUSTERING). (1) Calculate novelty features from labeled sorted training data for each scale K : (i) compute the similarity matrix S K (ii) compute the novelty score ν K (2) Train LVQ using the iterative procedure (3) Calculate novelty features for the testing data for each K (i) compute the similarity matrix S K (ii) compute the novelty score ν K (4) Classify each test sample’s novelty features N i using the LVQ codebook and the nearest-neighbor rule.

U NSUPERVISED EVENT CLUSTERING scale-space analysis operate on the raw timestamps T 0 = [t 1,..., t N ] so that T 0 (i) = t i ALGORITHM 2 (SCALE-SPACE PHOTO CLUSTERING). (1) Extract timestamp data from photo collection: {t 1,..., t N }. (2) For each σ in descending order: (i) compute T σ (ii) detect peaks in T σ, tracing peaks from larger to smaller scales (decreasing σ).

UNSUPERVISED EVENT CLUSTERING Temporal Similarity Analysis Locate peaks at each scale by analysis of the first difference of each novelty scores ν K, proceeding from coarse scale to fine (decreasing K) To build a hierarchical set of event boundaries, we include boundaries detected at coarse scales in the boundary lists for all finer scales. checkerboard kernel used to compute the novelty features

UNSUPERVISED EVENT CLUSTERING Combining Time and Content-Based Similarity constructed a content-based matrix S C using low- frequency DCT features and the cosine distance if |t i -t j | > 48h others if |t i -t j | > 48h others

CLUSTERING GOODNESS CRITERIA Peak detection at each scale K results in a hierarchical set of candidate boundaries Subset must be selected to define the final event clusters Three different automatic approaches Similarity-Based Confidence Score Boundary Selection via Dynamic Programming BIC-Based Boundary Selection

Similarity-Based Confidence Score Detected boundaries at each level K, B K = {b 1,..., b nK }, indexed by photo: B K ⊂ {1,..., N} average intracluster similarity between the photos within each cluster average intercluster similarity between photos in adjacent clusters

Boundary Selection via Dynamic Programming Reduced complexity Begin with the set of peaks detected from the novelty features at all scales Cost of the cluster between photos b i and b j

Boundary Selection via Dynamic Programming Optimal partitions with m boundaries based on the optimal partition with m−1 boundaries First, optimal partitions are computed with two clusters E F (j,m) is the optimal partition of the photos with cardinality m

Boundary Selection via Dynamic Programming Number of clusters increases, the total cost of the partition decreases monotonically Selecting the optimal number of clusters, M ∗, based on the total partition cost

BIC-Based Boundary Selection This method is based on the Bayes information criterion (BIC) [Schwarz 1978] Assumption timestamps within an event are distributed normally around the event mean log-likelihood of the two segment model Log-likelihood of the single segment model and the penalty term λ is 2,since we describe each segment using the sample mean μ,and variance, σ 2

BIC-BASED BOUNDARY SELECTION Employ the hierarchical coarse-to-fine approach At each scale, we test only the newly detected boundaries (undetected at coarser scales) Add the boundaries for which the left side exceeds the right side

ALGORITHM 3 (SIMILARITY- BASED PHOTO CLUSTERING) (1) Extract and sort photo timestamps, {t1,..., tn}. (2) For each K in decreasing order (i) compute the similarity matrix S k (ii) compute the novelty score ν K (iii) detect peaks in ν K (iv) form event boundary list using event boundaries from previous iterations and newly detected peaks (3) Determine a final boundary subset of collected boundaries over all scales considered according to one of the methods : (a) the confidence score (b) the DP boundary selection approach (c) the BIC boundary selection approach

EXPERIMENTAL RESULT Run Times for Different Size Photo Collections The times are in seconds No Conf. indicates times for Steps 1 and 2 BIC peak selection (BIC) Dynamic programming peak selection (DP) similarity-based peak selection (Conf.) Doubling the number of photos(N),the time for the segmentation step(No Conf.) increases linearly, while including the confidence measure (Conf.) incurs a polynomial cost.

EXPERIMENTAL RESULT Compare the event clustering performance of eleven systems on two separate photo collections Collection I consists of 1036 photos taken over 15 months Collection II consists of 413 photos taken over 13 months The first four algorithms in the table are “hand-tuned” to maximize performance. The remaining algorithms are fully automatic.

EXPERIMENTAL RESULT Precision indicates the proportion of falsely labeled boundaries: Recall measures the proportion of true boundaries detected: The F-score is a composite of precision and recall:

EXPERIMENTAL RESULT

The adaptive-thresholding algorithms exhibited high recall and low precision on both test sets, even with manual tuning Scale-space and the two similarity-based approaches demonstrated more consistent performance and traded off precision and recall more evenly

CONCLUSION Employed the automatic temporal similarity- based method Does not rely on preset thresholds or restrictive assumptions As photo collections with location information become available, we hope to extend our system to combine temporal similarity, content-based similarity, and location-based similarity. The automatic methods’ performance exceeded that of manually tuned alternatives in our testing, and have been well received by users of our photo management application.