Information Extraction from Multimedia Content on the Social Web Stefan Siersdorfer L3S Research Centre, Hannover, Germany.

Slides:



Advertisements
Similar presentations
Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.
Advertisements

Image Retrieval With Relevant Feedback Hayati Cam & Ozge Cavus IMAGE RETRIEVAL WITH RELEVANCE FEEDBACK Hayati CAM Ozge CAVUS.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Evaluating Color Descriptors for Object and Scene Recognition Koen E.A. van de Sande, Student Member, IEEE, Theo Gevers, Member, IEEE, and Cees G.M. Snoek,
Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
Data preprocessing before classification In Kennedy et al.: “Solving data mining problems”
Bring Order to Your Photos: Event-Driven Classification of Flickr Images Based on Social Knowledge Date: 2011/11/21 Source: Claudiu S. Firan (CIKM’10)
Image Processing IB Paper 8 – Part A Ognjen Arandjelović Ognjen Arandjelović
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
ICIP 2000, Vancouver, Canada IVML, ECE, NTUA Face Detection: Is it only for Face Recognition?  A few years earlier  Face Detection Face Recognition 
Information Retrieval in Practice
Flickr Tags Network Mustafa Kilavuz. Tags A tag is a keyword Search, spam detection, reputation systems, personal organization and metadata.
Landmark Classification in Large- scale Image Collections Yunpeng Li David J. Crandall Daniel P. Huttenlocher ICCV 2009.
Tagging Systems Mustafa Kilavuz. Tags A tag is a keyword added to an internet resource (web page, image, video) by users without relying on a controlled.
Li-Jia Li Yongwhan Lim Li Fei-Fei Chong Wang David M. Blei B UILDING AND U SING A S EMANTIVISUAL I MAGE H IERARCHY CVPR, 2010.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
Ensemble Tracking Shai Avidan IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE February 2007.
Presented by Zeehasham Rasheed
CS292 Computational Vision and Language Visual Features - Colour and Texture.
A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Overview of Search Engines
Improving web image search results using query-relative classifiers Josip Krapacy Moray Allanyy Jakob Verbeeky Fr´ed´eric Jurieyy.
Clustering Unsupervised learning Generating “classes”
Information Retrieval in Practice
Some facets of knowledge management in mathematics Wolfram Sperber (Zentralblatt Math) Patrick Ion (Math Reviews) Facets of Knowledge Organization A tribute.
Finding Wormholes with Flickr Geotags Maarten Clements Marcel Reinders Arjen de Vries Pavel Serdyukov December 3 rd, 2009 GIS.
Web 2.0: Concepts and Applications 4 Organizing Information.
Mapping the World’s Photos
CSE 185 Introduction to Computer Vision Pattern Recognition.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
By : Garima Indurkhya Jay Parikh Shraddha Herlekar Vikrant Naik.
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
Information Systems & Semantic Web University of Koblenz ▪ Landau, Germany Semantic Web - Multimedia Annotation – Steffen Staab
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
Christine Laham, Fahed Abdu, David Dezano,Shelly Kim.
Personalized Interaction with Web Resources First Sino-German Symposium on KNOWLEDGE HANDLING: REPRESENTATION, MANAGEMENT AND PERSONALIZED APPLICATION.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
Hierarchical Annotation of Medical Images Ivica Dimitrovski 1, Dragi Kocev 2, Suzana Loškovska 1, Sašo Džeroski 2 1 Department of Computer Science, Faculty.
TEMPLATE DESIGN © Zhiyao Duan 1,2, Lie Lu 1, and Changshui Zhang 2 1. Microsoft Research Asia (MSRA), Beijing, China.2.
Ranking and Classifying Attractiveness of Photos in Folksonomies Jose San Pedro and Stefan Siersdorfer University of Sheffield, L3S Research Center WWW.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
Graph-based Text Classification: Learn from Your Neighbors Ralitsa Angelova , Gerhard Weikum : Max Planck Institute for Informatics Stuhlsatzenhausweg.
1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)
How Useful are Your Comments? Analyzing and Predicting YouTube Comments and Comment Ratings Stefan Siersdorfer, Sergiu Chelaru, Wolfgang Nejdl, Jose San.
Algorithmic Detection of Semantic Similarity WWW 2005.
A measurement-driven Analysis of Information Propagation in the Flickr Social Network Meeyoung Cha Alan Mislove Krisnna P. Gummadi.
Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
Image Classification for Automatic Annotation
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
Exploiting Ontologies for Automatic Image Annotation Munirathnam Srikanth, Joshua Varner, Mitchell Bowden, Dan Moldovan Language Computer Corporation SIGIR.
Image Classification over Visual Tree Jianping Fan Dept of Computer Science UNC-Charlotte, NC
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Annotation Framework & ImageCLEF 2014 JAN BOTOREK, PETRA BUDÍKOVÁ
Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.
Computer vision. Applications and Algorithms in CV Tutorial 3: Multi scale signal representation Pyramids DFT - Discrete Fourier transform.
Semantic search-based image annotation Petra Budíková, FI MU CEMI meeting, Plzeň,
Flickr Tag Recommendation based on Collective Knowledge Hyunwoo Kim SNU IDB Lab. August 27, 2008 Borkur Sigurbjornsson, Roelof van Zwol Yahoo! Research.
A content-based System for Music Recommendation and Visualization of User Preference Working on Semantic Notions Dmitry Bogdanov, Martin Haro, Ferdinand.
Visual Information Retrieval
Multimedia Content-Based Retrieval
Personalized Social Image Recommendation
Multimedia Information Retrieval
Ying Dai Faculty of software and information science,
Presentation transcript:

Information Extraction from Multimedia Content on the Social Web Stefan Siersdorfer L3S Research Centre, Hannover, Germany

Meta Data and Visual Data on the Social Web Meta Data: Tags Title Descriptions Timestamps Geo-Tags Comments Numerical Ratings Users and Social Links Visual Data: Photos Videos How to exploit combined information from visual data and meta data?

Example 1: Photos in Flickr

Example 2: Videos in Youtube

Social Web Environments as Graph Structure User 1 Video 1 Video 2 Video 3 User 3 User 2 tag1 tag2 tag3 Group 2 Entities (Nodes): Rescources (Videos, Photos) Users Tags Groups Relationships (Edges): User-User: Contacts, Friendship User-Resources: Ownership, Favorite Assignment, Rating User-Groups: Membership Resource-Resource: visual similarity, meta data similarity

User Feedback on the Social Web Numeric Ratings, Favorite Assignments Comments Clicks/Views Contacts, Friendships Community Tagging Blog Entries Upload of Content How can exploit the community feedback?

Outline Part 1: Photos on the Social Web 1.1) Photo Attractiveness 1.2) Generating Photo Maps 1.3) Sentiment in Photos Part 2: Videos on the Social Web Video Tagging

Part I: Photos on the Social Web

1.1) Photo Attractiveness * * Stefan Siersdorfer, Jose San Pedro Ranking and Classifying Attractiveness of Photos in Folksonomies 18th International World Wide Web Conference, WWW 2009, Madrid, Spain

10 Attractiveness of Images LandscapePortraitFlower Which factors influence the human perception of attractiveness?

11 Attractiveness Visual Features Human visual perception mainly influenced by Color distribution Coarseness These are complex concepts Convey multiple orthogonal aspects Necessity to consider different low level features

12 Attractiveness Visual Features Color Features Brightness Contrast Luminance, RGB Colorfulness Naturalness Saturation Mean, Variance Intensity of the colors Saturation is 0 for grey scale images

13 Visual Features Coarseness Resolution + Acutance Sharpness Critical importance for final appearance of photos [Savakis 2000]

Textual Features We consider user generated meta data Correlation of topics with image appealing (ground truth: favorite assignments) Tags seem appropriate to capture this information

Attractiveness of Photos Community-based models for classifying/ranking images according to their appeal. [WWW´09] Content (visual features) Metadata (textual features) Community Feedback (photo’s interestingness) Classification & Regression Attractiveness Models Generator Inputs Flickr Photo Stream cat, fence, house #views #comments #favorites...

16 Classification & Regression Models

17 Experiments

1.2) Generating Photo Maps * *Work and illustrations from David Crandall, Lars Backstrom, Dan Huttenlocher, Jon Kleinberg, Mapping the World's Photos, 18th International World Wide Web Conference, WWW 2009, Madrid, Spain

Outline: Photos maps Use geo-location, tags, and visual features of photos to Identify popular locations and landmarks Find out location of photos Estimate representative images

Spatial Clustering Each data point corresponds to (longitude,latidue) of an image Mean shift clustering is applied to get hierarchical structure Most distinctive popular tags are used as labels (# photos tag in cluster/ # photos with tag in overall set) london paris eiffel louvre trafalgarsquare tatemodern

Estimating Location of Photos without tags Train SVMs on Clusters Positive Examples: Photos in Clusters Negative Examples: Photos outside the Cluster Feature Representation Tags Visual features (SIFT) Best Performance for Combination of Tags and SIFT features

Finding Representative Images Construct Weighted Graph: -Weight based on visual similarity of images (using SIFT features) -Use Graph Clustering (e.g. spectral clustering) to identify tightly connected components -Choose image from this connected component

Example 1: Europe

Example 2: New York

1.2) Sentiment in Photos * * Stefan Siersdorfer, Jonathon Hare, Enrico Minack, Fan Deng Analyzing and Predicting Sentiment of Images on the Social Web 18th ACM Multimedia Conference (MM 2010), Florence, Italy

Sentiment Analysis of Images Data: more than 500,000 Flickr Photos Image Features  Global Color Histogram: a color is present in the image  Local Color Histogram: a color is present at a particular location  SIFT Visual Terms: b/w patterns rotated and scaled Image Sentiment  SentiWordNet: provides sentiment values for terms  e.g. (pos, neg, obj) = (0.875, 0.0, 0.125) for term „good“  used for obtaining sentiment categories  training set + ground truth for experiments

Which are the most discriminative visual terms? Use Mutual Information Measure to determine these features: Probabilities (estimated through counting in image corpus): P(t): Probability that visual term t occurs in image P(c): Probability that image has sentiment category c („pos“ or „neg“) P(t,c): Prob. that image is in category c and has visual term t Intuition: „Terms that have high co-occurence with a category are more characteristic for that category.“

Most Discriminative Features Most discriminative visual features: Extracted using the Mutual Information measure [ACM MM’11]

Part 2: Videos on the Social Web * * Stefan Siersdorfer, Jose San Pedro, Mark Sanderson Content Redundancy in YouTube and its Application to Video Tagging ACM Transactions on Information Systems (TOIS), 2011 Stefan Siersdorfer, Jose San Pedro, Mark Sanderson Automatic Video Tagging using Content Redundancy 32nd ACM SIGIR Conference, Boston, USA, 2009

Near-duplicate Video Content Youtube: most important video sharing environment [SIGCOM’07]: 85 M videos, 65 k videos/day, 100 M downloads per day, Traffic to/from Youtube = 10% / 20% of the Web total Redundancy: 25% of the videos are near duplicates Can we use reduandancy to obtain richer video annotations?  Automatic tagging

Automatic Tagging What is it good for? Additional information  Better user experience Richer feature vectors for...  Automatic data organization (classification and clustering)  Video Search  Knowledge Extraction (  creating ontologies)

Overlap Graph Video 1 Video 3 Video 2 Video 5 Video 4 Video 1 Video 5 Video 2 Video 3 Video 4

Neighbor-based Tagging (1): Idea Video 4 contains original tags A, B; tags F,E are obtained from neighbors Criteria for automatic tagging: Prefer tags used by many neighbors Prefer tags from neighbors with a strong link Video 1Video 2Video 3 Video 4 ABCABC AEAE BEFBEF ABFEABFE automatically generated

Neighbor-based Tagging (2): Formal Weights correspond to overlap Indicator function Sum over all neighbors

Neighbor-based Tagging (3) Apply additional smoothing for redundant regions Number of neighbors with tag t Subsets of neighbors Smoothing factor Overlap Region

TagRank Takes also transitive relationships into account PageRank-like weight propagation

Applications of Extended Tag Respresentation Use relevancies rel( t, vi) for constructing enriched feature vectors for videos: combine original tags with new tags weighted by relevance values automatic annotation : use thresholding to select most relevant tags for a given videos Manual assessment of tags show their relavance Data organization: Clustering and Classification experiments (Ground truth: Youtube categories of videos) Improved performance through enriched feature representation

Summary Social Web contains visual information (photos, videos) and meta data (tags, time stamps, social links, spatial information,..) A large variety of users provide explicit and implict feedback in social web environments (ratings, views, favorite assignments, comments, content of uploaded material) Visual Information & annotations can be combined to obtain enhanced feature representations Visual information can help to establish links between resources such as videos (application: information propagation) Feature representations in combination with community feedback can be used for machine learning (appliciation: classification, mapping).

References Stefan Siersdorfer, Jose San Pedro, Mark Sanderson Content Redundancy in YouTube and its Application to Video Tagging ACM Transactions on Information Systems (TOIS), 2011 Stefan Siersdorfer, Jonathon Hare, Enrico Minack, Fan Deng Analyzing and Predicting Sentiment of Images on the Social Web 18th ACM Multimedia Conference (MM 2010), Florence, Italy Stefan Siersdorfer, Jose San Pedro, Mark Sanderson Automatic Video Tagging using Content Redundancy 32nd ACM SIGIR Conference, Boston, USA, 2009 Stefan Siersdorfer, Jose San Pedro Ranking and Classifying Attractiveness of Photos in Folksonomies 18th International World Wide Web Conference, WWW 2009, Madrid, Spain David Crandall, Lars Backstrom, Dan Huttenlocher, Jon Kleinberg Mapping the World's Photos 18th International World Wide Web Conference, WWW 2009, Madrid, Spain