Generating Summaries and Visualization for Large Collections of Geo-referenced Photographs Alexander Jaffe*, Mor Naaman*, Tamir Tassa †, Marc Davis $ *Yahoo!

Slides:



Advertisements
Similar presentations
California, California, What do you see?
Advertisements

Query Chain Focused Summarization Tal Baumel, Rafi Cohen, Michael Elhadad Jan 2014.
1 Language Models for TR (Lecture for CS410-CXZ Text Info Systems) Feb. 25, 2011 ChengXiang Zhai Department of Computer Science University of Illinois,
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
HT06, Position Paper, Tagging, Taxonomy, Flickr, Academic Article, ToRead, Presentation Cameron Marlow, Mor Naaman, danah boyd, Marc Davis Yahoo! Research.
Comparing Twitter Summarization Algorithms for Multiple Post Summaries David Inouye and Jugal K. Kalita SocialCom May 10 Hyewon Lim.
Ranking models in IR Key idea: We wish to return in order the documents most likely to be useful to the searcher To do this, we want to know which documents.
Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool.
Language Model based Information Retrieval: University of Saarland 1 A Hidden Markov Model Information Retrieval System Mahboob Alam Khalid.
CVPR 2008 James Philbin Ondˇrej Chum Michael Isard Josef Sivic
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
Efficient Processing of Top-k Spatial Keyword Queries João B. Rocha-Junior, Orestis Gkorgkas, Simon Jonassen, and Kjetil Nørvåg 1 SSTD 2011.
SLIDE 1IS 240 – Spring 2009 Prof. Ray Larson University of California, Berkeley School of Information Principles of Information Retrieval.
Chapter 2Modeling 資工 4B 陳建勳. Introduction.  Traditional information retrieval systems usually adopt index terms to index and retrieve documents.
Minimum Spanning Trees Displaying Semantic Similarity Włodzisław Duch & Paweł Matykiewicz Department of Informatics, UMK Toruń School of Computer Engineering,
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
The Community/Individual Cycle: ZoneTag and Y!RB Mor Naaman Yahoo! Research Berkeley.
1 LM Approaches to Filtering Richard Schwartz, BBN LM/IR ARDA 2002 September 11-12, 2002 UMASS.
Eyes on the World: Putting Your Photos in Context Mor Naaman Yahoo! Research Berkeley.
J. Chen, O. R. Zaiane and R. Goebel An Unsupervised Approach to Cluster Web Search Results based on Word Sense Communities.
SLIDE 1IS 202 – FALL 2004 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2004
INEX 2003, Germany Searching in an XML Corpus Using Content and Structure INEX 2003, Germany Yiftah Ben-Aharon, Sara Cohen, Yael Grumbach, Yaron Kanza,
ZoneTag: Putting Your Photos in Context Mor Naaman Yahoo! Research Berkeley.
ZoneTag: Putting Your Photos in Context Mor Naaman Yahoo! Research Berkeley.
Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for.
SLIDE 1ECDL 2004 Ray R. Larson and Patricia Frontiera University of California, Berkeley Spatial Ranking Methods for Geographic Information.
EVENT IDENTIFICATION IN SOCIAL MEDIA Hila Becker, Luis Gravano Mor Naaman Columbia University Rutgers University.
Hierarchical Summaries By: Dawn J. Lawrie University of Massachusetts, Amherst for Search.
1 Automatic Indexing The vector model Methods for calculating term weights in the vector model : –Simple term weights –Inverse document frequency –Signal.
SIGIR’09 Boston 1 Entropy-biased Models for Query Representation on the Click Graph Hongbo Deng, Irwin King and Michael R. Lyu Department of Computer Science.
Finding Wormholes with Flickr Geotags Maarten Clements Marcel Reinders Arjen de Vries Pavel Serdyukov December 3 rd, 2009 GIS.
Tag-based Social Interest Discovery
Web 2.0: Concepts and Applications 4 Organizing Information.
By : Garima Indurkhya Jay Parikh Shraddha Herlekar Vikrant Naik.
Automatic Web Tagging and Person Tagging Using Language Models - Qiaozhu Mei †, Yi Zhang ‡ Presented by Jessica Gronski ‡ † University of Illinois at Urbana-Champaign.
Text Classification, Active/Interactive learning.
Beyond Co-occurrence: Discovering and Visualizing Tag Relationships from Geo-spatial and Temporal Similarities Date : 2012/8/6 Resource : WSDM’12 Advisor.
Query Routing in Peer-to-Peer Web Search Engine Speaker: Pavel Serdyukov Supervisors: Gerhard Weikum Christian Zimmer Matthias Bender International Max.
Topical Crawlers for Building Digital Library Collections Presenter: Qiaozhu Mei.
1 CS 430: Information Discovery Lecture 9 Term Weighting and Ranking.
25/03/2003CSCI 6405 Zheyuan Yu1 Finding Unexpected Information Taken from the paper : “Discovering Unexpected Information from your Competitor’s Web Sites”
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A Comparison of SOM Based Document Categorization Systems.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
A Word at a Time: Computing Word Relatedness using Temporal Semantic Analysis Kira Radinsky (Technion) Eugene Agichtein (Emory) Evgeniy Gabrilovich (Yahoo!
A New Suffix Tree Similarity Measure for Document Clustering
A Hierarchical Monothetic Document Clustering Algorithm for Summarization and Browsing Search Results Kummamuru et al. Presented by Bei Yu Sept. 22 nd,
Wei Feng , Jiawei Han, Jianyong Wang , Charu Aggarwal , Jianbin Huang
Contextual Ranking of Keywords Using Click Data Utku Irmak, Vadim von Brzeski, Reiner Kraft Yahoo! Inc ICDE 09’ Datamining session Summarized.
Query Suggestion Naama Kraus Slides are based on the papers: Baeza-Yates, Hurtado, Mendoza, Improving search engines by query clustering Boldi, Bonchi,
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Flickr Tag Recommendation based on Collective Knowledge BÖrkur SigurbjÖnsson, Roelof van Zwol Yahoo! Research WWW Summarized and presented.
Vector Space Models.
Language Model in Turkish IR Melih Kandemir F. Melih Özbekoğlu Can Şardan Ömer S. Uğurlu.
Personal Tag Semantic Relation Yi-Ching Huang 2008/02/27 Yi-Ching Huang 2008/02/27.
ASSOCIATIVE BROWSING Evaluating 1 Jinyoung Kim / W. Bruce Croft / David Smith for Personal Information.
Sudhanshu Khemka.  Treats each document as a vector with one component corresponding to each term in the dictionary  Weight of a component is calculated.
Finding similar items by leveraging social tag clouds Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: SAC 2012’ Date: October 4, 2012.
Autumn Web Information retrieval (Web IR) Handout #14: Ranking Based on Click Through data Ali Mohammad Zareh Bidoki ECE Department, Yazd University.
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
LECTURE 10: TEXT AS DATA April 13, 2015 SDS 136 Communicating with Data Portions of this slide deck adapted from J.Chuang University of Washington.
IR 6 Scoring, term weighting and the vector space model.
Modeling Perspective Effects in Photographic Composition Zihan Zhou, Siqiong He, Jia Li, and James Z. Wang The Pennsylvania State University.
We are going to go to San Francisco. What we must take You must pack your sneakers, a camera, a phone and a map of the city. You must take a sleeping.
Document Clustering Based on Non-negative Matrix Factorization
Topic Sentence SS: Key words!!! ss: key words!! Concluding Sentence
Personalized Social Image Recommendation
Summary Presented by : Aishwarya Deep Shukla
Toshiyuki Shimizu (Kyoto University)
Panagiotis G. Ipeirotis Luis Gravano
Presentation transcript:

Generating Summaries and Visualization for Large Collections of Geo-referenced Photographs Alexander Jaffe*, Mor Naaman*, Tamir Tassa †, Marc Davis $ *Yahoo! Research Berkeley † Open University of Israel $ Yahoo! Research

Generating Summaries - Mor Naaman 2 Attraction Map of Paris Stanley Milgram, Psychological Maps of Paris

Generating Summaries - Mor Naaman 3 Attraction Map of London Jaffe et al, 2006.

Generating Summaries - Mor Naaman 4 Information Overload? Flickr “geotagged”

Generating Summaries - Mor Naaman 5 Overview Problem definition Intuition for solution Algorithm for summarization Visualizing the dataset Evaluation Demo?

Generating Summaries - Mor Naaman 6 Problem Definition Dataset: (photo_id, user_id, latitude, longitude) (photo_id, tag) Result: (photo_id, rank) Given all photos from a geographic region, find a “representative” summary set

Generating Summaries - Mor Naaman 7 Issues to Tackle Noisy data Whatever, color, city, spectrum, santa barbara, california, usa, Lookatme, Herbert Bayer Chromatic Gate Photographer biases –In locations –In Tags Wrong data

Generating Summaries - Mor Naaman 8 Intuition More “activity” in a certain location indicates importance of that location Tag that are unique to a certain location can suggest importance of that location

Generating Summaries - Mor Naaman 9 (Very) Simple Example

Generating Summaries - Mor Naaman 10 Algorithm Overview 1.Hierarchical Clustering of the location data 2.For each cluster, generate cluster score 3.Recursively generate ordering of all photos in each cluster, based on subcluster score and ordering

Generating Summaries - Mor Naaman 11 The Clustered Return of the (Very) Simple Example! 4, 6, 5 8,7 4,8,6,5,7 2010

Generating Summaries - Mor Naaman 12 Generating a Summary A complete ranking is produced for all photos in the dataset An n-photo summary is simply the first n photos in this ranking.

Generating Summaries - Mor Naaman 13 Generating Cluster Scores Main Factors: –Number of photos –Relevance (bias) factors –“Tag Distinguishability” –“Photographer Distinguishability”

Generating Summaries - Mor Naaman 14 Tag Distinguishability A measure of uniqueness of concepts represented in the cluster (“document”) TF/IDF based –Compute frequency of each tag (TF) –Compute (inverse) frequency of tag in the rest of the dataset (IDF) –Aggregate TF/IDF over all tags in cluster using L2 norm Or, if you like formulas: Read the damn paper!

Generating Summaries - Mor Naaman 15 Summary of San Francisco Golden Gate BridgeTransAmerica AT&T Baseball Park Golden Gate Twin Peaks Golden Gate Bay Bridge Ocean Beach Chinatown

Generating Summaries - Mor Naaman 16 Progress Bar (almost done) Problem definition Intuition for solution Algorithm for summarization Visualizing the dataset Evaluation Demo?

Generating Summaries - Mor Naaman 17 Tag Maps Observation: –The algorithm identifies “representative” locations –The algorithm identifies unique, important tags Can be used to visualize the dataset!

Generating Summaries - Mor Naaman 18 Tag Maps

Generating Summaries - Mor Naaman 19 Tag Maps

Generating Summaries - Mor Naaman 20 Ok, how do we evaluate this? Direct human-evaluation of algorithmic results –Evaluated Tag Maps with various weighting options –Compared summaries to 3 base conditions Compared chosen locations to top 15 locations selected by humans (Milgram- style)

Generating Summaries - Mor Naaman 21 Maybe we have time for a demo

Generating Summaries - Mor Naaman 22 Maybe we have time for Q’s (applied in prototype cameraphone app) (more on this and other topics)