Finding Wormholes with Flickr Geotags Maarten Clements Marcel Reinders Arjen de Vries Pavel Serdyukov December 3 rd, 2009 GIS
03/12/20092 Maarten Clements PhD: personalized retrieval in Social Media Faculty of EEMCS – ICT group. Supervisors º Marcel Reinders – Prof. Bioinformatics (and more) º Arjen de Vries – CWI, Prof. MM Dataspaces
03/12/20093 Maarten Clements Location prediction Predict relevant locations º Location Location º User Location Why? Flickr: MarsWFlickr: msokal 1 2 ?
03/12/20094 Maarten Clements Location prediction
03/12/20095 Maarten Clements Flickr Foto sharing website º Billions of photos º Active community: º Tags, Geotags, Favorites, Comments… M 91.4M Geotags in flickr
03/12/20096 Maarten Clements Flickr Using Flickr API to collect data: º Strategy to find people who geotag: First collected top cities in 'New York, NY, United States' 2. 'London, England, United Kingdom' 3. 'San Francisco, California, United States' 4. 'Paris, Ile-de-France, France' 5. … Lo Verdes, Canary Islands, Spain
03/12/20097 Maarten Clements Flickr Repeat: º Select a city based on full distribution º Get a photo at this location (geotagged) º Select the user who made the photo º Get all this users photos City
03/12/20098 Maarten Clements Flickr Users:36,264 Photos: 52,425,279 Geo Tags: 22,710,496
03/12/20099 Maarten Clements Flickr Tags Titles Time stamps Social network Descriptions Groups
03/12/ Maarten Clements Flickr
03/12/ Maarten Clements Wormholes Places that are similar but not necessarily spatially close. Use user travel patterns to detect these places Assumptions º Users have a certain travel preference º Users make photos at places they like
03/12/ Maarten Clements Wormholes Given a target location, find relevant users Weigh Euclidean distance with normal distribution
03/12/ Maarten Clements Wormholes Given a target location, find relevant users Weigh Euclidean distance with normal distribution Aggregate data over all users, using computed weights º 2000x4000 histogram, example 4x8: User 1:User 2:User 1+2:
03/12/ Maarten Clements Convolution: Wormholes Given a target location, find relevant users Weigh Euclidean distance with normal distribution Aggregate data over all users, using computed weights Compute convolution with Gaussian kernel Compute difference with expected geotag distribution
03/12/ Maarten Clements Wormholes Result
03/12/ Maarten Clements Wormholes Sigma determines how many users we call Relevant σ σ Many relevant usersFew relevant users
03/12/ Maarten Clements Evaluation Find ground truth data: Wikipedia, GeoNames
03/12/ Maarten Clements Evaluation Rank predicted peaks and compute precision Is there a mountain in a range of 3cells around the predicted peak? Average Precision σ (km) So… Does it work?
03/12/ Maarten Clements Evaluation (manual)
03/12/ Maarten Clements Evaluation (manual) σ = 100km
03/12/ Maarten Clements Evaluation (manual) σ = 20m Target: Tour Eiffel
03/12/ Maarten Clements Evaluation (manual) σ = 20m Target: Tour Eiffel
03/12/ Maarten Clements Evaluation (manual) σ = 80m Target: Tour Eiffel
03/12/ Maarten Clements Evaluation (manual) σ = 80m Target: Tour Eiffel
03/12/ Maarten Clements Evaluation (manual) Target: Tour Eiffel σ = 300m
03/12/ Maarten Clements Evaluation (manual) Target: Tour Eiffel σ = 300m
03/12/ Maarten Clements Evaluation (manual) σ = 60m Target: Pere Lachaise
03/12/ Maarten Clements Evaluation (manual) σ = 60m Target: Pere Lachaise
03/12/ Maarten Clements What next? User Location Query exists of multiple points (instead of 1) Get rid of grid based prediction º Compute kernel convolution peaks directly from continuous geotag data.
03/12/ Maarten Clements What next?
03/12/ Maarten Clements What next?
03/12/ Maarten Clements Conclusions We have proposed a new method to predict similar locations based on geotags. Scale parameter can be used to predict relevant locations at different scales. ECIR’10: Comparing different user aggregation methods
03/12/ Maarten Clements