Mor Naaman, Yee Jiun Song, Andreas Paepcke, Hector Garcia-Molina Digital Library Project, Database Group Stanford University Automatic Organization for Digital Photographs with Geographic Coordinates
JCDL Geo-Referenced Photos April 8 th, :20:02pm Latitude: N Longitude: W
JCDL Geo-Photography Technology + 1)2)
JCDL Personal Photo Libraries Searching/browsing very difficult Little discernible structure to photo collections
JCDL Content-based retrieval –Basic, primitive (far from semantic) Manual labeling –Improved, yet cumbersome Visual methods for fast scanning (Zoom) –Don’t scale well Managing Personal Photos
JCDL Our Approach Absolutely no human effort required Utilize time and location –Automatically captured –Easy to get
JCDL Automatic Organization
JCDL Automatic Organization
JCDL Automatic Organization
JCDL Automatic Organization
JCDL Outline Requirements and challenges The algorithms Sample output Experiment results
JCDL Browsing by Location/Time Use a map/calendar –wwmx.org from MSR: Map issues –Lots of screen space –Sparse –Limited interaction? –Not intuitive for some
Using Hierarchies Time United States Yosemite N.P, Yosemite Valley, CA Location: Around: San Francisco, Berkeley, Sonoma CA San Francisco, Golden Gate Park, CA Seattle, WA … …… … Berkeley, Oakland CA : Yosemite N.P. (2 Days) : San Francisco (1 hour) Time:
JCDL Challenges Locations should be intuitive Events are tricky –3-days trip to NYC –The kid’s soccer game, followed by a birthday party Good names are important.
JCDL Outline Requirements and challenges The algorithms Sample output Experiment results
JCDL Process Diagram
JCDL Discovering Structure Location Hierarchy Initial Event Segmentation Location Clustering Final Event Segmentation Event Hierarchy Initial Event Segmentation Automatic Organization
JCDL Initial Event Segmentation Photos occur in bursts Identify bursts: semantically “connected”
JCDL Initial Event Segmentation Stream of photos More details: Graham et al, JCDL 2002 Tomorrow Proceedings
JCDL Discovering Structure Location Hierarchy Initial Event Segmentation Final Event Segmentation Event Hierarchy Location Clustering Automatic Organization
JCDL Location Clusters Cluster the bursts into locations A. Gionis and H. Mannila. Finding recurrent sources in sequences. In Proceedings, Computational molecular biology –Minimize: number of clusters –Minimize: error (distance to cluster centers)
Photo location Location Clusters: 2-D View
2-D View: with Bursts
JCDL Location Clusters Location 4 - Location 3 - Location 2 - Location 1 -
Location 4 - Location 3 - Location 2 - Location 1 - Location Clusters (breakdown) Some clusters may be overloaded: –Many bursts / picture-taking days in one location San Francisco
JCDL Discovering Structure Location Hierarchy Initial Event Segmentation Location Clustering Event Hierarchy Final Event Segmentation Automatic Organization
JCDL Final Event Segmentation Again scan sequence, new events detected: –Whenever location context changes –In the same location, use adaptive time threshold
JCDL 2004 Final Event Segmentation Overnight trip to Yosemite Soccer game and dinner
JCDL Next - names Detected location and event structure Need to choose names for each node
30 Assigning Names Photo location Stanford Palo Alto City Park Palo Alto Butano State Park Stanford42 Palo Alto30 Butano10 P.A. park8
31 Assigning Names – Nearby? San Jose, 20 miles San Francisco, 30 miles What if photos occur sparsely within cities or parks?
JCDL 2004 Assigning Names - Nearby Which city has stronger “gravity”?
JCDL 2004 Assigning Names - Nearby San Jose is Closer
JCDL 2004 Assigning Names - Nearby San Jose is bigger* *larger population
JCDL 2004 Assigning Names - Nearby But San Fran is more important!* *greater Google count Final name for location cluster: “Stanford, 30 miles South of SF”
JCDL Assigning Names - Alexandria Using polygon-based dataset of administrative areas Alexandria gazetteer can be used for other prominent geographic features
JCDL Outline The requirement and challenges of automatic organization The algorithms Sample output Experiment results
JCDL Location Hierarchy Photoshop Album (at least 4 man-hours) Our system (about 0 man- seconds)
39 Location Hierarchy (US) +San Francisco, Berkeley, Sonoma, CA -Stanford, Mountain View, Monterey, CA Monterey (58 miles S of San Jose) Mountain View (4 miles NW of San Jose) Stanford -Colorado (219 miles W of Denver) -Long Beach (35 miles S of Los Angeles, CA) -Philadelphia, PA -Seattle, WA -Sequoia N.P. (153 miles E of Fresno, CA) -South lake Tahoe; Bear Valley, CA -Yosemite N.P.; Yosemite Valley, CA
Events about 0 man- seconds: : Long Beach,CA (3 days) : San Francisco,CA (3 hours) : Colorado (3 days) : San Francisco,CA(1 hours) : Mountain View,CA (5 hours) : San Francisco,CA (1 hours) : Philadelphia,PA (1 hours) : Sequoia NP (3 days)... Photoshop Album (at least 4 man-hours)
JCDL Event Names LOCALE: share automatically Check personal calendar Event Gazetteer Easy interface
JCDL Experiment Tested on 3 real-world geo-referenced photo collections Our system automatically generated the structure and names Tested with the owners
JCDL Experiment - Locations Accepted the automatic hierarchy Only minor edits requested –Merge/split few of the locations
JCDL Experiment - Events Compared to events as annotated by users 80-85% in both recall and precision Other metrics proposed (see paper)
JCDL Experiment - Naming Naming location clusters –For 76% of clusters, system and users pick at least one name in common –For the rest, “automatic” name was useful
Not yet published: Paid 13 participants to “geo-reference” their photos Loaded to WWMX and our browser –Most liked the map better, but… –Performed the same for search/browse tasks –Event notion helps overcome location handicap –Organization “made sense” P.S. Some didn’t touch the map, yet used our location hierarchy. P.S.2 This was on a BIG screen!
JCDL Thank You! More details: Proceedings Google: Mor Naaman
JCDL Future Work User interface PDA Integrate with map Global photo libraries
JCDL Enhancing Personal Collections Browse/search for photos by location Detect events
JCDL Our System: PhotoCompas Location hierarchy Event hierarchy Our algorithm: –Creates them simultaneously Such that they inform each other –Assigns geo-names using gazetteers and Google
48 != hours in Seattle: One trip. 48 hours in San Francisco: Countless events. Monday 7am Monday 4pm Tuesday 8am Tuesday 1pm Monday 11am Monday 1pm Monday 7pm …
JCDL Why not use simple location hierarchy? What’s wrong with country state city? States division is usually perceived by users (in US) but not always City list likely to be too long Many countries do not have “states”
JCDL Initial Event Segmentation
JCDL Remember The Bursts?
JCDL Location Clusters
JCDL Location Clusters