Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée
Alexandria Digital Library Project 2 Goals o Digital library: “an integrated set of services for capturing, cataloging, storing, searching, protecting, and retrieving information” o ADL: a lightweight, distributed digital library for heterogeneous, georeferenced information a system and an infrastructure –supports personal collections... institutions –provides interoperability across spatial data providers
Alexandria Digital Library Project 3 Adjectives o Heterogeneous remotely-sensed imagery; textual documents multimedia instructional materials; executable models gazetteer placenames o Georeferenced generalizes to “scientific data”: any highly-structured, metadata- rich information o Distributed for scalability o Lightweight accommodate small, cheap (i.e., free) implementations include non-traditional spatial data sources
Alexandria Digital Library Project 4 Where we are today o Downloadable server software, two clients o In operational use by MIL o Other (potential) users: Bren/ESSW Scripps DLESE Norwegian National Library Auckland University of Technology
Alexandria Digital Library Project 5 Challenges o Discovery o Gazetteers o Ranking o Scalability o Context o Client integration o More at
Alexandria Digital Library Project 6 Challenge 1: discovery o Can’t beat word search when it works I want a map of Boulder “Downtown street map of Boulder, Colorado” o But there are so many names for a place... Boulder, Arapahoe County, Colorado Chautauqua, Mapleton Hill, Pearl Street Mall Area code 303, ZIP code 80305, UTM grid 13S Flatirons, Rocky Mountains, Front Range Landers earthquake, hurricane Hugo
Alexandria Digital Library Project 7 If you’re still not convinced... o Remote-sensing imagery is nameless “AVHRR NOAA :33 UTC” o Challenge: exactly which two words will find a USGS map of the Flatirons in the Rocky Mountains behind Boulder, Arapahoe County, Colorado? Eldorado Springs
Alexandria Digital Library Project 8 ADL approach o Coordinate-based representation and discovery generic lat/lon coordinates rich geometry –polygons, polylines spatial operators –overlaps, contains o Gazetteer defines representation of places maps placenames coordinates client gazetteer library coordinates placenames
Alexandria Digital Library Project 9 Challenge 2: gazetteers: necessary evil o Few (public) sources of gazetteer data o Lousy quality digitized from maps o Difficult problems conflation classification boundary determination change over time o Conclusion gazetteer-based spatial reasoning seems unlikely interaction will likely remain client-centric
Alexandria Digital Library Project 10 Final thoughts on discovery o Coordinate-based approach is costly burden on users and catalogers limits potential collections relies on gazetteer’s weakest aspect: footprints continuous coordinate space adds complexity o Gazetteer improvements federated gazetteers new gazetteer models: topological as opposed to metric o Other coordinate spaces, grids, etc.
Alexandria Digital Library Project 11 Challenge 3: ranking o Observed phenomenon: World Map is first result of every query o Idea: rank by spatial similarity to query region query
Alexandria Digital Library Project 12 Challenge 4: scalability o Easy to accumulate lots of data satellites image continuously 1 m resolution, Earth’s surface area = 5 m 2 o Support for scalability text: amazingly good spatial: not so good –indexing becomes unwieldy at 10 6 items combining spatial with other constraint types is difficult
Alexandria Digital Library Project 13 ADL approach o Partition and distribute the problem o Multiple levels of discovery find relevant collections search just those collections o Support multiple implementation strategies spatial engine relational database home-grown
Alexandria Digital Library Project 14 Challenge 5: context o Context is critical for evaluation o Textual context: poem software
Alexandria Digital Library Project 15 Geospatial context o Does this answer your question? Flatirons 1-5 Flagstaff Rd. Green Mountain
Alexandria Digital Library Project 16 Challenge 6: client integration o “Click here” approach places large burden on users navigate interpret evaluate download o Service-based access will become predominant just as the WWW replaced FTP o Needed: description/access standards, protocols integration with search constraints