3/20/2000Principles of Information Retrieval Digital Libraries – Issues & Geographic Information Retrieval University of California, Berkeley School of.

Slides:



Advertisements
Similar presentations
The Cybercartographic Atlas of Antarctica Contribution from Wuhan University, China Dongcheng, E., Nengcheng, C.
Advertisements

Subject Based Information Gateways in The UK Coordinated Activities in The UK Within the UK Higher Education community, the JISC (Joint Information Systems.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
This tutorial is designed to take you through the features and content of Oxford African American Studies Center. Please click "Start the Tour" below for.
An Operational Metadata Framework For Searching, Indexing, and Retrieving Distributed GIServices on the Internet By Ming-Hsiang.
Department of Geography University of Portsmouth Getting to grips with GIS : The technology Dr. Ian Gregory, Department of Geography, University of Portsmouth.
For Mapping Biodiversity Data Data Management Options.
Spatial Hypermedia and Augmented Reality
11/20/2001Database Management -- Spring R. Larson Databases and the Future University of California, Berkeley School of Information Management.
What is Where? Lecture 5 Introduction to GISs Geography 176A Department of Geography, UCSB Summer 06, Session B.
SLIDE 1IS Fall 2002 Database Applications -- The UC Berkeley Environmental Digital Library University of California, Berkeley School.
Retrieving Documents with Geographic References Using a Spatial Index Structure Based on Ontologies Database Laboratory University of A Coruña A Coruña,
11/21/2000Database Management -- Spring R. Larson Object-Relational Database Applications -- The UC Berkeley Environmental Digital Library University.
Welcome to EDINA Digimap Digimap is an EDINA service offering online access to a range of spatial data. It is authenticated using Athens and is available.
1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries.
SLIDE 1IS 240 – Spring 2007 Prof. Ray Larson University of California, Berkeley School of Information Tuesday and Thursday 10:30 am - 12:00.
Principles of IR Hacettepe University Department of Information Management DOK 324: Principles of IR.
Welcome to EDINA Digimap Digimap is an EDINA service offering online access to a range of spatial data. It is authenticated using the UK Federation and.
SLIDE 1IS Fall 2004 Data-Driven Digital Library Applications -- The UC Berkeley Environmental Digital Library University of California,
Using Metadata in CONTENTdm Diana Brooking and Allen Maberry Metadata Implementation Group, Univ. of Washington Crossing Organizational Boundaries Oct.
Joint Information Systems Committee Supporting Higher and Further Education Development of an Information Environment for UK Learning and Teaching NOF-Digitise.
ISP 433/633 Week 5 Multimedia IR. Goals –Increase access to media content –Decrease effort in media handling and reuse –Improve usefulness of media content.
Access to Digital Heritage Resources using What, Where, When and Who Michael Buckland Electronic Cultural Atlas Initiative University of California, Berkeley.
SLIDE 1IS Fall 2002 Data Warehousing University of California, Berkeley School of Information Management and Systems SIMS 257: Database.
GTECH 201 Lecture 05 Storing Spatial Data. Leftovers from Last Session From data models to data structures Chrisman’s spheres ANSI Sparc The role of GIScience.
E-culture at UC Berkeley: Networked cultural and environmental data Caverlee Cary Staff Research Associate Geographic Information Science Center University.
© Anselm SpoerriInfo + Web Tech Course Information Technologies Info + Web Tech Course Anselm Spoerri PhD (MIT) Rutgers University
GTECH 361 Lecture 02 Introduction to ArcGIS. Today’s Objectives explore a map and get information about map features preview geographic data and metadata.
NPS Introduction to GIS: Lecture 1
SLIDE 1IS 240 – Spring 2007 Prof. Ray Larson University of California, Berkeley School of Information Tuesday and Thursday 10:30 am - 12:00.
A Digital Geolibrary: Integrating Keywords and PlacenamesECDL A Digital GeoLibrary: Integrating Keywords And Place Names Mathew Weaver and Lois Delcambre.
1 CIS / Introduction to Business GIS Winter 2005 Lecture 2 Dr. David Gadish.
SLIDE 1IS 245 – Spring 2009 Codes and Rules for Description: History University of California, Berkeley School of Information IS 245: Organization.
SLIDE 1IS 257 – Spring 2004 Object-Relational Database System Features University of California, Berkeley School of Information Management.
What is Where? u Getting Started With Geographic Information Systems u Chapter 5.
11/15/2001Database Management -- Spring R. Larson Object-Relational Database Applications -- The UC Berkeley Environmental Digital Library University.
SLIDE 1IS 240 – Spring 2010 Prof. Ray Larson University of California, Berkeley School of Information Principles of Information Retrieval.
Development of Japanese GIS Tool for use in the Humanities ○ Masatoshi ISHIKAWA †, Yoichi KAWANISHI ††, Hidefumi OKUMURA †††, Shoichiro HARA †††† † University.
SLIDE 1IS 257 – Fall 2007 Codes and Rules for Description: History University of California, Berkeley School of Information IS 245: Organization.
Introduction to GIS fGRG360G – Summer Geographic Information System Text Computer system GIS software Brainware Infrastructure Ray Hardware Software.
Digital Library Architecture and Technology
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Spatial Database Souhad Daraghma.
An Interactive Multimedia Database of U.S. Courthouses 1 CourtsWeb, is a website that evaluates and documents recent federal courthouses. It is a decision.
Tufts: New Developments in Spatial Exploration and Analysis for the Tufts Community Patrick Florance GIS Center Manager & Senior GIS Specialist UIT.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
Major parts of ArcGIS ArcView -Basic mapping, editing and Analysis tools ArcEditor -all of ArcView plus Adds ability to deal with topological and network.
Planning for Arctic GIS and Geographic Information Infrastructure Sponsored by the Arctic Research Support and Logistics Program 30 October 2003 Seattle,
8. Geographic Data Modeling. Outline Definitions Data models / modeling GIS data models – Topology.
Cultural Heritage Markup Strategies Bibliotheca Alexandria –Digital Library of the Middle East –January, 2006.
U.S. Department of the Interior U.S. Geological Survey Access to MODIS Land Data Products Through the Land Processes DAAC John Dwyer and Carolyn Gacke,
Geography 417/517: Introduction to GIS Introductory Materials.
Introduction to GIS For Slavic Humanists, Social Scientists and Librarians 2005 Slavic Digital Text Workshop Eileen Llona, University of Washington.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
INTRODUCTION TO GEOGRAPHICAL INFORMATION SCIENCE RSG620 Week 1, Lecture 2 April 11, 2012 Department of RS and GISc Institute of Space Technology, Karachi.
Media Arts and Technology Graduate Program UC Santa Barbara MAT 259 Visualizing Information Winter 2006George Legrady1 MAT 259 Visualizing Information.
Alexandria Digital Library Project Introduction ---- Digital Gazetteers Integration into Distributed Library Services JCDL 2002 Workshop Sponsored by Networked.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
A Prototype Ontology Tool and Interface for Coastal Atlas Interoperability Dawn J. Wright 1, Luiz Bermudez 2 (presenter), Liz O’Dea 3, Yassine Lassoued.
GIS & Health ESPM 9: W 5-8 April 7, 2010 Instructors: Maggi Kelly Kevin Koy Mark O’Connor Geospatial Innovation Facility College of Natural Resources -
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
Towards Unifying Vector and Raster Data Models for Hybrid Spatial Regions Philip Dougherty.
National Geospatial Enterprise Architecture N S D I National Spatial Data Infrastructure An Architectural Process Overview Presented by Eliot Christian.
Copyright © The Polis Center GIS for Historians The North American Religion Atlas and Indiana Online Bloomington, Indiana April 16, 2002 Karen Frederickson.
1 CS 430: Information Discovery Lecture 23 Non-Textual Materials.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
INTRODUCTION TO GEOGRAPHICAL INFORMATION SYSTEM
Presentation transcript:

3/20/2000Principles of Information Retrieval Digital Libraries – Issues & Geographic Information Retrieval University of California, Berkeley School of Information Management and Systems SIMS 240: Principles of Information Retrieval

3/20/2000Principles of Information Retrieval Mini-TREC Proposed Schedule –February 27 – Database and previous Queries –March 6 – report on system acquisition and setup –March 18, New Queries for testing… –April 29, Results due –May 1, Results and system rankings –May 6 & 8 Group reports and discussion

3/20/2000Principles of Information Retrieval Review Application of IR to Digital Library Environments Image Retrieval using Blobworld Derived from a paper presented at the 1999 ASIS Annual Meeting

3/20/2000Principles of Information Retrieval Today More on Digital Libraries –Demo of DL search and features Geographic Information Retrieval –Parts of this this lecture were presented at the invitational conference “The ‘I’ in Geographic Information Science”, Manchester, U.K., July 2001.

3/20/2000Principles of Information Retrieval User Interface Paradigms: Multivalent Documents An approach to new document types and their authoring. Supports active, distributed, composable transformations of multimedia documents. Enables sophisticated annotations, intelligent result handling, user-modifiable interface, composite documents.

3/20/2000Principles of Information Retrieval Multivalent Documents Cheshire Layer OCR Layer OCR Mapping Layer History of The Classical World The jsfj sjjhfjs jsjj jsjhfsjf sjhfjksh sshf jsfksfjk sjs jsjfs kj sjfkjsfhskjf sjfhjksh skjfhkjshfjksh jsfhkjshfjkskjfhsfh skjfksjflksjflksjflksf sjfksjfkjskfjskfjklsslk slfjlskfjklsfklkkkdsj ksfksjfkskflk sjfjksf kjsfkjsfkjshf sjfsjfjks ksfjksfjksjfkthsjir\\ ks ksfjksjfkksjkls’ks klsjfkskfksjjjhsjhuu sfsjfkjs Modernjsfj sjjhfjs jsjj jsjhfsjf sslfjksh sshf jsfksfjk sjs jsjfs kj sjfkjsfhskjf sjfhjksh skjfhkjshfjksh jsfhkjshfjkskjfhsfh skjfksjflksjflksjflksf sjfksjfkjskfjskfjklsslk slfjlskfjklsfklkkkdsj GIS Layer taksksh kdjjdkd kdjkdjkd kj sksksk kdkdk kdkd dkk skksksk jdjjdj clclc ldldl taksksh kdjjdkd kdjkdjkd kj sksksk kdkdk kdkd dkk skksksk jdjjdj clclc ldldl Table 1. Table Layer kdk dkd kdk Scanned Page Image Valence: 2: The relative capacity to unite, react, or interact (as with antigens or a biological substrate). Webster’s 7th Collegiate Dictionary Network Protocols & Resources

3/20/2000Principles of Information Retrieval Image Retrieval Research Finding “Stuff” vs “Things” BlobWorld

3/20/2000Principles of Information Retrieval Overview of Cheshire II The Cheshire II system is intended to provide an easy-to-use, standards-compliant system capable of retrieving any type of information in a wide variety of settings.

3/20/2000Principles of Information Retrieval Cheshire II Searching Z39.50 Internet Images Scanned Text LocalRemote Z39.50

3/20/2000Principles of Information Retrieval GIS in the MVD Framework Layers are georeferenced data sets. Behaviors are –display semi-transparently –pan –zoom –issue query –display context –“spatial hyperlinks” –annotations Written in Java

3/20/2000Principles of Information Retrieval GIS Viewer Example

3/20/2000Principles of Information Retrieval Geographic Information Retrieval and Spatial Browsing Ray R. Larson School of Library and Information Studies University of California, Berkeley

3/20/2000Principles of Information Retrieval Concerns for Digital Libraries Excellent summary in Distributed Geolibraries from NRC. –Distributed resources –Distributed users –Distributed services Access for a broad population is critical for many Digital Libraries

3/20/2000Principles of Information Retrieval Concerns for Digital Libraries Georeferenced Information (geoinformation) provides one organizational perspective Other common perspectives include Topical Classification schemes, Temporal/Historical organization (ECAI) DL’s can provide multiple views of the same information

3/20/2000Principles of Information Retrieval Concerns for Digital Libraries Most DLs are intended for a broad user base: –varying levels of expertise in the contents –varying requirements for access methods –simple expressions of interest in natural language should be supported –Mapping NL to controlled vocabularies (including Digital Gazetteers)

3/20/2000Principles of Information Retrieval Digital Library Needs Geographic and Spatial Querying Spatial Browsing Geographic and Spatial Indexing (Berkeley DL contents and examples)

3/20/2000Principles of Information Retrieval Overview What is Geographic Information Retrieval? Geographic and Spatial Querying and Browsing. Geographic and Spatial Indexing. Examples of GIR Systems and Geographically Indexed Information.

3/20/2000Principles of Information Retrieval Introduction What is Geographic Information Retrieval? –GIR is concerned with providing access to georeferenced information sources. It includes all of the areas of traditional IR research with the addition of spatially and geographically oriented indexing and retrieval. –It combines aspects of DBMS research, User Interface Research, GIS research, and Information Retrieval research.

3/20/2000Principles of Information Retrieval Introduction The need for Geographic and Spatial Information Retrieval. –Digital Libraries Sequoia 2000 UC Berkeley NSF/NASA/ARPA Digital Library Project UC Santa Barbara Alexandria Project NSDI - National Spatial Data Infrastructure –Next-Generation Online Catalogs Cheshire II

3/20/2000Principles of Information Retrieval Geographic and Spatial Querying Both imply querying on relationships within a particular coordinate system Spatial querying is the more general term Can be defined as queries about the spatial relationships (intersection, containment, boundary, adjacency, proximity) of entities geometrically defined and located in space

3/20/2000Principles of Information Retrieval Geographic and Spatial Querying Geographical coordinates are geometric relationships (distance and direction can be measured on a continuous scale) –E.g. “5.21 miles north of Champaign” Spatial relations may be both geometric and topological (spatially related but without measureable distance or absolute direction) –E.g.: “inside the city limits” –“left side of Beckman Institute”

3/20/2000Principles of Information Retrieval Geographic and Spatial Querying Types of spatial queries –Point-in-polygon : “What do we have at this X,Y point?” –Region Queries : “What do we have in this region?” Which point encoded items lie within the region What lines (borders, etc.) lie within or the cross the region What areas overlap the region area Y X

3/20/2000Principles of Information Retrieval Geographic and Spatial Querying Types of spatial queries, cont. –Distance and Buffer Zone Queries What cities lie within 40 miles of the border of Northern and Southern Ireland? What wetlands lie within 50 miles of London? –Path Queries What is the shortest route from San Francisco to Los Angeles?

3/20/2000Principles of Information Retrieval Geographic and Spatial Querying Types of spatial queries, cont. –Multimedia Queries : Use non- map georeferenced information. What are the names of farmers affected by flooding in Monterey and Santa Cruz Counties? p123 p127

3/20/2000Principles of Information Retrieval Spatial Browsing Combines ad hoc spatial querying with interactive displays HyperMap concept Pseudo-HyperMaps

3/20/2000Principles of Information Retrieval Spatial Browsing Advantages: –May not need the accuracy of a full GIS –Comprehensible searching metaphor for many materials Problems: –Clutter and differing scales. –Requires good (and preferably accurate) geographical indexing –Assumes that the user knows some geography

3/20/2000Principles of Information Retrieval Geographic and Spatial Indexing Traditional geographic indexing involves using place names from LCSH and name authorities. These have some problems: –Names are not unique –The places referred to change size, shape and names over time –Spelling variations –Some places are temporary conventions (study areas, etc.)

3/20/2000Principles of Information Retrieval Digital Gazetteers Geographic names are and will remain the primary Entry Vocabulary for DL spatial queries –The gazetteer must support as many variant forms of the name as possible Including temporal ranges for particular names –querying must support spatial reasoning based on gazetteer and other geographic and temporal information in the system or accessible by network access

3/20/2000Principles of Information Retrieval

3/20/2000Principles of Information Retrieval Geographic and Spatial Indexing Geographic coordinates have some advantages over names: –They are persistent regardless of name, political boundary or other changes –The can be simply connected to spatial browsing interfaces and GIS data. –They provide a consistent framework for GIR applications and spatial queries. However, the geographic extents and boundaries of entities also change over time –This may be the primary interest of historical scholarship

3/20/2000Principles of Information Retrieval Geographic and Spatial Indexing GIPSY: Automatic georeferencing of texts (Geographic Info Processing System) –The work of Allison Woodruff and Christian Plaunt - Later DBMS-based version by Jolly Chen -- New version planned –Designed to operate on the full text of documents –Extracts geographic terms and attempts to identify the coordinates of the places discussed in the text using a combination of evidence

3/20/2000Principles of Information Retrieval Geographic and Spatial Indexing GIPSY cont. –Used the USGS Geographic Names Information System (GNIS) and Geographic Information Retrieval and Analysis System (GIRAS) to associate names with coordinates of named places, geographic features and land use characteristics.

3/20/2000Principles of Information Retrieval Geographic and Spatial Indexing GIPSY cont. –Identified places are added as “elevations” with each place adding a weight based on its frequency in the text and database characteristics –The resulting map is analysed to identify the most likely locations, and coordinates for those locations are extracted

3/20/2000Principles of Information Retrieval Geographic and Spatial Indexing GIPSY Map Overlay “The proposed project is the construction of a new State Water Project facility, the coastal branch... by water purveyors of northern Santa Barbara County... delivering water to San Luis Obispo... “ the construction of a new State Water Project facility, the coastal branch... by water purveyors of northern Santa Barbara County... delivering water to San Luis Obispo... “ “The proposed project is the construction of a new State Water Project facility, the coastal branch... by water purveyors of northern Santa Barbara County... delivering water to San Luis Obispo... “ the construction of a new State Water Project facility, the coastal branch... by water purveyors of northern Santa Barbara County... delivering water to San Luis Obispo... “

3/20/2000Principles of Information Retrieval Geographic and Spatial Indexing To be useful for the range of cultural and humanities materials being collected in digital libraries, the GIPSY gazetteer must –Support many different time ranges, location and boundary changes –Support synonymous and variant names with differing locations for the same entity –Support names in multiple languages, scripts and usages

3/20/2000Principles of Information Retrieval ECAI The Electronic Cultural Atlas Initiative is a collaboration between IT professionals and humanities scholars ECAI is developing a globally distributed spatio- temporal library of cultural and historical resources with a centralized metadata catalogue and a GIS viewer Currently the ECAI consortium includes over 250 projects

3/20/2000Principles of Information Retrieval ECAI Projects range from small works by individual scholars to large nationally and internationally funded efforts. E.g.: –geography of Greco-Roman culture (Perseus project) –toponym locations for over 300,000 images of Buddhist art and architecture –Seals of the Sassanian Empire –historical trade routes of Eurasia –the map of Hideyoshi’s invasion of Korea –historical GIS projects for China, Great Britain, the United States, the Black Sea and Tibet

3/20/2000Principles of Information Retrieval Perseus

3/20/2000Principles of Information Retrieval The Sasanian Empire

3/20/2000Principles of Information Retrieval Opening shot of the Sasanian Empire ECAI project, showing a map with diverse resources, a timeline, and a menu of available map layers.

3/20/2000Principles of Information Retrieval Users may zoom in to see resources that are only visible at a higher level of detail.

3/20/2000Principles of Information Retrieval Spatial objects on the map are linked to a table of attributes, which may include any information about the objects. Note that this is a scholarly tool. By creating a “name quality” field, the author has noted that there is disagreement about the locations and names of places in the Sasanian Empire.

3/20/2000Principles of Information Retrieval Sites on the map may be linked to resources elsewhere on the internet. In this case, important archaeological sites on the map are linked to web-based tours.

3/20/2000Principles of Information Retrieval The map interface may be used to show change over time. The “Sasanian Empire ca. 270s” resource is highlighted, and the “Sasanian Empire ca. 570s” is greyed out. If a user slides the timeline bar, the new boundary of the empire will appear.

3/20/2000Principles of Information Retrieval In a different time range, not only do the boundaries of the empire appear different, but the sites that were active during the earlier era (the red dots) have moved as well.

3/20/2000Principles of Information Retrieval TimeMap is a user authoring tool, not merely a viewer. Users can control the look of the icons, the map layers that comprise a project, and, as shown here, the map scale at which different layers will become visible.

3/20/2000Principles of Information Retrieval This screen displays the metadata for the a part of the Sasanian Empire project. The metadata includes functional (tm.) metadata to enable connection to the map interface in addition to cataloguing (dc. and ecai.) metadata. Using the menu on the left, users may choose to map individual map layers or packaged projects.

3/20/2000Principles of Information Retrieval Historic Sydney

3/20/2000Principles of Information Retrieval The Mongol Empire