Alexandria Digital Library Project University of California, Santa Barbara.

Slides:



Advertisements
Similar presentations
EBSCO Discovery Service
Advertisements

Geoscience Information Network Stephen M Richard Arizona Geological Survey National Geothermal Data System.
Alexandria Digital Library Project Integration of Knowledge Organization Systems into Digital Library Architectures Linda Hill, Olha Buchel, Greg Janée.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Information Retrieval in Practice
Kyle Withers University of Arizona Acknowledgements: James Callegary USGS Space Grant Symposium April 18, 2009 Using Geophysical and GIS Methods to Develop.
Alexandria Digital Library Project The ADEPT Bucket Framework.
Historical Gazetteer Integration: CHGIS, Regnum Francorum & GeoNames Working Digitally with Historical Maps AAG 2012 Merrick Lex Berman & Johan Åhlfeldt.
Alexandria Digital Library Project Textual-Geospatial Integration Project J AMES F REW University of California, Santa Barbara.
Advisory Board Meeting  Portland, Oregon  08 November 2000 System Architecture David Maier
Information Sources for Urban History Linda Zellmer Government Information & Data Services Librarian Western Illinois University
SLIDE 1IS 240 – Spring 2009 Prof. Ray Larson University of California, Berkeley School of Information Principles of Information Retrieval.
1 The GeoParser. 2 Overview What is a geoparser? –Software for the automated extraction of place names from text Why would you want one? –Document characterisation.
Searching Text and Data via Common Geography 1 SEARCHING TEXT AND DATA via COMMON GEOGRAPHY Geographic Information Retrieval: Searching Text and Data via.
Mobile Technology for Real Property Assessment Tax Assessor’s Office Davie County, North Carolina.
University of Liverpool Proposed New Library Interface A Direct Manipulation based strategy COMP106 Assessment 2Proposal 16.
1 An introduction to the NSDL William Y. Arms Cornell University.
Data Mining – Intro.
SLIDE 1ECDL 2004 Ray R. Larson and Patricia Frontiera University of California, Berkeley Spatial Ranking Methods for Geographic Information.
Alexandria Digital Library Project Goals and Challenges in Georeferenced Digital Libraries Greg Janée.
Determining and Mapping Locations of Study in Scholarly Documents: A Spatial Representation and Visualization Tool for Information Discovery James Creel.
Support.ebsco.com EBSCO Discovery Service Statistics Explained Tutorial.
Digital Library Architecture and Technology
Urban Growth and Structure Kreg Walvoord And Hillary Campbell.
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
GIS Concepts ‣ What is a table? What is a table? ‣ Queries on tables Queries on tables ‣ Joining and relating tables Joining and relating tables ‣ Summary.
Jeremy D. Bartley Kansas Geological Survey An Introduction to an Index of Geospatial Web Services.
1 Scopus as a Research Tool March Why Scopus?  A comprehensive abstract and citation database of peer-reviewed literature and quality web sources.
1 Data, Information and Knowledge in the British Geological Survey Jeremy Giles.
IEEE Knowledge Media Networking KMN’02 Keynote Address, CRL, Kyoto Japan, July 11, 2002 Concept Switching in the Interspace: Networking Infrastructure.
PROVIDING REMOTE ACCESS TO MAP SET AND SERIES HOLDINGS USING DIGITAL INDEX MAPS AS A DISCOVERY TOOL By Paige G. Andrew Faculty Maps Cataloger Pennsylvania.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Giorgos Giannopoulos (IMIS/”Athena” R.C and NTU Athens, Greece) Theodore Dalamagas (IMIS/”Athena” R.C., Greece) Timos Sellis (IMIS/”Athena” R.C and NTU.
DIGGS Digital Interchange for Geotechnical and Geoenvironmental Specialists Presentation to TransXML Workshop December 9, 2013 Marc Hoit, PI, NC State.
Alexandria Digital Library Project ADEPT Retreat November 2002 ADEPT KOS Activities KOS = Knowledge Organization Systems Outline o KOS in DLs o what has.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
WISER : OxLIP+ Workshops in Information Skills and Electronic Research Oxford Libraries Information Platform Craig Finlay Gillian Beattie.
Alexandria Digital Library User and Use Evaluation Experiments with Log Data Analysis Linda Hill Mary Larsgaard Catherine Masi Mary-Anna Rae Philip Sallis.
Alexandria Digital Earth ProtoType DIGITAL LIBRARIES AND ENVIRONMENTAL INFORMATION Terence R. Smith Alexandria Digital Library Project.
Alexandria Digital Library Project Introduction ---- Digital Gazetteers Integration into Distributed Library Services JCDL 2002 Workshop Sponsored by Networked.
Future Directions for Geolibraries Michael F. Goodchild University of California Santa Barbara.
Current and Potential Uses for GIS in Academic Arctic Research Michael F. Goodchild University of California Santa Barbara.
Historical Gazetteer Integration: CHGIS, Regnum Francorum & GeoNames Working Digitally with Historical Maps AAG 2012 Merrick Lex Berman & Johan Åhlfeldt.
Rob Walker The INSPIRE metadata regulations and quality issues – a user view Rob Walker Association for Geographic Information, London.
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
GeoCrossWalk Use Cases. Reference use Information server Searching (1) Geo-parsing & indexing The GeoCrossWalk Server GeoCrossWalk use cases Searching.
BUILDING NANOBANK Data Structure and Selection Criteria Jason Fong and Emre Uyar University of California, Los Angeles 1.
Alexandria Digital Library Project Four Steps to Geospatial Enlightenment Greg Janée Additional text in “Notes” view.
ADL Alexandria digital Library – Davidson Library, UCSB Alexandria Digital Library (ADL) Brief intro to ADL Item vs Collection Level Metadata Collection.
ASSOCIATIVE BROWSING Evaluating 1 Jinyoung Kim / W. Bruce Croft / David Smith for Personal Information.
NSDL STEM Exchange: Technical Overview and Implications for Active Dissemination of Federally Funded Resources Across Implementation Systems.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
1 CS 430: Information Discovery Lecture 8 Collection-Level Metadata Vector Methods.
Semantic Interoperability for Geographic Information Systems Tobun Dorbin Ng Artificial Intelligence Lab The University of Arizona.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
A Hybrid Declarative/Procedural Metadata Mapping Language Based on Python Greg Janée & James Frew University of California at Santa Barbara.
1 e-Resources on Social Sciences: Scopus. 2 Why Scopus?  A comprehensive abstract and citation database of peer-reviewed literature and quality web sources.
1 CS 430: Information Discovery Lecture 21 Non-Textual Materials 1.
Alexandria Digital Library The ADL Testbed Greg Janée
1 CS 430: Information Discovery Lecture 23 Non-Textual Materials.
Alexandria Digital Library ADL Metadata Architecture Greg Janée.
Geog. 314 Working with tables.
Information Retrieval in Practice
Breakout analysis using Fullbore Formation MicroImager images
Daniel Pilon Senior project officer at NRCan
Alexandria Digital Library
Instructional Design : Design Phase Unit 3
Introduction to Information Retrieval
Geographic Search & Display Updates & Development Plans
Search for Article Citation
Presentation transcript:

Alexandria Digital Library Project University of California, Santa Barbara

Textual- Geospatial Integration Project NSF National Science Digital Library Project Aerial photos Maps Data

Project Goals Extend NSDL infrastructure by enabling o geographic queries  for text and non-text items across heterogeneous digital libraries o geographic referencing  of arbitrary texts without explicit geographic cataloging

Participants University of California, Santa Barbara o James Frew, PI o Terence Smith o Michael Bueno o Linda Hill Information Retrieval Lab, Illinois Institute of Technology o Ophir Frieder o David Grossman o Eric Jensen The American Geological Institute (AGI) has permitted us to use a set of their GeoRef records for system training.

Geospatially- o What’s here?  Find library objects associated with a given location: –Place name(s) –“Footprint” (geographic extent) o Where’s this?  Find the location(s) associated with a given library object

Augmented Search Examples Queries from TREC-9 o Find documents that contain residential real estate listings within New Jersey. o Find reports on automobile traffic in the Washington, DC metropolitan area. o What forms of entertainment are available in Newport Beach, California?

The stages Oral histories geo-parsing georeferenced facts placenames IN ENVIRONS PIECE OF feature types lookup in gazetteer gazetteer entries names footprints spatial analysis identify best footprint

The evaluation Test Settings geoparser: IIT, version 2 geoparser: give partial matches a value of 0.25 geofact rule: include geological terms gazetteer: ADL Gazetteer, protocol interface, gaz lookup settings: operator = “equals” clustering settings: basic clustering Manual Analysis word count in document identify unique geofacts in document identify geoparsing output as valid, partial, or invalid identify valid matches in ADL gazetteer Metrics geoparser: recall and precision gazlookup: recall and precision clustering bounding box: recall and precision clustering bounding box: spatial similarity to reference bounding box

Example Text title: Stress-induced borehole elongation; a comparison between the four-arm dipmeter and the borehole televiewer in the Auburn geothermal well keys: applications | Auburn | borehole breakouts | boreholes | caliper logging | Cayuga County New York | deformation | dipmeter logging | elongation | field studies | fractures | geophysical surveys | instruments | New York | patterns | preferred orientation | rock mechanics | spallations | stress | structural analysis | surveys | televiewers | United States | well-logging abstract: The nature and origin of borehole elongation recorded by the four-arm dipmeter calipers is studied utilizing information obtained from hydraulic fracturing stress measurements and borehole televiewer data taken in a well located in Auburn, New York. A preferred orientation N10 degrees W-S10 degrees E, + or -10 degrees and a less prominant E-W orientation of borehole elongation, was observed on two runs of the dipmeter. Comparisons of borehole geometry determined using the televiewer and the dipmeter show that both tools give the same orientation of borehole elongation provided that the zone of elongation is longer than 30 cm. Comparisons of dipmeter caliper data with orientation of in situ stress and natural fractures, obtained from hydrofracturing tests and televiewer data show that the N10 degrees W-S10 degrees E borehole elongations (1) are axisymmetric, (2) are aligned with the minimum horizontal stress S (sub h) and (3) are not associated with natural fractures intersecting the well. These elongations are interpreted as stress-induced well bore breakouts. The E-W elongation direction is characterized by an assymmetric borehole cross section in thinly bedded rocks and is not caused by breakouts. This assymmetric geometry can be discriminated from breakouts using the oriented electric measurements provided by the dipmeter. This study demonstrates that the dipmeter can be used to determine the orientation of S (sub h) confirming the results of earlier less detailed studies, and provides a firm basis for mapping regional stress patterns using existing dipmeter data.--Modified journal abstract GeoRef bibliographic record from the TGI test set of 7523 records

Manual Analysis Geofacts Auburn IN New York Cayuga County IN New York IN Auburn Auburn New York United States Gaz entries adlgaz (Auburn, New York) adlgaz d (Cayuga County, New York) adlgaz (New York) adlgaz (United States)

Geoparsing Geoparsing performance o parser recall = 4.25/6 = 0.71 o parser precision = 4.25/8 = 0.53 Geoparsing output (Auburn,,,, 1, K) (Auburn,,, (in, ), 1, T) (New York,,,, 1, K) (United States,,,, 1, K) (Cayuga County,,,, 1, K) (Auburn New,,, (in,), 1, B) (County New,,, (in, York), 1, K) (York,,,, 1, B) fact ::= (name?, type?, footprint?, related-fact?, certainty, importance) Geoparsing scoring o valid fact = 1 o partially valid fact = 0.25 o invalid fact = 0 blue = valid fact green = partially valid fact red = invalid fact

Gazlookup o operator = “equals” (exact match)  “auburn”.. 37 entries  “new york”.. 18 entries  “united states”.. 1 entry  “cayuga county”.. 1 entry  “auburn new”.. 0  “county new”.. 0  “york”.. 50 entries  TOTAL = 105 Gazlookup performance o lookup recall = 3/4 = 0.75 o lookup precision = 3/105 = 0.03

Scatter of points Scatter of 105 points from “equals” Gazlookup Clustered points (67) in the US and Canada Baseline clustering

Derived footprint Footprint for “equals” lookup data and simple clustering, compared to GeoRef footprint GeoRef footprint Derived footprint from points Very low spatial similarity between TGI box and reference box from GeoRef

Statistics redux Based on comparison of automated processes to manual analysis and GeoRef box for one sample record: o Geoparsing  Recall ………………………  Precision …………………… o Gazlookup  Recall ………………………  Precision …………………… o TBI bounding box  Recall ………………………  Precision ……………….…  Similarity to reference ….. 0

Set new conditions Find settings that give good results for 10 test records Run 7,524 GeoRef test records through TGI Calculate similarity of TGI boxes to GeoRef boxes Choose 10 new test records for manual analysis from best & worst results Reset conditions Repeat Next steps GeoparserGazlookupClustering