1 CS 430 / INFO 430 Information Retrieval Lecture 22 Non-Textual Materials 1.

Slides:



Advertisements
Similar presentations
Alexandria Digital Library Project Integration of Knowledge Organization Systems into Digital Library Architectures Linda Hill, Olha Buchel, Greg Janée.
Advertisements

UKOLN is supported by: Using the RSLP schema Ann Chapman Collection Description Focus A centre of expertise in digital information management
METS: An Introduction Structuring Digital Content.
Alexandria Digital Library Project The ADEPT Bucket Framework.
Digital Libraries and Multimedia Searching MIT 026B Winter 2002.
EAD in A2A Bill Stockting, Senior Editor A2A and EAD Working Group: Central Archives of Historical Records, Warsaw, 26 April 2003.
StatCat Building a Statistical Data Finder ssrs.yale.edu/statcat Steven Citron-Pousty Ann Green Julie Linden Yale University.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
1 CS 430 / INFO 430 Information Retrieval Lecture 15 Usability 3.
Metadata: An Introduction By Wendy Duff October 13, 2001 ECURE.
1 ISO – Metadata Next Generation International consensus being built on structured metadata within a broader Geomatics Standard under ISO Technical.
1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries.
© Tefko Saracevic, Rutgers University1 metadata considerations for digital libraries.
8/28/97Information Organization and Retrieval Metadata and Data Structures University of California, Berkeley School of Information Management and Systems.
1 CS/INFO 430 Information Retrieval Lecture 17 Web Search 3.
1 CS 430 / INFO 430 Information Retrieval Lecture 22 Metadata 4.
Using Metadata in CONTENTdm Diana Brooking and Allen Maberry Metadata Implementation Group, Univ. of Washington Crossing Organizational Boundaries Oct.
Joint Information Systems Committee Supporting Higher and Further Education Development of an Information Environment for UK Learning and Teaching NOF-Digitise.
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
Interface for the University Library Catalogue Implementing Direct Manipulation Proposal 4.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
1 CS 430: Information Discovery Lecture 15 Library Catalogs 3.
This chapter is extracted from Sommerville’s slides. Text book chapter
By Carrie Moran. To examine the Metadata Object Description Schema (MODS) metadata scheme to determine its utility based on structure, interoperability.
1 CS 430: Information Discovery Lecture 21 Thesauruses and Gazetteers.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
Mark Sullivan University of Florida Libraries Digital Library of the Caribbean.
1 CS 430: Information Discovery Lecture 14 Automatic Extraction of Metadata.
 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
1 CS 430: Information Discovery Lecture 16 Thesauruses and Gazetteers.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Metadata: Essential Standards for Management of Digital Libraries ALI Digital Library Workshop Linda Cantara, Metadata Librarian Indiana University, Bloomington.
International Seminary on Digitisation: Experience and Technology 11 th May 2004 | National Library | Lisbon – Portugal DIGITAL ARCHIVE OF PORTUGUESE ART.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials: Informedia.
Metadata and Documentation Iain Wallace Performing Arts Data Service.
PACSCL Consortial Survey Initiative Group Training Session February 12, 2008 at The Historical Society of Pennsylvania.
Introduction to metadata
1 CS/INFO 430 Information Retrieval Lecture 21 Metadata 3.
Future Directions for Geolibraries Michael F. Goodchild University of California Santa Barbara.
Best Practices for Digital Imaging and Metadata Roy Tennant The Library, University of California, Berkeley
1 Understanding Cataloging with DLESE Metadata Karon Kelly Katy Ginger Holly Devaul
1 CS/INFO 430 Information Retrieval Lecture 15 Metadata 2.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
OCLC Research Library Partnership Work-In-Progress webinar 3 December 2015 A Close Look at the Four Million Archival MARC Records in WorldCat Jackie Dooley.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
ADL Alexandria digital Library – Davidson Library, UCSB Alexandria Digital Library (ADL) Brief intro to ADL Item vs Collection Level Metadata Collection.
1 Chapter 12 Configuration management This chapter is extracted from Sommerville’s slides. Text book chapter 29 1.
ESRI Education User Conference – July 6-8, 2001 ESRI Education User Conference – July 6-8, 2001 Introducing ArcCatalog: Tools for Metadata and Data Management.
Software Reuse Course: # The Johns-Hopkins University Montgomery County Campus Fall 2000 Session 4 Lecture # 3 - September 28, 2004.
1 CS 430 / INFO 430 Information Retrieval Lecture 17 Metadata 4.
1 CS 430: Information Discovery Lecture 19 Non-Textual Materials 1.
EAD 101: An Introduction to Encoded Archival Description XML and the Encoded Archival Description: Providing Access to Collections Oregon Library Association.
8/28/97Information Organization and Retrieval Introduction University of California, Berkeley School of Information Management and Systems SIMS 245: Organization.
1 CS 430: Information Discovery Lecture 21 Non-Textual Materials 1.
A centre of expertise in digital information management UKOLN is supported by: Metadata – what, why and how Ann Chapman.
1 CS 430: Information Discovery Lecture 21 Non-Textual Materials 1.
Alexandria Digital Library The ADL Testbed Greg Janée
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
1 CS 430: Information Discovery Lecture 23 Non-Textual Materials.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Some basic concepts Week 1 Lecture notes INF 384C: Organizing Information Spring 2016 Karen Wickett UT School of Information.
1 CS 430: Information Discovery Lecture 13 Case Study: the NSDL.
Alexandria Digital Library ADL Metadata Architecture Greg Janée.
Lecture 12 Why metadata? CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
CS 430: Information Discovery
CS 430: Information Discovery
Application of Dublin Core and XML/RDF standards in the KIKERES
Proposal of a Geographic Metadata Profile for WISE
Presentation transcript:

1 CS 430 / INFO 430 Information Retrieval Lecture 22 Non-Textual Materials 1

2 Course Administration Thursday, November 11 No office hours Tuesday, November 16 No class or office hours Wednesday, November 17 Discussion class requires you to read three short papers. Wednesday, December 1 Discussion class requires you to search for and read materials on a specified topic.

3 Course Administration Discussion classes Attend! Speak!

4 The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung, "The Google File System." 19th ACM Symposium on Operating Systems Principles, October ghemawat.pdf "Component failures are the norm rather than the exception.... The quantity and quality of the components virtually guarantee that some are not functional at any given time and some will not recover from their current failures. We have seen problems caused by application bugs, operating system bugs, human errors, and the failures of disks, memory, connectors, networking, and power supplies...."

5 Examples of Non-textual Materials ContentAttribute mapslat. and long., content photographsubject, date and place bird songs and imagesfield mark, bird song softwaretask, algorithm data setsurvey characteristics videosubject, date, etc.

6 Possible Approaches to Information Discovery for Non-text Materials Human indexing Manually created metadata records Automated information retrieval Automatically created metadata records (e.g., image recognition) Context: associated text, links, etc. (e.g., Google image search) Multimodal: combine information from several sources User expertise Browsing: user interface design

7 Example 1: Blobworld

8

9

10 Surrogates Surrogates for searching Catalog records Finding aids Classification schemes Surrogates for browsing Summaries (thumbnails, titles, skims, etc.)

11 Catalog Records for Non-Textual Materials General metadata standards, such as Dublin Core and MARC, can be used to create a textual catalog record of non-textual items. Subject based metadata standards apply to specific categories of materials, e.g., FGDC for geospatial materials. Text-based searching methods can be used to search these catalog records.

12 Automated Creation of Metadata Records Sometimes it is possible to generate metadata automatically from the content of a digital object. The effectiveness varies from field to field. Examples Images -- characteristics of color, texture, shape, etc. (crude) Music -- optical recognition of score (good) Bird song -- spectral analysis of sounds (good) Fingerprints (good)

13 Collections: Finding Aids and the EAD Finding aid A list, inventory, index or other textual document created by an archive, library or museum to describe holdings. May provide fuller information than is normally contained within a catalog record or be less specific. Does not necessarily have a detailed record for every item. The Encoded Archival Description (EAD) A format (XML DTD) used to encode electronic versions of finding aids. Heavily structured -- much of the information is derived from hierarchical relationships.

14 Collection-Level Metadata Collection-level metadata is used to describe a group of items. For example, one record might describe all the images in a photographic collection. Note: There are proposals to add collection-level metadata records to Dublin Core. However, a collection is not a document-like object.

15 Collection-Level Metadata

16 Example 2: Photographs Photographs in the Library of Congress's American Memory collections In American Memory, each photograph is described by a MARC record. The photographs are grouped into collections, e.g., The Northern Great Plains, : Photographs from the Fred Hultstrand and F.A. Pazandak Photograph Collections Information discovery is by: searching the catalog records browsing the collections

17

18

19

20 Photographs: Cataloguing Difficulties Automatic Image recognition methods are very primitive Manual Photographic collections can be very large Many photographs may show the same subject Photographs have little or no internal metadata (no title page) The subject of a photograph may not be known (Who are the people in a picture? Where is the location?)

21 Photographs: Difficulties for Users Searching Often difficult to narrow the selection down by searching -- browsing is required Criteria may be different from those in catalog (e.g., graphical characteristics) Browsing Offline. Handling many photographs is tedious. Photographs can be damaged by repeated handling Online. Viewing many images can be tedious. Screen quality may be inadequate.

22 Example 3: Geospatial Information Example: Alexandria Digital Library at the University of California, Santa Barbara Funded by the NSF Digital Libraries Initiative since Collections include any data referenced by a geographical footprint. terrestrial maps, aerial and satellite photographs, astronomical maps, databases, related textual information Program of research with practical implementation at the university's map library

23 Alexandria User Interface

24 Alexandria: Computer Systems and User Interfaces Computer systems Digitized maps and geospatial information -- large files Wavelets provide multi-level decomposition of image -> first level is a small coarse image -> extra levels provide greater detail User interfaces Small size of computer displays Slow performance of Internet in delivering large files -> retain state throughout a session

25 Alexandria: Information Discovery Metadata for information discovery Coverage: geographical area covered, such as the city of Santa Barbara or the Pacific Ocean. Scope: varieties of information, such as topographical features, political boundaries, or population density. Latitude and longitude provide basic metadata for maps and for geographical features.

26 Gazetteer Gazetteer: database and a set of procedures that translate representations of geospatial references: place names, geographic features, coordinates postal codes, census tracts Search engine tailored to peculiarities of searching for place names. Research is making steady progress at feature extraction, using automatic programs to identify objects in aerial photographs or printed maps -- topic for long-term research.

27 Gazetteers The Alexandria Digital Library (ADL): geolibrary at University of California at Santa Barbara where a primary attribute of objects is location on Earth (e.g., map, satellite photograph). Geographic footprint: latitude and longitude values that represent a point, a bounding box, a linear feature, or a complete polygonal boundary. Gazetteer: list of geographic names, with geographic locations and other descriptive information. Geographic name: proper name for a geographic place or feature (e.g., Santa Barbara County, Mount Washington, St. Francis Hospital, and Southern California)

28 Use of a Gazetteer Answers the "Where is" question; for example, "Where is Santa Barbara?" Translates between geographic names and locations. A user can find objects by matching the footprint of a geographic name to the footprints of the collection objects. Locates particular types of geographic features in a designated area. For example, a user can draw a box around an area on a map and find the schools, hospitals, lakes, or volcanoes in the area.

29 Alexandria Gazetteer: Example from a search on "Tulsa" Feature nameStateCountyTypeLatitudeLongitude Tulsa OK Tulsapop pl360914N W Tulsa CountryOKOsagelocale360958N W Club Tulsa CountyOKTulsacivil360600N W Tulsa HelicoptersOKTulsaairport360500N W Incorporated Heliport

30 Challenges for the Alexandria Gazetteer Content standard: A standard conceptual schema for gazetteer information. Feature types: A type scheme to categorize individual features, is rich in term variants and extensible. Temporal aspects: Geographic names and attributes change through time. "Fuzzy" footprints: Extent of a geographic feature is often approximate or ill-defined (e.g., Southern California).

31 Challenges for the Alexandria Gazetteer (continued) Quality aspects: (a) Indicate the accuracy of latitude and longitude data. (b) Ensure that the reported coordinates agree with the other elements of the description. Spatial extents: (a) Points do not represent the extent of the geographic locations and are therefore only minimally useful. (b) Bounding boxes, often include too much territory (e.g., the bounding box for California also includes Nevada).

32 Alexandria Gazetteer Alexandria Digital Library Linda L. Hill, James Frew, and Qi Zheng, Geographic Names: The Implementation of a Gazetteer in a Georeferenced Digital Library. D-Lib Magazine, 5: 1, January

33 Alexandria Thesaurus: Example canals A feature type category for places such as the Erie Canal. Used for: The category canals is used instead of any of the following. canal bends canalized streams ditch mouths ditches drainage canals drainage ditches... more... Broader Terms: Canals is a sub-type of hydrographic structures.

34 Alexandria Thesaurus: Example (continued) canals (continued) Related Terms: The following is a list of other categories related to canals (non- hierarchial relationships). channels locks transportation features tunnels Scope Note: Manmade waterway used by watercraft or for drainage, irrigation, mining, or water power. » Definition of canals.