Metadata Crosswalking/ Transforming and Federated Searching in Ex Libris Products Anthony Dellureficio Library Systems Manager The New School University
Goals of this Presentation Define the role of crosswalks in federated searching Identify the problems created by crosswalks Explore possible solutions to these problems Draw conclusions about how to structure metadata and improve federated searches
What is a “Crosswalk”? Federated searches can incorporate many different types of metadata Crosswalks map descriptive metadata into a uniform schema
The Issue with Federated Searching Increased recall by adding collections Decreased precision due mismatching data fields Crosswalks offer a good solution but they are not ideal and they need tweaking
Related Products ExLibris products which gather data from multiple sources (Primo, Digitool, SFX, Metalib, URM) Products which supply data (Aleph) Integrated products which gather ExLibris data (Xerxes, Umlaut) Any other products that contribute or gather data
What metadata is crosswalked? Descriptive metadata Schemata: MARC, DC, EAD, etc. Sources: vendor data, locally created descriptive data, harvested data (OAI), public databases
Crosswalks in Ex Libris Table which defines all fields Table of indexed fields Table of search fields Table of display fields Map between field codes and standard fields
Why are crosswalks inadequate? Not all metadata has an equivalent in another schema Differing levels of specificity Lumping metadata Many standards
Options Alter different aspects of the structure: Data structure Search structure Metadata structure Interpretation of data
Data Structure Ex. Original cataloging Pros: total institutional control of data Cons: conform to standards?, time consuming
Search Structure Ex. Adding database discovery pages Pros: options for more sophisticated researchers, able to search more specific data Cons: messy website, confusing to have multiple search pages
Metadata Structure Ex. Parse and lump crosswalk fields Pros: adds more access points, customized to specific collection Cons: conforms to standards?, hierarchy problems, slow searches
Interpretation of Data Better search paradigm Pros: more human, addresses actual problem of data interpretation, not a “work-around” Cons: requires programming and special knowledge
Remaining Problems Metadata MUST be good! Access to table files (may need dev box) Staff and time to alter metadata May be constrained by old system/structure/data
Conclusions Part of an overall metadata strategy No single solution Each institution must know its patrons and how they search Increased transforming results in increased data loss