Extensible Library Catalog Name Access Control Module Matthew Horoszowski Rob Busack Anthony Lyo Ben Greenwood Dean Rzonca Faculty advisor - Robert Bubacz Sponsored by University of Rochester River Campus Library
Overview Project overview Features Future development Demo Questions
Project Two types of records A bibliographic record represents a book, and is linked to multiple authority records. An authority record represents a single author or subject.
Goals Bibliographic Record Author field Name Date of Birth, Death Authority Record Authorized Form Alternate Forms: Alternate form 1 Alternate form 2 … See Also References to other authority records
Project Name matching Names are entered differently. Multiple pens names by the same person. Finding matching records Easy when authority record of an author already exists. A new authority record is created when an author does not exist. Importing different record formats
Technologies Used Java XML MySQL Hibernate Ant Marc4J
Supported Record Types MARC Authority records MARC Bibliography records Dublin Core records
Features A persistent data storage Import records Match records A functional API A prototype GUI
Importing Identifies the correct record formats Imports Marc and DublinCore XMLs Uses Marc4j to parse raw data to Marc XMLs Detects duplications Updates records with new information
Matching Phases
Matching Loops through all unmatched records. Tries various strategies and string transformations in order of confidence. If a match is found, a link is created with evidence. If no match is found, a new Authority record is created based on the Bibliographic record information
Name Transformations Names are transformed to get better matches. For example, Homer Simpson Simpson, Homer Smith, Elizabeth ($q Ann Elizabeth) Smith, Ann Elizabeth De la Mare, Walter Mare, Walter De la Vanughan Williams, Ralph Williams, Ralph Vanughan
Discriminators Adjusts the confidence in a match based on a discrimination criterion. For example, Common names Publication dates
Graphical User Interface Schedules jobs Filters and sorts results Views records and matches Manually matches of records
Metrics Effort by type of activity Test metrics (JUnit) - 45 tests Defects by types
Defects by Type
Effort by type of activity
Status R1 - 2/28/07 R2 - 3/11/07 R3 - 4/3/07 R4 - 5/8/07
Future Possibilities Support for new metadata formats A web-based interface Searching (backend to a OPAC) GUI improvements
Questions and Comments?