Presentation is loading. Please wait.

Presentation is loading. Please wait.

Build Better Data: Best Practices for Catalog Cleanup CT Library Association, April 23, 2018 Diane Napert, Interim Director Monographic Processing Services,

Similar presentations


Presentation on theme: "Build Better Data: Best Practices for Catalog Cleanup CT Library Association, April 23, 2018 Diane Napert, Interim Director Monographic Processing Services,"— Presentation transcript:

1 Build Better Data: Best Practices for Catalog Cleanup CT Library Association, April 23, 2018
Diane Napert, Interim Director Monographic Processing Services, Yale University Library

2 The Numbers Yale has three holdings symbols in OCLC, mainly due to Interlibrary Loan 10, 430, 569 active bib records, 674,819 suppressed records 3,534,654 million authority records as an estimate We imported over 300,000 records in 2017, including batch loads for e-resources and print material, that wouldn’t include the law library, which uses Millennium (Innovative)

3 The Yale Library Tale From Voyager 8.1 to 10 over the holiday break 2017/2018 Hardware configuration began in August/September The underlying system moved to Linux also (was Oracle) Oracle to Workday summer 2017

4 Upgrade Planning

5 Complications after upgrade
64 issues on the problem report

6 Tools SQL Oracle SQL Developer – free from website: html Excel – Power Pivot, VLOOKUP function Access, MySQL (relational databases) MarcEdit - Voyager Global Headings Change Cataloger’s Toolkit works with Voyager (authority clean-up in bib records) (Gary Strawn) Voyager Global Data Change – more robust starting with Voyager 9 PyMARC (Python report-writer for bibliographic records using Marc 21) Digital Information Research Specialist – Library GitHub LINQPad OpenRefine - Backlog searching Checking links in bib records BaseX XML data clean-up Python, Perl, PHP Scripting

7 Reports Bibs without holdings report is done periodically as this clears out temporary ILL bib records which have no holdings Sub-divisions in records – hard to do programmatically so write reports and do manually Yale original catalogers also correct as they encounter them as time permits Yale de-dupes bibliographic records during the load process for vendors which send bib records System also de-dupes Language Codes missing, dates missing, in fixed fields Discovery Metadata Librarian wrote these in PyMARC and there were thousands to correct manually (before QuickSearch, Discovery Interface) Ran report which listed On Order records, there were of old ones, records which never got overlaid when a book or item came in. This also points to a training issues. Empty sub-fields report, Discovery Metadata Librarian report Reports listing vouchers pending older than a certain date Old POs remaining unpaid

8 Considerations Timing - Some of these are run at the end of the year to get all invoices paid Perhaps might not want to migrate some of this data into a new system Perhaps take into account the type of orders, firm vs ongoing/subscriptions

9 Future E-book records – improving on quality of current e-book records – pre-processing in MarcEdit Automating more reports Backlog by language – would like it to run automatically, and compare month to month Statistics – now use a report which requires manual input Productivity Reports Linked Data Casalini SHARE-VDE Project - Sent 10,217,644 bib records, 3, authority records to Casalini for conversion to BibFrame ,400,000 Triples! (185 gigabytes of uncompressed data) Data Cleanup – Data which is not machine actionable, omission of data (empty fields), fields converted to wrong level, local fields don’t convert (690 field, Beinecke) LD4P – Linked Data for Production Project (Stanford Lead, Mellon Grant)

10 Why? Productivity Informational Staffing Workflows
Deleting data and enhancing record quality Future Upgrades All for the users!!!

11 A little help from my friends
Thanks to: Éva Bolkovac, Asssistant Catalog Management Librarian Debbie Falvey, Collection Procurement Librarian Lynette Robinson-Johnson, Acquisitions Assistant Angela Sidman, Director, E-Resources and Serials Management Steelsen Smith, Technical Lead


Download ppt "Build Better Data: Best Practices for Catalog Cleanup CT Library Association, April 23, 2018 Diane Napert, Interim Director Monographic Processing Services,"

Similar presentations


Ads by Google