Download presentation
Presentation is loading. Please wait.
Published byBerniece Stokes Modified over 8 years ago
1
Using OpenRefine in Digital Collections: the Spencer Sheet Music Project Bruce J. Evans Cataloging & Metadata Unit Leader/Music and Fine Arts Catalog Librarian Baylor University Kara Long Metadata & Catalog Librarian Baylor University
2
Frances G. Spencer Collection of American Sheet Music
3
Cataloging & Metadata Overview
8
Card Catalog MARC Record Dublin Core Metadata & digital object
9
OpenRefine Interactive Data Transformation tool (IDT) Interactive like a spreadsheet – but more powerful Programmable like a database – but more exploratory Open source Runs locally in your browser But what can it do? Import and export data Facet data Transform data Reconcile data to outside data sets
10
Importing the data and creating a new project… MARC fields re-named and re-ordered Join fields where data is separated Separate fields where data is joined Re-format dates Remove unnecessary punctuation Add fields that required in digital collection
11
Columns are the primary units of interaction. The drop down menu of functions at the column level allows us to rename, reorder, or transform columns. Column names must exactly match our CDM field names in order for upload the metadata. MARC 100 Composer Renaming Columns
12
Columns must also exactly match the order that the corresponding fields appear in our CONTENTdm collection. Once all the fields have been re- named, they can be re-ordered under the All columns menu. Re-ordering Columns
13
Joining the 245$a and 245$b to create a Title field Transform data with Google Refine Expression Language (GREL) Expected value Joining Values
14
Adding a column based on an existing column. Values from the 260$c populate the new Date Search field, with unnecessary data removed. Adding a New Column
15
246 must be split into two or three separate fields: -Alternative Title -First line of verse -First line of chorus Splitting Values
16
The value in the First Line of Verse field always begins with the same phrase, “First line of text.” To create a new column with this portion only, split the value by a semi-colon, filter those values by the leading phrase. The same method will also isolate the First Line of Chorus values. Using “not” as a Boolean operator will isolate the Alternative Title values. Splitting Values
17
Extract the Operation History to automate your data transformation and clean up. Extract and save! Apply to new data sets that need the same kind of clean up. Isn’t it all a little tedious?
18
Invaluable resources http://openrefine.org/ http://freeyourmetadata.org/ https://github.com/OpenRefine/OpenRefine/wiki/GREL-Functions Verborgh, Ruben, and Max De Wilde. Using OpenRefine. Birmingham: PACKT Publishing, 2013. Van Hooland, Seth, and Ruben Verborgh. Linked Data for Libraries, Archives, and Museums: How to Clean, Link, and Publish Your Metadata. Chicago: Neal-Schuman, 2014.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.