Download presentation
Presentation is loading. Please wait.
Published byKathryn Freeman Modified over 8 years ago
Light on ETD’s : Out From the Shadows Wendy Robertson Rebecca Routh The University of Iowa Libraries ILA/ACRL Spring Conference 23 April 2010 1
Background Benefits Workflow Related Issues Questions/Discussion ILA/ACRL Spring Conference 23 April 2010 2 Overview
Background ILA/ACRL Spring Conference 23 April 2010 3
While we will be discussing theses and dissertations, many of these issues and workflow possibilities could be done with honors projects to highlight the best work of your undergraduates. ILA/ACRL Spring Conference 23 April 2010 4 Honors Projects
“In short, dissertations are high in quality and low in accessibility, In fact, I’d say they constitute the most invisible form of useful literature and the most useful form of invisible literature. Because of their high quality, the access problem is worth solving.” ILA/ACRL Spring Conference 23 April 2010 5 Peter Suber,
Virginia Tech leader in ETDS – required them in 1997. Iowa accepted them from 1999. – 1999-2002 experimental, XML files with non- textual content. – 2003+ students could submit electronically to ProQuest and we received PDF and individual XML files for each ETD. ILA/ACRL Spring Conference 23 April 2010 6 Ancient History
ILA/ACRL Spring Conference 23 April 2010 7 Total U Iowa ETDs per Year ETDs required
IT staff unzipped and renamed files. Basic HTML listing made by our web master. Email with basic information sent to cataloging. Handled as exceptions and outside the normal flow. ILA/ACRL Spring Conference 23 April 2010 8 Old ETD Workflow
ILA/ACRL Spring Conference 23 April 2010 9
Iowa Research Online (U Iowa’s institutional repository) launched Jan. 2009. ETDs would move to Future ones would be added in an unspecified but hopefully efficient manner. ILA/ACRL Spring Conference 23 April 2010 11 New ETD Home
Joanna Lee and Shawn Averkamp developed method to use ProQuest XML data to load ETDs. They also developed model to convert XML into MARC for cataloging. ILA/ACRL Spring Conference 23 April 2010 12 New Workflow Introduced
Winter 2009 Graduate College announced that beginning with the fall 2009 graduation, ALL dissertations and theses, with the exception of the MFAs, would be submitted electronically. Library staff began determining what this would mean for department workloads and workflows. ILA/ACRL Spring Conference 23 April 2010 13 Required ETDs Announced
The Graduate College allows T-Ds to be embargoed for up to 24 months. This is no different than for print. The library does not receive an embargoed title until after the embargo has passed. We also have allowed a very small number to be restricted to campus, based on graduate request and approval of Graduate College. ILA/ACRL Spring Conference 23 April 2010 14 A Note On Embargos
Benefits ILA/ACRL Spring Conference 23 April 2010 15
Remote access to ETDs is free, easy and instant UI print T-Ds are rarely checked out (a circulation rate of 24 times in 10 years is considered exceptional) Usage data for UI ETDs show a very different picture ILA/ACRL Spring Conference 23 April 2010 16 Increased Access
ILA/ACRL Spring Conference 23 April 2010 17
ILA/ACRL Spring Conference 23 April 2010 18 Global Use of UI ETDs
Enriched content (color images, video, audio, raw data) Shelf space freed up in Archives Savings on paper and printing Increased visibility of the institution Wider vetting of the candidate encourages higher quality ILA/ACRL Spring Conference 23 April 2010 19 Other Advantages
Workflow ILA/ACRL Spring Conference 23 April 2010 20
Students fill out form. ProQuest designs and controls form. Some fields are specific to us based on Graduate College input. The Libraries have no control over fields. ILA/ACRL Spring Conference 23 April 2010 21 Students Submit to ProQuest
ILA/ACRL Spring Conference 23 April 2010 22
ILA/ACRL Spring Conference 23 April 2010 23
If we didn’t rely on ProQuest, we would hopefully have more control over the form and students could even submit directly to the repository. If not submitted to the repository, the form data could still be provided in XML. ILA/ACRL Spring Conference 23 April 2010 24 Local Submission
Thesis title page: – The title appears in all capital letters. – The author’s name is the name of the student as it appears in the Registrar’s records at the time of graduation. Abstract required only for PhD and DMA candidates but abstract field is required on ProQuest form. ILA/ACRL Spring Conference 23 April 2010 25 Graduate College Requirements
Unzip files from ProQuest. Combine multiple XML documents into one. Transform XML file to format for upload. – Relies on quality of student input data. – XML transformation changes titles in all capital to having all words capitalized. ILA/ACRL Spring Conference 23 April 2010 26 First XML Steps
ILA/ACRL Spring Conference 23 April 2010 27 Original ProQuest data XML after transformation
Data reformatted and standard values added according to standard metadata practices. First formatted based on Networked Digital Library of Theses and Dissertations ETD-MS scheme ( v1.00-rev2.html). v1.00-rev2.html Later added fields for MARC records. ILA/ACRL Spring Conference 23 April 2010 28 Metadata Interoperability
ILA/ACRL Spring Conference 23 April 2010 29 Quality Control of Abstract Field Original ProQuest data After transformation After correction
Some irregularities are easier to see in a spreadsheet so we transform to a tab delimited txt file to review. ILA/ACRL Spring Conference 23 April 2010 30 Quality Control By Spreadsheet
Some problems are more obvious in html. Entities with encoding problems are particularly obvious. ILA/ACRL Spring Conference 23 April 2010 31 Quality Control By HTML
ILA/ACRL Spring Conference 23 April 2010 32 Files to Cataloging After the quality control corrections have been made, the XML file formatted for upload is sent to cataloging staff. A post-correction html file is made. The html file includes a temporary link to the ETDs.
XML file is divided into sections and distributed among staff Catalogers add some new data & check for character display They also edit the data to conform to AACR2 rules, as it will later be converted to MARC for a traditional catalog record ILA/ACRL Spring Conference 23 April 2010 33 Processing the XML file
Fix capitalization – acronyms, words in title Fix pagination to reflect the actual numbering assigned by candidate Add page numbers of bibliographic references Check that candidate’s name matches that on the title page Check character display ILA/ACRL Spring Conference 23 April 2010 34 Edits and Checks
ILA/ACRL Spring Conference 23 April 2010 35 Software options Plain text editors (e.g. Notepad, Textpad) show the XML code but are hard on the eyes XML editors (e.g. oXygen) display code with syntax color highlighting Graphical XML editors (e.g. XML Notepad) hide the code and feature an editing workform graphic – useful for those not fluent in XML
ILA/ACRL Spring Conference 23 April 2010 36 Textpad
ILA/ACRL Spring Conference 23 April 2010 37 XML Editor
ILA/ACRL Spring Conference 23 April 2010 38 XML Notepad
ILA/ACRL Spring Conference 23 April 2010 39 Checking Character Display
Check for completion of data by viewing it in XML code IE’s XML Editor displays the code with syntax highlighting – easy to see if any elements are incomplete Catalogers send completed XML file to Digital Library Services, with notes about any characters problems ILA/ACRL Spring Conference 23 April 2010 40 Final steps in XML editing
Individual XML files reassembled as one file. Corrections/problems reviewed. XML file and PDFs uploaded to repository. ILA/ACRL Spring Conference 23 April 2010 41 XML Files From Cataloging
ILA/ACRL Spring Conference 23 April 2010 42
ILA/ACRL Spring Conference 23 April 2010 43
Metadata exposed through OAI-PMH can be harvested by larger collections focused by topic, format, or simply broad collections. Many repositories and digital collections harvested by OAIster. As of fall 2009, OAIster data is in WorldCat. ILA/ACRL Spring Conference 23 April 2010 44 ETD Metadata in OAIster
ILA/ACRL Spring Conference 23 April 2010 45 Fully cataloged MARC record and OAIster record both in WorldCat
ILA/ACRL Spring Conference 23 April 2010 46
ILA/ACRL Spring Conference 23 April 2010 47 Additional metadata displays if you search oaister.worldcat. org.
OAI is a specific XML format. The software maps to OAI for us. In the future we will have more control over how this is done. You can see OAI data a using standard syntax: &metadataPrefix=oai_dc& etd-1436 &metadataPrefix=oai_dc& etd-1436 ILA/ACRL Spring Conference 23 April 2010 48 OAI is Also XML
ILA/ACRL Spring Conference 23 April 2010 49
Metadata for uploading transformed into MARC XML. File is split into broad subject areas for catalogers. ILA/ACRL Spring Conference 23 April 2010 50 XML Transformed to MARC
ILA/ACRL Spring Conference 23 April 2010 51
MarcEdit Free downloadable software Use the MarcMaker function to convert MARC-XML to MARC Choice of UTF or MARC8 ILA/ACRL Spring Conference 23 April 2010 52
ILA/ACRL Spring Conference 23 April 2010 53 File import to OCLC Connexion
Full-level description as per AACR2 LCSH subject headings Adopted some of the ETD metadata best practice guidelines ILA/ACRL Spring Conference 23 April 2010 54 MARC Cataloging
ETD metadata searchable in: WorldCat OAIster UI institutional repository UI OPAC ILA/ACRL Spring Conference 23 April 2010 55 Workflow Complete!
Related Issues ILA/ACRL Spring Conference 23 April 2010 56
Google is Primary Entry to ETDs ILA/ACRL Spring Conference 23 April 2010 57 Traditional library access is still used.
We are not printing the ETDs. – We are relying on multiple copies of the PDF. – We are relying on multiple servers (ProQuest, bepress, Iowa) and CD-ROMs. – We are relying on the ubiquity of pdf to ensure long term access in some format. Graduate College requires signed certificate of approval and title page/abstract in print. ILA/ACRL Spring Conference 23 April 2010 58 Preservation
Since Graduate College handles permissions from students, not a concern for our ETDs, but it would be for other student projects. Ramirez and McMillan suggest including text this in permissions: “Students making submissions to this repository agree to share their work and waive any privacy rights granted by FERPA or any other law, policy or regulation, with respect to this work, for the purpose of publication.” ILA/ACRL Spring Conference 23 April 2010 59 FERPA
We will need to get the author’s permission. We have added two older dissertations with the author’s permission. We would like to include brittle T-Ds. We would like to add higher use T-Ds. ILA/ACRL Spring Conference 23 April 2010 60 Adding Older T-Ds
Concerns sometimes rise that making a dissertation available online constitutes prior publication. Publisher’s will still publish an OA dissertation because they know the book will be substantially different, aiming for a different audience. ILA/ACRL Spring Conference 23 April 2010 61 ETDs are NOT a Prior Publication
Libraries and ILS systems use MARC for data exchange The rest of the world uses XML (publishers, vendors, e-businesses of all types) Conversion back and forth is cumbersome XML is expected to replace MARC in future generations of ILS systems ILA/ACRL Spring Conference 23 April 2010 62 XML vs. MARC
The new model for cooperative cataloging is iterative Data grows and morphs as it travels through the supply chain For ETDs: graduate candidate ProQuest Digital Library Services OAIster the library cataloger Redundancy is avoided Libraries can’t afford to wait until a book reaches their door to create the perfect bib record from scratch ILA/ACRL Spring Conference 23 April 2010 63 Cataloging as an Iterative Process
Includes publishers, vendors, libraries, e-content providers, users o Bibliographic data o BISAC subject headings, sales data o LCSH subject headings and classification numbers o Cover art. table of contents, user reviews OCLC has developed a crosswalk to convert publisher metadata (ONIX) to MARC for data ILA/ACRL Spring Conference 23 April 2010 64 Data Supply-chain
Gain familiarity with XML encoding (the likely successor to MARC) Experience with new tools – XML Notepad – MarcEdit for data conversion Increased confidence Collaboration outside traditional roles ILA/ACRL Spring Conference 23 April 2010 65 New Roles for Catalogers
Averkamp, S. and Lee, J. Repurposing ProQuest Metadata for Batch Ingesting ETDs into an Institutional Repository. Bailey, C.W. Electronic Theses and Dissertation Bibliography. Boock, M. and Kunda, S. Electronic thesis and dissertation metadata workflow at Oregon State University Libraries. Fyffe, R. and Welburn, W. “ETDs, scholarly communication, and campus collaboration: Opportunities for libraries.” /2008/mar/etdsschcommcampucollab.cfm /2008/mar/etdsschcommcampucollab.cfm ILA/ACRL Spring Conference 23 April 2010 66 References
McCutcheon, S., et al. Morphing metadata: maximizing access to electronic theses and dissertations. Ramirez, M. and McMillan, G. FERPA and Student Work: Considerations for Electronic Theses and Dissertations. Register, Renee. Mix and match: mashups of bibliographic data (ALCTS Midwinter Forum, January 18, 2010) Suber, P. Open access to electronic theses and dissertations (ETDs). 02-06.htm#etds 02-06.htm#etds ILA/ACRL Spring Conference 23 April 2010 67 References (cont’d)
ILA/ACRL Spring Conference 23 April 2010 68 Questions?
ILA/ACRL Spring Conference 23 April 2010 69 This presentation by Wendy Robertson and Rebecca Routh is licensed under a Creative Commons Attribution 3.0 United States License Creative Commons Attribution 3.0 United States License
Similar presentations
© 2025 Inc.
All rights reserved.