 Megan Dirickson, Kristin Law, Nora Winslow INF 392K, Spring 2013.

 Megan Dirickson, Kristin Law, Nora Winslow INF 392K, Spring 2013

 Previous Work  Determining Scope  Gathering & Assessing records  Appraisal & Arrangement o Creating the DSpace Collections o Privacy  Processing o Descriptive Metadata Spreadsheet o Creation of the SIPS o Batch Ingest o Shell scripting o Batch Metadata Editing  Twitter  Future Work  Self-Archiving Guidelines

 In 2011, Wendy Hagenmaier and Rachel Appel digitized SAA paper records for the Survey of Digitization class.  They digitized 221 objects.  They set up a basic schema in DSpace, which we used as jumping-off point.

 Community-School of Information Student Organizations  Sub-community-Society of American Archivist UT Chapter  Collections: Administrative Records, Archives Week, Correspondence, Events, Financial Records, Marketing, Meeting Minutes, Website

 Archive all the existing born-digital records, especially the records from the past year.  But more importantly, set up a self-archiving work flow that would allow future SAA members to easily archive their own records into Dspace.

 We wanted to gain intellectual control over the materials. We asked: “What exists and where is it? What should be included for the future?”  Used Megan and Kristin’s expertise as previous officers  Rachel and Wendy’s previous documentation

 We asked previous SAA board members to send us anything they had.  Gleaned materials from the SAA’s two websites-the general website and the Archives Week website

 Images, documents, recordings, presentations and spreadsheets  Files that made up the websites, html and css files mostly  Twitter and Facebook accounts  Listserv emails

 Over 600 discrete files  Experimented with archiving Twitter and Facebook-mixed results  Looked into previous attempts to archive listserv emails.  Facebook and the emails proved too complicated and time-consuming for the scope of this project.

 Appraisal basically consisted of weeding out duplicates, of which there were a lot.  Kristin managed the files that were sent to us from previous members.  Megan gleaned the general SAA website.  Nora worked with the Archives Week website.  Over 900 files

SAA Website Structure

 Large number of files  Over 10 year time span  We wanted to maintain the arrangement, but the current structure was too restrictive. o We moved everything up a level, in order to create collections for each year

 Community o School of Information Student Organizations.  Sub-community o Society of American Archivists: o UT Student Chapter  Sub-sub-communities o Administrative Records o Archives Week o Correspondence o Events o Financial Records o Marketing o Meeting Minutes o Website and Social Media  Collections o Calendar year

 All Financial Records collections have been set to be private. These collections contain budgets, potential account information, and information about donations to Archives Week that the donors may wish to keep private. All financial documents from Archives Week planning have intentionally been included with Financial Records in order to keep the Archives Week collections open to the public.  The most current years (2010-2013) of Administrative Records are currently closed. Sensitive documents in these collections include membership rosters (with emails), and mentorship program information. EIDs have been redacted from the 2010-2011 membership rosters.

 EIDs are not to be kept in the digital archive and documents should be reviewed to be sure that they are not included.  Other sensitive information may be included in the archive, but kept in a private collection. All sensitive documents have been included in only Financial Records and Administrative Records, allowing the remaining collections to be open. Titles of private items will be viewable to the public, but the contents of the items will not be.  It is up to the discretion of the future board to determine when the closed collections may be made publicly available. The Treasurer is responsible for reviewing current and previously deposited records for privacy issues, as the Treasurer will be most cognizant of sensitive information contained in financial and membership records.

 Kept archival copy of records safe on a flash drive  Made other ‘processing’ copies for determining content and gathering metadata  Created spreadsheet for entering descriptive metadata  This is also when we determined intellectual arrangement of records and spotted duplicates

 Create extracted metadata xml file using National Library of New Zealand’s Metadata Extractor  Perl script to create dublin_core formatted xml from extracted xml, and create a new directory for each  Manually add original bitstream to each directory  Perl script to create ‘contents’ text file  Perl script to change directory names to item_001, item_002, etc.  This had to be done separately for each collection (about 30 collections)

 Staged SIPs on Vauxhall in structure mirroring the Dspace structure, and wrote batch ingest command lines before meeting with Sam  Change in command line: o /opt/dspace/bin/dspace import org.dspace.itemimport.ItemImport -- add --eperson=msdirickson@gmail.com --collection=2081/29160 --  Problems with dublin_core files—junk!

 Since we had so many collections, we bundled the command lines to execute using shell scripts  The idea was to save time…..but… o The script didn’t leave time to check for errors before moving on to the next collection  Added: echo sleep 5

 Exported metadata from each sub-community: id collection dc.contributor.author dc.date.created dc.date.issued dc.identifier.uri dc.language.iso dc.publisher dc.subject dc.title  Merged with our descriptive metadata files by matching with id #’s, and adding/changing dublin core fields and data: id collection dc.contributor.author – SAA-UT dc.date.created –changed from ingest date, to date of creation/use of document dc.date.issued dc.identifier.uri dc.language.iso dc.publisher dc.subject dc.title.alternative –moved filename here dc.contributor – if an individual author was known dc.title --changed from filename to descriptive title dc.coverage.spatial dc.description

 Once the spreadsheet was completely edited, we saved them as CSVs, and met with Sam again to import the metadata  Each sub-community had to be imported individually (much faster than each collection!)  Command line: Opt/dspace/bin/dspace metadata-import –f /opt/batch_ingests/2081-29125.csv Weird things happened with the ingest date…

Yay, Metadata!!!

 Twitter provides a simple means for downloading Tweets  We felt that the tweets, especially from 2012, were valuable records. The Archives Week lectures were live-tweeted, providing rich documentation for the events.  The Dspace bundle includes: o Zip file including CSV of tweets (with time/date stamps) o Screenshot for added visual context

 Follow workflow and continue archiving records!  Website—too complicated for a simple ingest  Listserv emails  Facebook  Continued digitization

 Naming Conventions and Standards  Roles & Responsibilities  Basic workflow for importing items individually to Dspace, including adding descriptive metadata  Security/Access and Privacy Issues  Community and Collection structure; arrangement guidelines for consistency  Appraisal/Selection Policies and record priorities

 Megan Dirickson, Kristin Law, Nora Winslow INF 392K, Spring 2013.

Similar presentations

Presentation on theme: " Megan Dirickson, Kristin Law, Nora Winslow INF 392K, Spring 2013."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

 Megan Dirickson, Kristin Law, Nora Winslow INF 392K, Spring 2013.

Similar presentations

Presentation on theme: " Megan Dirickson, Kristin Law, Nora Winslow INF 392K, Spring 2013."— Presentation transcript:

Similar presentations

About project

Feedback