Free for all : opening collections & supporting multi-institutional efforts w/ Internet Archive Patrick R. Wallace, digital projects & archives librarian, Middlebury College Special Collections.
the basics. Founded 1996. 501(c)(3) non-profit. San Francisco. 150 billion+ Web pages. Millions of objects. Cool objects. We’re mainly talking about objects today.
the (very) good It’s free. Really free. The stuff in it is free. Long history, no signs of going away. Dedication to providing public access to knowledge. Dedicated to preserving a historical record, especially re: everyday life and digital culture. Transcoding, streaming, OCR, storage -- for free!
the (maybe) bad Tendency to act in an un- librarylike fashion. 1,300 Public Domain dictionaries, still don’t know the word “deaccession”. Bucket system. No quality assurance. No access control. Once free, always free (kind of).
the (really pretty) ugly Messy collections. Arbitrary metadata. Lot and lots of junk. Limited UI, hard to find materials. Serious collection management means Linux, command lines, and scripting. Key management tasks require IA staff/admin intervention.
a serious tool for serious libraries We’re in this together. Shared professional ethic and ethical praxis. “Universal access to all knowledge”. Lots of users. API. Weaknesses are also strengths. Migration is not really that bad.
extra special collections Use case #1: Midd Special Collections & Archives Sharing everything we may, because we can. 5,000+ original items added since Jan 2016. Collaboration to automate DLA uploads. Dramatic increase in item views over CONTENTdm. Loss of some metadata. “Much kludge, very wow.”
no budget? no problem. Use case #2: one big union. Green Mountain Digital Archive (DPLA @ VT). Bringing the smallest institutions on board. Scheduled metadata scraping. Dependant on training, style guide compliance, normalization. Community involvement. When all you have is a hammer, at least you have a hammer.
hack it,work it Web interface. internetarchive Python library. Amazon S3 API. Standalone CLI tool. Lots of custom scripts. Don’t be afraid. Backlogs are good practice.
overloaded? happy to help. [me] [not_me]