Download presentation
Presentation is loading. Please wait.
1
ETD Preservation Survey Results
Gail McMillan Digital Library and Archives, Virginia Tech 11th International ETD Symposium Robert Gordon University Background MetaArchive Cooperative NDLTD Board of Directors More than 93% of the world’s information today originates as digital files, not print documents. Essentially all theses and dissertations created today are born-digital and increasingly universities worldwide are accepting electronic theses and dissertations (ETDs) in addition to or in place of print versions. How we care for these new digital resources is important in light of possible catastrophic events such as fires and hurricanes, as well as the more prevalent hardware, software, and human failures that all institutions encounter. We must be proactive in providing long-term digital preservation strategies to protect the research and scholarship that comprises this important component of our institutional histories. ETDpreservPaper.ppt
2
What is Digital Preservation?
Systematic management of digital works over an indefinite period of time Processes and activities that ensure the continued access to works in digital formats Requires ongoing attention--constant input of effort, time, and money. Technological and organizational change are obstacles for preserving beyond a few years. What is digital preservation? It is the systematic management of computerized information over an indefinite period of time. It demands continual attention and this constant input of effort, time, and money to handle technological and organization change is the main stumbling block for preserving digital information beyond a few years. I believe that the most effective preservation succeeds by replicating copies of digital content in secure, distributed locations over time because security reduces the likelihood that any single cache will be compromised and distribution reduces the likelihood that the loss of any single cache will lead to a loss of the preserved content. A single organization is unlikely to have the capability to operate several geographically dispersed and securely maintained servers. Inter-institutional agreements must be put in place or there will be no commitment to act in concert over time. Halbert, Martin. MetaArchive presentation to SCHEV LAC, March 28, 2008. ETDpreservPaper.ppt
3
The NDLTD and the MetaArchive Cooperative share a goal
Help higher education institutions provide long-term open access to ETDs NDLTD members can achieve this goal by becoming part of the ETD Archive Join the DDPN Distributed Digital Preservation Network The Networked Digital Library of Theses and Dissertations (NDLTD) and another service organization, the MetaArchive Cooperative, share the goal of helping higher education institutions provide long-term open access to ETDs. The MetaArchive Cooperative and the NDLTD joined forces in 2008 to offer preservation services for ETD collections by implementing an ETD Archive using the technological approach called distributed digital preservation network (DDPN). Participants in this new ETD Archive make their collections available for harvesting into the network and they also may participate in the Cooperative more actively by hosting a LOCKSS-based secure networked server. ETDpreservPaper.ppt
4
Distributed Digital Preservation Network Web Crawls and Audits Content Integrity
Typically LOCKSS programmatically collects content from publishers and distributes copies among partner libraries’ servers where it is preserved using inexpensive computers and open-source software that audits and repairs content as needed from the publisher or the partners. It allows content to be disseminated only to the appropriate users. The host library’s clientele see the content from the publisher’s site, unless it is not available from there. Then a partner provides the content. Otherwise the partners’ copies are used only to audit and repair the digital content. Many libraries are familiar with this simple but robust, low maintenance and low cost distributed digital preservation system. [new slide?] In 2004 six American university libraries received funding from the Library of Congress to create a similar network of trusted partners and adapted the LOCKSS software by disconnecting access from preservation so that the partners’ servers become a networked secure dark archive. This partnership became the MetaArchive Cooperative and four years later were ready to expand the network to welcome members from the NDTLD who may also seek a tested and effective preservation strategy familiar to libraries. Collections of born-digital and digitized theses and dissertations from NDLTD institutions will be ingested into the ETD Archive by the MetaArchive system and copied, distributed, and stored on secure servers at multiple NDTLD partner institutions. The MetaArchive Cooperative will not provide access; that service remains with each institutional member hosting an ETD collection. ETDpreservPaper.ppt
5
95 Responses to ETD Preservation Survey
15% Assoc. of Research Libraries 11% Assoc. of SE Research Libraries 9% Council of Graduate Schools 10% Digital Library Federation 32% NDLTD, ETD listserv 23% Others While the MetaArchive Cooperative felt there would be an interest among universities for an ETD Archive, its Steering Committee (of which I am a part) decided to conduct a survey. Because of the significant portion of the survey respondents’ were members of the NDLTD and ETDs listservs, and because this presentation was prepared for this conference, I grouped their responses together and refer to them as ETDL, and, when noteworthy highlight and/or contrast them to the non-ETDL, that is responses from the other listservs. At least half of the ETDL are current members of the NDTLD. Of these, nearly one-third of the respondents were at international universities, a little more than half were at American universities, and less than one-fourth of the respondents were at undesignated institutions. ETDpreservPaper.ppt
6
Over three-fourths of the survey responses came from universities that accept ETDs. Over one-third of those respondents also reported that their institutions accept just the electronic formats while just over half of them also maintain print copies. It was expected that institutions with active ETD initiatives would be keeping up with relevant issues through the NDLTD or ETD listservs so it is not surprising that more of those universities accept ETDs. However, a smaller percentage of those institutions accept only electronic versions. ETDpreservPaper.ppt
7
ETD File Formats 85% PDF 30% JPG 27% WAV 24% GIF 23% HTML, MOV
21% AVI, MP3 MetaArchive Conspectus Database The survey sought to determine the file formats that institutions accept with ETDs because this is an important element of preservation planning and of future format migration considerations. The MetaArchive is format agnostic, ingesting all file formats into its DDPN. The range of file formats that comprise ETDs at the survey respondents’ universities, matches those the MetaArchive has already ingested and those that were corrupted and rebuilt during extensive network testing. Such information as file formats along with other metadata is tracked through the MetaArchive Conspectus Database, which contains each institution’s detailed descriptions of the collections in the DDPN. The metadata is used for a variety of purposes such as administration of the network (management, harvesting, maintenance, and recovery), public understanding of the collections, and future activities such as format migration. ETDpreservPaper.ppt
8
Platforms, Institutional Repositories with ETDs
26% DSpace 13% ETD_db 3% Fedora 1% Eprints 29% Locally developed systems 29% Others As anticipated, ETDs are hosted by a variety of platforms and repositories. Others mentioned were CONTENTdm, DigitalCommons, DigiTool, and ProQuest. ETDpreservPaper.ppt
9
Structure of ETD Collections
25% Subject-like categories 21% Everything-in-one 21% Year* 9% Accessibility 7% Degree 5% Author * Recommended by MetaArchive There is a little less variety in the way collections of ETDs are structured. Most frequently ETDs were organized by subject-like categories according to departments, colleges, or disciplines, according to 25% of the responses. Tied for the second most common collection organization, 21% each, were everything-in-one-collection and collections based on the year the degree was granted. The three other categories chosen were accessibility (9%), degree (7%), and author (5%). (12% of the responses did not address the question.) ETDpreservPaper.ppt
10
The most surprising response to any question in the ETD Preservation Survey was the response to “Does your institution have a formalized preservation plan for its ETDs?” Only about one-quarter of the universities responding indicated that they have a preservation plan for their ETDs, leaving nearly three-fourths of the universities that accept ETDs without formal preservation plans! Correlating responses for universities that accept ETDs and have formal preservation plans, reveals that less that one-fifth (18%) of the ETDL that accept ETDs also have formal plans. Two-thirds of the ETDL accept ETDs without having formalized preservation plans. Only one institution has a formal plan but does not accept ETDs. ETDpreservPaper.ppt
11
Would your institution be interested in participating in an ETD-specific LOCKSS-based collaborative distributed digital archive sponsored by the NDLTD? 92% Yes or Maybe as of last week. If even a portion of these Yeses join the MetaArchive Cooperative, it would be a significant increase in membership. And it would significantly increase the number of ETDs that will be available for long term access. ETDpreservPaper.ppt
12
If your institution is interested in the ETD Archive, is there a preference for
When the NDLTD Board of Directors was considering the sponsorship of the ETD preservation survey during its January 2008 meeting, several members stressed the importance of one of its founding goals--open and unimpeded access to ETDs. The MetaArchive also believes that ETDs should be openly accessible, but that access should come directly from the authors’ home institutions rather than from the secure and dark ETD Archive. While considering future development of the MetaArchive Cooperative, the Steering Committee acknowledged that some changes would be necessary in order to evolve to meet the needs of potential members. If it is to attract new members from among the majority of universities that responded to the ETD preservation survey, the MetaArchive may have to reconsider its stand on the separation of access from preservation. More than three-fourths of the survey respondents preferred accessible preservation archives if the NDTLD sponsored an ETD-specific LOCKSS-based collaborative preservation strategy as reported by survey respondents in the table below. ETDpreservPaper.ppt
13
Participating in the MetaArchive The survey sought to determine if there was interest in not only preserving ETD collections, but also in having a role in the preservation activities as participating members of the MetaArchive Cooperative. There are three membership categories in the Cooperative, with institutional participation ranging from minimal to considerable. The MetaArchive prescribes how Contributing Members organize their ETD collections to facilitate harvesting and ingest into the ETD Archives. Contributing Members are allocated five gigabytes for their ETD collections, though they may to purchase … Preservation Members archive their ETDs in the DDNP and run a secure server for the network that harvests and caches ETDs for other NDLTD members. Preservation Membership comes with 20 GB of storage … Along with the responsibilities of Preservation Members, Sustaining Members also develop and test software, networking, and transmission standards, and they research and deploy the work of the Coop, contributing staff and resources. Sustaining Members each receive 40 GB of archiving space in the DDPN though they… Nearly 1/4 of the survey respondents indicated they wanted to join the MetaArchive as Sustaining Members. While nearly 1/3 of the ETDL selected this category, this category was of less interest to the non-ETDL. ETDpreservPaper.ppt
14
Katherine Skinner’s slide from SODA Dallas, May 2008
ETDpreservPaper.ppt
15
Need information about
55% Costs 23% Human resources 18% Hardware 14% Responsibilities; technical 12% Access 11% Policies 9% Procedures Comments 20% A welcome opportunity 15% Concerns about functionality From 65 narrative responses, I developed 35 categories. Among the ETDL responses, 50% commented on, from most to least, costs, human resources, and access to ETDs. 30% mentioned costs. Far behind but echoed by 13% was human resources, then 7% who questioned ETD access. The remainder was a varied list of 23 comments each mentioned by 2-4%. These include hardware/platform, national networks, requirements, work to prepare archives, documentation, liability, policies and procedures, security, support, and sustainability. Comments that were made by the ETDL, but were not mentioned by the Non-ETDL, included documentation and national networks. The largest number of comments made by nearly half of the Non-ETDL, from most to least, was cost, hardware, technical issues, and human resources. Their most-made comment was also about costs, made by 20%. Comments about technical issues were raised nearly as many who specifically mentioning hardware, platform, and general technical issues. Only 8% of the non-ETDL mentioned human resources/staff/skills while 7% wanted to know about general responsibilities and requirements. 11% of the comments were about policies and procedures. Non-ETDL comments that were not mirrored by ETDL, included metadata, migrations, scope, and standards. It may be that Non-ETDL are less familiar with the information that can be garnered through the ETD and NDLTD listservs. The dichotomy of responses are sometimes quite striking. Non-NDTLD-I mentioned technical issues more than twice as often: 19% versus 8%. ETDL more often mentioned human resources, 13% versus 8%. ETDpreservPaper.ppt
16
Conclusion The survey responses indicated that ETD collections exist at 80% of their universities. However, a shocking 74% of those universities lack formal preservation plans for their ETD collections. This chasm demonstrates the dire need for a preservation strategy such as that offered by the MetaArchive Cooperative. Any preservation strategy will have to accommodate a broad range of file formats, repository systems, and ETDs in varied organizational structures. An ETD-specific LOCKSS-based collaborative DDPN sponsored by the NDLTD is of interest to the majority of survey respondents. And a distributed digital preservation network may be a particularly good fit for an international organization such as the NDTLD as well as the fact that two-thirds of the responding universities already have experience with LOCKSS. Not only did this survey demonstrate the need for and interest in a formal preservation strategy for ETDs, the majority of survey respondents also want to participate in the preservation activities, not just off-load their ETDs into a secure archive maintained by others. If this interest in preservation network participation becomes linked with the respondents’ interests in an open or dim archive, it will cause the MetaArchive to reexamine one of its founding principles—the separation of access from preservation in its distributed digital archive. But the NDTLD is now in a position offer guidelines for long-term preservation of ETDs. ETDpreservPaper.ppt
17
There’s still time to participate in the ETD preservation survey.
Select Digital Preservation of ETDs The results are immediately available If you haven’t already and you’d like to participate in the survey, just go to ETDpreservPaper.ppt
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.