Download presentation
Presentation is loading. Please wait.
1
Daniel Gelaw Alemneh and
Data Desiccation: Facilitating Long-term Access, Use, and Reuse of ETDs Daniel Gelaw Alemneh and Mark Edward Phillips 14th International Symposium on Electronic Theses and Dissertations (ETD-2011) Sept. 2011, Cape Town, South Africa
2
UNT’s ETDs -General Background -Libraries Role
3
Background The University of North Texas (UNT) began accepting theses and dissertations in electronic format in 1999. UNT is one of the early adopters of what was to become the ETD movement in higher education One of the first three American universities to require ETDs for graduation.
4
UNT Libraries Role The UNT Libraries play an active role in facilitating access to UNT’s ETDs In 2007 the Digital Projects Unit took on a stewardship role Develop appropriate Metadata Integrate Value added services into the ETDs In 2010 we started retrospective conversion projects: Digital retro-conversion (in-house project) for pre theses and dissertations previously available only in paper or microform. Digital retro-conversion for ETDs (1999 to 2009) previously available only in PDF file format.
5
What makes up UNT’s ETDs?
-UNT ETDs Size -By Access Level -By Degree Level
6
UNT’s ETDs 51% vs. 49%
7
Access Levels of UNT’s ETDs
1. Public: - These ETDs are open or there are no restrictions on these resources. 2. Restricted:- 2.1 UNT-Community:- These ETDs are restricted to users associated with UNT. Users are normally required to log in using their EUID if they are located outside the UNT campus. The restricted ETDs after 2007 have a delay (2-5 years) and then they will be moved to "Public" 2.2 UNT-Strict:- These ETDs are restricted to the UNT Community. This will be strictly enforced and users are always required to log in using their EUIDs, regardless of their location.
8
UNT ETDs Size By Access Level in 2010
In 2010, we had 2, 836 ETDs open (Public) and 759 ETDs restricted to UNT community. 21% restricted ETD is still big and our goal is to minimize the restriction to the minimum possible (0%) .
9
UNT ETDs Size By Access Level in 2011
80% vs. 20% Well now (in 2011), we have 3,086 or 80% ETDs open (Public) and 768 or 20% ETDs restricted to UNT community. Gradually the number of restricted ETDs will decrease further. For example, all of the ETDs after 2007 (with 2 to 5 years) embargoed will be moved to "Public" after the embargo period has passed. All of the digitized item will be open.
11
Data Desiccation -Overview -Magick Numbering -Multiple Data Formats
-Submission Information Package (SIP)
12
Data Desiccation In the context of the UNT ETDs, data desiccation first involves converting the deposited PDF into a series of image files that serve as the primary access point to the documents online. High quality JPEG images as the image format Magick numbering involves two running sequences of numbers (an eight digit filename).
13
Table 1. Magick Numbering
Sequence Pagination Filename 1 Title Page 000100tp.jpg 2 Copyright page jpg 3 Abstract jpg 4 ii 000400ii.jpg 5 iii 00050iii.jpg 6 iv 000600iv.jpg 7 jpg 8 jpg 9 jpg … Table 1. Magick Numbering
14
Multiple Data Formats PDF JPGs Originally deposited version
A series of derivatives converted from the original pdf: jpg:- (serve as the primary access point to the documents online) pro:- (the proprietary format from the PrimeOCR engine) xml:- (a UNT-specific word bounding box file) txt:- (ASCII text file converted from Pro format).
15
Submission Information Package (SIP)
16
Enhancing UNT’s ETDs Access/Use via Desiccation
-Multiple Formats Access Strategy -Access by Degree Level -Access by Country -Access via Mobile Devices
17
Multiple Formats Access Strategy
In addition to the originally deposited PDF format, the data desiccation process provides and facilitates additional methods of access by: exposing the page level OCR text to an increasing number of search engines allowing page turning interfaces or other interfaces designed for emerging mobile devices
18
Multiple Formats Access …
Longitudinal data will be collected to see if desiccated ETDs receive more use than the older, single-format PDF versions. We are already witnessing an overall increase in access to the ETDs in the UNT Digital Library.
19
Access By Degree Level In , ETDs accessed 282,240 times and out of which were Doctoral dissertations while the remaining were Master’s theses.
20
Visits from 216 Countries
21
6,000+ visits from Africa
22
Access via Mobile Device in 2009-10
23
Access via Mobile Device in 2010-11
Compare to last year, access via mobile devices almost doubled this year.
24
Number of Pages Viewed Most visitors (61%) view just one page and overall more than 90% of the users viewed less than 7 pages. Most ETDs have about 200 pages and downloading all ETDs may not serve well the vast majority of the users.
25
Duration of Visit Most (more than 85%) visits lasted less than 3 minutes.
26
Visitor Loyalty About 85 % visited only once
27
Traffic Sources Overview
28
Referring Sites Wikipedia is the highest referring site.
29
Search Engines Out of the total of 23 different search engines or sources, Google sent almost 90% of the search
30
Summary References
31
Summary Given the pressure of reading more in less time, today’s users demand access to various formats regardless of temporal and spatial restrictions and the types of devices used. Based on the data, users: -Increasingly use Mobile devices -Come from different countries (with varied bandwidth) -View one or a few pages -Visit just once Understanding user communities, their information needs, and their use behavior will help to move contents into the users’ space and facilitate access and use of ETDs.
32
Summary The successful management of ETDs requires multifaceted effort across the entire life-cycle to ensure that ETDs are managed, preserved, & made accessible in a manner that today’s users expect. Over the past year, the UNT Libraries have put forth great effort in making digital collections more accessible and useful in research processes. Data desiccation or providing multiple options certainly facilitates both enhanced and long-term access to the contents of ETDs!
33
References - The University of North Texas (UNT) ETD-Progress: UNT Metadata: UNT Theses and Dissertations:
34
Questions? and/or
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.