340 likes | 466 Views
This presentation discusses the University of North Texas's (UNT) initiatives in enhancing the accessibility and usability of Electronic Theses and Dissertations (ETDs) through a process called data desiccation. Beginning in 1999, UNT became a pioneer in the ETD movement, requiring electronic submissions for graduate work. The Libraries' Digital Projects Unit plays a key role in improving access by converting ETDs into multiple formats, facilitating online access, and enabling mobile device compatibility. Recent statistics show increased usage and access from diverse countries, highlighting the importance of understanding user behavior.
E N D
Data Desiccation: Facilitating Long-term Access, Use, and Reuse of ETDs Daniel Gelaw Alemneh and Mark Edward Phillips 14th International Symposium on Electronic Theses and Dissertations (ETD-2011)13-17 Sept. 2011, Cape Town, South Africa
UNT’s ETDs -General Background -Libraries Role
Background • The University of North Texas (UNT) began accepting theses and dissertations in electronic format in 1999. • UNT is one of the early adopters of what was to become the ETD movement in higher education • One of the first three American universities to require ETDs for graduation.
UNT Libraries Role • The UNT Libraries play an active role in facilitating access to UNT’s ETDs • In 2007 the Digital Projects Unit took on a stewardship role • Develop appropriate Metadata • Integrate Value added services into the ETDs • In 2010 we started retrospective conversion projects: • Digital retro-conversion (in-house project) for pre-1999 theses and dissertations previously available only in paper or microform. • Digital retro-conversion for ETDs (1999 to 2009) previously available only in PDF file format.
What makes up UNT’s ETDs? -UNT ETDs Size -By Access Level -By Degree Level
Access Levels of UNT’s ETDs • 1. Public: - • These ETDs are open or there are no restrictions on these resources. • 2. Restricted:- • 2.1 UNT-Community:- • These ETDs are restricted to users associated with UNT. • Users are normally required to log in using their EUID if they are located outside the UNT campus. • The restricted ETDs after 2007 have a delay (2-5 years) and then they will be moved to "Public" • 2.2 UNT-Strict:- • These ETDs are restricted to the UNT Community. • This will be strictly enforced and users are always required to log in using their EUIDs, regardless of their location.
Data Desiccation -Overview -Magick Numbering -Multiple Data Formats -Submission Information Package (SIP)
Data Desiccation • In the context of the UNT ETDs, data desiccation first involves converting the deposited PDF into a series of image files that serve as the primary access point to the documents online. • High quality JPEG images as the image format • Magick numbering involves two running sequences of numbers (an eight digit filename).
Multiple Data Formats • PDF • Originally deposited version • JPGs • A series of derivatives converted from the original pdf: • jpg:- (serve as the primary access point to the documents online) • pro:- (the proprietary format from the PrimeOCRengine) • xml:- (a UNT-specific word bounding box file) • txt:- (ASCII text file converted from Pro format).
Enhancing UNT’s ETDs Access/Use via Desiccation -Multiple Formats Access Strategy -Access by Degree Level -Access by Country -Access via Mobile Devices
Multiple Formats Access Strategy • In addition to the originally deposited PDF format, the data desiccation process provides and facilitates additional methods of access by: • exposing the page level OCR text to an increasing number of search engines • allowing page turning interfaces or other interfaces designed for emerging mobile devices
Multiple Formats Access … • Longitudinal data will be collected to see if desiccated ETDs receive more use than the older, single-format PDF versions. • We are already witnessing an overall increase in access to the ETDs in the UNT Digital Library.
Summary References
Summary • Given the pressure of reading more in less time, today’s users demand access to various formats regardless of temporal and spatial restrictions and the types of devices used. • Based on the data, users: • -Increasingly use Mobile devices • -Come from different countries (with varied bandwidth) • -View one or a few pages • -Visit just once • Understanding user communities, their information needs, and their use behavior will help to move contents into the users’ space and facilitate access and use of ETDs.
Summary • The successful management of ETDs requires multifaceted effort across the entire life-cycle to ensure that ETDs are managed, preserved, & made accessible in a manner that today’s users expect. • Over the past year, the UNT Libraries have put forth great effort in making digital collections more accessible and useful in research processes. • Data desiccation or providing multiple options certainly facilitates both enhanced and long-term access to the contents of ETDs!
References • - The University of North Texas (UNT) ETD-Progress: http://www.library.unt.edu/digitalprojects/procedures/etd/etd-progress • UNT Metadata: http://www.library.unt.edu/digitalprojects/metadata • UNT Theses and Dissertations: http://digital.library.unt.edu/explore/collections/UNTETD/browse/
Questions? Mark.Phillips@unt.edu Daniel.Alemneh@unt.edu and/or