Analysis of URL References in ETDs: A Case Study at the University of North Texas Mark E. Phillips Assistant Dean for Digital Libraries.

Slides:



Advertisements
Similar presentations
Introduction to DataCite Adam Farquhar PhD Head of Digital Library Technology, The British Library President, DataCite June 2010.
Advertisements

Update for CDNL Milan 26 August 2009 Caroline Brazier, Chair of ICADS IFLA-CDNL Alliance for Digital Strategies.
E-learning and Libraries WSIS Forum, Geneva,11 May 2010 Tullio Basaglia, CERN Scientific Information Service, Geneva.
CLDs, stewardship, resource discovery and collections management (hmm…catchy) Nick Poole ICT Adviser Resource: The Council for Museums, Archives and Libraries.
A centre of expertise in data curation and preservation DCC Workshop: Curating sApril 24 – 25, 2006 Funded by: This work is licensed under the Creative.
DSpace: the MIT Libraries Institutional Repository MacKenzie Smith, MIT EDUCAUSE 2003, November 5 th Copyright MacKenzie Smith, This work is the.
Providing collections, tools and services for digital humanities A national library perspective Clément Oury Head of Digital Legal Deposit Bibliothèque.
Terrill Thompson University of IT Accessibility Policies and Practices in Higher Education:
A CMS for PhD Theses Oleg Burlaca, Constantin Gaindric, Svetlana Cojocaru Institute of Mathematics and Computer Science Oleg Burlaca, Constantin Gaindric,
Peter Griffith and Megan McGroddy 4 th NACP All Investigators Meeting February 3, 2013 Expectations and Opportunities for NACP Investigators to Share and.
Communications & knowledge sharing Global Impact Study Impact Indicators Workshop Montpellier, France March 2010 Christine Prefontaine.
Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
The Impact of Open Access in Scholarly Communications: Stakeholders Perspectives Digital Frontiers Annual Conference September 19, 2013 University of North.
The School of Graduate and Postdoctoral Studies Presentation Title Goes in Here Electronic Thesis and Dissertation (ETD) Topic Submitting.
Features and Uses of a Multilingual Full-Text Electronic Theses and Dissertations (ETDs) System Yin Zhang Kent State University Kyiho Lee, Bumjong You.
Creation of an online catalog of dissertations using Access & ASP – slide 1 Creation of an online catalog of dissertations using Access & ASP: from Datatel.
If We Build It, Will They Come (Eventually)? : Scholarly Communication and Institutional Repositories A Presentation to the NASIG 2005 Conference May 20.
Use Data and IR Content Recruitment: Caltech Experience Kimberly Douglas Ed Sponsler and Hema Ramachandran, Jim O’Donnell, Eric Van de Velde, Sandy Garstang.
Early Results from the Council of Graduate Schools Study of the Bepress/UMI Online Submission Application ECURE March 2, 2005 Bill Savage UMI Dissertations.
Introducing Symposia : “ The digital repository that thinks like a librarian”
Resistance to ETDs in Academe: Diffusion of Innovation Jude Edminster Bowling Green State University.
Introduction to Implementing an Institutional Repository Delivered to Technical Services Staff Dr. John Archer Library University of Regina September 21,
Role of Contributing Institutions – The NDL Movement Presented By Dr. B. Sutradhar, Librarian Central Library (ISO 9001:2008 Certified) IIT Kharagpur
Information Retrieval for High-Quality Systematic Reviews: The Basics 6.0.
EZID (easy-eye-dee) is a service that makes it simple for digital object producers (researchers and others) to obtain and manage long-term identifiers.
Digital Objects Management Arbicon Visit, June 7, 2007 Esa-Pekka Keskitalo, Senior Analyst esa-pekka.keskitalo [at] helsinki.fi.
Early Results from the Council of Graduate Schools Study of the Bepress/UMI Online Submission Application ETD Conference September, 2005 Delphine Lewis.
Educational Research Theses : Online Communities and Partnerships Sue Clarke Manager, Cunningham Library, ACER ETD2005: Evolution through discovery 28.
Partnership for Enhancing Developing Countries’ Capacity in Participating in Global Knowledge Production and Use IAALD World Congress 2013, Cornell University,
Social Science Data and ETDs: Issues and Challenges Joan Cheverie Georgetown University Myron Gutmann ICPSR – University of Michigan Austin McLean ProQuest.
Emerging Trends and Evolving Issues in Open Access and Scholarly Communications Daniel Gelaw Alemneh Digital Curation Coordinator University of North Texas.
Digitization of the Federal Depository Library Program Judith C. Russell Superintendent of Documents & Managing Director, Information Dissemination “Electronic.
Proposition: Digital Collections Are Easier to Find and Use through DLF Aquifer’s American Social History Online Katherine Kott, Aquifer Director Library.
Web Citation Index Chris Powell Account Manager ISI Web of Knowledge Academic & Government Thomson Scientific
Michael Witt Interdisciplinary Research Librarian & Assistant Professor Purdue Libraries & Distributed Data Curation Center (D2C2) Eliciting.
Opening access to UK doctoral theses: the EThOS E-Theses Service 13 August 2014 Sara Gould.
The ISI Web of Knowledge nce/training/wok/#tab3.
CONTENT DISCOVERY, SERVICES, AND SUSTAINED ACCESS Timothy Cole, William Mischo, Beth Sandore, Sarah Shreeves ~ University of Illinois Library
Dealing with the Dynamic Ginger Dickens and Sunday Phillips of the University of Texas at Arlington Archiving Dynamic Thesis and Dissertation Documents.
Patricia B. Condon Simmons Society of American Archivist Research Forum Washington, DC August 12, 2014 Disciplining.
Sheryl Burgstahler Director, UW Accessible Technology & DO-IT IT Accessibility: Policies, Procedures and Practices in Higher Education Terrill Thompson.
Penn State Electronic Theses and Dissertations Project Graduate Council Library Computing Services Graduate School University Libraries Center for Academic.
JOURNAL CITATION REPORTS James Cook University Celebrating Research 9 OCTOBER 2009 Steven Werkheiser Manager, Customer Education & Training ANZ Thomson.
Ensuring Long-term Access to Electronic Theses and Dissertations: Local vs. Global Lifecycle Management Daniel Gelaw Alemneh, University of North Texas.
Current Quality Assurance Practices in Web Archiving Brenda Reyes Ayala, Mark Phillips, and Lauren Ko University of North Texas
Making Dissertations & Theses accessible and discoverable Специальные условия по включению диссертаций российских ученых в базу ProQuest Dissertations.
Depositing your thesis, dissertation and other research outputs Carol Brandenburg & Sarah L. Tritt, Library Teaching and Learning, 2014.
Fourth IABIN Council Meeting Support to Building the Inter-American Biodiversity Information Network.
From Theses and Dissertations to ETD: Retrospective Digitization and New Forms of Scholarship Kathryn Loafman, Daniel Alemneh, and Jeremy Berg University.
Collaborative Approach to Address Scholarly Communications and Digital Curation Challenges Kris Helge, Laura Waugh, Daniel Alemneh SCDC Affinity Group.
PROQUEST DISSERTATION DASHBOARD CNI, DECEMBER 2013 AUSTIN MCLEAN, PROQUEST.
Filling institutional repositories: considering copyright issues Susan Veldsman eIFL Content Manager
How to Implement an Institutional Repository: Part IV A NASIG 2006 Pre-Conference May 4, 2006 Policy Issues.
Effects of electronic indexes and journals on citation patterns in chemical information Beth Thomsett-Scott University of North Texas Libraries ACS Fall.
Managing ETDs with Associated Complex Digital Objects Gabrielle V. Michalek Director, Scholarly Publishing, Archives and Data Services Carnegie Mellon.
Using JSTOR May What is JSTOR?JSTOR 2.JSTOR demonstration −Searching JSTOR −Format of the journal content −Linking to content on JSTOR 3.Help.
Lab project summary: Digital Present May This Lab project summary provides an overview of the Lab’s Digital present project report. The report is.
Electronic Theses and Dissertations: A Status Report for 2001 Paul A. Soderdahl University of Iowa Libraries IACON 2001, Buena Vista University June 1,
March 2013 Turnitin A Plagiarism Check Software for Kenyatta University.
The academic library’s role in providing access to ETDs – the case of The Ohio State University Presented at the Electronic Theses and Dissertations Conference.
Theses in the UK: PhD research, university repositories and EThOS ETD2014 International Conference 24 July 2014 Sara Gould.
Using JSTOR May 2016.
Managing ETDs with Associated Complex Digital Objects
Introduction to Implementing an Institutional Repository
Rhodes Digital Commons: Raising the visibility of your research Research Week. 12th May 2017 Khawulile Radebe: Librarian: Repository & Metadata Debbie.
Curate, Archive, Manage, Preserve
Daniel Gelaw Alemneh and
Christopher C. Brown Reference Librarian
Metadata to fit your needs... How much is too much?
Using Citation Analysis to Develop a Strategic Plan for a Campus-Wide Scholarly Communication Initiative Scott Lancaster Transforming Libraries for Graduate.
Presentation transcript:

Analysis of URL References in ETDs: A Case Study at the University of North Texas Mark E. Phillips Assistant Dean for Digital Libraries Daniel G. Alemneh Digital Curation Coordinator for Digital Libraries Brenda Reyes Ayala Graduate Assistant, UNT Web Archiving Team

Background ETD at UNT Curating cited URLs URL references, linking patterns, Methods URL Extraction & Indexing Findings Summary Outline

Background: ETD at UNT

UNT & ETD  The University of North Texas (UNT) began accepting theses and dissertations in electronic format in ◦ UNT is one of the early adopters of what was to become the ETD movement in higher education ◦ One of the first three American universities to require ETDs for graduation.

UNT & ETD The UNT Libraries play an active role in facilitating access to UNT’s ETDs – Digital Projects Unit took on a stewardship role Develop appropriate Metadata Integrate Value added services into the ETDs – Multiple formats (PDF, JPG, ) – Integrate Related contents (Datasets, videos, audios e.g. recitals) – Started retrospective conversion projects: Digital retro-conversion (in-house project) for pre-1999 theses and dissertations previously available only in paper or microform.

Visits from 200+ Countries :

Curating Cited URLs

The UNT Libraries carried out this research to better understand what effect this shift to the Web had on the use of Web resources as the research focus, or primary citation target of theses and dissertations. In order to answer this question, the authors analyzed the scope of referencing Web resources, how it differs between academic degree levels, and how it has changed over the past twelve years at UNT. Why Case Study

Degree Level Total # of Docume nts # of Documents without URLs # and % of ETDs that contain URLs Average URLs per item #% Doctoral 2, , % Master’s 1, , % Total 4,3351,6222, %12.83

URL Range# of ETDs% of ETDsCumulative % 01, % %46.62% 2-91, %76.42% %94.16% %100.00% UNT ETDs breakdown of several ranges of URLs per document

The average number of URLs per document in the overall UNT ETD dataset is 8.03 with a standard deviation of These numbers represent documents which contained from 0 to 809 URLs each. Removing the documents that did not contain URLs and re- computing the average changed it to URLs per document with a standard deviation of UNT ETDs breakdown of URLs per document

Top-Level Domain # of Documents # of Documents with URLs Remark com1, % org1, % edu1, % gov1, % net % us % uk % ca % au % de % Top-Level Domain Reference

Second-Level Sub-Domain # of Documents # of Documents with URLs Remark ed.gov % state.tx.us % unt.edu % census.gov % cdc.gov % wikipedia.org963.54% nih.gov843.10% utexas.edu802.95% microsoft.com752.76% nytimes.com712.62% Ten Most Referenced Second Level Sub-Domain

Year# of ETDs# of ETDs with URLs% of ETDs with URLs % % % % % % % % % % *42.44%* % % % No. of ETDs with URLs by domain names

Year # of ETDs with URLs.com.org.edu.gov.net #%#%#%#%#% %932.1%1139.3%414.3% %7055.1%5341.7%4132.3%1915.0% %6757.8%5749.1%3933.6%2017.2% %7350.0%6242.5%4732.2%1812.3% %9648.5%8442.4%7035.4%2412.1% %8949.2%8446.4%6636.5%2312.7% % % %8341.7%4321.6% % % %9841.7%4017.0% % % % %3516.6% % %9941.0%9137.6%4016.5% %8362.9%6952.3%5037.9%2519.0% % % % %5017.5% % % % %6620.0% % %9440.9% %3816.5% Longitudinal Data For ETDs with URLs

 62% of the publications analyzed in this work included URLs.  Doctoral level publications at 68.2%  Master’s level at 55.8%  The percentage of ETDs that include URLs consistently increased  From 23% in 1999 to almost 80% in 2012 Summary

 Across the years, there were more doctoral dissertations than masters’ theses with URLs referenced.  The.gov domain is the fourth most referenced top- level domain, however, it accounts for nearly half of the top ten most referenced domains  A further investigation at the domain or subdomain level could reveal additional patterns that may show more content based information about the URL references. Summary …

Looking Ahead

 The URLs referenced in a large corpus of ETDs may be present interesting insight into the subjects, disciplines and patterns in these documents which warrants further investigation.  This research provides a preliminary framework for technical methods appropriate for approaching future analysis of the data. A deeper investigation into the scope of the target URLs across an entire ETD corpus could provide a better understanding of the content-based URL linking patterns  Additionally an investigation into how specific disciplines or subject areas are referencing URLs in their ETDs would be helpful in identifying particularly high areas of URL linking versus lower levels.  An analysis of URL inclusion in ETDs across institutions and even nations would make a logical follow-on investigation that would show if higher level trends exist in ETDs. Future Works

 Finally a further investigation into URL extraction from text would be beneficial to the ETD community in several ways:  It would allow libraries to extract URLs not only from born digital ETDs but also from theses and dissertations that are being retrospectively digitized in institutions that have not had longstanding ETD policies.  It would allow for investigation into ways of normalizing or completing malformed URLs that may provide for better analysis of content referenced and its availability in Web archives. Future Works…

Thank You! Ameseginalehu! Gracias!