1 Improving the ETD Landscape ETD 2014: 17 th Int’l Symposium on ETDs Leicester, England Edward A. Fox Executive Director, NDLTD,

Slides:



Advertisements
Similar presentations
Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
Advertisements

1 Distributed Agents for User-Friendly Access of Digital Libraries DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen,
1 Introduction to NDLTD and Brief History of the ETD Movement ETD 2008: 11 th Int. Symp. on ETDs Aberdeen, Scotland: Newcomers Edward A. Fox,
ETD-db: Original ETD-db 2.0: Enhanced Gail McMillan Director, Digital Library and Archives, Virginia Tech and Edward A. Fox, Executive Director, NDLTD.
ELPUB 2006 June Bansko Bulgaria1 Automated Building of OAI Compliant Repository from Legacy Collection Kurt Maly Department of Computer.
ETD’s at the University of Saskatchewan or… David Fox & Darryl Friesen University of Saskatchewan October 4, 2003.
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
1 CHCI Visit by Dean Benson, Associate Dean Lesko KW II Rm – 10/10/2011 Digital Library Research Laboratory Torgersen Hall Rm 2030 –
Digital Library Education in Computer Science Programs Jeffrey Pomerantz Barbara M. Wildemuth Sanghee Oh School of Info. & Library Science UNC Chapel Hill.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Digital Curation Issues within the Context of a Curriculum for Digital Librarianship Jeffrey Pomerantz Barbara M. Wildemuth School of Information & Library.
Introducing Symposia : “ The digital repository that thinks like a librarian”
Designing, Developing, and Evaluating an Interdisciplinary Digital Library Curriculum Jeffrey Pomerantz School of Information & Library Science University.
Overview of Search Engines
Use of METS in CDL Digital Special Collections Brian Tingle.
1 Closing Session ETD 2010: 13 th Int. Symp. on ETDs Austin, TX Edward A. Fox Executive Director, NDLTD, Virginia Tech, Blacksburg,
Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
Xpantrac connection with IDEAL Sloane Neidig, Samantha Johnson, David Cabrera, Erika Hoffman CS /6/2014.
Digital Library Architecture and Technology
The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation SEASR Overview Loretta Auvil and Bernie Acs National.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Web Archives, IDEAL, and PBL Overview Edward A. Fox Digital Library Research Laboratory Dept. of Computer Science Virginia Tech Blacksburg, VA, USA 21.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
Improving the Catalogue Interface using Endeca Tito Sierra NCSU Libraries.
1. 2 introductions Nicholas Fischio Development Manager Kelvin Smith Library of Case Western Reserve University Benjamin Bykowski Tech Lead and Senior.
Seungwon Yang, Edward A. Fox, Barbara M. Wildemuth, Sanghee Oh and Jeffrey P. Pomerantz 1 JCDL/ICADL'10 Digital Libraries & Education Workshop.
5-7 November 2014 DR Workflow Practical Digital Content Management from Digital Libraries & Archives Perspective.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
1 Introduction to NDLTD and Brief History of the ETD Movement ETD 2009: 12 th Int. Symp. on ETDs Pittsburgh, PA: Newcomers Edward A. Fox, Executive.
Dec 9-11, 2003ICADL Challenges in Building Federation Services over Harvested Metadata Hesham Anan, Jianfeng Tang, Kurt Maly, Michael Nelson, Mohammad.
Collaborative Research: Curriculum Development for Digital Library Education Presentation in May 1,2006
1 NDLTD Welcome and Introduction ETD 2014: 17 th Int’l Symposium on ETDs Leicester, England Edward A. Fox Executive Director, NDLTD,
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Developing a Concept Extraction Technique with Ensemble Pathway Prat Tanapaisankit (NJIT), Min Song (NJIT), and Edward A. Fox (Virginia Tech) Abstract.
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
CITIDEL: Computing & Information Technology Interactive Digital Educational Library Web Page: Contacts: Future.
1 NDLTD Welcome and Introduction ETD 2011: 14 th Int. Symp. on ETDs Cape Town, South Africa Edward A. Fox Executive Director, NDLTD,
Topic Rathachai Chawuthai Information Management CSIM / AIT Review Draft/Issued document 0.1.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Scientific Data and Electronic Publishing Renze Brandsma, Head, Digital Production Centre University of Amsterdam Maarten Hoogerwerf, Project Manager,
The Digital Library for Earth System Science: Contributing resources and collections Meeting with GLOBE 5/29/03 Holly Devaul.
Topical Categorization of Large Collections of Electronic Theses and Dissertations Venkat Srinivasan & Edward A. Fox Virginia Tech, Blacksburg, VA, USA.
1 Data Curation Workshop Some Reflections on Students’ Roles ETD 2011: 14 th Int. Symp. on ETDs Cape Town, South Africa Edward A. Fox Executive Director,
LOGO A comparison of two web-based document management systems ShaoxinYu Columbia University March 31, 2009.
IR Applications at University of Saskatchewan Library: present and future CARL Institutional Repository Luncheon Saskatoon, SK June 8, 2005 David Fox Head,
XXDL and CSTC and Virginia Tech NSDL Fall 2000 PI Meeting September 22-24, 2000 NSF, Arlington, VA Edward A. Fox CS DLRL.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
1 Video Message: Welcome ETD 2015: 18 th Int’l Symposium on ETDs New Delhi, India Edward A. Fox Executive Director, Chairman of the Board NDLTD,
Towards a Reference Quality Model for Digital Libraries Maristella Agosti Nicola Ferro Edward A. Fox Marcos André Gonçalves Bárbara Lagoeiro Moreira.
Introduction to Concept Maps Edward A. Fox and Rao Shen CS5604 Fall 2002 “Information Storage & Retrieval” Dept. of Computer Science Virginia Tech, Blacksburg,
Oct 12-14, 2003NSDL Challenges in Building Federation Services over Harvested Metadata Kurt Maly, Michael Nelson, Mohammad Zubair Digital Library.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
1 IBM Academic Initiative Introduction for Pamplin School of Business Virginia Tech – October 13, 2011 “IBM Academic Skills Cloud and Computing Education.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
ETD Search Services Ming Luo Edward A. Fox Virginia Tech.
1 ETDs for Life Panel ETD 2014: 17 th Int’l Symposium on ETDs Leicester, England Edward A. Fox Executive Director, NDLTD,
Visual Semantic Modeling of Digital Libraries Qinwei Zhu, Marcos André Gonçalves, Rao Shen, Edward A. Fox – Virginia Tech,, Blacksburg, VA, USA Lillian.
Soon Joo Hyun Database Systems Research and Development Lab. US-KOREA Joint Workshop on Digital Library t Introduction ICU Information and Communication.
NDLTD Union Collection User Services Edward A. Fox Virginia Tech DLRL March 2001.
CTRnet Digital Library for Disaster Information Services Seungwon Yang 1, Andrea Kavanaugh 1, Nádia P. Kozievitch 4, Lin Tzy Li 1,4,5, Venkat Srinivasan.
Managing ETDs with Associated Complex Digital Objects Gabrielle V. Michalek Director, Scholarly Publishing, Archives and Data Services Carnegie Mellon.
Introduction to NDLTD and Brief History of the ETD Movement ETD 2008: 11th Int. Symp. on ETDs Aberdeen, Scotland: Newcomers Edward A. Fox,
VI-SEEM Data Repository
ETDs for Life Panel ETD 2014: 17th Int’l Symposium on ETDs Leicester, England Edward A. Fox Executive Director, NDLTD,
NSF: Interested in education History: DLs dev for UG ed
NSDL Data Repository (NDR)
Oya Y. Rieger Cornell University Library May 2004
Presentation transcript:

1 Improving the ETD Landscape ETD 2014: 17 th Int’l Symposium on ETDs Leicester, England Edward A. Fox Executive Director, NDLTD, Virginia Tech, Blacksburg, VA USA

Outline Acknowledgments Why, what, who, how Improving, quality Related technical contributions DLs and DL curriculum

Acknowledgments Family, mentors, teachers, students Dissertations: Sung He Park, Venkat Srinivasan, Seungwon Yang NSF: IIS , , All those working with ETDs NDLTD, including its Members, Board, Committees, and Working Groups

Why, What, Who? Why? – enhance graduate education – expand global research collaboration What? – help students communicate more effectively – get ETDs for all TDs: next goal 5 million – help make ETDs open, accessible, preserved Who? – levels: students, faculty, staff, (grad) administrators – professions: CS, IT, LIS, librarians, archivists

How? Authoring systems, tools, methods Data and auxiliary information management aids Metadata creation software and techniques Submission, approval, refinement workflows Local access and information management Sharing, disseminating, discovering – OAI, data providers, harvesting – Regional/national, global institutions Services: access, preservation, adding value Add back files

Improving – 1 of 2 Context: Quality frameworks, references on quality Guidelines and documentation for all of this Works – XML + PDF + raw/original representations – Multimedia, software, simulations, websites, dynamic content Data, auxiliary information, references/bibliographies – Reproducibility Metadata – Completeness: subject classification, faculty by role – Authority info

Improving – 2 of 2 Local services – Training, assistance – IR, archives, archival consortia Global services – Browse, faceted search, full-text search – Recommend, CLIR, CBIR, summaries, topics – Linked data, hyperlinks, citation linking – Alerts, notifications, RSS feeds, filtering

Borgman et al Information Life Cycle (adapted) Authoring Modifying Classifying Tagging Recommending Indexing Storing Retrieving Distributing Networking Retention / Mining Filtering Using Downloading Citing Discovering

Quality and the Information Life Cycle

Quality Dimensions

11 Digital Library Service Taxonomy

Improve related movements Make related efforts work for graduate researchers, ETDs, and university ETD activities: Open access, institutional repositories Sharing references and citations: Zotero, … Sharing data, datasets, workflows; reproducible science: reproducibleresearch.net, … Building author profiles: ORCID, ISNI, … Digital libraries and DL education (DL2014)

Related technical contributions Broadly: new/better systems, user/usage studies, added services, improved practices Automatically assign topics or categories to ETDs or to portions (e.g., chapters) to aid browsing and (faceted) searching Build a union reference collection: by aiding authors (e.g., Hiberlink) and/or by automatic ETD text mining Enhanced information retrieval: cross language IR, content based IR (image/video/music) …

Topic determination Given a document, extract or generate generalized description of its topics Statistical approaches, e.g., LDA Knowledge based approaches, e.g., Xpantrac – Take a webpage or document – Use portions of it to build queries to a knowledge source (Web, Wikipedia, and ETD collection) – Combine, analyze, and summarize the results – Seungwon Yang, "Automatic Identification of Topic Tags from Texts Based on Expansion-Extraction Approach", Jan. 2014, Ph.D. dissertation,

ETD Classification: Venkat Srinivasan Enhance metadata by adding subject categories Hierarchical classification of ETDs (and chapters thereof) using Library of Congress categories Training data – OCLC’s WorldCat: records from 1M books have good labels but little metadata; labels on ETDs not usable – Results coming from queries each designed to describe a category – Need to balance negative and positive examples throughout the LoC taxonomy

Category Tree Document Sets GoogleNaïve Bayes Classifiers Training Sets Web Interface ETD Collection Categorized ETDs Category label for each node used as query Top 50 webpages (for each node in the tree) Cleanup (stemming, stopword removal, etc.) Level-wise categorization ETD metadata used for categorization Browsing Training ETDs categorized into a node of the category tree (after classification) ETD Classification: Algorithm Pipeline

Reference Extraction and Databasing 1.How can we implement metadata schema for bibliographic information? 2.What machine learning methods are effective to extract reference sections including footnotes and chapter references? Sung Hee Park, "Discipline-Independent Text Information Extraction from Heterogeneous Styled References Using Knowledge from the Web", June 2013, VT CS Ph.D. dissertation

Dataflow of Reference Section Extraction Pdf2 txt ETD in PDF Feature Extraction Reference Section Extraction Learning Training data Tagged data Feature Extraction

ETD References: System Architecture ETD Repository Users Web App (e.g., ETD-db) Metadata with References Searching, Browsing, Manipulating Extracting Reference Sections Union ETD References ?

Discovery, Search Engines, Info. Retrieval (to be extended for images, etc.) Documents Search Ranking Q D Query Results Best matches (Q with D) selected Quality of many systems is low, with recall and precision at only around.5, as opposed to 1 at 1.

Search Module Detail (features can be about text, images, …) Query Q Document D1 Feature vector Q Similarity Function Feature vectors D1 Feature vectors D1 S = Sim(Q,D1) In CBIR (Content Based Image Retrieval), search is based on visual content of images – Color – Shape – Texture …

22 DL Definitions: Informal 5S DLs are complex systems that help satisfy info needs of users (societies) provide info services (scenarios) organize info in usable ways (structures) present info in usable ways (spaces) communicate info with users (streams) Use this as: checklist, design guidelines, basis for formal description, specification for software implementation; e.g., Spaces help re GIS, VR

Digital Library Books Edward A. Fox and Jonathan P. Leidig, eds. Digital Library Applications: CBIR, Education, Social Networks, eScience/Simulation, and GIS. Morgan & Claypool Publishers, 2014, 175 p., Edward A. Fox and Ricardo da Silva Torres, eds. Digital Library Technologies: Complex Objects, Annotation, Ontologies, Classification, Extraction, and Security. Morgan & Claypool, 2014, 205 p., Rao Shen, Marcos Andre Goncalves, and Edward A. Fox. Key Issues Regarding Digital Libraries: Evaluation and Integration. Morgan & Claypool, 2013, 110 p., Edward A. Fox, Marcos Andre Goncalves, and Rao Shen. Theoretical Foundations for Digital Libraries: The 5S (Societies, Scenarios, Spaces, Structures, Streams) Approach. Morgan & Claypool, 2012, 180 p., supplementary website

DL Curriculum Project NSF awards to VT and UNC-CH: CS and LIS Project server: Wikiversity: ital_Libraries Table 1: Core DL Curriculum Table 2: Information Retrieval Packages Table 3: LucidWorks Big Data Software Table 4: Multimedia Software 24

DL Curriculum Module Template 1. Module name 2. Scope 3. Learning objectives 4. 5S characteristics of the module (streams, structures, spaces, scenarios, society) 5. Level of effort required (in-class and out-of-class time required for students) 6. Relationships with other modules (flow between modules) 7. Prerequisite knowledge/skills required (what the students need to know prior to beginning the module; completion optional; complete only if prerequisite knowledge/skills are not included in other modules) 8. Introductory remedial instruction (the body of knowledge to be taught for the prerequisite knowledge/skills required; completion optional) 9. Body of knowledge (theory + practice; an outline that could be used as the basis for class lectures) 10. Resources (required readings for students; additional suggested readings for instructor and students) 11. Exercises / Learning activities 12. Evaluation of learning objective achievement (graded exercises or assignments) 13. Glossary 14. Additional useful links 15. Contributors (authors of module, reviewers of module) 25

DL Curriculum Framework 26

DL Curriculum Modules - examples Module 1-b: History of digital libraries and library automation Module 2-c: File Formats, Transformation, and Migration Module 3-b: Digitization Module 4-b: Metadata Module 5-a: Architecture overviews … 27

Summary Scene

Conclusion: Improving together Who will help? What can we do? What knowledge and education is needed? What connections, integrations, collaborations can help with ETDs? Please comment and share! – Ed Fox