Download presentation
Presentation is loading. Please wait.
Published byAlban Morgan Modified over 9 years ago
1
1 Symposium: Open Access to Information Panel 2: Open Access & Institutional Repositories 24 August 2006, Brasilia Digital Libraries, Electronic Theses and Dissertations (ETDs), and NDLTD http://fox.cs.vt.edu/talks/2006/20060824IBICTp2 Edward A. Fox, fox@vt.edu Executive Director, NDLTD Chair, IEEE-CS Tech. Committee on Digital Libraries Professor, Department of Computer Science Director, Digital Library Research Laboratory Virginia Tech, Blacksburg, VA 26061 USA
2
2 Outline Key Ideas Acknowledgements Digital Libraries DLs & Scholarly Communication Institutional Repositories NDLTD Summary DL Futures
3
3 Key Ideas - Overview Theorem 1: Supporters of Open Access should support NDLTD. Theorem 2: 5S can guide us to better support of Open Access.
4
4 Acknowledgements Students Faculty, Staff Collaborators Support Mentors
5
5 Acknowledgements: Students Pavel Calado, Yuxin Chen, Fernando Das Neves, Shahrooz Feizabadi, Robert France, Marcos Gonçalves, Nithiwat Kampanya, S.H. Kim, Aaron Krowne, Bing Liu, Ming Luo, Paul Mather, Fernando Das Neves, Unni. Ravindranathan, Ryan Richardson, Rao Shen, Ohm Sornil, Hussein Suleman, Ricardo Torres, Wensi Xi, Baoping Zhang, Qinwei Zhu, …
6
6 Acknowledgements: Faculty, Staff Lillian Cassel, Debra Dudley, Roger Ehrich, Joanne Eustis, Weiguo Fan, James Flanagan, C. Lee Giles, Eberhard Hilf, John Impagliazzo, Filip Jagodzinski, Rohit Kelapure, Neill Kipp, Douglas Knight, Deborah Knox, Aaron Krowne, Alberto Laender, Gail McMillan, Claudia Medeiros, Manuel Perez, Naren Ramakrishnan, Layne Watson, …
7
7 Other Collaborators (Selected) Brazil: FUA, IBICT, UFMG, UNICAMP, USP Case Western Reserve University Emory, Notre Dame, Oregon State Germany: Humboldt U., U. Oldenburg Mexico: UDLA (Puebla), Monterrey College of NJ, Hofstra, Penn State, Villanova University of Arizona University of Florida, Univ. of Illinois University of Virginia VTLS (slides on digital repositories, NDLTD)
8
Acknowledgements: Support Course: UNESCO, CETREDE, IFLA- LAC, AUGM, CLEI, UFC Sponsors: ACM, Adobe, AOL, CAPES, CNI, CONACyT, DFG, IBM, Microsoft, NASA, NDLTD, NLM, NSF (IIS-9986089, 0086227, 0080748, 0325579, 0535057; ITR-0325579; DUE-0121679, 0136690, 0121741, 0333601), OCLC, SOLINET, SUN, SURA, UNESCO, US Dept. Ed. (FIPSE), VTLS
9
9 Acknowledgements - Mentors JCR Licklider – undergrad advisor (1969-71) –Author in 1965 of “Libraries of the Future” –Before, at ARPA, funded start of Internet Michael Kessler – BS thesis advisor –Project TIP (technical information project) –Defined bibliographic coupling Gerard Salton – graduate advisor (1978-83) –“Father of Information Retrieval”
10
10 Digital Libraries Definitions DL Manifesto – Reference Model Book in process (Fox & Gonçalves), 5S DL Curriculum Project
11
11 DL Definitions - 1 “A digital library is an organized and focused collection of digital objects, including text, images, video, and audio, along with methods of access and retrieval, and for selection, creation, organization, maintenance, and sharing of the collection.” Witten & Bainbridge – “How to Build a Digital Library” – Morgan Kaufmann 2003
12
12 DL Definitions - 2 “Digital libraries are organizations that provide the resources, including the specialized staff, to select, structure, offer intellectual access to, interpret, distribute, preserve the integrity of, and ensure the persistence over time of collections of digital works so that they are readily and economically available for use by a defined community or set of communities” Waters,D.J. CLIR Issues, July/August 1998 www.clir.org/pubs/issues/issues04.html
13
13 DL Definitions - 3 Issues and Spectra –Collection vs. Institution –Content vs. System –Access vs. Preservation –“Free” vs. Quality –Managed vs. Comprehensive –Centralized vs. Distributed
14
14 DL Definitions - 4 NOT a “digitized library” NOT a “deconstruction” of existing systems and institutions, moving them to an electronic box in a Library IS a new way to deal with knowledge –Authoring, Self-archiving, Collecting, –Organizing, Preserving, –Accessing, Propagating, Re-using
15
15
16
16 DL Manifesto - 1 DL Reference Model In support of the future European Digital Library Developed by team connected with DELOS (Candela, Casteli, Ioannidis, Koutrica, Meghini, Pagano, Ross, Schek, Schuldt) Draft 2.2 presented in Frescati, near Rome, June 2006 – 79 pages Could be integrated with work of DLF, JISC, etc.
17
17 DL Manifesto – 2: 3 Tiers
18
18 DL Manifesto – 3: Main Concepts
19
19 DL Manifesto – 4: Actor Roles
20
20 Fox & Gonçalves DL Book Parts Ch. 1. Introduction (Motivation, Synopsis) Part 1 – The “Ss” Part 2 – Higher DL Constructs Part 3 – Advanced Topics Appendix
21
21 Book Parts and Chapters - 1 Ch. 1. Introduction (Motivation, Synopsis) Part 1 – The “Ss” –Ch. 2: Streams –Ch. 3: Structures –Ch. 4: Spaces –Ch. 5: Scenarios –Ch. 6: Societies
22
22 Informal 5S & DL Definitions DLs are complex systems that help satisfy info needs of users (societies) provide info services (scenarios) organize info in usable ways (structures) present info in usable ways (spaces) communicate info with users (streams)
23
23 Digital Object Repository Collection Minimal DL Metadata Catalog Descriptive Metadata Specification A Minimal DL in the 5S Framework Structural Metadata Specification StreamsStructuresSpacesScenariosSocieties indexing browsing searching services hypertext Structured Stream
24
24 Book Parts and Chapters - 2 Part 2 – Higher DL Constructs –Ch. 7: Collections –Ch. 8: Catalogs –Ch. 9: Repositories and Archives –Ch. 10: Services –Ch. 11: Systems –Ch. 12: Case Studies
25
25 Book Parts and Chapters - 3 Part 3 – Advanced Topics –Ch. 13: Quality –Ch. 14: Integration –Ch. 15: How to build a digital library –Ch. 16: Research Challenges, Future Perspectives Appendix –A: Mathematical preliminaries –B: Formal Definitions: Ss –C: Formal Definitions: DL terms, Minimal DL –D: Formal Definitions: Archeological DL –E: Glossary of terms, mappings
26
26 DL Curriculum Framework
27
27 Project Teams/NSF Grant Project Team at VT (IIS-0535057): –PI: Dr. Edward A. Fox (fox@vt.edu) –GRA: Seungwon Yang (seungwon@vt.edu) Project Team at UNC-CH (IIS-0535060): –Co-PI: Dr. Barbara Wildemuth (wildem@ils.unc.edu) –Co-PI: Dr. Jeffrey Pomerantz (pomerantz@unc.edu) –GRA: Sanghee Oh (shoh@email.unc.edu)
28
28 DLs & Scholarly Communication Asynch Information Life Cycle Flattening Author skills, toward Semantic Web Crossing the Chasm OAI
29
29 Asynchronous, Digital Library Mediated Scholarly Communication Different time and/or place
30
30 Information Life Cycle Authoring Modifying Organizing Indexing Storing Retrieving Distributing Networking Retention / Mining Accessing Filtering Using Creating
31
31 Digital Libraries Shorten the Chain from Editor Publisher A&I Consolidator Library Reviewer
32
32 DLs Shorten the Chain to Author Reader Digital Library Editor Reviewer Teacher Learner Librarian
33
33 Important skills for authors Authoring (Word Processing ->e-pub) Rendering, presenting Tagging, Markup (XML, SGML) “Semi-structured information” Dual-publishing, eBooks Styles (XSL, XSLT) Structured queries
34
34
35
35
36
36
37
37
38
38 OAI – Repository Perspective Required: Protocol DO MDO
39
39 OAI – Black Box Perspective OA 1OA 2OA 4OA 3OA 5OA 6OA 7
40
40 Discovery Current Awareness Preservation Service Providers Data Providers Metadata harvesting The World According to OAI
41
41 Institutional Repositories Definitions, Goals Eprints DSpace Fedora, VITAL Comparisons ODL + 5S Suite (not shown)
42
42 Institutional Repositories - 1 “Institutional repositories are digital collections that capture and preserve the intellectual output of a single university or a multiple institution community of colleges and universities.” Crow, R. “Institutional repository checklist and resource guide”, SPARC, Washington, D.C., USA www.arl.org/sparc/IR/IR_Guide_v1.pdf
43
43 Institutional Repositories - 2 “A university-based institutional repository is a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members. It is most essentially an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access or distribution.” Lynch, C.A. In ARL Bimonthly Report 226, pp. 1-7, Feb. 2003, www.arl.org/newsltr/226/ir.html
44
44 What is a Digital Object Repository? Also called: digital rep., digital asset rep., institutional repository Stores and maintains digital objects (assets) Provides external interface for Digital Objects Creation, Modification, Access Enforces access policies Provides for content type disseminations Adapted from Slide by V. Chachra, VTLS
45
45 Goals of Institutional Repositories (by Steven Harnad, U. Southampton) Self Archiving of Institutional Research Thesis and Dissertations (VTLS NDLTD Project) Article preprints and post prints Internal documents and maps Management of digital collections Preservation of materials – decentralized approach Housing of teaching materials Electronic Publishing of journals, books, posters, maps, audio, video and other multimedia objects Adapted from Slide by V. Chachra, VTLS
46
46
47
47
48
48
49
49
50
50
51
51
52
52
53
53
54
54 What is Fedora™? Slides courtesy Vinod Chachra of VTLS Flexible Extensible Digital Object Repository Architecture
55
55 History of Fedora™ 1997-Present –DARPA and NSF-funded research project at Cornell (Conceptual framework developed by Sandra Payette and Carl Lagoze) –Reference implementation developed at Cornell 1999-2001 –University of Virginia digital library prototype (Thornton Staples and Ross Wayland) 2002-Present –Andrew W. Mellon Foundation granted Virginia and Cornell $1 million to develop a production-quality Fedora system –Fedora 1.0 released in May 2003 as Open Source under the Mozilla public license.
56
56 Fedora™ Terms Metadata Digital Objects (data) Complex Objects (Object consisting of many objects in a complex/hierarchical relationship) Content (Data and Metadata together) Data-streams (are content for dissemination) Disseminators (are services) – A dissemination is defined as a stream of data that manifests a view of the digital objects content.
57
57 Digital Object w. multiple datastreams Digital Object DC EAD DatastreamsDatastreams Admin Metadata Admin Metadata EA D
58
58 Example Disseminators Persistent ID (PID) Default Disseminators Simple Image SystemMetadata Datastreams Get Profile List Items Get Item List Methods Get DC Record Get Thumbnail Get Medium Get High Get VeryHigh
59
59 Fedora™ Repository Web Service Exposure Layer Adapted from Slide by V. Chachra, VTLS
60
60 Fedora Advantage Extensible digital object model Repository exposed by Web services APIs –Management (Creation, Deletion, Maintenance, Validation) –Access (Search, Disseminations) Scalable, persistent storage for content and metadata Content can be local and/or remote Content versioning Open source solution
61
61 Comparison of DSpace and Fedora Dspace is a standalone product in a box whereas Fedora can be standalone or integrated with ILS In Fedora the metadata and the content are treated the same way as data-streams; in Dspace the metadata and content get separate treatments. Fedora can define complex objects easier Dspace is not as extensible as Fedora as it deals both with the repositories and workflows. Fedora focuses only on the data model. Fedora uses the Mozilla licensing model and Dspace uses GNU license. It makes it easier for software companies to provide extensions to the model.
62
62 VITAL / Fedora Relationship
63
63 Prospero: Summary of features of the three software packages compared DSpaceE-printsFedora What you get A package with front-end web interface directly linked to a database A repository database, with internal database. Server require- ments Unix environment, Java, Apache Ant, Apache Tomcat, PostgreSQL or Oracle Unix environment, Perl, Apache+mod-perl, MySQL Unix or Windows, Java. (optional: MySQL or Oracle) Subject class- ification Yes Community groups YesNoPossible but … (see below) Where from? MIT and Hewlett- Packard. Southampton University, outcome of a JISC project. Cornell University and the University of Virginia Library.
64
64
65
65
66
66
67
67
68
68 NDLTD DL case study Goals How, Workflow Union Catalog Services atop the Union Catalog Sustainability and Impact UK related report (Aug. 2006)
69
A Digital Library Case Study Domain: graduate education, research Genre:ETDs=electronic theses & dissertations Submission: http://etd.vt.edu Collection: http://www.theses.org Project: Networked Digital Library of Theses & Dissertations (NDLTD) http://www.ndltd.org
70
70 NDLTD Goals For Students: –Gain knowledge and skills for the Information Age, especially about Digital Libraries –Richer communication (digital information, multimedia, …) For Universities: –Easy way to enter the digital library field and benefit thereby For the World: –Global digital library – large, useful, many services
71
NDLTD: How can a university get involved? Select planning/implementation team –Graduate School –Library –Computing / Information Technology –Institutional Research / Educ. Tech. Join online, give us contact names –www.ndltd.org/join Adapt Virginia Tech or other proven approach –Build interest and consensus –Start trial / allow optional submission
72
Student Gets Committee Signatures and Submits ETD Signed Grad School
73
Library Catalogs ETD, Access is Opened to the New Research WWW NDLTD
74
74 Union catalog: OCLC OCLC will expand OAI data provider on TDs. Is getting data from WorldCat (so, from many sites!). Will harvest from all others who contact them. Need DC and either ETD-MS or MARC. Has a set for ETDs.
75
75
76
76
77
77 ETD Union Search Mirror Site in China (CALIS) (http://ndltd.calis.edu.cn – popular site!)
78
78
79
79 VTLS Union Catalog Content Languages The VTLS NDLTD Union Catalog has data in 6 different languages. These are: English German Greek Korean Portuguese Spanish Examples follow
80
80 Full-text Services Running since Sept 2005: Scirus In beta test: Google Scholar Challenges: –Data quality problems –Inconsistency in way to get from metadata to the full-text file(s) –Broadening the coverage since OAI use has not spread as widely as we would like
81
81
82
Aiding universities to enhance graduate education, publishing and IPR efforts Helping improve the availability and content of theses and dissertations Educating ALL future scholars so they can publish electronically and effectively use digital libraries (i.e., are Information Literate and can be more expressive) -> support Open Access What are we doing?
83
83 UK Report of Aug. 2006 EVALUATION OF OPTIONS FOR A UK ELECTRONIC THESIS SERVICE Study report edited by Alma Swan Key Perspectives Ltd & UCL Library Services EThOS project (Electronic Theses Online Service) - commissioned to develop a model for a workable, sustainable and acceptable national service for the provision of open access to electronic doctoral theses.
84
84 EThoS: Stakeholders Academic registrars University administrators (graduate schools) Librarians Repository managers (3; 2) Authors (or potential authors) of theses and dissertations
85
85 Assessment of the organisational models Distributed modelCentralised modelMixed architecture model Viability Dependent upon individual institutions’ capabilities and resources, which are highly variable Good, providing service provider selects correct business model and satisfies HEI concerns on rights, liabilities, etc) Dis- advantages Dependent upon individual institutions’ capabilities and resources, which are highly variable. This would lead to a service of patchy quality for at least a decade Potentially chaotic with respect to standards and consistency levels HEIs lose control to an extent and may lose some benefits in terms of PR and other institutional-purpose benefits that accrue with local service provision Offers potential for inconsistencies unless well- managed by hub provider Advantages Self-organising, cheap, simpleHEIs need only to provide access to e-theses: central service provider does the rest: Standards applied across the board: Guaranteed consistent access: Scope for added-value services: One interface; a true national collection as well as a national gateway: Easy to hook up to other national or international services. Gives the greatest flexibility to HEIs to select the most appropriate options; HEIs can retain control of selected elements: Standards applied across the board: Guaranteed consistent access: Scope for added-value services: One interface (multiple sites of supply): National gateway: Easy to hook up to other national or international services. HEI commun- ity views Strong feeling against this optionSecond most popular optionHighest level of support for this option Comments No support in the HEI communityStrong support within HEI community Very strong support within HEI community
86
86 EThoS Survey: familiar with IPR issues related to e-theses 8% know very little 30% not very familiar 51% familiar 11% very familiar
87
87 EThoS Survey: my institution’s handling of PhD e-theses 83% not yet 11% from some students 5% from most students 1% from all students
88
88 EThoS Survey: my institution’s policy position on PhD e-theses 55% no policies yet 34% current planning policies 11% has a policy
89
89 EThoS: Benefits Hugely increased visibility of UK doctoral research output Resulting in increased usage and impact of UK doctoral research output The opportunities for resulting new research efforts and collaborations
90
90 Summary: Key Ideas Theorem 1: Supporters of Open Access should support NDLTD. Theorem 2: 5S can guide us to better support of Open Access.
91
91 Theorem 1: Supporters of Open Access should support NDLTD - 1 DLs will lead to enormous benefit at all levels, from personal to global. An IR is a type of DL, in the middle of the levels (requiring support from below, and providing support for above levels). Having a DL at every university (i.e., IR) greatly encourages Open Access.
92
92 Theorem 1: Supporters of Open Access should support NDLTD - 2 The easiest way to launch an IR at a university is with ETDs. NDLTD is the lead world organization promoting ETD activities. NDLTD’s goals are all in support of Open Access and IRs.
93
93 Theorem 2: 5S can guide us to better support of Open Access - 1 5S helps us think formally about Open Access, hence clearly, hence to find focus. 5S helps us design and build DLs, hence IRs. Societies –Individuals: members of institution, discipline –Social influence can promote DL (re)use. –Economic and political and social issues lead us to a distributed architecture.
94
94 Theorem 2: 5S can guide us to better support of Open Access - 2 Distributed infrastructure + services lead us to harvesting (vs. federation, gathering). 5S helps make harvesting a success: –Streams of content flow from individuals. –Structures: ETD-ms, (browsing) classification –Spaces: indexes, interfaces –Scenarios: submission, workflow, harvesting –Societies (see above) More collaboration (social networks) Prestige is more widely spread. Access if more open
95
95 DL Futures History People, Content, Tools Sustainable Infrastructure Future Work Links For More Information
96
96
97
97
98
98
99
99 People Digital librarians DL system developers DL system administrators DL managers DL collection development staff DL evaluators DL users
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.