Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Symposium: Open Access to Information Panel 2: Open Access & Institutional Repositories 24 August 2006, Brasilia Digital Libraries, Electronic Theses.

Similar presentations


Presentation on theme: "1 Symposium: Open Access to Information Panel 2: Open Access & Institutional Repositories 24 August 2006, Brasilia Digital Libraries, Electronic Theses."— Presentation transcript:

1 1 Symposium: Open Access to Information Panel 2: Open Access & Institutional Repositories 24 August 2006, Brasilia Digital Libraries, Electronic Theses and Dissertations (ETDs), and NDLTD http://fox.cs.vt.edu/talks/2006/20060824IBICTp2 Edward A. Fox, fox@vt.edu Executive Director, NDLTD Chair, IEEE-CS Tech. Committee on Digital Libraries Professor, Department of Computer Science Director, Digital Library Research Laboratory Virginia Tech, Blacksburg, VA 26061 USA

2 2 Outline Key Ideas Acknowledgements Digital Libraries DLs & Scholarly Communication Institutional Repositories NDLTD Summary DL Futures

3 3 Key Ideas - Overview Theorem 1: Supporters of Open Access should support NDLTD. Theorem 2: 5S can guide us to better support of Open Access.

4 4 Acknowledgements Students Faculty, Staff Collaborators Support Mentors

5 5 Acknowledgements: Students Pavel Calado, Yuxin Chen, Fernando Das Neves, Shahrooz Feizabadi, Robert France, Marcos Gonçalves, Nithiwat Kampanya, S.H. Kim, Aaron Krowne, Bing Liu, Ming Luo, Paul Mather, Fernando Das Neves, Unni. Ravindranathan, Ryan Richardson, Rao Shen, Ohm Sornil, Hussein Suleman, Ricardo Torres, Wensi Xi, Baoping Zhang, Qinwei Zhu, …

6 6 Acknowledgements: Faculty, Staff Lillian Cassel, Debra Dudley, Roger Ehrich, Joanne Eustis, Weiguo Fan, James Flanagan, C. Lee Giles, Eberhard Hilf, John Impagliazzo, Filip Jagodzinski, Rohit Kelapure, Neill Kipp, Douglas Knight, Deborah Knox, Aaron Krowne, Alberto Laender, Gail McMillan, Claudia Medeiros, Manuel Perez, Naren Ramakrishnan, Layne Watson, …

7 7 Other Collaborators (Selected) Brazil: FUA, IBICT, UFMG, UNICAMP, USP Case Western Reserve University Emory, Notre Dame, Oregon State Germany: Humboldt U., U. Oldenburg Mexico: UDLA (Puebla), Monterrey College of NJ, Hofstra, Penn State, Villanova University of Arizona University of Florida, Univ. of Illinois University of Virginia VTLS (slides on digital repositories, NDLTD)

8 Acknowledgements: Support Course: UNESCO, CETREDE, IFLA- LAC, AUGM, CLEI, UFC Sponsors: ACM, Adobe, AOL, CAPES, CNI, CONACyT, DFG, IBM, Microsoft, NASA, NDLTD, NLM, NSF (IIS-9986089, 0086227, 0080748, 0325579, 0535057; ITR-0325579; DUE-0121679, 0136690, 0121741, 0333601), OCLC, SOLINET, SUN, SURA, UNESCO, US Dept. Ed. (FIPSE), VTLS

9 9 Acknowledgements - Mentors JCR Licklider – undergrad advisor (1969-71) –Author in 1965 of “Libraries of the Future” –Before, at ARPA, funded start of Internet Michael Kessler – BS thesis advisor –Project TIP (technical information project) –Defined bibliographic coupling Gerard Salton – graduate advisor (1978-83) –“Father of Information Retrieval”

10 10 Digital Libraries Definitions DL Manifesto – Reference Model Book in process (Fox & Gonçalves), 5S DL Curriculum Project

11 11 DL Definitions - 1 “A digital library is an organized and focused collection of digital objects, including text, images, video, and audio, along with methods of access and retrieval, and for selection, creation, organization, maintenance, and sharing of the collection.” Witten & Bainbridge – “How to Build a Digital Library” – Morgan Kaufmann 2003

12 12 DL Definitions - 2 “Digital libraries are organizations that provide the resources, including the specialized staff, to select, structure, offer intellectual access to, interpret, distribute, preserve the integrity of, and ensure the persistence over time of collections of digital works so that they are readily and economically available for use by a defined community or set of communities” Waters,D.J. CLIR Issues, July/August 1998 www.clir.org/pubs/issues/issues04.html

13 13 DL Definitions - 3 Issues and Spectra –Collection vs. Institution –Content vs. System –Access vs. Preservation –“Free” vs. Quality –Managed vs. Comprehensive –Centralized vs. Distributed

14 14 DL Definitions - 4 NOT a “digitized library” NOT a “deconstruction” of existing systems and institutions, moving them to an electronic box in a Library IS a new way to deal with knowledge –Authoring, Self-archiving, Collecting, –Organizing, Preserving, –Accessing, Propagating, Re-using

15 15

16 16 DL Manifesto - 1 DL Reference Model In support of the future European Digital Library Developed by team connected with DELOS (Candela, Casteli, Ioannidis, Koutrica, Meghini, Pagano, Ross, Schek, Schuldt) Draft 2.2 presented in Frescati, near Rome, June 2006 – 79 pages Could be integrated with work of DLF, JISC, etc.

17 17 DL Manifesto – 2: 3 Tiers

18 18 DL Manifesto – 3: Main Concepts

19 19 DL Manifesto – 4: Actor Roles

20 20 Fox & Gonçalves DL Book Parts Ch. 1. Introduction (Motivation, Synopsis) Part 1 – The “Ss” Part 2 – Higher DL Constructs Part 3 – Advanced Topics Appendix

21 21 Book Parts and Chapters - 1 Ch. 1. Introduction (Motivation, Synopsis) Part 1 – The “Ss” –Ch. 2: Streams –Ch. 3: Structures –Ch. 4: Spaces –Ch. 5: Scenarios –Ch. 6: Societies

22 22 Informal 5S & DL Definitions DLs are complex systems that help satisfy info needs of users (societies) provide info services (scenarios) organize info in usable ways (structures) present info in usable ways (spaces) communicate info with users (streams)

23 23 Digital Object Repository Collection Minimal DL Metadata Catalog Descriptive Metadata Specification A Minimal DL in the 5S Framework Structural Metadata Specification StreamsStructuresSpacesScenariosSocieties indexing browsing searching services hypertext Structured Stream

24 24 Book Parts and Chapters - 2 Part 2 – Higher DL Constructs –Ch. 7: Collections –Ch. 8: Catalogs –Ch. 9: Repositories and Archives –Ch. 10: Services –Ch. 11: Systems –Ch. 12: Case Studies

25 25 Book Parts and Chapters - 3 Part 3 – Advanced Topics –Ch. 13: Quality –Ch. 14: Integration –Ch. 15: How to build a digital library –Ch. 16: Research Challenges, Future Perspectives Appendix –A: Mathematical preliminaries –B: Formal Definitions: Ss –C: Formal Definitions: DL terms, Minimal DL –D: Formal Definitions: Archeological DL –E: Glossary of terms, mappings

26 26 DL Curriculum Framework

27 27 Project Teams/NSF Grant Project Team at VT (IIS-0535057): –PI: Dr. Edward A. Fox (fox@vt.edu) –GRA: Seungwon Yang (seungwon@vt.edu) Project Team at UNC-CH (IIS-0535060): –Co-PI: Dr. Barbara Wildemuth (wildem@ils.unc.edu) –Co-PI: Dr. Jeffrey Pomerantz (pomerantz@unc.edu) –GRA: Sanghee Oh (shoh@email.unc.edu)

28 28 DLs & Scholarly Communication Asynch Information Life Cycle Flattening Author skills, toward Semantic Web Crossing the Chasm OAI

29 29 Asynchronous, Digital Library Mediated Scholarly Communication Different time and/or place

30 30 Information Life Cycle Authoring Modifying Organizing Indexing Storing Retrieving Distributing Networking Retention / Mining Accessing Filtering Using Creating

31 31 Digital Libraries Shorten the Chain from Editor Publisher A&I Consolidator Library Reviewer

32 32 DLs Shorten the Chain to Author Reader Digital Library Editor Reviewer Teacher Learner Librarian

33 33 Important skills for authors Authoring (Word Processing ->e-pub) Rendering, presenting Tagging, Markup (XML, SGML) “Semi-structured information” Dual-publishing, eBooks Styles (XSL, XSLT) Structured queries

34 34

35 35

36 36

37 37

38 38 OAI – Repository Perspective Required: Protocol DO MDO

39 39 OAI – Black Box Perspective OA 1OA 2OA 4OA 3OA 5OA 6OA 7

40 40 Discovery Current Awareness Preservation Service Providers Data Providers Metadata harvesting The World According to OAI

41 41 Institutional Repositories Definitions, Goals Eprints DSpace Fedora, VITAL Comparisons ODL + 5S Suite (not shown)

42 42 Institutional Repositories - 1 “Institutional repositories are digital collections that capture and preserve the intellectual output of a single university or a multiple institution community of colleges and universities.” Crow, R. “Institutional repository checklist and resource guide”, SPARC, Washington, D.C., USA www.arl.org/sparc/IR/IR_Guide_v1.pdf

43 43 Institutional Repositories - 2 “A university-based institutional repository is a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members. It is most essentially an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access or distribution.” Lynch, C.A. In ARL Bimonthly Report 226, pp. 1-7, Feb. 2003, www.arl.org/newsltr/226/ir.html

44 44 What is a Digital Object Repository?  Also called: digital rep., digital asset rep., institutional repository  Stores and maintains digital objects (assets)  Provides external interface for Digital Objects  Creation, Modification, Access  Enforces access policies  Provides for content type disseminations Adapted from Slide by V. Chachra, VTLS

45 45 Goals of Institutional Repositories (by Steven Harnad, U. Southampton)  Self Archiving of Institutional Research  Thesis and Dissertations (VTLS NDLTD Project)  Article preprints and post prints  Internal documents and maps  Management of digital collections  Preservation of materials – decentralized approach  Housing of teaching materials  Electronic Publishing of journals, books, posters, maps, audio, video and other multimedia objects Adapted from Slide by V. Chachra, VTLS

46 46

47 47

48 48

49 49

50 50

51 51

52 52

53 53

54 54 What is Fedora™? Slides courtesy Vinod Chachra of VTLS Flexible Extensible Digital Object Repository Architecture

55 55 History of Fedora™ 1997-Present –DARPA and NSF-funded research project at Cornell (Conceptual framework developed by Sandra Payette and Carl Lagoze) –Reference implementation developed at Cornell 1999-2001 –University of Virginia digital library prototype (Thornton Staples and Ross Wayland) 2002-Present –Andrew W. Mellon Foundation granted Virginia and Cornell $1 million to develop a production-quality Fedora system –Fedora 1.0 released in May 2003 as Open Source under the Mozilla public license.

56 56 Fedora™ Terms  Metadata  Digital Objects (data)  Complex Objects (Object consisting of many objects in a complex/hierarchical relationship)  Content (Data and Metadata together)  Data-streams (are content for dissemination)  Disseminators (are services) – A dissemination is defined as a stream of data that manifests a view of the digital objects content.

57 57 Digital Object w. multiple datastreams Digital Object DC EAD DatastreamsDatastreams Admin Metadata Admin Metadata EA D

58 58 Example Disseminators Persistent ID (PID) Default Disseminators Simple Image SystemMetadata Datastreams Get Profile List Items Get Item List Methods Get DC Record Get Thumbnail Get Medium Get High Get VeryHigh

59 59 Fedora™ Repository Web Service Exposure Layer Adapted from Slide by V. Chachra, VTLS

60 60 Fedora Advantage Extensible digital object model Repository exposed by Web services APIs –Management (Creation, Deletion, Maintenance, Validation) –Access (Search, Disseminations) Scalable, persistent storage for content and metadata Content can be local and/or remote Content versioning Open source solution

61 61 Comparison of DSpace and Fedora  Dspace is a standalone product in a box whereas Fedora can be standalone or integrated with ILS  In Fedora the metadata and the content are treated the same way as data-streams; in Dspace the metadata and content get separate treatments.  Fedora can define complex objects easier  Dspace is not as extensible as Fedora as it deals both with the repositories and workflows. Fedora focuses only on the data model.  Fedora uses the Mozilla licensing model and Dspace uses GNU license. It makes it easier for software companies to provide extensions to the model.

62 62 VITAL / Fedora Relationship

63 63 Prospero: Summary of features of the three software packages compared DSpaceE-printsFedora What you get A package with front-end web interface directly linked to a database A repository database, with internal database. Server require- ments Unix environment, Java, Apache Ant, Apache Tomcat, PostgreSQL or Oracle Unix environment, Perl, Apache+mod-perl, MySQL Unix or Windows, Java. (optional: MySQL or Oracle) Subject class- ification Yes Community groups YesNoPossible but … (see below) Where from? MIT and Hewlett- Packard. Southampton University, outcome of a JISC project. Cornell University and the University of Virginia Library.

64 64

65 65

66 66

67 67

68 68 NDLTD DL case study Goals How, Workflow Union Catalog Services atop the Union Catalog Sustainability and Impact UK related report (Aug. 2006)

69 A Digital Library Case Study Domain: graduate education, research Genre:ETDs=electronic theses & dissertations Submission: http://etd.vt.edu Collection: http://www.theses.org Project: Networked Digital Library of Theses & Dissertations (NDLTD) http://www.ndltd.org

70 70 NDLTD Goals For Students: –Gain knowledge and skills for the Information Age, especially about Digital Libraries –Richer communication (digital information, multimedia, …) For Universities: –Easy way to enter the digital library field and benefit thereby For the World: –Global digital library – large, useful, many services

71 NDLTD: How can a university get involved? Select planning/implementation team –Graduate School –Library –Computing / Information Technology –Institutional Research / Educ. Tech. Join online, give us contact names –www.ndltd.org/join Adapt Virginia Tech or other proven approach –Build interest and consensus –Start trial / allow optional submission

72 Student Gets Committee Signatures and Submits ETD Signed Grad School

73 Library Catalogs ETD, Access is Opened to the New Research WWW NDLTD

74 74 Union catalog: OCLC OCLC will expand OAI data provider on TDs. Is getting data from WorldCat (so, from many sites!). Will harvest from all others who contact them. Need DC and either ETD-MS or MARC. Has a set for ETDs.

75 75

76 76

77 77 ETD Union Search Mirror Site in China (CALIS) (http://ndltd.calis.edu.cn – popular site!)

78 78

79 79 VTLS Union Catalog Content Languages The VTLS NDLTD Union Catalog has data in 6 different languages. These are: English German Greek Korean Portuguese Spanish Examples follow

80 80 Full-text Services Running since Sept 2005: Scirus In beta test: Google Scholar Challenges: –Data quality problems –Inconsistency in way to get from metadata to the full-text file(s) –Broadening the coverage since OAI use has not spread as widely as we would like

81 81

82 Aiding universities to enhance graduate education, publishing and IPR efforts Helping improve the availability and content of theses and dissertations Educating ALL future scholars so they can publish electronically and effectively use digital libraries (i.e., are Information Literate and can be more expressive) -> support Open Access What are we doing?

83 83 UK Report of Aug. 2006 EVALUATION OF OPTIONS FOR A UK ELECTRONIC THESIS SERVICE Study report edited by Alma Swan Key Perspectives Ltd & UCL Library Services EThOS project (Electronic Theses Online Service) - commissioned to develop a model for a workable, sustainable and acceptable national service for the provision of open access to electronic doctoral theses.

84 84 EThoS: Stakeholders Academic registrars University administrators (graduate schools) Librarians Repository managers (3; 2) Authors (or potential authors) of theses and dissertations

85 85 Assessment of the organisational models Distributed modelCentralised modelMixed architecture model Viability Dependent upon individual institutions’ capabilities and resources, which are highly variable Good, providing service provider selects correct business model and satisfies HEI concerns on rights, liabilities, etc) Dis- advantages Dependent upon individual institutions’ capabilities and resources, which are highly variable. This would lead to a service of patchy quality for at least a decade Potentially chaotic with respect to standards and consistency levels HEIs lose control to an extent and may lose some benefits in terms of PR and other institutional-purpose benefits that accrue with local service provision Offers potential for inconsistencies unless well- managed by hub provider Advantages Self-organising, cheap, simpleHEIs need only to provide access to e-theses: central service provider does the rest: Standards applied across the board: Guaranteed consistent access: Scope for added-value services: One interface; a true national collection as well as a national gateway: Easy to hook up to other national or international services. Gives the greatest flexibility to HEIs to select the most appropriate options; HEIs can retain control of selected elements: Standards applied across the board: Guaranteed consistent access: Scope for added-value services: One interface (multiple sites of supply): National gateway: Easy to hook up to other national or international services. HEI commun- ity views Strong feeling against this optionSecond most popular optionHighest level of support for this option Comments No support in the HEI communityStrong support within HEI community Very strong support within HEI community

86 86 EThoS Survey: familiar with IPR issues related to e-theses 8% know very little 30% not very familiar 51% familiar 11% very familiar

87 87 EThoS Survey: my institution’s handling of PhD e-theses 83% not yet 11% from some students 5% from most students 1% from all students

88 88 EThoS Survey: my institution’s policy position on PhD e-theses 55% no policies yet 34% current planning policies 11% has a policy

89 89 EThoS: Benefits Hugely increased visibility of UK doctoral research output Resulting in increased usage and impact of UK doctoral research output The opportunities for resulting new research efforts and collaborations

90 90 Summary: Key Ideas Theorem 1: Supporters of Open Access should support NDLTD. Theorem 2: 5S can guide us to better support of Open Access.

91 91 Theorem 1: Supporters of Open Access should support NDLTD - 1 DLs will lead to enormous benefit at all levels, from personal to global. An IR is a type of DL, in the middle of the levels (requiring support from below, and providing support for above levels). Having a DL at every university (i.e., IR) greatly encourages Open Access.

92 92 Theorem 1: Supporters of Open Access should support NDLTD - 2 The easiest way to launch an IR at a university is with ETDs. NDLTD is the lead world organization promoting ETD activities. NDLTD’s goals are all in support of Open Access and IRs.

93 93 Theorem 2: 5S can guide us to better support of Open Access - 1 5S helps us think formally about Open Access, hence clearly, hence to find focus. 5S helps us design and build DLs, hence IRs. Societies –Individuals: members of institution, discipline –Social influence can promote DL (re)use. –Economic and political and social issues lead us to a distributed architecture.

94 94 Theorem 2: 5S can guide us to better support of Open Access - 2 Distributed infrastructure + services lead us to harvesting (vs. federation, gathering). 5S helps make harvesting a success: –Streams of content flow from individuals. –Structures: ETD-ms, (browsing) classification –Spaces: indexes, interfaces –Scenarios: submission, workflow, harvesting –Societies (see above) More collaboration (social networks) Prestige is more widely spread. Access if more open

95 95 DL Futures History People, Content, Tools Sustainable Infrastructure Future Work Links For More Information

96 96

97 97

98 98

99 99 People Digital librarians DL system developers DL system administrators DL managers DL collection development staff DL evaluators DL users


Download ppt "1 Symposium: Open Access to Information Panel 2: Open Access & Institutional Repositories 24 August 2006, Brasilia Digital Libraries, Electronic Theses."

Similar presentations


Ads by Google