Presentation is loading. Please wait.

Presentation is loading. Please wait.

Foundations of Excellence DSpace vs Fedora: Or what I do on my summer vacation.

Similar presentations


Presentation on theme: "Foundations of Excellence DSpace vs Fedora: Or what I do on my summer vacation."— Presentation transcript:

1 Foundations of Excellence DSpace vs Fedora: Or what I do on my summer vacation

2 TRLN: Staff Enrichment Series: 8 Nov, 2007 Objectives Background: Why we even considered a digital repository Background: Why we even considered a digital repository FOE – version 1 FOE – version 1 DSpace & Fedora: 50,000 foot view DSpace & Fedora: 50,000 foot view FOE – version 2 FOE – version 2 FOE – version 3 FOE – version 3 Where to from here? Where to from here?

3 TRLN: Staff Enrichment Series: 8 Nov, 2007 Background

4 75 th Anniversary Duke University School of Medicine established in 1930 Duke University School of Medicine established in 1930 2005 – year-long celebration 2005 – year-long celebration New published history New published history Articles, videos, speeches Articles, videos, speeches Alumni weekend gala event Alumni weekend gala event Josiah C. Trent Foundation Grant Josiah C. Trent Foundation Grant

5 TRLN: Staff Enrichment Series: 8 Nov, 2007 Digitization Project 500 images documenting the first 3 decades of the School of Medicine and Hospital 500 images documenting the first 3 decades of the School of Medicine and Hospital Image groups: Image groups: Buildings Buildings Education Education Events Events Clinical Clinical People People Technology Technology

6 TRLN: Staff Enrichment Series: 8 Nov, 2007 Digitization Project (cont.) Selection – Whole staff Selection – Whole staff Digitization – Outsourced to University Photography Digitization – Outsourced to University Photography Description – Technical services and Reference coordinators Description – Technical services and Reference coordinators Subject terms – Technical services coordinator, Head, Cataloging services. Subject terms – Technical services coordinator, Head, Cataloging services. Controlled vocabulary – Notetab templates and libraries Controlled vocabulary – Notetab templates and libraries

7 FOE1.0 XML, XSLT, and Postgresql

8 TRLN: Staff Enrichment Series: 8 Nov, 2007 FOE1.0 600 images = 600 xml files = 2 xslt stylesheet 600 images = 600 xml files = 2 xslt stylesheet Xml = EAD2002 Xml = EAD2002EAD2002 XSLT = 1) convert xml to html; 2) convert xml to SQL statements XSLT = 1) convert xml to html; 2) convert xml to SQL statements Postgresql database used only for search Postgresql database used only for search Result http://archives.mc.duke.edu/projects/bld/bld0 0012.html Result http://archives.mc.duke.edu/projects/bld/bld0 0012.html http://archives.mc.duke.edu/projects/bld/bld0 0012.html http://archives.mc.duke.edu/projects/bld/bld0 0012.html

9 TRLN: Staff Enrichment Series: 8 Nov, 2007 Issues SQL search statements worked…not SQL search statements worked…not No indexing by search engines No indexing by search engines JDBC JDBC I am not a programmer I am not a programmer Definite need for improvements Definite need for improvements

10 TRLN: Staff Enrichment Series: 8 Nov, 2007 DSpace & Fedora: A Birds-eye View

11 TRLN: Staff Enrichment Series: 8 Nov, 2007 Need for a Digital Repository DSpace DSpace First released in 2002. Developed by MIT Libraries and Hewlett-Packard (USA Today) First released in 2002. Developed by MIT Libraries and Hewlett-Packard (USA Today)USA TodayUSA Today Current version (download) Current version (download)download Optimal performance in a *nix environment, but should operate in any environment Optimal performance in a *nix environment, but should operate in any environment Written in Java Written in Java VERY active listservs VERY active listservs Manakin – TAMU created “front-end” which makes for easier UI localization Manakin – TAMU created “front-end” which makes for easier UI localization

12 TRLN: Staff Enrichment Series: 8 Nov, 2007 Need for a Digital Repository (cont.) FEDORA (Flexible Extensible Digital Object and Repository Architecture) FEDORA (Flexible Extensible Digital Object and Repository Architecture) Began as a DARPA and NSF-funded research project at Cornell in 1997 Began as a DARPA and NSF-funded research project at Cornell in 1997 2001, UVA and Cornell: $1M Mellon grant 2001, UVA and Cornell: $1M Mellon grant 1.0 released 2003 1.0 released 2003 Current version 2.2.1 (download) Current version 2.2.1 (download)download Optimal performance in a *nix env, but will run on Windows based systems Optimal performance in a *nix env, but will run on Windows based systems Written in Java Written in Java Several front-end tools developed. (more in a moment) Several front-end tools developed. (more in a moment)

13 TRLN: Staff Enrichment Series: 8 Nov, 2007 Side by side testing Testing environment: Testing environment: Lenovo T60, 120 G hard drive, 2 G memory, Fedora 7, 2.6.23 kernel, java 1.5 Lenovo T60, 120 G hard drive, 2 G memory, Fedora 7, 2.6.23 kernel, java 1.5

14 TRLN: Staff Enrichment Series: 8 Nov, 2007 Requirements DSpace DSpace Java1.4 + Java1.4 + Apache Ant 1.6.2 + Apache Ant 1.6.2 + Postgresql 7.3 + (or Oracle 9 +) Postgresql 7.3 + (or Oracle 9 +) Jakarta Tomcat 4.x/5.x (I used 6.x) Jakarta Tomcat 4.x/5.x (I used 6.x) Can also run on Jetty or Caucho Resin Can also run on Jetty or Caucho Resin Fedora Fedora JDK 1.5 + JDK 1.5 + Optional Optional MySQL MySQL Postgresql Postgresql Oracle 9 Oracle 9 Jakarta Tomcat Jakarta Tomcat Ant 1.6.5 + if building from source code Ant 1.6.5 + if building from source code

15 TRLN: Staff Enrichment Series: 8 Nov, 2007 File Size & Download times DSpace DSpace 16 mb 16 mb 1:43 over a T1 line 1:43 over a T1 line 1:13 on a T line 1:13 on a T line Fedora Fedora 72 mb 72 mb 7:49 over a T1 line 7:49 over a T1 line 1:53 over a T line 1:53 over a T line

16 TRLN: Staff Enrichment Series: 8 Nov, 2007 Installation time DSpace DSpace Postgresql installation and set up: 8 minutes Postgresql installation and set up: 8 minutes Ant build and configuration: 8 minutes Ant build and configuration: 8 minutes DSpace/Tomcat configuration and deployment: 8 minutes DSpace/Tomcat configuration and deployment: 8 minutes Total time to live: 24 minutes Total time to live: 24 minutes Fedora Fedora Postgresql installation and set up: 8 minutes Postgresql installation and set up: 8 minutes Fedora install: 5 minutes Fedora install: 5 minutes Total time to live: 13 minutes Total time to live: 13 minutes

17 TRLN: Staff Enrichment Series: 8 Nov, 2007 Initial Live View DSpace DSpace Front Page Front Page Front Page Front Page Fedora Fedora Front Page Front Page Front Page Front Page

18 FOE2.0 Choosing our Digital Repository

19 TRLN: Staff Enrichment Series: 8 Nov, 2007 Deciding Factors DSpace DSpace Off-the-shelf view Off-the-shelf view Workflow process Workflow process Individual submitters, one project admin Individual submitters, one project admin Item submission form (link here) Item submission form (link here) Bulk load script (dc, item, mapfile) Bulk load script (dc, item, mapfile) Searchbot harvestable Searchbot harvestable OAI harvestable OAI harvestable Fedora Fedora Off-the-shelf view Off-the-shelf view One submitter One submitter Item submission not intuitive (link) Item submission not intuitive (link) Bulk load script (foxml) Bulk load script (foxml) Content Models (will return) Content Models (will return) Dissemenators Dissemenators Behavior Definitions Behavior Definitions Would require extensive programming Would require extensive programming

20 TRLN: Staff Enrichment Series: 8 Nov, 2007 FOE2.0 = DSpace Cup is Half Full March 2006 March 2006 Foundations new home Foundations new home Data submission form Data submission form Item View bld00012 Item View bld00012bld00012 Item Update Item Update Access Restrictions Access Restrictions Handle server Handle server

21 TRLN: Staff Enrichment Series: 8 Nov, 2007 FOE2.0 = DSpace Cup is Half Empty Object is entered as one item Object is entered as one item DSpace is self-contained DSpace is self-contained No real way to show complex relationships No real way to show complex relationships All or nothing metadata All or nothing metadata Access Restrictions Access Restrictions Handle server Handle server Searchbot indexing: Searchbot indexing: DSpace@DukeMed: Item 2193/77 Title:, A. Jack Tannenbaum. Issue Date:, 10-Nov-2005... Abstract:, A. Jack Tannenbaum received his medical degree from Duke University in 1935.... DSpace@DukeMed: Item 2193/77 Title:, A. Jack Tannenbaum. Issue Date:, 10-Nov-2005... Abstract:, A. Jack Tannenbaum received his medical degree from Duke University in 1935.... DSpace@DukeMed: Item 2193/77 DSpace@DukeMed: Item 2193/77

22 FOE3.0 “Our goal is to never be satisfied”

23 Content Models Reusing datastreams (next 2 slides borrowed from EDUCASE 2004 presentation by Grizzle, Wayland, and Wilper)

24 TRLN: Staff Enrichment Series: 8 Nov, 2007 Atomistic Model

25 TRLN: Staff Enrichment Series: 8 Nov, 2007 Compound Model

26 TRLN: Staff Enrichment Series: 8 Nov, 2007 An old favorite blanket 2005-2007 Fedora minimally utilized 2005-2007 Fedora minimally utilized Primarily used for archiving Library Administrative documents (Council and Management Team minutes, and Policies and procedures) Primarily used for archiving Library Administrative documents (Council and Management Team minutes, and Policies and procedures) Use of XACML policies to restrict access (156\.16\.\d{1,3}\.\d{1,3} lock down) Use of XACML policies to restrict access (156\.16\.\d{1,3}\.\d{1,3} lock down) Began looking at front-end GUIs Began looking at front-end GUIs

27 TRLN: Staff Enrichment Series: 8 Nov, 2007 Front End tools Fez – A web front-end management system for Fedora that is developed in PHP. Fez functionality includes: Web-based browsing and searching; Semi- advanced searching; Complex security; Basic image handling; Dublin Core. http:// espace.library.uq.edu.au/documentation/ Fez – A web front-end management system for Fedora that is developed in PHP. Fez functionality includes: Web-based browsing and searching; Semi- advanced searching; Complex security; Basic image handling; Dublin Core. http:// espace.library.uq.edu.au/documentation/ Elated - ELATED is a lightweight, general-purpose application for managing digital files. ELATED is built on top of the Fedora Repository system, and can be used as a digital assets management system, an institutional repository, or to meet other collection archiving, publishing and searching needs. Dublin Core metadata entry and search; Custom metadata by collection; Automatic previews for images; Collections with simple editorial workflow; Indexing and searching of content; User feedback, enabled by collection; Select and import existing Fedora objects http://elated.sourceforge.net/ Elated - ELATED is a lightweight, general-purpose application for managing digital files. ELATED is built on top of the Fedora Repository system, and can be used as a digital assets management system, an institutional repository, or to meet other collection archiving, publishing and searching needs. Dublin Core metadata entry and search; Custom metadata by collection; Automatic previews for images; Collections with simple editorial workflow; Indexing and searching of content; User feedback, enabled by collection; Select and import existing Fedora objects http://elated.sourceforge.net/ Both require extensive programming for localization Both require extensive programming for localization

28 TRLN: Staff Enrichment Series: 8 Nov, 2007 External Forces at play Fall 2006 we began a project to digitize 10,000+ cytopathology slides. Fall 2006 we began a project to digitize 10,000+ cytopathology slides. Images converted to JPEG2000 to increase user experience (example) Images converted to JPEG2000 to increase user experience (example)example Archives purchased Aware JPEG2000 Image Server Archives purchased Aware JPEG2000 Image Server History of Medicine image database, Historical Images in Medicine (HIM) needed new platform History of Medicine image database, Historical Images in Medicine (HIM) needed new platform

29 TRLN: Staff Enrichment Series: 8 Nov, 2007 Call out of the blue VTLS – Vital VTLS – Vital Open Repositories Open Repositories

30 TRLN: Staff Enrichment Series: 8 Nov, 2007 FOE3.0 = Fedora/Vital Cup is Half Full June 2007 June 2007 Foundations new home (link) Foundations new home (link) Data submission (3 ways to enter items) Data submission (3 ways to enter items) Item View bld00012 Item View bld00012bld00012 Object is entered as many datastreams (fedora view) Object is entered as many datastreams (fedora view)fedora viewfedora view Vita/Fedora/Aware…interoperability Vita/Fedora/Aware…interoperability Complex relationships Complex relationships Multiple metadata streams Multiple metadata streams Handle server Handle server Searchbot indexing: Searchbot indexing: A. Jack Tannenbaum. | MeDSpace Description: A. Jack Tannenbaum received his medical degree from Duke University in 1935.... per00165, A. Jack Tannenbaum. 302.3 kB, JPEG 2000 Image... A. Jack Tannenbaum. | MeDSpace Description: A. Jack Tannenbaum received his medical degree from Duke University in 1935.... per00165, A. Jack Tannenbaum. 302.3 kB, JPEG 2000 Image... A. Jack Tannenbaum. | MeDSpace A. Jack Tannenbaum. | MeDSpace

31 TRLN: Staff Enrichment Series: 8 Nov, 2007 FOE3.0 = Fedora/Vital Cup is Half Empty Fedora is open source, Vital is not Fedora is open source, Vital is not Customization possible with programming knowledge Customization possible with programming knowledge No way at this time to implement xacml policies (work arounds exist) No way at this time to implement xacml policies (work arounds exist) Vital upgrades require full software installation Vital upgrades require full software installation Local customization can cause breaks in certain functions Local customization can cause breaks in certain functions

32 Conclusions and obligatory links

33 TRLN: Staff Enrichment Series: 8 Nov, 2007 Selected Links DSpace – http://dspace.org http://dspace.org Manakin - http://di.tamu.edu/projects/xmlui/install http://di.tamu.edu/projects/xmlui/install Fedora – http://www.fedora-commons.org/ http://www.fedora-commons.org/ Elated - http://elated.sourceforge.net/ http://elated.sourceforge.net/ Fez - http://espace.library.uq.edu.au/documentation/ http://espace.library.uq.edu.au/documentation/ Vital – http://vtls.com http://vtls.com DSpace@DukeMed – http://dspace.mclibrary.duke.edu http://dspace.mclibrary.duke.edu MeDSpace – http://medspace.mc.duke.edu/vital/access/manager/Index http://medspace.mc.duke.edu/vital/access/manager/Index


Download ppt "Foundations of Excellence DSpace vs Fedora: Or what I do on my summer vacation."

Similar presentations


Ads by Google