Download presentation
Presentation is loading. Please wait.
Published byRosaline Sherman Modified over 9 years ago
1
Open Source Software for Digital Libraries Jon Dunn Associate Director for Technology Associate Director for Technology John A. Walsh Manager of Electronic Text Technologies Indiana University Digital Library Program IU Digital Library Brown Bag Series Bloomington, IN 09 April 2004
2
Outline Open Source Introduction Categories of Open Source Software for Libraries Open Source Digital Library Systems Open Source XML Tools and Systems
3
What is open source software? In the phrase open source, source refers to source code, the human-readable computer code which is the origin, or source, of the computer application. Open refers to the terms of access to that computer source code. So open source software is software for which the source code is freely available. But this is a very general and incomplete definition. A detailed definition of open source software is maintained by the Open Source Initiative Open Source InitiativeOpen Source Initiative
4
Advantages and Disadvantages Advantages Access to source code and ability and right to modify it Right to redistribute modifications to benefit wider community Free Excellent support networks Large and enthusiastic user base Disadvantages Limited or no accountability Informal and unaccountable support channels
5
Categories of Open Source Software Operating Systems Linux Linux Programming Languages Perl, PHP, Python Perl, PHP, Python Applications Apache, Tomcat, emacs, grep, MySQL, sendmail, ssh Apache, Tomcat, emacs, grep, MySQL, sendmail, ssh
6
Different Open Source Licenses GNU GPL ("General Public License") GNU Lesser GPL BSD License Mozilla Public License IU Open Source License And more... And more... And more...
7
Open Source Software in the DLP Linux, Apache, Tomcat, PHP, Perl, DLXS, ImageMagick, ePrints, MySQL, Darwin Streaming Server, emacs, CVS, Webalizer, LibXML, LibXSLT, Saxon, and more!
8
Open Source Resources Open Source Initiative Open Source Initiative Open Source Initiative GNU GNU SourceForge SourceForge
9
Some categories of open source library software Library-oriented search engines Cheshire, Pears Cheshire, Pears Z39.50 toolkits ZetaPerl (Perl), JAFER (Java), YAZ (C/C++) ZetaPerl (Perl), JAFER (Java), YAZ (C/C++) MARC parsers MARC.pm (Perl), MARC4J (Java) MARC.pm (Perl), MARC4J (Java) Image processing ImageMagick, tiffinfo/tiffdump ImageMagick, tiffinfo/tiffdump
10
Some categories of open source library software Portals MyLibrary MyLibrary OAI service providers and data providers PHP OAI Data Provider PHP OAI Data Provider Lots! See www.openarchives.org Lots! See www.openarchives.orgwww.openarchives.org METS tools Page turners, toolkits, more: see www.loc.gov/mets/ Page turners, toolkits, more: see www.loc.gov/mets/www.loc.gov/mets/ Digital object repositories Fedora Fedora
11
A Good Starting Point oss4lib: Open Source Systems for Libraries www.oss4lib.org www.oss4lib.org www.oss4lib.org
12
Complete DL Systems DSpace Eprints Greenstone
13
DSpace “DSpace is a groundbreaking digital institutional repository that captures, stores, indexes, preserves, and redistributes the intellectual output of a university’s research faculty in digital formats.” Developed jointly by MIT Libraries and Hewlett- Packard Licensed under BSD distribution license www.dspace.org www.dspace.org
14
DSpace Supports submission of, management of, and access to digital content Formats: text, images, audio, video Formats: text, images, audio, video Organized based on organizational needs of a large university Communities and collections Communities and collections
15
DSpace Features Digital preservation Persistent IDs, support levels for different file formats Persistent IDs, support levels for different file formats Access control Versioning Search and retrieval Based on qualified Dublin Core metadata Based on qualified Dublin Core metadata OAI-PMH data provider To support metadata harvesters To support metadata harvesters
16
DSpace Technology OS: Unix or Linux Written in Java PostgreSQL relational database Provides complete Web user interface, but Java APIs available
17
DSpace Data Model
18
DSpace Architecture
19
DSpace Demonstration MIT DSpace dspace.mit.edu dspace.mit.edu dspace.mit.edu
20
EPrints “free software which creates online archives” Developed by University of Southampton, UK Supports self-archiving of e-prints Can be configured as institutional repository or otherwise, e.g. repository focused on particular research area or discipline Licensed under GNU General Public License software.eprints.org software.eprints.org
21
EPrints Supports submission, management of, and access to digital content Can support multiple archives on one server Moderated or unmoderated archives Search and retrieval Based on metadata Based on metadata Metadata can be customized for different archives and document types Metadata can be customized for different archives and document types No access control OAI-PMH data provider
22
EPrints Technology OS: Unix or Linux Written in Perl Requirements: Apache web server Apache web server MySQL relational database MySQL relational database
23
EPrints Demonstration Digital Library of the Commons dlc.dlib.indiana.edu dlc.dlib.indiana.edu dlc.dlib.indiana.edu
24
Greenstone “Suite of software for building and distributing digital library collections” Developed by University of Waikato, New Zealand Developed in cooperation with UNESCO and the Human Info NGO Developed in cooperation with UNESCO and the Human Info NGO Licensed under GNU General Public License www.greenstone.org www.greenstone.org
25
Greenstone Features Supports creation and management of collections by administrator(s) Web interface for search and retrieval Customizable metadata Customizable metadata Supports full text search of content Supports full text search of content Extensive document filters Word, Excel, PowerPoint, PDF,... Word, Excel, PowerPoint, PDF,... Can extract metadata from documents Can extract metadata from documents Many ways to build a collection, including: Local files Local files Retrieve web sites Retrieve web sites Retrieve objects via OAI-PMH Retrieve objects via OAI-PMH
26
Greenstone Features Focus on: Ease of installation Ease of installation Ease of use Ease of use Internationalization Internationalization Full support for English, French, Spanish, Russian, and KazakhFull support for English, French, Spanish, Russian, and Kazakh Support for many other languagesSupport for many other languages Low barriers to use Low barriers to use Minimal system requirementsMinimal system requirements Creation of CD-ROMsCreation of CD-ROMs
27
Greenstone Technology Runs on Windows (back to 3.1), Linux, Mac OS X, Unix Written in C++, Perl, and Java Uses MG/MG++ search engine Several different Web and Java/Swing user interfaces for various functions Web interface for user access
28
Greenstone Demonstration Examples at www.greenstone.org www.greenstone.org
29
Open Source XML Tools and Systems Utilities Xalan, Xerces, libxml, libxslt, saxon Xalan, Xerces, libxml, libxslt, saxon Editors emacs / nxml-mode emacs / nxml-mode Database / Search Engines Apache XindiceApache Xindice Berkeley DB XMLBerkeley DB XML eXisteXist Publishing/WebApplication Frameworks AxKitAxKit CocoonCocoon
30
XML Databases & Search Engines Apache Xindice Apache Xindice Apache Xindice Berkeley DB XML Berkeley DB XML Berkeley DB XML eXist eXist
31
Apache Xindice http://xml.apache.org/xindice/ http://xml.apache.org/xindice/ Technology: Java Optimized for large numbers of small XML files. Does not work well on large files.
32
Berkeley DB XML http://www.sleepycat.com/products/xml.shtml http://www.sleepycat.com/products/xml.shtml Technology: C C++ and Java APIs
33
eXist http://exist.sourceforge.net/ http://exist.sourceforge.net/ Technology: Java
34
XML Publishing / Web Application Frameworks XML Publishing, or Web Application, Frameworks provide systems for publishing XML data in a variety of formats, such as HTML, WAP/WML, PDF, etc. Both AxKit and Cocoon use a "pipeline" paradigm to route incoming requests through different processing routines. Apache AxKit Apache AxKit Apache AxKit Apache Cocoon Apache Cocoon Apache Cocoon
35
Apache AxKit http://axkit.org/ http://axkit.org/ Technology: Perl AxKit is an XML Application Server for Apache. It provides on-the-fly conversion from XML to any format, such as HTML, WAP or text using either W3C standard techniques, or flexible custom code. AxKit also uses a built-in Perl interpreter to provide some amazingly powerful techniques for XML transformation.
36
Apache Cocoon http://cocoon.apache.org/ http://cocoon.apache.org/ Technology: Java "Apache Cocoon is a web development framework built around the concepts of separation of concerns and component- based web development."
37
Cocoon: Key Concepts publishing framework XML and XSLT "pipelined SAX processing" separation of: content content logic logic style style centralized configuration sophisticated caching
38
Cocoon: Problems to Be Solved Separation of content, style, logic, and management functions in an XML content based web site:
39
Cocoon: Problems to be Solved (cont.) Data mapping:
40
Cocoon: Basic mechanisms for processing XML documents Dispatching based on Matchers. Generation of XML documents (from content, logic, Relation DB, objects or any combination) through Generators Transformation (to another XML, objects or any combination) of XML documents through Transformers Aggregation of XML documents through Aggregators Rendering XML through Serializers
41
Cocoon: Basic mechanisms for processing XML documents
42
Cocoon: The Pipeline Sequence of interactions:
43
Cocoon: The Pipeline
44
Generators, Transformers, & Serializers Generators Generators Transformers Transformers Serializers Serializers
45
Cocoon: Configuration: The Sitemap <map:components>...</map:components><map:views>...</map:views><map:pipelines><map:pipeline><map:match>...</map:match>...</map:pipeline>...</map:pipelines>...</map:sitemap>
46
Cocoon: Configuration: A Pipeline <map:pipelines><map:pipeline> <map:serialize/></map:match> </map:match> <map:serialize/></map:match> <map:read mime-type="text/css" src="technochat/resources/styles/{1}.css“ /> /></map:match> </map:match> </map:match> </map:match></map:pipeline>
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.