Download presentation
Presentation is loading. Please wait.
Published byConrad Edwin Carter Modified over 9 years ago
1
DSpace: Technical Basics Iryna Kuchma Open Access Programme Manager www.eifl.net Attribution 3.0 Unported
4
Application Architecture The DSpace system is organised into three tiers which consist of a number of components Each layer only invokes the layer below it i.e. the application layer may not used the storage layer directly
5
The Storage Layer The storage layer is responsible for physical storage of metadata and content DSpace uses a relational database to store all information about the organization of content, metadata about the content, information about e-people and authorization, and the state of currently-running workflows.
6
The Business Logic Layer The business logic layer deals with managing the content of the archive, users of the archive (e-people), authorization, and workflow
7
The Application Layer The application layer contains components that communicate with the world outside of the individual DSpace installation, for example the Web user interface and the Open Archives Initiative protocol for metadata harvesting service The DSpace Web UI is the largest and most-used component in the application layer. Two versions: 1. JSPUI: Built on Java Servlet and JavaServer Page technology 2. XMLUI (Manakin): Built on XML and Cocoon technology
8
Server Architecture Web Application ServerUser Interface These systems may reside on a single server or be hosted separately on dedicated servers
9
Structural Overview DSpace is split into three directory trees: Source Directory [dspace-src] Surprisingly, this is where the source code resides Install Directory [dspace] Populated during install & during normal operation Contains: Configuration files Command line tools Libraries DSpace archive (depending on configuration) Web Deployment Directory [tomcat]/webapps/dspace Contains the JSPs and Java classes and libraries necessary to run DSpace
10
Persistent Identifiers The use of location based identifiers such as the Uniform Resource Locator (URL) often leads to problems in accessibility to resources with time Often when accessing a resource via a hyperlink users receive a “404 - page not found” error Persistent identifiers are an attempt at solving the issues surrounding resource identification and long term preservation A persistent identifier allows the resource to be uniquely identified in a way that will not change if the resource is renamed or relocated
11
Persistent Identifiers This means that a resource can be reliably referenced for future access by humans and software Caveat: Persistence is heavily dependant on organisation policy i.e. persistence of an object is only effective if an organisation maintains and manages this persistence Different systems in use for persistent identifiers Persistent Uniform Resource Locators (PURLs) Digital Object Identifiers (DOI) Handle – Used by DSpace
12
The Handle In a handle system, resource address is identified by a unique handle assigned by a common registration service Registration ServiceHandle PrefixLocal Identifier http://hdl.handle.net2160568 http://hdl.handle.net/2160/568
13
Practical: Using a Handle Navigate to Aberystwyth’s DSpace repository – Cadair Select an item from a collection and note the handle address Open this address in a new browser window The handle will resolve an redirect back to your original item
14
Configuring the Handles service Out of the box, a DSpace installation will use the handle: hdl:123456789 These aren't really Handles, since the global Handle system doesn't actually know about them 3 Steps to handle configuration
15
Configuring the Handles service In order to use handle in DSpace, registration for a prefix with the Corporation for National Research Initiatives (CNRI) is required How to register with CNRI? Complete the registration form on the CNRI website Create & Upload the sitebndl.zip to CNRI Pay a small annual fee http://www.handle.net/service_agreement.html
16
Generating the sitebndl.zip The Site Bundle is an archive which contains information about your DSpace installation and is used to generate your handle To generate the sitebndl.zip run the command: [dspace]/bin/dsrun net.handle.server.SimpleSetup [dspace]/handle-server You will be required to complete a series of questions Once completed the sitebndl.zip can be found: [dspace]/handle-server/sitebndl.zip Complete the registration and upload the sitebndl.zip
17
Configuring the Handle Server Once registration is complete, a handle should be returned from CNRI Edit the [dspace]/handle-server/config.dct to include the lines in the “ server_config ” clause: " storage_type" = "CUSTOM" "storage_class" = "org.dspace.handle.HandlePlugin” Update all references to YOUR_NAMING_AUTHORITY to your assigned handle: 300:0.NA/YOUR_NAMING_AUTHORITY ->300:0.NA/2097 Configuring the Handle Server
18
Updating the Handle Prefix Edit [dspace]/config/dspace.cfg and update the handle prefix A restart of Tomcat will be required If items have already been deposited into DSpace their handle will need updating [dspace]/bin/update-handle-prefix 123456789 YourHandle
19
Starting the Handle Server Finally start the handle server [dspace]/bin/start-handle-server A script will be required to automate the starting of the handle server upon a server boot Once configured the handles should resolve as the practical demonstrated earlier in this module
20
Workflow scenarios Scenario 1: Head of research I want to be able to see everything my researchers deposit for quality control purposes
21
Workflow scenarios Scenario 2: Repository manager I want to approve everything that goes in to the repository to make sure there are no copyright issues or bad metadata
22
Workflow scenarios Scenario 3: Cataloguer I want to be able to see everything my researchers deposit for quality control purposes
23
The three workflows DSpace has three workflow steps 1. Accept/Reject Step 2. Accept/Reject/Edit Metadata Step 3. Edit Metadata Step You can use any combination of the three Steps are worked through in order Which might be used in each of the previous scenarios?
24
RSS feeds –Site level (all new items) –Community level (new items in all contained collections) –Collection level (new items in that collection) Can be read in modern web browsers Can be subscribed to in news reader software
25
Alerts –Created by users –Created for a collection –Emails sent each day for new items –Script must run daily: [dspace]/bin/sub-daily
26
DSpace statistcis DSpace statistics: –Collated from DSpace log files –Reports generated daily (daily and monthly reports) –http://dspace.example.com/dspace/statistics Or via the Administer menu –Can be private (must be logged in) or public In dspace.cfg: –report.public = [true|false]
27
Statistics collected The following statistics are collected –General overview (e.g. number of items archived / number of item views / user logins) –Archive Information (numbers of each type of item) –Item view counts –Actions performed –Search terms used
28
Google Analytics Google Analytics allow a richer and more detailed suite of statistics Time visitors spent on the site Where they came from Terms they used in search engines to find items The geographic location of visitors How many pages they looked at Which pages they started and ended their visit on –JSPUI requires a small code change, Manakin has a configurable option.
29
Credits These slides have been produced re-using The DSpace Course by: –Stuart Lewis & Chris Yates –Repository Support Project http://www.rsp.ac.uk/ –Part of the RepositoryNet –Funded by JISC http://www.jisc.ac.uk/
30
Thank you! Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.