Download presentation
Presentation is loading. Please wait.
Published byRebecca Murphy Modified over 9 years ago
1
Building an Institutional Research Repository from the Ground Up: The ARROW Experience Dr Andrew Treloar Project Manager, Strategic Information Initiatives & ARROW Technical Architect Status Snapshot as of September 2004 (pre-Bandicoot)
2
Vacant Lot
3
Context – Global Increasing focus on content as institutional asset Increasing proportion of this content is now born- digital or re-born digital Wide uptake of software such as Dspace and eprints.org Open Access scholarship movement gathering strength worldwide Recent UK House of Commons STC report calling for establishment of institutional research repositories and mandated deposit
4
Context – Australian Higher Education Information Infrastructure Advisory Committee (HEIIAC) report in Nov 2002 identified need for Research Information Infrastructure DEST arranged Digital Object Repository Management meeting in Sydney in May 2003 DEST called for RII bids in June 2003 Four successful: Australian Digital Theses (ADT) Australian Partnership for Sustainable Repositories (APSR) Meta Access Management System (MAMS) Australian Research Repositories Online to the World (ARROW)
5
Design Brief
6
Requirements – Content Streams E-Prints Pre-prints, postprints, working papers, etc Digital theses Masters and Ph. D. Electronic Publishing Open-access ejournals DEST Returns Actually, database behind the returns Non-University Research ‘Scholar in the Garden Shed’
7
Requirements – Content Types Based on Dspace philosophy: Lots of digital material is already lost Most digital material is at risk Preserving bits is better than nothing It is important to capture as much information as possible It will be necessary to evaluate cost/benefit trade-offs over time Decided to divide content into three types: Supported Known Unsupported Long list of actual types in referenced paper (URL at end)
8
Architectural Drawings
9
Architecture Considerations Common Repository because boundaries between Research and Teaching/Learning are very fluid Series of Content Workflow and Management layers to handle ingest/management of content Exposure of content in variety of ways to maximise access
10
ARROW OLAD
11
Building Materials - Foundation
12
Repository Repository decision determines a number of other aspects of project Functionality Type of application development Lots of options available (refer http://www.soros.org/openaccess/software/) http://www.soros.org/openaccess/software/ Version 3 of this report due out soon Careful examination of alternatives narrowed quickly to focus on DSpace & FEDORA
13
Repository – Dspace Joint activity between MIT Libraries and Hewlett-Packard to develop a software system to enables institutions to: Capture and describe digital works using customized workflow processes Provide access to an institution's digital works so users can search and retrieve items in the collection Preserve digital works over the long term Being made available under the BSD open source license to other groups to run as-is, or to modify and extend as needed. Can best be thought of as a general-purpose repository application, with a series of both hard-wired and preferred behaviours Designed to provide stable long-term storage needed to house the digital products of MIT faculty and researchers
14
Repository – FEDORA Not the RedHat FEDORA... Flexible Extensible Digital Object and Repository Architecture Joint venture between UVA Library and Cornell CS Both a software platform and an architecture Open source, digital object repository system using public APIs exposed as web services Best thought of as services-mediation infrastructure, rather than an off-the-shelf application Underlying object-based model
15
Repository – Decision After lots of due diligence, decided to go with FEDORA: better/cleaner underlying architecture (flexible not hierarchical) easier to build on top of (APIs exposed as web services) designed from ground up as services provider and mediator (not packaged application) powerful idea of objects and disseminators (content behaviours)
16
Construction Strategy: Sub-Contract or DIY? Original bid assumed that project would hire and manage development team ARROW Project Manager (Geoff Payne) realised we could do much better by sub-contracting development work to a company already familiar with FEDORA: outsource risk save time by avoiding initial learning curve partner in way that met ARROW and company needs increase attractiveness of FEDORA build a sustainable support and enhancement model
17
VTLS the Builder ARROW entered into contract with VTLS (Blacksburg, VA) to acquire VITAL 1.0 (and successor versions) extend the functionality of FEDORA either by contributing back to the core FEDORA code or by writing a series of ARROW-commissioned modules ARROW-commissioned modules to be open-sourced using the same license as the FEDORA code VTLS will be able to build products on top of these new ARROW-commissioned modules, but so will anyone else
18
Open-Access Publishing VTLS won’t be writing all the modules Need module to provide simple OA ejournal publishing Have decided to use the Open Journal System (http://www.pkp.ubc.ca/ojs/ from the Public Knowledge Project at UBChttp://www.pkp.ubc.ca/ojs/ Provides high-level of devolved functionality Still deciding how best to integrate this with rest of ARROW
19
Building Materials - Frame
20
Application Framework ARROW-commissioned modules will call FEDORA API-A (Access) and API-M (Management) web services expose themselves as Web Services Possible that combination of ARROW-modules and FEDORA will lead to refactoring of existing APIs into: API-A (Access) API-S (Search) API-M (Management) API-W (Workflow)
21
FEDORA Development Consortium Announced at same time as ARROW-VTLS deal Joint activity of FEDORA, VTLS, ARROW, and others partners selected on ability to contribute and resources to make it happen Rest of 2004 will be spent working out how this might function Work towards API-W will be used as process testbed
22
Building Materials - Doors and Windows
23
Search and Exposure Exposure of metadata for OAI-PMH harvesting Open Archives Initiative - Protocol for Metadata Harvesting Each repository will be an OAI Data Provider Support for direct searching via SRU/SRW Simpler version of Z39.50 Exposure of full text (including derived full text) for spidering by Google and other search engines) Local search gateways at each ARROW site http://search.arrow.monash.edu.au/ http://search.arrow.monash.edu.au/ National Resource Discovery Service offered by NLA http://search.arrow.edu.au/ http://search.arrow.edu.au/ NLA acting as OAI Service Provider (as well as Data Provider with their non-uni research repository) Possible RSS feeds later
24
ARROW Web Site Project Information National Library of Australia Swinburne UNSW Monash ARROW Repository Digital Object Storage using Fedora & VITAL Members only area Meeting Minutes etc National Library of Australia ARROW Resource Discovery Service Using TeraText to index metadata harvested by OAI PMH ARROW Open Access Journal Publishing System Using OJS from Public Knowledge Project Internet Search Engines Capture text exposed by ARROW Repositories ARROW Branded Services Profile Internet
25
Building Site
26
State of Development Funding commenced in February A$ 3.66*10 6 over 3 years Project Manager appointed in February Contract with VTLS signed in June FEDORA Phase 2 funding secured in June US$ 1.4*10 6 over 3 years Anticipated delivery of ARROW Phase 1 (Bandicoot) functionality in September Anticipated delivery of ARROW Phase 2 (Bilby) functionality in February 2005
27
Phased Deliverables DEST Metadata Collections Copyright support Object validation Search engine support Still Images PDF RTF XHTML SRU/SRW Web-based XML Editor SMIL Audio Video DEST Reporting Multiple Object Viewing and Editing
28
Open House?
29
What we’ve learned already All IT projects involve People, Processes and Technology. In addition, this one has a heavy focus on Content. These proportions are going to change over time Component200420052006 People5%20%35% Processes10%20%10% Technology75%20%5% Content10%40%50%
30
ARROW Availability ARROW partners (NLA, Monash, UNSW, Swinburne) will be testing and refining beta software this year and early next year Hope to be able to offer ARROW more broadly around mid-2005 http://arrow.edu.au/ will be regularly updated with news and more information http://arrow.edu.au/
31
Questions? Geoff.Payne@lib.monash.edu.au Geoff.Payne@lib.monash.edu.au Project Manager Andrew.Treloar@its.monash.edu.au Andrew.Treloar@its.monash.edu.au Technical Architect http://arrow.edu.au/ http://arrow.edu.au/ Project web site http://andrew.treloar.net/research/publications/ausweb04/ http://andrew.treloar.net/research/publications/ausweb04/ Link to updated version of AusWeb04 paper about development of ARROW architecture
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.