An Institutional Repository for the University of Pretoria Ina Smith DSpace Platform Manager University of Pretoria 20 May 2005 DITCHE Conference & Workshop Nelson Mandela Metropolitan University
Institutional Repository Set of services Management & dissemination of digital materials Organisational commitment Stewardship Long-term preservation Organisation, access, distribution
Approaches Digitally born resources Digitisation
“Champions” Africana & Special Collections Architecture Scholarly Publications Veterinary Science
Evaluation of software EPrints Greenstone DSpace Fedora MyCoRe CDSware ARNO Innopac Millennium WebCT
Why Open Source? Promote collaboration Promote knowledge sharing Benefit all – not only the vendor Belongs to all – lots of support No marketing
Why Open Source? Flexible Openness and creative thinking Open for scrutiny Searchable & retrievable via the WWW Used by institutions with minimal resources Uses world standards and open standards
Why DSpace? MIT & Hewlett Packard Archiving & permanent (long-term) preservation Easy retrieval Scalability Supports full text Good separation of data and metadata Supports OAI
Why DSpace? Availability of code Integration into portal Well defined workflow Lucene Search Engine Taxonomic Structure UP IT Architecture Open source GAP Authentication
DSpace Architecture Three layers with components –Storage layer: Physical storage of metadata and content –Business logic layer: Manage content, e-people, authorisation, workflow –Application layer: Communicate with outside, e.g. Web User Interface, Metadata Harvesting
National & International DSpace Support Informative & active web-site 100+ instances running Active mailing lists Enthusiastic community of developers DSpace wiki – sense of community, visibility
DSpace Open Source Software License Open Software License V php php Open Source Initiative (OSI) “ a non-profit corporation dedicated to managing and promoting the Open Source Definition for the good of the community, specifically through the OSI Certified Open Source Software certification mark and program. “
DSpace for Digital Archiving Institutional Repositories Learning Object Repositories E-theses Electronic Records Management Digital Preservation Publications
System Requirements Linux Sun Java Compiler Tomcat (Java servlet engine) Ant Java build tool: generate.jar or.war files PostgreSQL (Relational Database Management System) Mail Server
About DSpace DSpace Stable May 2005 Version.major.minor Minor releases: bug fixes, no migration of data Major releases: changes to database, migration needed Versions: restructuring of architecture, significant data migration
Default Look & Feel
UPSpace Look & Feel
Information Model Top-level Community (e.g. Faculty) Sub-Community (e.g. Department) Collections (e.g. Pretoriana, Freedom Struggle)
Top-level Communities
Top-level Community
Collections within Top-level Community
Items in DSpace Articles (preprints & postprints) Technical reports Working papers Conference reports E-theses E-books
Items in DSpace Datasets: statistical, geospatial, etc. Images: visual, scientific, etc. Audio files Video files Learning objects Reformatted digital library collections
An Item in DSpace Metadata Files (Bitstreams) with digital content Many formats are accepted Largest file 500 MB video (Infosys) File formats e.g. xml, tiff, jpeg, wav, mov, qt
Thumbnails
Roles in DSpace DSpace Administrator Collection Administrators Submitters Reviewers Metadata Editors Approvers
Permission policies in DSpace Set permission policies for various collections (read, remove, add, write) E-person can be removed from Collection Items in Collections can only be viewed by authenticated persons
Workflow Submit Accept Reject Edit Metadata Commit DSpace
Submission Default Submission Interface Community/ Collection Specific Interface Mandatory fields: Title, Language Distributed input
About Metadata Descriptive information about an object or resource E.g. title, subjects, keywords, author/s, date
Metadata in DSpace Qualified DC Metadata 15 Elements + Qualifiers Edit DC Registry Add / delete elements/qualifiers (not recommended) Mandatory / Optional fields
Harvesting Metadata OAISTER Search Engines e.g. Google
Bitstreams Bitstream Registry
Persistent Identifiers CNRI Handle System Valid citations Unique handle for each item E.g = Naming authority assigned by handle system to e.g. UP 171 = Unique local name assigned to an item in the repository
Lucene Search Engine Simple, high performance, powerful search engine Open source Features –Boolean searches –Phrase and proximity searching –Relevancy ranking –Field searching –Advanced and basic search –Browse by Community/Collection, Author, Title, Date
Managing the Project Evaluation Proposal Needs Analysis Design Development Implementation Evaluation
DSpace on UPDev DSpace on UP Q&A DSpace on Portal Server Evaluation Instrument Program Usability Evaluation Instrument Design
Reflection & Communication DSpace Listserve DSpace web Update documents & policies Meetings Community of Practice
Marketing Meetings UP Library Community Faculties & Departments UP Dept. of Marketing UPSpace E-Newsletter New staff orientation program Online brochure Statistics to Faculties Support from UP Management
Training Information Specialists Cataloguers End-users
User Support Help landline 8/5 Online help 24/7 –Viewlets e.g. Macromedia Captivate, PPT, Open Office Tutorial User policy Pamphlet
Policy Issues Submission Communities and Collections Responsibilities & rights Licensing, copyright, privacy, intellectual property Preservation support Withdrawals Workflow System back-up and availability
Copyright, Rights & Licensing License stored with each item Submitter “Grants License” Default DSpace License Community/ Collection License Copyright note when creating Community DC Element Rights DSpace StyleSheet: © University of Pretoria
IT Support & Back-up Dept. of Information Technology Back-up report DSpace Master.tiff images
Limitations Limited resources e.g. scanner Java expertise
The way forward … Federated Search Engine Emerging IR software Long-term budget Migration of digital objects Repository size Bandwidth Collaboration e.g. DISA Collaboration between institutions
The way forward … Register with –Open Archives Institute –DSpace Wiki –OAISTER
Conclusions Great “digital” libraries will last Comfortable with change, open source Traditional library activities Curator ship Collection Development Quality Assurance etc. The WWW can be a library …
Success … “Courage to explore Knowledge to exceed Technology to excel”
Thank you! Tel.: Many thanks to TENET for this opportunity!