DSpace Basic Tutorial Stuart Lewis & Chris Yates
Information Details: –Requires tutorial CD DSpace CD includes DSpace 1.5 alpha CD and workbook created by: –Chris Yates –Stuart Lewis
Information Tutorial created by: –Repositories Support Project – Funded by: –JISC as part of the RepositoryNet
Contents 1.Introduction to DSpace 2.The tutorial CD 3.DSpace technical architecture 4.Users and groups 5.Item structure 6.Metadata and item input, workflows 7.Search and browse 8.Import / export / harvest
Introduction to DSpace “DSpace captures your data in any format in text, video, audio, and data. It distributes it over the web. It indexes your work, so users can search and retrieve your items. It preserves your digital work over the long term. DSpace provides a way to manage your research materials and publications in a professionally maintained repository to give them greater visibility and accessibility over time.” –
*Introduction to the tutorial CD Intended to be used with tutorial DSpace version (includes 1.5 alpha) Shouldn’t affect your PC Installs no software Can be reused DOES NOT SAVE DATA! Disclaimer…
DSpace Technical Architecture Written in Java –Can be run on any platform that supports Java Most installation on Unix (Linux* / Solaris) Runs on Windows / Mac OS X –Sun JDK (not GNU) –1.4 for <= version –1.5* for >= version 1.5
DSpace Technical Architecture Database: –Same machine, or database server Postgres* Oracle Web application server –Tomcat* –Jetty –Other
DSpace file layout Download –[dspace-src] Edit config/dspace.cfg Build Installed –[dspace] (often /dspace/) [dspace]/assetstore/ [dspace]/upload/ [dspace]/logs/ [dspace]/bin/ [dspace]/search/
*Create the database Create a database user –Who will own the database –Called dspace Create the database –Called dspace –UNICODE encoding –Owned by the dspace database user
Create the database Create a database user –Double click on the ‘Terminal icon’ –‘su - postgres’ password is ‘postgres’ –‘createuser -U postgres -d -P dspace’ –password is ‘postgres’ Create the database –‘createdb -U dspace -E UNICODE dspace’
*Build DSpace DSpace needs to be compiled Uses ‘ant’ build system Inserts default data into the database –Table structures –Dublin Core metadata schema –Bitstream formats Builds package for the web server Configuration can be changed
Build DSpace ‘cd /dspace142-src/’ ‘gedit config/dspace.cfg’ –Change dspace.name to your name –Save and quit ‘ant fresh_install’ ‘chmod 777 /dspace142/upload’ ‘chmod 777 /dspace142/assetstore’
*Deploy to the web server.war files are packaged applications Tomcat is a Java web application server –Tomcat automatically unpacks.war files Two web applications –DSpace & DSpace OAI interface Copy.war files to Tomcat directory Start Tomcat
Deploy to the web server ‘cp build/*.war /var/lib/tomcat5.5/’ ‘sudo /etc/init.d/tomcat5.5 start’ Load Firefox –Go to
DSpace users and groups Administrator –bin/create-administrator Create more users: –Web user interface –Administrator Other authentication methods: –LDAP (LDAP or Active Directory) –Plugable and stackable authentication
DSpace users and groups Groups –Members can be users of other groups –E.g. Dept group made up of research group groups User defined or Automatically generated for collections
*Create DSpace users Create an administrator Log in Log out Create a normal user Modify a group
Create DSpace users Create first administrator –‘bin/create-administrator’ –Answer questions Create another user –Administrator pages, ‘E-People’ –‘Add EPerson’ Promote new user to administrator –Administrator pages, ‘Groups’ –Edit ‘Administrator’ group
*Communities & collections Communities –Often used to represent organisational units –Can have sub-communities and collections –Can be branded (logo) and have own policies Collections –Holds items
Communities & collections ‘Communities / Collections’ link ‘Create Top-Level Community’ –Enter name, short description –Press ‘Create’ button Create collection –Enter name, short description –Add the Administrator group to submitters
DSpace items Metadata –One or more metadata schemas –User-entered –System-generated (e.g. accessioned date) Files –In bundles –Special bundles (e.g. extracted text, licences)
DSpace Items Items can be mapped across collections –E.g. appear in central and departmental e- theses collections –Same as a file system symbolic link Submissions controlled by input forms –config/input-forms.xml –Input forms Controlled vocabularies
DSpace Items input-forms.xml
input-forms.xml … … …
input-forms.xml schema element qualifier true/false Text label name/onebox/date/twobox/textarea/dropdown/ qualdrop_value Expanded hint Warning to show if not entered
input-forms.xml English en Welsh cy
input-forms.xml dc contributor author true Authors name Enter the names of the authors of this item below.
input-forms.xml dc title false Title name Enter the main title of the item. You must enter a main title for this item.
*Deposit an item Choose collection Enter metadata Upload file Confirm details Agree to the licence
Deposit an item Enter the collection you created –Tick ‘The item has been published or publicly distributed before’ - asks extra questions about the publishing (i.e. date / publisher) –Enter metadata –Upload file (‘/dspace-docs/RSP.pdf’) –Agree to licence –Submit item
*Create a workflow Three workflows –Accept/reject step E.g. Head of research “Should item be included in the repository?” –Accept/reject/edit metadata step E.g. Repository manager –Edit metadata step E.g. Librarian
Create a workflow Create new collection –Tick ‘This submission will include and accept/reject/edit metadata step’ –Enter name and short description –Add ‘Administrator’ group to workflow Submit to the new collection Go to ‘My DSpace’ to enter the workflow –‘Edit Metadata’ –‘Approve’
Search and browse Browse –By: Author / title / date –Database driven –Always up to date Search –Lucene search engine –Define fields to index in dspace.cfg –Full texts –Not always up to date
*Search system initalisation Build indexes –Index metadata Extract from database –Index full-texts Extract from PDF/Doc files Extra MediaFilters can be written
Search system initalisation Search ‘Aberystwyth’ –No results Run: –‘bin/filter-media’ Extract full texts (create thumbnails) Build indexes Search ‘Aberystwyth’ –See results!
*Scheduled background jobs filter-media –Extract texts and build indexes sub-daily – subscription s checker –checks bitstream checksums stat-* –statistics
*RSS feeds and thumbnails Configured in dspace.cfg RSS feeds: –webui.feed.enable = [false|true] –webui.feed.localresolve = [false|true] Thumbnails: –webui.item.thumbnail.show = [true|false] –webui.browse.thumbnail.show = [false|true]
RSS feeds and thumbnails ‘gedit /dspace142/config/dspace.cfg’ –Set webui.feed.enable to true –Set webui.feed.localresolve to true ‘sudo /etc/init.d/tomcat5.5 restart’ Upload new item with PNG file –Upload png from /home/dspace/examples/ –‘/dspace142/bin/filter-media’ –See thumbnail
Import / export See docs Bulk import command line tool –Imports one item per directory –Multiple files / metadata file / contents file Bulk exporter –Writes same file format –Adds file containing handle (for re-import)
Import / export archive_directory/ item_000/ dublin_core.xml -- qualified DC metadata contents -- one line per filename file_1.doc -- files to be added file_2.pdf [dspace]/bin/dsrun org.dspace.app.itemimport.ItemImport --add --eperson= --collection=collectionID --source=items_dir --mapfile=mapfile
Harvesting / OAI-PMH OAI-PMH interface –Separate web application –/dspace-oai/ –/dspace-oai/request?verb= –/dspace-oai/request?verb=Identify –/dspace-oai/request?verb=ListSets –/dspace-oai/request?verb=GetRecord –/dspace-oai/request?verb=ListIdentifiers –/dspace-oai/request?verb=ListMetadataFormats –/dspace-oai/request?verb=ListRecords
The end… Incomplete –Lots lots more! – – dspace-tech list Advanced tutorial this afternoon –Or surgery / open discussion / demos