Download presentation
Presentation is loading. Please wait.
Published byErika Ford Modified over 9 years ago
1
Data Wrangling at Rice University Denis Galvin Rice University MetaArchive Annual Membership Meeting Houston Texas
2
ETDs at Rice Dspace Collection in a database driven by programming 42,581 G Brief and Full records
3
ETD Structure Brief http://scholarship.rice.edu/handle/1911/13401 Full http://scholarship.rice.edu/handle/1911/13401 ?show=full PDFs http://scholarship.rice.edu/bitstream/handle/ 1911/13401/1338793.PDF?sequence=1
4
Testing All testing done on Centos using VMware Plugintool testing Run one daemon Copying other sites plugins
5
Manifest Page
6
Dublin Core request?verb=ListRecords&metadataPrefix=oai_dc&s et=hdl_1911_8299
7
Sub-Manifest Page Links to ETDs within DSpace
8
Plugin Configuration parameters: Base URL For the sub-manifest pages: Part (integer)
9
Crawl Rules
10
Crawl rules explained Include master manifest page: Include sub-manifest page: Include items under /bitstream Include OAI-PMH link
11
Crawl rules explained Include full record OAI-PMH link on manifest master Pulls in Dublin Core http://scholarship.rice.edu/dspace- oai/request?verb=ListRecords&metadat aPrefix=oai_dc&set=hdl_1911_8299
13
Collection Sizes Recommended AU between 1G and 10G 5 AUs between 7 and 10G Create new AUs as collection grows
14
Tips Don’t trust testing with the plugin tool Read documentation Test with Run One Daemon Test on the caches Use expert mode to write plugin
15
Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.