Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Wrangling at Rice University Denis Galvin Rice University MetaArchive Annual Membership Meeting Houston Texas.

Similar presentations


Presentation on theme: "Data Wrangling at Rice University Denis Galvin Rice University MetaArchive Annual Membership Meeting Houston Texas."— Presentation transcript:

1 Data Wrangling at Rice University Denis Galvin Rice University MetaArchive Annual Membership Meeting Houston Texas

2 ETDs at Rice Dspace Collection in a database driven by programming 42,581 G Brief and Full records

3 ETD Structure Brief http://scholarship.rice.edu/handle/1911/13401 Full http://scholarship.rice.edu/handle/1911/13401 ?show=full PDFs http://scholarship.rice.edu/bitstream/handle/ 1911/13401/1338793.PDF?sequence=1

4 Testing All testing done on Centos using VMware Plugintool testing Run one daemon Copying other sites plugins

5 Manifest Page

6 Dublin Core request?verb=ListRecords&metadataPrefix=oai_dc&s et=hdl_1911_8299

7 Sub-Manifest Page Links to ETDs within DSpace

8 Plugin Configuration parameters: Base URL For the sub-manifest pages: Part (integer) ‏

9 Crawl Rules

10 Crawl rules explained Include master manifest page: Include sub-manifest page: Include items under /bitstream Include OAI-PMH link

11 Crawl rules explained Include full record OAI-PMH link on manifest master Pulls in Dublin Core http://scholarship.rice.edu/dspace- oai/request?verb=ListRecords&metadat aPrefix=oai_dc&set=hdl_1911_8299

12

13 Collection Sizes Recommended AU between 1G and 10G 5 AUs between 7 and 10G Create new AUs as collection grows

14 Tips Don’t trust testing with the plugin tool Read documentation Test with Run One Daemon Test on the caches Use expert mode to write plugin

15 Questions?


Download ppt "Data Wrangling at Rice University Denis Galvin Rice University MetaArchive Annual Membership Meeting Houston Texas."

Similar presentations


Ads by Google