Presentation is loading. Please wait.

Presentation is loading. Please wait.

Micah Altman Associate Director, Harvard-MIT Data Center Institute for Quantitative Social Science, Harvard University Bryan Beecher Director of Computing.

Similar presentations


Presentation on theme: "Micah Altman Associate Director, Harvard-MIT Data Center Institute for Quantitative Social Science, Harvard University Bryan Beecher Director of Computing."— Presentation transcript:

1 Micah Altman Associate Director, Harvard-MIT Data Center Institute for Quantitative Social Science, Harvard University Bryan Beecher Director of Computing and Network Services Inter-university Consortium of Political and Social Research, University of Michigan Marc Maynard Director of Technical Services The Roper Center for Public Opinion Research, University of Connecticut Jonathan Crabtree Assistant Director for Archives and Information Technology HW Odum Institute for Research in Social Science, University of North Carolina CNI 2008 Fall Task Force Meeting1

2 Our Story Who are you guys? What problem are you trying to solve? What have you done? Why do we care? CNI 2008 Fall Task Force Meeting2

3 Data-PASS Partnership devoted to identifying, acquiring and preserving data at-risk of being lost to the social science research community Partners – ICPSR – Odum Institute – Harvard MIT Data Center – Roper Center – National Archives CNI 2008 Fall Task Force Meeting3 http://flickr.com/photos/phauly/35555985/

4 Data-PASS CNI 2008 Fall Task Force Meeting4

5 Data-PASS Lots of little files (social science data) ASCII data files PDF technical documentation (codebooks) Millions of ‘em Archival storage Was tape Now disk CNI 2008 Fall Task Force Meeting5

6 Before CNI 2008 Fall Task Force Meeting6

7 After CNI 2008 Fall Task Force Meeting7

8 Archival storage? CNI 2008 Fall Task Force Meeting8 http://failblog.org/2008/02/08/floppy-fail/

9 Archival storage? Remote disks Grids Clouds With partners? CNI 2008 Fall Task Force Meeting9

10 Why roll your own? Policy-driven Auditable Asymmetric Independence of each location CNI 2008 Fall Task Force Meeting10

11 Syndicated Storage Platform (SSP) Start with LOCKSS Lots of Copies Keep Stuff Safe But used in a closed network Private LOCKSS Network (PLN) A few of them out there MetaArchive perhaps the best known Biggest selling point was independence of each node in the PLN CNI 2008 Fall Task Force Meeting11

12 PLNs LOCKSS is really easy to setup PLNs are more difficult Other differences between traditional PLN and our needs Our content isn’t harvestable via HTTP Our PLN nodes are different sizes Our trust model requirement prevents a centralized authority controlling the network CNI 2008 Fall Task Force Meeting12

13 SSP = Stone Soup Platform? ICPSR and Odum setup a small PLN HDMC provided a harvester and designed the schema Odum built the Comparator Roper is building the Invitor CNI 2008 Fall Task Force Meeting13

14 PLN CNI 2008 Fall Task Force Meeting14

15 Schema Nodes – IP address – Storage commitment AUs – Max size – # in the PLN Lots more CNI 2008 Fall Task Force Meeting15

16 Comparator diff for our SSP Compares – Contents of the LOCKSS Cache Manager [sic] – Schema Produces – List of differences between “what is” and “what should be” – Feeds into another tool for “fixing the PLN” Machine-actionable output (XML) CNI 2008 Fall Task Force Meeting16

17 Invitor Reads the report from the Comparator Issues requests to PLN nodes to ADD or DROP an AU – Expectation is that PLN nodes always accept an ADD if they can An offer they cannot refuse Requests may be reviewed/approved by a human administrator (or not) USENET news technology? CNI 2008 Fall Task Force Meeting17

18 Summary Data-PASS is a group of archives committed to preserving social science data Exploring various technology options One avenue is a custom LOCKSS deployment Network schema OAI data harvester Comparison tool Network update tool CNI 2008 Fall Task Force Meeting18


Download ppt "Micah Altman Associate Director, Harvard-MIT Data Center Institute for Quantitative Social Science, Harvard University Bryan Beecher Director of Computing."

Similar presentations


Ads by Google