University of Cambridge July 8, 2005 CWSpace Archiving MIT OpenCourseWare in DSpace DSpace Federation 2nd User Group Meeting University of Cambridge July 7 - 8, 2005 v. 20050704_2215 CWSpace, an MIT iCampus project
DSpace: wide adoption as Institutional Repository July 8, 2005 DSpace: wide adoption as Institutional Repository FEATURES Safe professionally archived persistent, citable URL preserved over archival timeframes policies re: removal, etc. Findable search (metadata; full-text) browse notification e-mails disseminated metadata (OAI-PMH) Google and DSpace ("Scholar") CONTENT Scholarly materials Research or Education oriented AUDIENCE Humans; SysAdmins; Spiders; Harvesters July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
DSpace: new territory as LOR? Learning Objects Repository July 8, 2005 DSpace: new territory as LOR? Learning Objects Repository CONTENT Educational content: Teaching, learning, instruction, assessment Courseware materials Learning Objects Compound Digital Objects, Websites… FEATURES Safe Findable Emphasis on: Sharing, Re-Use Aggregation Evaluation AUDIENCE Humans; SysAdmins; Spiders ; Harvesters System-to-System: CLEs; Image gallery tools July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
Project Goal: InterOperability July 8, 2005 Project Goal: InterOperability re: "harvest" - Yes. OCW has built "Content Exporter” re: "archive" - Yes. DSpace has new IMS-CP ingest module; course rendering re: "learning objects" - No, but courseware, Yes. Discovered that from OCW will be Courses not LOs re: "Web Services"...to "LMSs" - Yes. Initial version WS simple clients; also to LMS (SloanSpace) prototype "To harvest and digitally archive OCW learning objects, and make them available to learning management systems (LMSs) by using Web Services interfaces on top of DSpace." http://icampus.mit.edu/projects/DSpace.shtml July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
InterOperability: the main goal July 8, 2005 InterOperability: the main goal 1st Level of INTEROPERABILITY—Key components to standardize on: What you send. (PACKAGE) How you send it. (PROTOCOL) (2nd Level of InterOperability: standardize on Descriptive Metadata (profiles, crosswalks, etc.).) (3rd Level of InterOperability: beyond the crosswalk: RDF and Semantic Web…) July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
CWSpace: Standards for Packages & Protocols July 8, 2005 CWSpace: Standards for Packages & Protocols Protocols, APIs SOAP & WSDL WebDAV JSR 170 JCR RESTful (XML, HTTP) XML-RPC Packages IMS Content Package METS MPEG21-DIDL XFDU July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
CWSpace, an MIT iCampus project July 8, 2005 Work Activity To Date Schematic showing various areas of development: metadata specifications, Web Services, export and import programming. July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
Metadata for Content Packaging July 8, 2005 Metadata for Content Packaging METS Libraries… IMS-CP Education… MPEG-21 DIDL Commercial… XFDU Aerospace… Re: Venn diagram -- Premise is this interaction (Education with Libraries) is on the rise. 1st Level of INTEROPERABILITY (2nd Level of InterOperability: standardize on Descriptive Metadata (profiles, crosswalks, etc.).) (3rd Level of InterOperability: beyond the crosswalk: RDF and Semantic Web…) July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
Package Interchange File (PIF) July 8, 2005 Package Interchange File (PIF) IMS-CP uses a .ZIP file with a Manifest XML file and all content files: http://www.imsglobal.org/content/packaging/cpv1p1p4/imscp_bestv1p1p4.html July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
IMS-CP and OCW Object Model July 8, 2005 IMS-CP and OCW Object Model IMS-CP imsmanifest.xml July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
METS & IMS-CP Manifests Diagrammed July 8, 2005 METS & IMS-CP Manifests Diagrammed METS mets.xml IMS-CP imsmanifest.xml Red arrow: Logical organization to Physical (href) Blue arrow (METS only): Descriptive metadata is separated from Logical or Physical Red arrow: Logical organization to Physical (href) http://cwspace.mit.edu/docs/ProjectMgt/Reports/SPARC-IR-Workshop/sparc-poster.html IMS-CP “Resources” with Files and Dependencies, vs. METS “Div”s with nested Divs and Fptrs and Pars and Areas. IMS-CP SubManifests (not used with OCW courses) Multiple “Organization”s (IMS-CP) and “StructMap”s (METS) possible. July 8, 2005 CWSpace, an MIT iCampus project http://cwspace.mit.edu/docs/ProjectMgt/Reports/SPARC-IR-Workshop/sparc-poster.html CWSpace, an MIT iCampus project
XML Elements: METS & IMS-CP July 8, 2005 XML Elements: METS & IMS-CP For those who like the pointy angle brackets… July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
METS SIP Profile for DSpace Work-in-progress MIT CSAIL Publications Archive Anticipates DSpace 2.0 possible METS as AIP Preservation and Technical Metadata Licenses as METS metadata (Deposit; Creative Commons) StructMap raises questions for flat DSpace file storage (Bundles & fileGrps) (Export re-creation) Option for new PluginManager to manage DIP work ahead as well; may differ July 8, 2005 CWSpace, an MIT iCampus project
CWSpace D E M O (SCREENSHOTS) Archiving MIT OpenCourseWare in DSpace July 8, 2005 CWSpace Archiving MIT OpenCourseWare in DSpace D E M O (SCREENSHOTS) CWSpace, an MIT iCampus project
OCW Course Rendered in Three Systems July 8, 2005 OCW Course Rendered in Three Systems OCW………….. DSpace ………….. SloanSpace ……………………………. July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
Year One Demo: OCW to DSpace July 8, 2005 Year One Demo: OCW to DSpace July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
CWSpace, an MIT iCampus project July 8, 2005 OCW Course Website Static website HTML, PDF, JPG, XLS Akamai: Multimedia LOM XML Search; Feedback; Tracking Copyright cleared http://ocw.mit.edu/OcwWeb/Sloan-School-of-Management/15-040Spring2004/CourseHome/index.htm July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
OCW Content Exporter (CE) July 8, 2005 OCW Content Exporter (CE) CE generates entire course website (rewriting links) CE writes imsmanifest.xml CE publishes .ZIP to web page CE can publish whole dept. CE also used: translation; professors; etc. CE also used: Course copies for professors Translation partners Education partners OCW new functionality "Download course zip" off OCW site Increased interoperability: OCW using educational technologies standard (IMS-CP) Spur to upgrade LOM to recent IEEE advancements Content Export improvements related also to improvements in OCW publishing process per se July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
Client: IMS-CP DSpace Import Driver July 8, 2005 Client: IMS-CP DSpace Import Driver OCW CE not yet WS client Simple CGI driver to DSpace session(), auth/auth(), upload(), and ingest() WS 4 params (w-i-p) RESULTS: Success - Created DSpace item 123456789/75 Using collections='123456789/2’ Started session with token='2-1039ad464ee-7665e5a950ff7ff2' upload response :: <?xml version="1.0" encoding="UTF-8"?> <uploadservice><URI>http://rotarran.mit.edu:8080/dspace-ws/upload/package-21802.zip</URI><size>8309770</size> </uploadservice> Ingesting IMSCP package in package-21802.zip ingest response :: 123456789/75 Session ended" July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
DSpace Item Record for OCW Course July 8, 2005 DSpace Item Record for OCW Course DSpace info model maps “Item” to OCW “Course” Files (all types) are “Bitstreams” Metadata: basic LOM-2-DC Year 2: Further DSpace dev re: websites, LOs July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
DSpace Serving Full OCW Course July 8, 2005 DSpace Serving Full OCW Course Static website HTML, PDF, JPG, XLS Akamai: Multimedia DSpace search (Lucene) LOM XML to Dublin Core (w-i-p) OAI metadata distribution Copyright cleared, DSpace, CC licenses July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
Year Two Demo: Other CLE (early preview!) July 8, 2005 Year Two Demo: Other CLE (early preview!) July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
SloanSpace Searching, Uploading OCW Course July 8, 2005 SloanSpace Searching, Uploading OCW Course PROTOTYPE SloanSpace (dotLRN, OpenACS) portal functionality Search DSpace (SRW) Retrieve via (provisional) DSpace WS Directly “Add Course” to dotLRN module for “Learning Object Repository System” (LORS) July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
SloanSpace LORS Navigation for OCW Course July 8, 2005 SloanSpace LORS Navigation for OCW Course PROTOTYPE dotLRN module, for IMS-CP standard LOM XML (Use of SCORM permits tracking, etc.) July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
SloanSpace LORS Serving Full OCW Course July 8, 2005 SloanSpace LORS Serving Full OCW Course PROTOTYPE HTML, PDF, JPG, XLS Akamai: Multimedia Lifecycle issues in Year Two work July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
CWSpace, an MIT iCampus project July 8, 2005 Year Two Deliverables OCW Production Operationalize to scale for OCW archiving (100 courses Fall 2005…) InterOperability with other CLE/LMSs Package: “CWSpace IMS-CP Profile” for SloanSpace, for Stellar, for Sakai Protocol: Web Services use cases, clients for same Archived websites Improved contextual presentation Terminology Extraction Tool (CSAIL) Explore integration into DSpace, OCW WEB SERVICES: From Year One of project: CARET (Univ. Cambridge) WS with SOAP/WSDL; sophisticated Packaging; large file upload/download. (XSLT; Spring; Maven) ARCHIVED WEBSITES: To include options as use of framesets; left-pane navigation; OCW HTML with and without “C-clamp” navigation; use of IMS Manifest for selection of learning object materials; etc. Consult similar research elsewhere to benefit from best practices (Internet Archive; New York University; Stanford; Library of Congress; et al.). July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
CWSpace, an MIT iCampus project July 8, 2005 CWSpace: Planned Work Protocols Lightweight Network Interface (LNI) SOAP & WSDL WebDAV DSpace Platform Plugin Manager Packagers (I/O) Crosswalks Structured Metadata Stackable Authentication For this DSpace User Group audience, the more pertinent planned work concerns the DSpace platform, and Web Services. Plugin Manager can also manage Media Filters and other things. The Structured Metadata (beyond flat Qualified DC) problem is not solved for, but gets a partial solution with the introduction of packages: use METS or IMS-CP to import XML for any type of descriptive metadata, and at a minimum store as a DSpace Bitstream. Stackable Authentication is more a DSpace than a CWSpace item, but the introduction of a new interface to the platform (Networked Interface) on the CWSpace project is the right time to re-consider this important piece. This work decouples it from the Web U/I and prepares a more manageable approach to applying security options in the variety of settings in which DSpace implementations are found. The resulting “stack” can be used for the Web U/I, for both types of LNI (SOAP; WebDAV), for the command line. July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
Planned Work on CWSpace July 8, 2005 CWSpace, an MIT iCampus project
Existing Interfaces to DSpace July 8, 2005 Existing Interfaces to DSpace DSpace Web U/I SRU/SRW Search OAI-PMH Command-line DSpace Batch Importer Media Filters… http://wiki.dspace.org/NetworkInterfaces July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
Lightweight Network Interface July 8, 2005 Lightweight Network Interface New Proposed Interface(s) DSpace “Web Services”, for CWSpace… SOAP & WSDL WebDAV (Extension to HTTP protocol) http://wiki.dspace.org/LightweightNetworkInterface July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
A Few (Quick) Thoughts on Web Services Design July 8, 2005 A Few (Quick) Thoughts on Web Services Design The next few slides provide a (very) brief tour of the topics we are investigating with our Lightweight Network Interface (LNI) Standards-based vs. Custom model Abstract vs. Strongly modeled specification Technology Approaches (SOAP; REST…) July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
Approaches To Expose Your Object Model, 1 July 8, 2005 Approaches To Expose Your Object Model, 1 In selecting an API or protocol or approach to exposing your object model to a network interface there are two axes to consider. ‘X’ axis: Degree of match to Standards Custom PRO More control for service CON Harder to interoperate for consumer Standards-based CON Compromises for service PRO Easier interoperation for consumer In exposing a new networked interface to DSpace’s content model and its functionality, a few different approaches may be taken. The custom or “native” approach involves representing the DSpace object model (Community, Collection, Item, (Bundle), Bitstream) in the network interface (set of Web Services) in a manner akin to RPC, or almost as an API. This approach could take advantage of strongly reflecting the DSpace model, but would be then bound to that, and client developers could not expect to get reuse out of code they write to talk to these DSpace services. Alternatively, a standards-based approach would require some mapping of the DSpace model to the standard model, but the gain would be seen in the ease of subsequent interoperability with more clients. July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
Approaches To Expose Your Object Model, 2 July 8, 2005 Approaches To Expose Your Object Model, 2 ‘Y’ axis: Degree of precision in model abstraction Abstractly Modeled PRO Open to extension, interpretation, wide application CON You have a specification, but not implementation guide Strongly Modeled CON Compromises for service; line is drawn in the sand (!) PRO Implementable directions; line is drawn in the sand (!) On the ‘Y’ axis consideration, we see that more abstractly modeled “standards” exist, but these specifications are not implementable per se. The Java Community Process JSR 170 for Repository access is an example of a proposed API that by definition is going to be more strongly modeled--it takes a position on certain design decisions. The WebDAV protocol extension to HTTP provides a fairly strongly modeled position as well, while remaining very flexible, with its concept of Resources and Properties. July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
Matrix: Expose Object Model July 8, 2005 Matrix: Expose Object Model This initial proposal for a new network interface for DSpace is settling on an adaptation of WebDAV. We are _not_ intending at this time to take on implementing the JSR 170 API to the Dspace model. Note that this matrix does not concern itself with implementation approach (e.g. SOAP vs. RESTful etc.). July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
Technology Approaches July 8, 2005 Technology Approaches In then opting for a style of developing the services your Object Model will provide, there are some technology choices to consider. SOAP, WSDL Enterprise developers Contains hints re: objects, methods RESTful (“Representational State Transfer”) Developers comfortable with XML markup, HTTP Straightforward XML messages over HTTP XML-RPC Early, simplified spin-off from SOAP (ca. 1999) WebDAV (Protocol: extension to HTTP) “Resources” and “Properties” works well with Repository July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
Resources, Properties: WebDAV July 8, 2005 Resources, Properties: WebDAV DSpace (root) /mylib/dspace-ni/ Community /mylib/dspace-ni/dso_1721.1%2F46 Collection /mylib/dspace-ni/dso_1721.1%2F3549 Item /mylib/dspace-ni/dso_1721.1%2F5543 Bitstream /mylib/dspace-ni/dso_1721.1%2F5543/bitstream_13 Workflow /mylib/dspace-ni/workflow/wf_23 The server only pays attention to the ''last'' element in a path of DSO's, e.g. http://myserver/DAV/dso_123456789%2F1/dso_123456789%2F4/dso_123456789%2F13 is the same as http://myserver/DAV/dso_123456789%2F13 http://wiki.dspace.org/LightweightNetworkInterface http://wiki.dspace.org/LightweightNetworkInterface WebDAV a protocol (additional “verbs” for HTTP), and with its Resources and Properties provides a quite different model than the message exchange RPC-like facilities of XML over HTTP (a la SOAP, XML-RPC, or even RESTful). Nevertheless, there is still the need to map the exposure of DSpace object model components to the WebDAV Resources, as we see in this slide. Note that the initial thinking was to name these WebDAV Resources akin to DSpace (Community, Collection, Item…) but this was dropped in favor of the more generic “dso_” (DSpace Object). The "dso_" path elements can be cascaded as you descend a hierarchy. The server only pays attention to the ''last'' element in a path of DSO's, e.g. http://myserver/DAV/dso_123456789%2F1/dso_123456789%2F4/dso_123456789%2F13 is the same as http://myserver/DAV/dso_123456789%2F13 July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
Resources, Properties: WebDAV July 8, 2005 Resources, Properties: WebDAV Collection objects logo; short_description ; introductory_text ; sidebar; copyright; default_license; provenance Item objects submitter; owning_collection; license; cc_license; cc_license_rdf; DAV:getlastmodified Bitstream objects DAV:getcontentlength; DAV:getcontenttype; source; description; format-id; format-description; checksum; checksum-algorithm; sequence-id http://wiki.dspace.org/LightweightNetworkInterface Initial proposals. Comments would be welcome. July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
CWSpace, an MIT iCampus project July 8, 2005 Example GET URIs /mylib/dspace-ni/dso_1721.1%2F5543?session=123xyzzy456&package=org.dspace.METS /mylib/dspace-ni/dso_1721.1%2F5543/bitstream/13?session=123xyzzy456 http://wiki.dspace.org/LightweightNetworkInterface July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
CWSpace, an MIT iCampus project July 8, 2005 Example PUT URI To add a new item to the collection at handle 1721.1/3549: PUT /mylib/dspace-ni/dso_1721.1%2F3549?session=123xyzzy456&package=OCW-IMSCP ....package contents... HTTP/1.1 201 OK Location: /mylib/dspace-ni/dso_1721.1%2F5549 ....other headers.... http://wiki.dspace.org/LightweightNetworkInterface July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
CWSpace, an MIT iCampus project July 8, 2005 Features Matrix Comparing the Interfaces to DSpace with the Features they provide The Web U/I permits Submit of files; the various Web Services permit (require) Submission of Packages. Caret’s approach has a Dissemination of Packages that is more sophisticated (non-contiguous Items and Bitstreams, even Collections). The initial LNI approaches can Disseminate a single Item as a Package. (This will grow in complexity as well, for some CWSpace “LO” requirements.) Note that things like Administration, Browse are not currently high priorities for putting into Web Services. July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
CWSpace, an MIT iCampus project July 8, 2005 Three Things to Take Away from this talk on the iCampus Project: CWSpace We’re working on Packaging Metadata We’re working on Web Services We’re also working on Archiving Websites ... help us refine the thinking … July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project
CWSpace Thank You. Questions, Comments… July 8, 2005 CWSpace Archiving MIT OpenCourseWare in DSpace Thank You. Questions, Comments… http://cwspace.mit.edu William Reilly, Larry Stone, MacKenzie Smith—MIT Libraries’ Digital Library Research Group (DLRG) Rob Wolfe—MIT Libraries’ Metadata Services Unit Cec d’Oliveira—MIT OpenCourseWare, Technology July 8, 2005 CWSpace, an MIT iCampus project CWSpace, an MIT iCampus project