Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Fedora Project March 10, 2003

Similar presentations


Presentation on theme: "The Fedora Project March 10, 2003"— Presentation transcript:

1 The Fedora Project March 10, 2003
Sandy Payette Cornell Information Science

2 The Problem of Complex Content
Motivation The Problem of Complex Content

3 Digital Library Content not just documents ...
Some familiar objects Complex, compound, dynamic objects

4 Key Research Questions
How can clients interact with heterogeneous collections of complex objects in a simple and interoperable manner? How can complex objects be designed to be both generic and genre-specific at the same time? How can we associate services and tools with objects to provide different presentations or transformations of the object content? How can we associate specialized, fine-grained access control policies with specific objects, or with groups of objects? How can we facilitate the long-term management and preservation of complex objects with dependencies on distributed content and services?

5 Shortcomings of commercial digital library products
Narrow focus on specific media formats (e.g. image databases, document management) Fail to effectively address interrelationships among digital entities Fail to address interoperability; no open interfaces to facilitate sharing of services; no standard protocols for cross-system interoperability Fail to provide facilities for managing programs and tools that are integral to delivering digital content. Not extensible; does not enable easy integration of new tools and services Do not address fine-grained access control and preservation issues.

6 The Flexible Extensible Digital Object Repository Architecture (FEDORA)
DARPA and NSF-funded research at Cornell (1997-present) CORBA-based reference implementation (Payette/Lagoze) Extensive interoperability testing (with Arms/Blanchi/Overly) Policy Enforcement (Payette/Schneider) Interpreted and re-implemented at U of Virginia (1999-) Simple web-oriented implementation, focused on access to collections Java servlet and relational db Testbed of 10,000,000 objects with performance metrics ( ) Mellon-Funded FEDORA Software(2002-) University of Virginia and Cornell - joint development Open source Web services and XML Mediation of distributed services Preservation focus

7 The Fedora Architecture
Digital Object Model The Repository Web Services

8 FEDORA Basic Object Architecture
Digital Object Model Container to aggregate digital content of any type Data or metadata Local or distributed “Behavior” definitions (like abstract interfaces) Hooks to external services Enables multiple “disseminations” of content

9 Digital Object Model Functional View
dynamic Application services

10 Globally unique persistent id
Digital Object Model Architectural View Globally unique persistent id Persistent ID ( PID ) Public view: access methods for obtaining “disseminations” of digital object content Disseminators Internal view: metadata necessary to manage the object System Metadata Datastreams Protected view: content that makes up the “basis” of the object

11 Digital Object Model Service Relationships
Persistent ID (PID) Service Definition Metadata (WSDL) System Metadata Datastreams Behavior Definition Object Persistent ID (PID) System Metadata Datastreams Disseminators Data Object Behavior Mechanism Object Persistent ID (PID) Service Binding Metadata (WSDL) System Metadata Datastreams External Service

12 FEDORA Basic Repository Architecture
Repository System Object Management Lifecycle (Ingest/create  Store  Delete  Approve  Purge) Validation PID Generation Version management Access Control Preservation support Object Access Object Dissemination Object Reflection Service Mediation

13 Fedora: A Programmer’s View
Understanding the system implementation Web Services Server Design

14 What is a Web Service? A distributed application that runs over the internet. An addressable network endpoint which receives structured messages returns structured responses. A web application that publishes an open interface through which clients can send requests and received responses.

15 How is this different from plain old web applications?
Formally defined API (application programming interface) defines a set of abstract operations for a web service Published bindings for client to run operations Standard protocol for invoking operations on the service. XML as standard means of encoding service requests and responses.

16 Why are Web Services important?
Interoperability Web applications can interact and build upon each other Data is transferred in an interoperable manner (HTTP) Data is encoded in an interoperable format (XML) Works in decentralized, distributed, operating-system independent environment. Standards-oriented Means to expose complex operations with rich data typing (via XML Schema language typing) Ease of integrating distributed systems via the Web W3C effort to develop this service architecture

17 How are Web Services Implemented?
Simple Object Access Protocol (SOAP) SOAP is a messaging protocol that can run over different transport protocols (e.g., HTTP, SMTP) Operation oriented (send a request to a end point) Like CORBA, RMI, DCOM…but for Web and simpler Application APIs can be defined and published using the Web Service Description Language (WSDL) Requests and responses sent as XML messages Supports simple and complex data typing in requests and responses Supports transmission of binary data within requests or response packages

18 How are Web Services Implemented?
REST (Representational State Transfer) URI + HTTP + XML URI/resource driven; message built into a URL HTTP GET or POST Response is XML data Issues: Not a standard, but a style of doing web apps; arguably it just gives a fancy name to how lots of people do applications on the web by default; nothing really new here; just argues to do things the way we have been, maybe a little more standard by using XML. Fragile service definition – URL’s change No data typing on requests Limited ability to transmit complex requests on URL W3C behind SOAP; one strong voice out there for REST (Prescod).

19 Example of Web Service using SOAP
My Application SOAP Request (XML) Google Web Service SOAP/HTTP SOAP/HTTP doSpellingSuggestion(payet) payette SOAP Response (XML)

20 XML SOAP Request <?xml version="1.0" encoding="UTF-8"?>
SOAP-ENV:Envelope xmlns:SOAP-ENV= xmlns:xsi=" xmlns:xsd=" <SOAP-ENV:Body> <m:doSpellingSuggestion xmlns:m="urn:GoogleSearch"> <key>/e325JlNPASJu</key> <phrase>payet</phrase> </m:doSpellingSuggestion> </SOAP-ENV:Body> </SOAP-ENV:Envelope>

21 XML SOAP Response <?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope xmlns:SOAP-ENV=" xmlns:xsi=" xmlns:xsd=" <SOAP-ENV:Body> <ns1:doSpellingSuggestionResponse xmlns:ns1="urn:GoogleSearch" SOAP-ENV:encodingStyle=" <return xsi:type="xsd:string">payette</return> </ns1:doSpellingSuggestionResponse> </SOAP-ENV:Body> </SOAP-ENV:Envelope>

22 Fedora and Web Services
Fedora Repository system exposed as two related Web services Access (API-A) and Management (API-M) Both described using WSDL Both have SOAP and HTTP bindings Back-end services Digital object behaviors implemented as linkages to other distributed web services Service binding metadata (WSDL) stored in special Fedora objects. Fedora Repository system acts a mediator to these services.

23 Fedora: Web Services View

24 3-Tiered Architecture Modular & Extensible System Diagram
Fedora Server Design 3-Tiered Architecture Modular & Extensible System Diagram

25 Server Design: 3 Layers Interface Service Exposure
API-A, API-M, pure http and SOAP via http. Application Logic Implements requests in terms of the Fedora object model. Storage Database, Filesystem, Object serializations and cache(s).

26 System Diagram

27 Fedora: Implementation Technologies
Fedora Web Services Layer Apache Axis for SOAP over HTTP Apache Tomcat 4.1 Core Repository System Sun Java J2SDK1.4 Xerces for XML parsing and validation Saxon 6.5 for XSLT transformation Schematron 1.5 for validation MySQL and Mckoi relational database Deployment Platforms Windows 2000, NT, XP Solaris Linux

28 DEMO Local Repository

29 Deployment Partners Los Alamos National Laboratory: Research Library
Library of Congress: Motion Picture and Recorded Sound Division Indiana University: Digital Library group Kings College London: Humanities Computing NYU: Humanities Computing Northwestern University: Academic Computing Oxford: Oxford Digital Library and The Refugee Studies Center Tufts: Digital Collections and Archives Department


Download ppt "The Fedora Project March 10, 2003"

Similar presentations


Ads by Google