Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Fedora Project March 19, 2003 ISTEC Symposium, Brazil Sandy Payette Cornell Information Science.

Similar presentations


Presentation on theme: "The Fedora Project March 19, 2003 ISTEC Symposium, Brazil Sandy Payette Cornell Information Science."— Presentation transcript:

1 The Fedora Project March 19, 2003 ISTEC Symposium, Brazil Sandy Payette Cornell Information Science

2 Motivation The Problem of Complex Content

3 Digital Library Content not just documents... Some familiar objects Complex, compound, dynamic objects

4 Research Questions How can clients interact with heterogeneous collections of complex objects in a simple and interoperable manner? How can complex objects be designed to be both generic and genre-specific at the same time? How can we associate services and tools with objects to provide different presentations or transformations of the object content? How can we associate fine-grained access control policies with specific objects, or with groups of objects? How can we facilitate the long-term management and preservation of complex objects that have dependencies on distributed content and services?

5 The Flexible Extensible Digital Object Repository Architecture (FEDORA) DARPA and NSF-funded research at Cornell (1997-present) CORBA-based reference implementation (Payette/Lagoze) Extensive interoperability testing (with Arms/Blanchi/Overly) Policy Enforcement (Payette/Schneider) Interpreted and re-implemented at U of Virginia (1999-) Simple web-oriented implementation, focused on access to collections Java servlet and relational db Testbed of 10,000,000 objects with performance metrics (1999-2001) Mellon-Funded FEDORA Software(2002-) University of Virginia and Cornell - joint development Open source Web services and XML Mediation of distributed services Preservation focus

6 Fedora: Key Features Open System – public APIs, exposed as web services Flexible Digital Object Model XML submission and storage (METS Schema) Local and distributed content Data (any type) and metadata (any schema – DC, other) Supports inter-relationships among objects Behavior “contracts” for objects Associate services with objects Objects can provide launch-pad or tool to use object content Repository System: Management Service - manage digital resources, metadata, as well as computer programs, services and tools that support them Access Service – repository search and object disseminations Mediation - interacts with other distributed web services for content transformation and presentation OAI Provider Access Control Preservation service (future release)

7 Requirements: Heterogeneous Digital Collections Books Rare Books MultimediaMusic E-textsMapsPhotographsStatistics VideoArtManuscriptsData Images 3-D Objects Journals Sound Effects

8 Shortcomings of commercial digital library products Narrow focus on specific media formats (e.g. image databases, document management) Fail to effectively address interrelationships among digital entities Fail to address interoperability; no open interfaces to facilitate sharing of services; no standard protocols for cross-system interoperability Fail to provide facilities for managing programs and tools that are integral to delivering digital content. Not extensible; does not enable easy integration of new tools and services Do not address fine-grained access control and preservation issues.

9 The Fedora Architecture Digital Object Model The Repository Web Services

10 FEDORA Basic Object Architecture Digital Object Model Container to aggregate digital content of any type Data or metadata Local or distributed Behavior “contracts” Definitions of abstract operations Fulfillment via bindings to external services Enables multiple “disseminations” of content

11 Application Digital Object Model Functional View Dynamic data services

12 Persistent ID (PID) Disseminators SystemMetadata Datastreams Globally unique persistent id Public view: access methods for obtaining “disseminations” of digital object content Internal view: metadata necessary to manage the object Protected view: content that makes up the “basis” of the object Digital Object Model Architectural View

13 Persistent ID (PID) Default Disseminators Simple Image SystemMetadata Datastreams Digital Object Model Example Disseminators Get Profile List Items Get Item List Methods Get DC Record Get Thumbnail Get Medium Get High Get VeryHigh

14 Persistent ID (PID) Behavior Definition Metadata SystemMetadata Datastreams Data Object Persistent ID (PID) Service Binding Metadata (WSDL) SystemMetadata Datastreams Web Service Object Behavior Contracts behavior contract behavior subscription data contract Persistent ID (PID) Disseminators Datastreams System Metadata Behavior Mechanism Object Behavior Definition Object

15 FEDORA Basic Repository Architecture Repository System Object Management Lifecycle (Ingest/create  Store  Delete  Approve  Purge) Validation PID Generation Version management Access Control Preservation support Object Access Object Dissemination Object Reflection Service Mediation

16 Fedora Implementation Understanding the system implementation Web Services Server Design

17 What is a Web Service? A distributed application that runs over the internet. A web application that publishes an open interface through which clients can send requests and received responses Standards Transport protocol: HTTP, others Messaging protocol: SOAP, HTTP GET/POST Message encoding: XML Service description: WSDL

18 Fedora and Web Services Fedora Repository system is a web service Access/Search (API-A) and Management (API-M) Service descriptions published using WSDL Both SOAP and HTTP bindings Back-end services Digital object behaviors implemented as linkages to other distributed web services Service binding metadata (WSDL) stored in special Fedora Behavior Mechanism objects. Fedora acts as mediator to these services.

19 Fedora Repository System Client and Web Service Interactions Fedora Repository System Content Transform Service Content Transform Service user Web Service Dispatch Web Service Service BackendFrontend client application client application web browser user

20 Fedora Server Design 3-Tiered Architecture Modular & Extensible System Diagram

21 Server Design: 3 Layers InterfaceService Exposure API-A, API-M, pure HTTP and SOAP via HTTP. Application LogicImplements requests in terms of the Fedora object model. StorageDatabase, File system, Object serializations and cache(s).

22 Fedora System Diagram

23 Open Source Fedora: Implementation Technologies Fedora Web Services Layer Apache Axis for SOAP over HTTP Apache Tomcat 4.1 Core Repository System Sun Java J2SDK1.4 Xerces 2-2.0.2 for XML parsing and validation Saxon 6.5 for XSLT transformation Schematron 1.5 for validation MySQL-2.23.52 and Mckoi relational database Deployment Platforms Windows 2000, NT, XP Solaris Linux

24 DEMO: Use Cases Connect to Repository www.fedora.info

25 Release Plan Phase 1 – Fedora 1.0 (May 1, 2003 public) Phase 2/3 (2003-2005) Advanced Access Control Preservation Service R2R Repository Federation Reliability Fault tolerance Mirroring and replication Performance tuning Caching Load balancing Storage scalability

26 Deployment Partners Los Alamos National Laboratory: Research Library Library of Congress: Motion Picture and Recorded Sound Division Indiana University: Digital Library group Kings College London: Humanities Computing NYU: Humanities Computing Northwestern University: Academic Computing Oxford: Oxford Digital Library and The Refugee Studies Center Tufts: Digital Collections and Archives Department

27 More Information www.fedora.info


Download ppt "The Fedora Project March 19, 2003 ISTEC Symposium, Brazil Sandy Payette Cornell Information Science."

Similar presentations


Ads by Google