Download presentation
Presentation is loading. Please wait.
1
The Fedora Project March 19, 2003 ISTEC Symposium, Brazil Sandy Payette Cornell Information Science
2
Motivation The Problem of Complex Content
3
Digital Library Content not just documents... Some familiar objects Complex, compound, dynamic objects
4
Research Questions How can clients interact with heterogeneous collections of complex objects in a simple and interoperable manner? How can complex objects be designed to be both generic and genre-specific at the same time? How can we associate services and tools with objects to provide different presentations or transformations of the object content? How can we associate fine-grained access control policies with specific objects, or with groups of objects? How can we facilitate the long-term management and preservation of complex objects that have dependencies on distributed content and services?
5
The Flexible Extensible Digital Object Repository Architecture (FEDORA) DARPA and NSF-funded research at Cornell (1997-present) CORBA-based reference implementation (Payette/Lagoze) Extensive interoperability testing (with Arms/Blanchi/Overly) Policy Enforcement (Payette/Schneider) Interpreted and re-implemented at U of Virginia (1999-) Simple web-oriented implementation, focused on access to collections Java servlet and relational db Testbed of 10,000,000 objects with performance metrics (1999-2001) Mellon-Funded FEDORA Software(2002-) University of Virginia and Cornell - joint development Open source Web services and XML Mediation of distributed services Preservation focus
6
Fedora: Key Features Open System – public APIs, exposed as web services Flexible Digital Object Model XML submission and storage (METS Schema) Local and distributed content Data (any type) and metadata (any schema – DC, other) Supports inter-relationships among objects Behavior “contracts” for objects Associate services with objects Objects can provide launch-pad or tool to use object content Repository System: Management Service - manage digital resources, metadata, as well as computer programs, services and tools that support them Access Service – repository search and object disseminations Mediation - interacts with other distributed web services for content transformation and presentation OAI Provider Access Control Preservation service (future release)
7
Requirements: Heterogeneous Digital Collections Books Rare Books MultimediaMusic E-textsMapsPhotographsStatistics VideoArtManuscriptsData Images 3-D Objects Journals Sound Effects
8
Shortcomings of commercial digital library products Narrow focus on specific media formats (e.g. image databases, document management) Fail to effectively address interrelationships among digital entities Fail to address interoperability; no open interfaces to facilitate sharing of services; no standard protocols for cross-system interoperability Fail to provide facilities for managing programs and tools that are integral to delivering digital content. Not extensible; does not enable easy integration of new tools and services Do not address fine-grained access control and preservation issues.
9
The Fedora Architecture Digital Object Model The Repository Web Services
10
FEDORA Basic Object Architecture Digital Object Model Container to aggregate digital content of any type Data or metadata Local or distributed Behavior “contracts” Definitions of abstract operations Fulfillment via bindings to external services Enables multiple “disseminations” of content
11
Application Digital Object Model Functional View Dynamic data services
12
Persistent ID (PID) Disseminators SystemMetadata Datastreams Globally unique persistent id Public view: access methods for obtaining “disseminations” of digital object content Internal view: metadata necessary to manage the object Protected view: content that makes up the “basis” of the object Digital Object Model Architectural View
13
Persistent ID (PID) Default Disseminators Simple Image SystemMetadata Datastreams Digital Object Model Example Disseminators Get Profile List Items Get Item List Methods Get DC Record Get Thumbnail Get Medium Get High Get VeryHigh
14
Persistent ID (PID) Behavior Definition Metadata SystemMetadata Datastreams Data Object Persistent ID (PID) Service Binding Metadata (WSDL) SystemMetadata Datastreams Web Service Object Behavior Contracts behavior contract behavior subscription data contract Persistent ID (PID) Disseminators Datastreams System Metadata Behavior Mechanism Object Behavior Definition Object
15
FEDORA Basic Repository Architecture Repository System Object Management Lifecycle (Ingest/create Store Delete Approve Purge) Validation PID Generation Version management Access Control Preservation support Object Access Object Dissemination Object Reflection Service Mediation
16
Fedora Implementation Understanding the system implementation Web Services Server Design
17
What is a Web Service? A distributed application that runs over the internet. A web application that publishes an open interface through which clients can send requests and received responses Standards Transport protocol: HTTP, others Messaging protocol: SOAP, HTTP GET/POST Message encoding: XML Service description: WSDL
18
Fedora and Web Services Fedora Repository system is a web service Access/Search (API-A) and Management (API-M) Service descriptions published using WSDL Both SOAP and HTTP bindings Back-end services Digital object behaviors implemented as linkages to other distributed web services Service binding metadata (WSDL) stored in special Fedora Behavior Mechanism objects. Fedora acts as mediator to these services.
19
Fedora Repository System Client and Web Service Interactions Fedora Repository System Content Transform Service Content Transform Service user Web Service Dispatch Web Service Service BackendFrontend client application client application web browser user
20
Fedora Server Design 3-Tiered Architecture Modular & Extensible System Diagram
21
Server Design: 3 Layers InterfaceService Exposure API-A, API-M, pure HTTP and SOAP via HTTP. Application LogicImplements requests in terms of the Fedora object model. StorageDatabase, File system, Object serializations and cache(s).
22
Fedora System Diagram
23
Open Source Fedora: Implementation Technologies Fedora Web Services Layer Apache Axis for SOAP over HTTP Apache Tomcat 4.1 Core Repository System Sun Java J2SDK1.4 Xerces 2-2.0.2 for XML parsing and validation Saxon 6.5 for XSLT transformation Schematron 1.5 for validation MySQL-2.23.52 and Mckoi relational database Deployment Platforms Windows 2000, NT, XP Solaris Linux
24
DEMO: Use Cases Connect to Repository www.fedora.info
25
Release Plan Phase 1 – Fedora 1.0 (May 1, 2003 public) Phase 2/3 (2003-2005) Advanced Access Control Preservation Service R2R Repository Federation Reliability Fault tolerance Mirroring and replication Performance tuning Caching Load balancing Storage scalability
26
Deployment Partners Los Alamos National Laboratory: Research Library Library of Congress: Motion Picture and Recorded Sound Division Indiana University: Digital Library group Kings College London: Humanities Computing NYU: Humanities Computing Northwestern University: Academic Computing Oxford: Oxford Digital Library and The Refugee Studies Center Tufts: Digital Collections and Archives Department
27
More Information www.fedora.info
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.