Download presentation
Presentation is loading. Please wait.
Published byJanis Mason Modified over 8 years ago
1
Fedora An Architecture for Complex Objects and their Relationships Old Dominion University, VA April 7, 2005 Sandy Payette Cornell University
2
The Fedora Project Fedora –Flexible –Extensible –Digital –Object –Repository –Architecture Open source software –Not Red Hat ! –Mozilla Public License
3
The Digital Problem Space: Manage, Publish, Preserve… Conventional digital objects Complex, compound, dynamic objects
4
Fedora History Cornell Research (1997-present) –DARPA and NSF-funded research –First reference implementation developed –Distributed, Interoperable Repositories (experiments with CNRI) –Policy Enforcement First Application (1999-2001) –University of Virginia digital library prototype –Technical implementation: adapted to web; RDBMS storage –Scale/stress testing for 10,000,000 objects Open Source Software (2002-present) –Andrew W. Mellon Foundation grants –Technical implementation: XML and web services –Fedora 1.0 (May 2003) –Fedora 2.0 (Jan 2005)
5
Why Fedora? (1) Digital Object Model –Abstraction for heterogeneous digital resources –Flexibility for different “content models” –Container for content and metadata (both local and remote) –Behaviors (extensible service attachments) Service-Oriented Architecture –Core Repository web service (SOAP and REST) –Service Framework for collaborating and supporting services Feature-worthy for archiving and preservation –XML object serialization for ingest, storage, and export –Content versioning –Event history –Manage dependencies (e.g., object-to-object relationships) –New service framework for plug-in of preservation services
6
Why Fedora? (2) Relationships –RDF-based index –Define and query “the graph” of objects in repository Content repurposing –Reuse digital content in different contexts –Re-purpose content via mechanisms for dynamically transforming content to fit new requirements Web Services –SOAP and REST bindings –WSDL to define interfaces –XML transmission Easy integration with other apps and systems –Does not assume any particular workflow or end-user application –Generic repository service as substrate
7
Fedora Use Cases Digital Library Collections Institutional Repository Educational Software Information Network Overlay Digital archives and preservation Digital Asset Management Content Management System Scholarly publishing
8
Selected Fedora Adopters University of Virginia (image, EAD, e-texts)imageEADe-texts VTLS Tufts University OhioLink Northwestern: Library and Academic Technologies National Science Digital Library (NSDL): Core Integration ARROW: National Library of Australia and Monash University Royal Library Denmark, National Library, and DTU Rutgers University Indiana University American Geophysical Union Library of Congress: I Hear America Singing University of Delaware Hamilton College Cornell CIT Tibetan Buddhist Resource Center Yale University DISA – South Africa, History of Apartheid resistance
9
Fedora Digital Object Model
10
Digital object identifier Reserved Datastreams Key object metadata Disseminators Pointers to service definitions to provide service-mediated views Datastreams Set of content or metadata items Fedora Digital Object Model Component View Persistent ID (PID) Dublin Core (DC) Datastream Audit Trail (AUDIT) Relations (RELS-EXT) Disseminator Default Disseminator
11
Managed Content Fedora stores and manages the content bytestream Fedora stores a reference (URL) to the content Fedora stores a reference (URL) to the content, but will not mediate access to content. Fedora stores a name-spaced block of XML content within the Fedora digital object XML file. The Datastream Component External Referenced External Redirected Inline XML 4 Classifications for Datastreams
12
implements Behavior Definition (BDEF Object) External Service The Disseminator Component Library Resource (Data Object) Persistent ID (PID) Disseminator Default Disseminator Persistent ID (PID) Service Binding Metadata (WSDL) Persistent ID (PID) Service Definition Metadata Behavior Mechanism (BMECH Object) Encapsulates references to service definition objects
13
Example Digital Object Get Object Profile Get DC Get THUMB Get MRSID Get Medium Quality Get High Quality Get MARC Record External Service External Service Default Disseminations Custom Disseminations Persistent ID (PID) DC (text/xml) AUDIT (text/xml) RELS-EXT (text/xml) Image Disseminator Default Disseminator Metadata Disseminator MRSID (image/x-mrsid) THUMB (image/gif)
14
Fedora – XML for digital objects FOXML (Fedora Object XML) –Simple XML format directly expresses Fedora object model –Easily adapts to Fedora new and planned features –Easily translated to other well-known formats –Internal storage format for objects in repository XML-based Ingest/Export of objects –FOXML, METS (Fedora extension) –Extensible to accommodate new XML formats –Planned: METS 1.4, MPEG21 DIDL
15
FOXML – Object Properties 2
16
FOXML – Datastream (type ‘E’) 2
17
FOXML – Relationships Datastream 2
18
FOXML – Disseminator 2
19
Fedora Repository Service
20
RDF files rdbms
21
Fedora 2.0 Repository Modules Management and Validation (via API-M) –Object Ingest and Export –Object Validation –Object Maintenance (create-modify-delete-purge-etc.) –Object Versioning –Initiates incremental object Indexing Access and Dissemination (via API-A and API-A-Lite) –Get object profile –Get Datastream Disseminations –Get Service-Mediated Disseminations Storage –Default file system implementation –Stores XML object wrappers and datastream byte stream content –Relational database registry
22
Fedora 2.0 Repository Modules Basic Search –Search object properties and DC record of each object –Relational database Resource Index Search –Search “graph” of objects with object properties, object relationships, DC –Kowari triple-store (RDF-based index of repository) Authorization –XACML policy enforcement –Repository policies and object-specific policies –Sun XACML Engine OAI-PMH
23
Fedora Web Service APIs in a Nutshell Management Service (API-M) –Ingest Object –Export Object –Get Object XML –Purge Object –Modify Object –Get Next PID –Get Datastream(s) –Get DatastreamHistory –Get DisseminatorHistory –Get Disseminator(s) –Add/modify/purge Datastream –Add/modify/purge Disseminator –Set State
24
Fedora Web Service APIs in a Nutshell Access Service (API-A and API-A-LITE) –Describe Repository –Get Object Profile –Get Object History –Get Datastream –Get Dissemination –Find Objects –Resume Find Objects
25
Fedora Web Service APIs in a Nutshell API-A-Lite –Repository-level operations: fedora/describe - Describe Repository fedora/search – methods to locate objects via the default repository index –Object-level operations: fedora/get - method to get object profile fedora/get/.. – method to “disseminate” a view of an object’s content Fedora/getMethods – methods get information about all disseminations available on object OAI-PMH Provider Service –All OAI-PMH methods to harvest OAI-DC from each object
26
Fedora 2.1 (May 2005) Authentication plug-ins –HTTP basic authentication and SSL –Plug-in #1 : Tomcat user/password file/db –Plug-in #2 : LDAP tie-in –Plug-in #3 : Radius Authentication Authorization module –XML-based policies using XACML –Fine-grained policy enforcement (API actions X subject attrs X object attrs) –Repository-wide policies –Object-specific policies
27
Fedora 2.0 - Clients Fedora Administrator (via SOAP interfaces) –Java Swing client –Ingest/Export objects –Batch and single object creation and modification –Wizards for creating BDEF/BMECH objects Web Browser (via REST interfaces) –API-A-LITE: Access, Search, –OAI –RISearch: Resource Index Search –API-M-LITE: Selected management operations Command Line Utilities –Ingest, export, purge –Migration Policy Builder (available in 2.1) –Graphical user interface to assist in authoring XACML policies
28
Fedora Service Framework (as of Fedora 2.1)
29
Phase 2 – Preservation Services Enhanced ingest/export (for archive transfers) –Focus on exchange of self-contained archives among different systems –OAI Harvesting of complex objects (e.g., MPEG21-DIDL) Preservation Integrity Service –Check Datastreams on ingest and modification –Validate byte stream format integrity (e.g., via JHOVE) –Checksums Preservation Monitoring Service –Dependency checking –Publication of repository events significant to preservation
30
Fedora Service Framework (Year 1 - starting v2.1)
31
Fedora Service Framework (Year 2)
32
Fedora Service Framework (Year 3)
33
Fedora Resource Index: Practical Use of Semantic Web Technologies
34
Fedora Digital Objects Resource Index View
35
Fedora Digital Objects Service Relationships – Object to BDef/BMech
36
Fedora 2.0 and RDF Object-to-object Relationships –Ontology of common relationships (RDF schema) –Relationships stored in special datastream (RELS-EXT) Resource Index (RI) –RDF-based index of repository (Kowari triple-store) –Graph-based index includes: Object properties and Dublin Core Object Relationships Object Disseminations RI Search –Powerful querying of graph of inter-related objects –REST-based query interface (using RDQL or ITQL) –Results in different formats (triples, tuples, sparql)
37
Uses of Object Relationships Define collections (e.g., collection objects) Assert critical relationships among object for management purposes Enable network overlay –Surrogate objects referring to external entities –Assert relationships among them –Assert other relationships (e.g., annotations) Enable navigation of repository (as tree or graph)
38
Fedora Relationship Ontology (RDFS) isPartOf / hasPart isMemberOf / hasMember isDescriptionOf / hasDescription hasEquivalent … others
39
Demo: Collection – Member Relationships Collection Object [smiley]smiley –Datastream containing a query to Resource Index for all members of collection Image Objects [brush]brush –Use RELS-EXT datastream to assert relationship to collection object
40
Example: UVA’s Scholarly Text Collections
41
UVa TEI Book Content Model
42
TEI Book Example link
43
Example: UVA’s Quantitative Data Collections
44
Example: NSDL Network Overlay Architecture Fedora
45
Phase 2 - Development Plans New Services (as depicted in Framework) Federated repositories Performance – goal of 10 million objects Web services security and Shibboleth Event Notification – pub/sub Code Refactoring New Client Applications –Fedora-Web-IR (for institutional repository use) –Advanced Scholarly workbench –Workflow –Tools for RDF browse and graph traversal
46
Fedora Software Distribution Open Source (Mozilla Public License) 100% Java (Sun Java J2SDK1.4) Supporting Technologies –Apache Tomcat –Apache Axis (SOAP) –Xerces for XML parsing and validation –Saxon for XSLT transformation –Schematron for validation –RDBMS: MySQL, Mckoi, Oracle 9i support –Kowari triple store –Sun XACML Engine for policy enforcement Deployment Platforms –Windows 2000, NT, XP –Solaris –Linux –Mac OSX
47
Fedora Development Consortium Advisory Board –University of Virginia –Tufts –VTLS –ARROW (Monash University and Nat’l Lib Australia) –Harris Corp. –Danish Royal Library and DTU –Northwestern University –NSDL – Core Integration Mission –Requirements Definition, Specifications. Joint Development –Commission of Working Groups Content Modeling Outreach and Education Workflow and Service-Oriented Processes –Recommendation for Long-Term sustainability model Governance and Funding Set Fedora Free – full open source model (e.g., public SourceForge) Code Maintenance (UVA until 2012; plan for beyond)
48
Recent News Downloads ~20K; 52 countries Growth – lots of new interest Fedora Users Conference (May 13-14, Rutgers) Interesting new adopters –OhioLink –DISA (South Africa history) Interesting new proposals –Company X finalist for large government contract –Cornell Lab of Ornithology (data + tools + documents) Recent Article –XML CoverPages http://xml.coverpages.org/ni2005-03-18-a.html http://xml.coverpages.org/ni2005-03-18-a.html
49
Finally, a new Fedora Web Site! www.fedora.info
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.