Overview: Fedora Architecture and Software Features Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009
Kudos to Mark Leggott of UPEI for this great slide!
Digital Content Familiar content types Complex, compound, dynamic content types
Support for inter-connected digital content Documents Text Data Simulations Images Video Computations Automated Analyses Data
Fedora Repository – Key Features Digital Object Model Aggregate content “datastreams” in an object… any type of content Intermix both local content and external content Relationships among digital objects (via RDF) Register “content models” for known object patterns Repository Service Modular Web service interfaces (REST/SOAP) Versioning Dynamic service binding based on object content model types File-centric (all essential characteristics in XML files) RDF-based indexing (semantic triplestore index with query) Security with pluggable authentication and XACML policies Journaling (replay all events to create replicas of repository)
Basic Building Block: Fedora Digital Object Model (relationships and properties) Persistent ID DC RELS-EXT Reserved Datastreams Audit Trail Policy Datastream 1 Datastream 2 Datastreams (any type, any number) Datastream n
Forming Collections… with relationships PID 5 isMemberOfCollection Query PID1 isMemberOfCollection PID 3 isMemberOfCollection PID 2 “Smiley Stuff “ Collection Object
Digital Objects... with compositional relationships 8
Digital Objects… information network scholarly object PID 3 PID 5 hasPart annotates PID 1 PID 2 providesContext hasPart PID 4 Amazon Web Service library content External commercial web content
Fedora Repository Service For creating, managing, providing access to various and sundry objects Fedora APIs (SOAP and REST) Manage API Access API Registry Search RDF Query Ingest Manage Access Validate Policy CMABind Store Registry RDF Index File system (Objects) RDBMS (Registry) Triplestore
Better integration with web and workflows OAI-ORE (2009) Atom (2008) new formats Fedora APIs (SOAP and REST) Manage API Access API Registry Search RDF Query Ingest Manage Access Additional APIs: SWORD (2008) APP (2009) ? WEBDAV (2009) Validate Policy CMABind Store Registry RDF Index File system (Objects) RDBMS (Registry) Triplestore
Fedora Core Repository Service (Mapping to OAIS Perspective)
Preservation Enabling Features XML-based Digital Object Storage XML-based Ingest and Export (METS, FOXML, extensible to other) Automatic Versioning of content datastreams Audit Trail of all modifications to objects Recovery via Repository Rebuild Reconstitutes the repository by crawling persistent XML object store Rebuilds object registry, search index, resource index Fedora Journaling for Replication Captures all API-M transactions Replay to one or more “following” repositories (replication) Preservation Support Services (upcoming with community)
Simple Replication of Repositories Replica repositories, each with different underlying storage system; useful for failover, redundancy, archiving Now: Fedora Journaling http://fedora.info/download/2.2.1/userdocs/server/journal/index.html Future: Journal Event Messaging via Fedora JMS Leader Repository Follower Repository Journal Event Log API events API events Sun Honeycomb Can configure multiple “followers”
Performance/Scalability Measurement
Fedora - Software Features http://fedora-commons.org/documentation/3.0/userdocs/index.html Questions and Discussion