Presentation is loading. Please wait.

Presentation is loading. Please wait.

Toward a Digital Asset Management Ecosystem at Texas A&M University Libraries An update on developments in document workflows, data modeling, media service,

Similar presentations


Presentation on theme: "Toward a Digital Asset Management Ecosystem at Texas A&M University Libraries An update on developments in document workflows, data modeling, media service,"— Presentation transcript:

1 Toward a Digital Asset Management Ecosystem at Texas A&M University Libraries
An update on developments in document workflows, data modeling, media service, and exhibitions Welcome to Managing Digital Assets from Curation to Exhibition Background: Magpie was presented last year at TCDL. The software is built using the Weaver framework, a Texas A&M University Libraries in house framework. The original use case for Magpie, or Metadata Assignment Tool at the time, was to facilitate annotation of legacy dissertations from their scanned and OCR’d form. There was also a requirement to prepopulate specific metadata fields from Voyagers MARC records for the legacy dissertations. Since then the use case evolved and more came into existence when recognizing the software's potential. William Welling, James Creel, Jeremy Huff, Jason Savell, Douglas Hahn, Sarah Potvin, Sean Buckner, Michael Bolton

2 Talk Outline Introducing the DAME Components and Technologies
DAMS Task Force Needs Assessment Piloting and Evaluation Components and Technologies Document Workflows IR/Storage/Data Modeling Media Servers Discovery Exhibition Demonstration of Current Developments accuracy: ensure metadata is correct efficiency: annotating large collections in a reasonable amount of time normalization: avoid repeating metadata, avoid multiple ambiguous terms, use controlled vocabularies and name authorities synchronization: ensure metadata is consistent between repositories, exhibits, preservation, etc. Digital assets can come from many collections, some being homogeneous and some not. Sources of metadata can be from many sources such as a catalogs, repositories, exports, etc. The annotated digital asset can have many destinations, such as multiple IR, multiple exhibits, and/or preservation. A good exhibition solution needs to be able to exhibit multiple collections with digital assets from varying sources. It should be able to allow exhibit level annotations. It should be discoverable and aesthetic. It would be ideal for an exhibit layer to not duplicate the digital asset but reference it and utilize linked data.

3 Introducing the DAME Needs assessment revealed a diverse set of requirements not met by any single system Different exhibitions and collection types require different workflows Legacy dissertations Agricultural Bulletin Image Collections Newspapers and articles Existing Collections Etc. Legacy dissertations: This is an ongoing project to digitize, annotate, and preserve legacy dissertations. As already mentioned, this was the original use. The requirements have since changed. The annotation will be conducted using existing commercial tools, but still left the need to preserve the digital assets. The process required that the MARC record from Voyager be packaged with the documents and preserved in Archivematica. There was no user interaction required for this, so we introduced a headless mode to Magpie. When the scanned documents are finished they begin the headless process of gathering MARC record from Voyager and automatically packaged and pushed to Archivematica for preservation. Agricultural Bulletin: Dr. Robert McGeachin, Agriculture and Digital Services Librarian, came to use with a project to semi-automatically index the Agricultural Bulletin he has so diligently been annotating and marshaling the assets into Dspace. The project called for a UI that provided a lists of suggestions that were discovered by finding all occurrences of any term from the National Agriculture Library Thesaurus and applying a set of use-for rules. Image Collections: We have had requests to have digital exhibits of photos taken from physical exhibits.

4 Ecosystem MAGPIE Service and UI Pelican Service Authorization Service
Scanning workstations Abbyy OCR DSpace Fedora IIIF Manifest Generator Archivematica Cantaloupe Spotlight Mirador Magpie: the application being presented Pelican: a web-service for name disambiguation, modified to provide subject suggestions by counting frequency of thesaurus words within a document and applying use-for rules Authorization Service: used to authenticate and authorize applications either by basic login or through a service provider such as shibboleth Scanning workstations: scanners for scanning physical documents Abbyy OCR: software run as a service to perform optical character recognition on scanned documents DSpace: Institutional repository Fedora: Institutional repository Archivematica: preservation software Spotlight: exhibition software

5 Document Workflow Components
Projects Observers Authorities Automatic Suggestions Exporters Repositories Exhibits

6 IIIF Manifest Generator
Filesystem DSpace CSV Metadata Spreadsheet Legacy Dissertations Agricultural Bulletins Primeros Libros (SAF) WW I Postcards (SAF) MAGPIE Authority Observer Observer Observer Observer Archivematica Repository (DSpace REST) Voyager (OPAC) Authority Repository (Archivematica REST) Fedora Repository (Fedora REST w/ PCDM) Suggestor Exporter Exporter Repository (Fedora REST w/ PCDM Pelican (NLP) DSpace SAF Import Archive Mirador IIIF Manifest Generator Spotlight Import CSV Spreadsheet Spotlight

7 Project Configuration
Defined by JSON Configurable Metadata Fields Ingest mode Authorities Suggestors Repositories Exporters

8 Data Modeling DSpace Fedora IIIF Flat metadata PCDM Collections
Presentations Images

9 Discovery and Exhibition
Solr Fuseki Exhibition IIIF Manifests Cantaloupe Image Server Spotlight Mirador

10 Legacy Scanned Dissertations
Filesystem DSpace CSV Metadata Spreadsheet Legacy Dissertations Agricultural Bulletins Primeros Libros (SAF) WW I Postcards (SAF) MAGPIE Authority Observer Observer Observer Observer Archivematica Repository (DSpace REST) Voyager (OPAC) Authority Repository (Archivematica REST) Fedora Repository (Fedora REST w/ PCDM) Suggestor Exporter Exporter Repository (Fedora REST w/ PCDM Pelican (NLP) DSpace SAF Import Archive Mirador IIIF Manifest Generator Spotlight Import CSV Spreadsheet Spotlight

11 Texas Agricultural Experiment Station Publications
Filesystem DSpace CSV Metadata Spreadsheet Legacy Dissertations Agricultural Bulletins Primeros Libros (SAF) WW I Postcards (SAF) MAGPIE Authority Observer Observer Observer Observer Archivematica Repository (DSpace REST) Voyager (OPAC) Authority Repository (Archivematica REST) Fedora Repository (Fedora REST w/ PCDM) Suggestor Exporter Exporter Repository (Fedora REST w/ PCDM Pelican (NLP) DSpace SAF Import Archive Mirador IIIF Manifest Generator Spotlight Import CSV Spreadsheet Spotlight

12 IIIF Manifest Generator
Filesystem DSpace CSV Metadata Spreadsheet Legacy Dissertations Agricultural Bulletins Primeros Libros (SAF) WW I Postcards (SAF) MAGPIE Authority Observer Observer Observer Observer Archivematica Repository (DSpace REST) Voyager (OPAC) Authority Repository (Archivematica REST) Fedora Repository (Fedora REST w/ PCDM) Suggestor Exporter Exporter Repository (Fedora REST w/ PCDM Pelican (NLP) DSpace SAF Import Archive Mirador IIIF Manifest Generator Spotlight Import CSV Spreadsheet Spotlight

13 Future Direction Full Production Deployment UI managed projects
UI annotation to support PCDM A/V media support Heuristics to guess metadata Open source code of MAGPIE

14 Click to add your credits
Thank You. Any Questions? Subtitle


Download ppt "Toward a Digital Asset Management Ecosystem at Texas A&M University Libraries An update on developments in document workflows, data modeling, media service,"

Similar presentations


Ads by Google