Download presentation
Presentation is loading. Please wait.
Published byPrimrose Byrd Modified over 7 years ago
1
and Transmission Standard overview – and case study
Metadata Encoding and Transmission Standard overview – and case study Markus Enders, SUB Göttingen
2
METS overview METS was derived from „Making of America“ format
--> generalize format; usage for other media types Funded by Digital Library Federation (DLF) multiple structures are possible; type attribute can be "logical, physical" etc... nested div elements Editorial Board is steering the development helds “Mets Opening Days”
3
METS overview structMap central object mandatory
<mets:div TYPE=”Monograph” LABEL=”From Hamburg to San Fransisco” ORDER=”1” ID="DMD1"> structMap <div> central object mandatory nested <div> store structure multiple structures (type attribute) multiple structures are possible; type attribute can be "logical, physical" etc... nested div elements
4
METS overview structLink structMap central object mandatory
<mets:structLink> <mets:smLink xlink:from=”div1” xlink:to=”div2”> structLink structMap <div> central object mandatory nested <div> store structure multiple structures (type attribute) structLink: stores links between two <div> elements link between two div elements from different <structMap>
5
METS overview contains file groups structLink structMap (nested)
files are contained in file groups basic technical metadata as attributes link from a <div> to one or more files structLink structMap <div> <fptr>: FileSec parallel or sequential file groups can be nested; Files are always contained in file Groups. files: attributes for checksums, size, mime type – further technical metadata can be stored in metadata section <fptr> = file pointer <par> or <seq>: several file pointers for each <div> possible; files can be parallel or sequential can link into a file: -- images: HTML coordinates -- byte-offsets -- XML IDs -- time codes (streaming media) link to streams <FileGrp> <file> link into a file
6
METS overview Descriptive metadata vs. Administrative Metadata
metadata can be embedded or referenced XML or binary metadata extensions schemas used: MODS, DC, premis etc... m:n relationship between metadata and <div> od <file> Desc. MD extension schema Admin. MD Administrative Metadata: seperate sections: technical metadata, digital provenance metadata, rights metadata, source metadata METS does not come with an own metadata schema, but enables to plug in different extensions schemas extension schema techMD digiProvMD rightsMD sourceMD
7
METS overview structLink StructMap Desc. MD FileSec Admin. MD
<div> extension schema FileSec Admin. MD several metadata sections for each <div> or <file> possible a single metadata section can be used by several <div> or <file> objects <FileGrp> extension schema <file> techMD digiProvMD rightsMD sourceMD
8
METS overview METS Header structLink StructMap Desc. MD FileSec
<div> extension schema FileSec Admin. MD METS header contains information about the METS object (mets file), NOT about the content <FileGrp> extension schema <file> techMD digiProvMD rightsMD sourceMD
9
METS overview How does the linking work (in XML): XML IDs are used:
each target must have a unique ID <mets:dmdSec ID="DMD1"> Metadata: DMDID and ADMID are of the type IDREFs <mets:div DMDID="DMD1 DMD2"> ID need only locally unique (within the same file) IDREFS: space separated pointers may point everywhere in the file: even from DMDID to <file>: file will validate Not a problem of METS data model but of XML representation File pointer: <mets:fptr FILEID="FN10081"/>
10
METS example (1) Digitization Centre
Simple Document model (single structure) several content files per document (single TIFF image per page) bibliographic metadata logical structure for the document (table fo content) direct relationships between logical structure entities and content files This model was developed in mid 90ies, stored in XML with a proprietary metadata set
11
METS example (1) Digitization Centre Simple logical document model
Logical structure <structMap> Content files <fileSec> Monograph tif tif Chapter tif Chapter tif tif Max. eine Datei pro Seite; Namenskonvention bestimmt die Reihenfolge Chapter tif Chapter tif Chapter tif
12
METS example (1) Digitization Centre Simple logical document model
Logical structure <structMap> Content files <fileSec> Metadaten Monograph Metadaten tif tif Chapter tif Chapter tif tif file can belong to several document structure entities Chapter tif Chapter tif Chapter tif
13
METS example (1) Digitization Centre Simple logical document model
Logical structure <METS:structMap TYPE="LOGICAL"> <METS:div TYPE="Monograph"DMDID="dmdlog0001"> <METS:div TYPE="TitlePage" ID="log0002"> <METS:fptr FILEID="bitonal0001"/> </METS:div> <METS:div TYPE="Dedication" ID="log0003"/> <METS:fptr FILEID="bitonal0002"/> ...... </METS:structMap> file can belong to several document structure entities
14
METS example (1) Digitization Centre Simple logical document model
Metadata <METS:dmdSec ID="dmdlog0001"> <METS:mdWrap MDTYPE="MODS"> <METS:xmlData> <MODS:mods> ...... </MODS:mods> </METS:xmlData> </METS:mdWrap> </METS:dmdSec> MODS metadata embedded in METS
15
METS example (1) Digitization Centre Simple logical document model
ContentFiles <METS:fileSec> <METS:fileGrp> <METS:file ID="bitonal0001" MIMETYPE="image/tiff"> <METS:FLocat LOCTYPE="URL" xlink:href="file://./ tif"/> </METS:file> </METS:fileGrp> </METS:fileSec> Files are only referenced no metadata section for files; basic technical metadata is included as attributes: size, mimetype and checksum...
16
METS example (2) Digitization Centre
Document model with two structures logical structure (TOC) physical structure (bound book, page) realtionships between structures Every structure entity has its own metadata section content files are linked to physical structure entities
17
METS example (2) Digitization Centre
Document model with two structures Logical structure Phys. structure Content files Monograph Bound Book tif Page tif Chapter Page tif Chapter Page tif Page tif page area: column Chapter page area tif Chapter Page tif Chapter Page tif Page HiRes01.jpg Page Fulltext.xml
18
METS example (2) Digitization Centre
Document model with two structures Map two structures <METS:structMap TYPE="LOGICAL"> <METS:div TYPE="Monograph" ID="log0001" DMDID="dmdlog0001"/> </METS:structMap> <METS:structMap TYPE="PHYSICAL"> <METS:div TYPE="BoundBook" ID="phys0001" DMDID="dmdphys0001"> <METS:div TYPE="page" ID="phys0002" DMDID="dmdphys0002"/> </METS:div> </METS:structMap>
19
METS example (2) Digitization Centre
Document model with two structures Map two structures <METS:structLink TYPE="xxx"> <!--Monograph --> <METS:smLink from="log0001" to="phys0001"/> <!—title page--> <METS:smLink from="log0002" to="phys0002"/> </METS:structLink> link from logical to physical (pages)
20
METS example (2) Digitization Centre
Document model with two structures Link to several files <METS:div TYPE="page" ID="phys0002" DMDID="dmdphys0002"> <METS:fptr FILEID="bitonal0001"/> <METS:fptr FILEID="hires0001"/> </METS:div> Link to page area files are neither sequential nor parallel, but alternative versions link to page area: COORDS attribute contains information where the column is <METS:div TYPE="column" ID="phys0003" DMDID="dmdphys0002"> <METS:fptr> <METS:area FILEID="bitonal " COORDS="40x40x150x250"/> </METS:fptr> </METS:div>
21
METS example (2) Digitization Centre
Document model with two structures Logical structure Phys. structure Content files Monograph Bound Book tif Page tif Chapter Page tif Chapter Page tif Page tif Link to full text: single fulltext file (TEI) for the whole monograph Chapter page area tif Chapter Page tif Chapter Page tif Page HiRes01.jpg Page Fulltext.xml
22
METS example (2) Digitization Centre
Document model with two structures Link to fulltext (TEI): <METS:div TYPE="page"> <METS:fptr> <METS:area FILEID="teixml01" BEGIN="xx02" END"xx02"BETYPE="IDREF"/> </METS:fptr> </METS:div> <METS:div TYPE="page"> <METS:fptr> <METS:area FILEID="teixml01" BEGIN="xx02" END"xx02"BETYPE="IDREF"/> </METS:fptr> </METS:div> files are neither sequential nor parallel, but alternative versions link to page area: COORDS attribute contains information where the column is <TEI:p> <TEI:q id="xx01">....</TEI:q> <TEI:q id="xx02">....</TEI:q> <TEI:pb n="13"/> <TEI:q id="xx03">...</TEI:q> </TEI:p>
23
METS example (2) Digitization Centre
Document model with two structures Fulltext is referenced, not embedded in METS file due to file sizes. METS file is about 2 – 3 MB fulltext is about 20 MB Use MODS for descriptive metadata for logical structure entities files are neither sequential nor parallel, but alternative versions link to page area: COORDS attribute contains information where the column is Own descriptive metadata schema for physical structure entites – storing page numbers
24
METS example (2) Digitization Centre Why did the GDZ choose METS:
easily extendable: may start with image digitization and may add fulltext later complex structure needs to be stored Fulltext format not flexible enough: (1) TEI knows only one kind of structure (logical); does not know any pages (just page breaks). (2) no extensive metadata model --> fulltext files needs to be linked to a METS file
25
METS creation: By hand in XML editor (structMap the only required object) special tools for certain purposes, e.g: - conversion tools for web-archiving - ... At GDZ: GOOBI workflow management tool to do: General METS API, which implements the data model.
26
METS presentation: Depends on your METS file:
- simple XSLT transformations - repository systems (ContentDM, Fedora etc.) - some page turners available (for digitized content)
27
METS-Profile Documentation Documentation is necessary:
Describe objects and relationships in you document model: What objects are available What metadata are attached to those objects How are objects related to each other (trees) How to store unambiguous order? Are there non-hierarical relationships between objects? Which content files are available? How's the access granularity?
28
METS-Profile Documentation
Documentation should not describe a format generally, but the precise usage of a packaging format. Example: How to inheirit relationships between two structure-trees? Chapter Page Page Need the column be linked to the chapter directly or is an indirect link sufficient? Page Column
29
METS-Profile Documentation
Documentation should not describe a format generally, but the precise usage of a packaging format. Examples: How to link into fulltext files? Usage of BEGIN and END attributes How to store the order of <div> elements? What kind of <div> elements are available? Developing and sharing documentation encourages the usage of „complex document formats“ even for simple documents. documents can be enriched with additional information later on.
30
METS-Profile Documentation
METS Profile describes the usage of METS for a special scenario: - what extension schemas are used? - what authority files? - usage of attributes and elements METS-profile schema available; profile is an XML file, which is not machine readable. „registry“ on METS website available ähnliche Dokumente:
31
ähnliche Dokumente:
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.