How to express MARC in XML ELAG Workshop 10 Report
Participants: Liv Aasa Holm, JBI-HIO, Norway; Christer Larsson, The Royal Library, LIBRIS Department, Sweden; Dan Matei, CIMEC - Institute for Cultural Memory, Romania; Anne Munkebyaune, BIBSYS, Norway; Mona-Lise Pedersen, BIBSYS, Norway; Nils Pharo, Oslo University College, Norway.
Why XML ? XML is (really) useful ? vs. XML is (just) fashionable ? Useful ! –more flexible syntax, i.e. has more “expressive power”; –it allows more (and finer) syntactic constraints; –it allows (unrestricted) hierarchies in a record; –it is here to stay (?); –a lot of tools available.
Problems with MARCs not too flexible (too flexible ! [Ole]); only 2 (or 3 ?) hierarchical levels; some tags express two things: the nature of the related entity and the kind of relationship with the record; 1:1 principle not observed (i.e. the records are not “normalized”) – but they are “self-contained” !
Aim: to devise an XML-based bibliographic format Approach A: “mechanically” express MARC in XML. Already done (by LC): marcxml, see Approach B: consider the synthesis of authority, bibliographic and holdings MARCs in a new format (i.e. a “bibliographic MARC-up language” ?), let’s code-name-it MARCX (not Karl..., not... Brothers !).
Use cases: A. internal (database) format; B. transportation (serialization) format: “transport scenarios”: –to/from union catalogues: “normalized” files, i.e. records of instances of “base” entities + records for their relationships; –for presentation (i.e. display): un-normalized, self-contained (MARC-like) records; –for... something else (?): records with “FRBR families” of bibliographic objects, e.g. works with their expressions.
“Integrating” framework: FRBR schema (and/or DTD) including types for: –works; –expressions; –manifestations; –items; –persons; –... –concepts; –subject headings; –relationships.
Relationships: identifiers need for unique identifiers within a file; need for global unique identifiers: need for large amounts of unique identifiers, i.e. automatic generation; options: URIs, GUIDs [Global Unique Identifiers].
Relationships: options reified (Topic Maps like): id-s id-t within source:... id-t...
Relationships: the “type problem” the type as attribute: id vs. the type as element: author id which is more convenient for “ontology controlled” types ?
Types/elements: inner structure to conserve the MARC blocks ? No ! to re-group data elements by their nature, e.g. ‘title’ and ‘notes on title’; to use as many hierarchical level as necessary (but not more).
Types/elements: general pattern [... ]
“Language independence” (1) for multilingual records element: “localized text” (ltext), with attributes: –language; –script; –transliteration standard. e.g. What the hell is going on ?
“Language independence” (2) cataloguing rule: areas in the language of the material: Romeo and Juliet Romeo et Juliette Romeo und Julietta
Conclusions ? To tag or not to tag ? To tag ! –In MARCX: –finer (and more controllable) granularity; –less redundancy; –more compact records; –more human-readable records; –lots of ready-made tools. Another lingua franca ? “INTERMARC” redivivus !