Download presentation
Presentation is loading. Please wait.
Published byInnozenz Schmitt Modified over 6 years ago
1
MARCXTM: Topic Maps Modeling of MARC Bibliographic Information
Hyun-Sil Lee, Yang-Seung Jeon, Sung-Kook Han Semantic Web Services Research Group Won Kwang University, Korea
2
Agenda Description of Bibliographic Information
MARC21 MARCXML MODS Topic Maps Modeling of MARC21 Requirements of MARC Modeling UML Model MARCXTM Implementations Conclusions
3
Overview: MARC MARC: Machine-Readable Cataloging
standards used for the representation of bibliographic and related information for books and other library materials in machine-readable form and their communication to and from other computers. All MARC Standards conform to: ISO 2709:1996 Information and documentation - Format for Information Exchange. MARC was originally designed in the late 1960’s to aid in the transfer of bibliographic data onto magnetic tape, and also to replace the printed catalog cards with electronic forms. There are a number of implementation of MARC, including USMARC used in US, CAN/MARC used in Canada, and UKMARC used in Britain. After discussions and minor changes to USMARC and CAN/MARC, MARC21 was evolved to harmonize both formats and to cover diverse types of resources including digital materials and Internet resources. MARC accommodates extensive data elements describing all forms of materials susceptible to bibliographic description, as well as related information.
4
Family of MARC Formats Bibliographic Authorities Holdings
a carrier for bibliographic information about printed and manuscript textual materials, computer files, maps, music, serials, visual materials and mixed materials. Authorities a carrier for information concerning the authorized forms of names, [titles,]subjects, and subject subdivisions to be used in constructing access points in MARC records, the forms of these names, subjects and subdivisions that should be used as references to the authorized form, and the relationships among these forms Holdings a carrier for holdings information for three types of bibliographic items: single-part; multipart; serial and may include: copy-specific information; information peculiar to the holding institution; information needed for local processing, maintenance or preservation; version information. Classification a carrier for information about classification numbers and the captions associated with them that are formulated according to a specified authoritative classification scheme Community Information a carrier for descriptions of non-bibliographic resources that fulfil the information needs of a community.
5
Supporting Documentation of MARC
MARC 21 Specification for Record Structure, Character Sets, and Exchange Media Character sets MARC-8 (8-bit encoding) UCS/UNICODE UTF-8 (8/16 bit encoding) Repertoire of 15,000+ characters Latin; Cyrillic; Hebrew; Arabic; CJK Code lists Countries, Geographical; Languages; Sources; Relators
6
MARC Record Format Leader
the first 24 characters of the record defining parameters for processing the record data elements that contain coded values and are identified by relative character position Directory directory entries that contain the tag used in variable fields, starting location, and length of each field within the record constructed by computer from the bibliographic record, and can be reconstructed in the same way if any of the cataloging information is altered Variable Field Control 00X fields in the MARC 21 formats are variable control fields. either a single data element or a series of fixed-length data elements identified by relative character position Data Indicators: The first two characters which interpret or supplement the data found in the field. Subfield codes: Two characters that precede each data element within a field that requires separate manipulation
7
MARC Record Format: Example
Leader Directory Variable Control Field Data
8
MARC Record Format: Example
Sign Post
9
Formalization of MARC <MARC21Record>::=<Leader><Directory><VariableField> <Directory>::=<DirectoryElement>* <DirectoryElement>::=<Tag><Length><Position> <VariableField>::=<ControlField><DataField> * <ControlField>::=<ControlNumber><ControlFieldElement> <DataField>::=<Tag><Indicator><SubField> * <Indicator>::=<FirstIndicator><SecondIndicator> <SubField>::=<SubFieldCode><SubFieldValue>
10
Problems with MARC Lack of expandability due to rigorous record formats, since it was originally intended for the production of printed catalogue cards in 1960s Difficulties in representing bibliographic relationships Ambiguities in describing MARC records Incompatibilities between other MARC formats since the various library systems have invented their own non-standard peculiarities in order to handle local bibliographic materials Weaknesses in describing bibliographic attributes of digitized resources
11
Character Set Conversion
MARCXML MARC21 (2709)Records MARC21 (XML) Records Tagging Transformations Character Set Conversion Dublin Core Records MODS Records Other XML Formats HTML Output MARC Validation
12
MARCXML MARCXML: a framework for working with MARC data in a XML environment Design Considerations and Features Simple and Flexible MARC XML Schema for representing a complete MARC record in XML Supports all MARC encoded data regardless of format Lossless Conversion of MARC to XML Roundtrip ability from XML back to MARC Data Presentation and Data Conversion Extensibility A component-oriented, extensible architecture allowing users to plug and play different software pieces to build custom solutions
13
MARCXML: Example
14
MODS MODS: Metadata Objects Description Schema Features
XML-based descriptive metadata standard that includes a subset of data elements derived from MARC21 Features MODS is intended to complement other metadata formats. MODS provides a richer bibliographic element set than Dublin Core. MODS has a high level of compatibility with MARC records because it inherits the semantics of the equivalent data elements in the MARC21 bibliographic format. In MODS some elements that appear in various fields in MARC have been repackaged into one. So MODS can define 19 upper metadata elements. MODS takes advantage of the XML environment. It uses language-based tags rather than the numeric tags traditional to MARC. MODS also has flexible linking mechanisms by providing for all the top-level elements with attributes such as xlink and ID. MODS accommodates special requirements for digital resources.
15
MODS: Example
16
Topic Maps Modeling of MARC 21
Requirements for MARC Modeling A model should be able to support the full set of data elements in MARC21 to achieve seamless compatibility with MARC formats. This is a practical requirement in order to embrace the current circumstances even though it is awkward. It should have the same expressive power as metadata. This implies that the model should be realized with semantic descriptors to be used in an XML environment instead of obsolete alphanumeric codes. The use of attributes should be minimized to maintain consistency and increase readability. It should be able to maintain the structure of MARC record format A model does not intend to develop bibliographic metadata system based on MARC. A model can be handled without expertise in MARC to achieve the usability of the model. A model should be simple and lightweight for system implementation and harmonization with other models.
17
UML diagram of MARC Modeling
DataField TagCode:String DataFieldName:String Repeatability:{NR, R} Description:String FirstIndicator SecondIndicator SubField IndicatorItem IndicatorCode:{Integer, ‘#’} IndicatorName:String SubFieldItem SubFieldCode:String SubFieldValue:String SubFieldName:String 1 1…* 0…*
18
MARCXTM Implementation
Librarians/Users MARCXTM for MARC Specification MARC Records XTM Representation of MARC Records
19
XTM Realization of MARC Specification
DataField: <association> of data item, indicators, and subfield codes <association id="data100"> <instanceOf> <topicRef xlink:href="#DataField"/> </instanceOf> <member> <roleSpec> <topicRef xlink:href="#Field"/> </roleSpec> <topicRef xlink:href="#Field100"/> </member> <topicRef xlink:href="#FirstIndicator"/> <topicRef xlink:href="#TypeOfPersonalNameEntryElement"/> <topicRef xlink:href="#SecondIndicator"/> <topicRef xlink:href="#Undefined"/> <topicRef xlink:href="#SubField"/> <topicRef xlink:href="#a100"/> <topicRef xlink:href="#b100"/> <topicRef xlink:href="#c100"/> <topicRef xlink:href="#d100"/> ……………………………………. <topicRef xlink:href="#q100"/> <topicRef xlink:href="#t100"/> <topicRef xlink:href="#u100"/> <topicRef xlink:href="#four100"/> <topicRef xlink:href="#six100"/> <topicRef xlink:href="#eight100"/> </association>
20
XTM Realization of MARC Specification
Hiding the real data value by topic abstraction <topic id="TypeOfPersonalNameEntryElement"> <baseName> <baseNameString> Type of personal name entry element </baseNameString> </baseName> <occurrence> <instanceOf> <topicRef xlink:href="#Forename"/> </instanceOf> <resourceData> 0 </resourceData> </occurrence> <instanceOf> <topicRef xlink:href="#Surname"/> </instanceOf> <resourceData> 1 </resourceData> <instanceOf> <topicRef xlink:href="#FamilyName"/> </instanceOf> <resourceData> 3 </resourceData> </topic>
21
MARCXTM for MARC Specification
22
XTM Realization of MARC Records
Complex to maintain MARC structure due to its idiosyncratic dependency between indicators and subfield code Difficult to realize the seamless compatible with MARC records Repeatability of subfield elements are individually defined in MARC specification. XTM supports for MARC modeling XTM does not provide multiple instances for <occurrence>. Difficult to define record schema with <association>.
23
XTM Realization of MARC Records
24
MARCXTM for MARC Records
25
Conclusions MARCXTM: Topic Maps-based implementation of MARC 21
MARCXTM for MARC Specification MARCXTM for MARC Records Application of Topic Maps paradigm to bibliographic information system Seamless compatible with MARC 21 expressive power as metadata XTM is inappropriate to represent MARC format due to its idiosyncratic structure and dependency between data elements. Metadata models similar to Dubline Core or MODS are necessary for XTM modeling of MARC. FRBR (Functional Requirements for Bibliographic Records) framework is an attractive model for XTM modeling of bibliographic information system.
26
Thank you!!!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.