Study report on SMM process Tae-Hoon Lim and Tae-Sul Seo ISO/IEC JTC1/SC32 WG2 Interim Meeting Seoul, Korea
SC32 WG2 Interim Meeting, Seoul, Korea 2 Background According to the resolution of SC32 New York meeting (SC32N1604a), the study on Semantic Harmonization of Metadata was performed. Reference: SC32N1658
SC32 WG2 Interim Meeting, Seoul, Korea 3 Summary Title was changed. The procedures were modified. Name of each step was changed The 2 nd and 3 rd steps can be replaced by each other. A description system for mapping was established.
SC32 WG2 Interim Meeting, Seoul, Korea 4 Title change From “semantic harmonization of metadata” To “semantic metadata mapping (SMM) process” The later is more specific expression than the former.
SC32 WG2 Interim Meeting, Seoul, Korea 5 Procedure modification 1 st Surveying metadata sets 2 nd Constructing common DECs based on th Completing crosswalks 3 rd Grouping data elements by the DECs 1 st Collecting metadata schema 2 nd Grouping attributes 4 th Mapping into a table 3 rd Finding common DECs
SC32 WG2 Interim Meeting, Seoul, Korea 6 Overall Process 1 st Collecting metadata schema 2 nd Grouping attributes 4 th Mapping into a table 3 rd Finding common DECs The 2 nd and 3 rd steps can be replaced by each other. Semantic Metadata Mapping Process
SC32 WG2 Interim Meeting, Seoul, Korea 7 Survey and identify candidate metadata schema in a domain. Surveying form includes: Domain name, Service DB name, or an other equivalent name. Number of fields Sample data Value domains 1st. Collecting metadata schema Semantic Metadata Mapping Process
SC32 WG2 Interim Meeting, Seoul, Korea 8 Selecting a metadata set as a primary metadata set. The simplest or the highest level metadata set is desirable to be the primary one. For all available metadata schema, attributes should be aggregated by the attributes of the primary metadata set. There may exist attributes which aren’t fitted to any of them. Some attributes, which are not important, may be removed. The remaining are grouped separately. Metadata experts should perform the work along with domain experts. 2nd. Grouping attributes Semantic Metadata Mapping Process
SC32 WG2 Interim Meeting, Seoul, Korea 9 Analyzing each attribute of the primary metadata set and find out an object class and a property hidden in and related to the attribute. Constructing common DECs based on ISO/IEC standard using the object classes and the properties. If there exists an attribute which isn’t fitted to any of the DECs, a new DEC may be constructed for them. 3rd. Finding common DECs Semantic Metadata Mapping Process
SC32 WG2 Interim Meeting, Seoul, Korea 10 Finally, arranging all attributes into a table by the common DECs. Comments on the types of mapping can be included in the table as bellow. Same, no difference: no description Level difference: upper/lower terms Domain difference: generic/specific (book, technical report, article, …) Term difference: synonym, antonym or preferred term Naming rule difference: Order or representation rules A recommended set of metadata can be provided for guiding future standardization. 4th. Mapping into a table Semantic Metadata Mapping Process
SC32 WG2 Interim Meeting, Seoul, Korea 11 Domain: e-Book (1 st ) Available metadata sets: OpenEBPS, MODS and TEI primary metadata set: OpenEBPS Application to e-Book OpenEBPSMODSTEI header Domain nameDescription of Electronic Book Description of Library resources Encoding methods for machine-readable texts Number of fields15About 60 (top level: 20)Over 20 Sample datayesnoyes
SC32 WG2 Interim Meeting, Seoul, Korea 12 (2 nd ) Grouping attributes Application to e-Book OpenEBPSMODSTEI titletitleInfor:titlefileDesc:titleStmt:title titleInfor:subTitlefileDesc:seriesStmt:title titleInfor:partNumberfileDesc:seriesStmt:idno titleInfor:partName titleInfor:nonSort creator(role)name:role creator(file-as)name:namePartfileDesc:titleStmt:author name:displayForm name:affiliation name:discription subjectsubject:topicprofileDesc:textClass:keyword classificationprofileDesc:textClass:classCode subject:catographicsprofileDesc:textClass:catRef subject:occupation
SC32 WG2 Interim Meeting, Seoul, Korea 13 (3 rd ) Constructing common DECs based on 11179: Object class: e-Book Properties: title, author, subject, abstract, publisher, distributor, authority, contributor, publication-date, genre, format, extent, identifier, language, coverage-geographic, coverage-temporal, right, location, edition DECs: ebookTitle, ebookAuthor, ebookSubject, ebookAbstract, ebookPublisher, ebookDistributor, ebookAuthority, ebookContributor, ebookPublication-date, ebookGenre, ebookFormat, ebookExtent, ebookIdentifier, ebookLanguage, ebookCoverage-geographic, ebookCoverage-temporal, ebookRight, ebookLocation, ebookEdition Application to e-Book
SC32 WG2 Interim Meeting, Seoul, Korea 14 (4 th ) Mapping into a table Application to e-Book DECOpenEBPS MODS TEI Recommaned DE ebookTitletitle titleInfo:title titleStmt:title ebookTitle titleInfo:subTitle seriesStmt:titleT:preebookSubtitle ebookAuthorcreator(role) name:role creator(file-as)T:prename:namePartD:gentitleStmt:authorN:repebookAuthorName ebookSubjectsubjectN:repsubject:topicT:pretextClass:keywordN:repebookSubjectWord classificationN:reptextClass:classCodeN:repebookSubject-classCode textClass:catRefT:pre L - up: upper term/lo: lower term D - generic: gen/… T - syn: synonym/ant: antonym/pre: preferred term N - ord: order/rep: representation
SC32 WG2 Interim Meeting, Seoul, Korea 15 Future plan The SMM process will be elaborated more in order to be proposed as a new work item in ISO/IEC JTC1/SC32 next year.
SC32 WG2 Interim Meeting, Seoul, Korea 16 Thank you!