EDUG 2012 Symposium 26 April 2012 Boston Spa, UK Michael Panzer Assistant Editor, DDC OCLC DDC metadata
Types of DDC data -Usually, Dewey numbers provide metadata for describing other resources -DDC as value vocabulary for metadata element sets -Instead, the following focuses on cases where Dewey numbers and DDC editions are the resources described -Two levels of DDC metadata -Number-level metadata (focus on bibliographic records) -Edition-level metadata (focus on classification records)
DDC metadata Metadata about -Dewey numbers (082, 083, 085 fields in MARC Bibliographic) -Provenance of machine-generated classication data -Dewey number components in linked 085 fields -Dewey editions (084, 686 fields in MARC Classification) -Interplay between class- and edition-level metadata rendered in MARC Classification format
Agenda Scenario 1.Provenance of machine-generated data 2.Edition-level metadata 3.Metadata about Dewey number components Context -Proposal for MARBI; (metadata) provenance initiatives at W3C / DCMI -Relationship between translations and other “versions” -Enhancing Dewey numbers for retrieval
MARBI proposal -Drafted over the last two months in cooperation with colleagues from DNB and LC -To be presented at MARBI meeting at ALA Annual Conference Two options -Option 1: Addresses the immediate needs of documenting information about machine generation of classification data -Defines additional subfields in 082, 083, 084 -Option 2: Proposes a more general way of dealing with metadata provenance -Applicable to all MARC variable fields (in principle) -Heeds the distinction between provenance in general and metadata provenance in particular
Option 1 Defined for 082, 083, 084 $i - Method of assignment designator Fully machine-generated (m) Not fully machine-generated (x) $u - Process of assignment May contain a URI, a process name, or some other description of process designated in $i $1 - Confidence value Confidence of the assigning agency in relation to the process described in $u. Contains value from the interval [0,1] $q – Assigning agency (already defined)
Examples DDC 23 number assigned by LC using AutoDewey. The AutoDewey process involves machine assistance followed by intellectual review: $a829/.3$223$ix$uautodewey$11 Fictitious example of DDC 22 number assigned by OCLC in a fully automated way using information in Classify: $a394.12$222$im$uclassify$10.5$qOCoLC
Option Data provenance (R) First Indicator: Method of assignment # - No information provided 0 – Fully machine-generated 1 – Not fully machine-generated $d - Date on which the linked field was generated $u - Process used to generate linked field $q - Agency using the process/activity to generate the linked field $1 - Confidence value $x - Ending date of validity $0 - Authority record control number or standard number $8 - Field link and sequence number (with new field link type “p – Data provenance”)
Examples $81\p$a829/.3$ # $81\p$uautodewey$d $qDLC$ $81\p$a394.12$222$qOCoLC 8830# $81\p$uclassify$d $qOCoLC$10.5
Examples (2) $81\p$a004$222/ger$qNO-OsNB 8830# $81\p$udeweyclassifierv0.1$d $x $qNO-OsNB$10.25 $0(DE-101) $81\p$a004$222/ger$qDE # $81\p$uparallelrecordcopy$d $x $qNO-OsNB
Agenda Scenario 1.Provenance of machine-generated data 2.Edition-level metadata 3.Metadata about Dewey number components Context -Proposal for MARBI; (metadata) provenance initiatives at W3C / DCMI -Relationship between translations and other “versions” -Enhancing Dewey numbers for retrieval
Edition-level metadata -Edition registry: capturing information about editions and translations in a centralized manner outside of MARC records -Storing additional metadata about editions/translations in MARC records -Better management of translation data and other versions -MARC does not offer edition-level records -Data info has to be carried in individual records, even when it applies to the whole edition -Relevant fields: Classification Scheme and Edition Relationship to Source Note
French DDC 22 German DDC 22 Italian DDC 22 Swedish Mixed DDC 22 Italian A14 Vietnamese A14 FrenchA14FrenchA14 Spanish A14 Hebrew A Religion Class Guide (French) DDC 22 A14A14 DDC Sach- Gruppen (German) DDC Summaries English French Italian Rhaeto- Romansch English French Italian Rhaeto- Romansch Afrikaans Arabic Chinese French German Norwegian Portuguese Russian Scots Gaelic Spanish Swedish Afrikaans Arabic Chinese French German Norwegian Portuguese Russian Scots Gaelic Spanish Swedish DDC translations: Anatomy of an edition
Types of editions -Related to an edition, with relationships not captured at record level Examples: sdnb, DDC Summaries, Guide versus -Related to an edition, with relationships captured at record level Examples: 200 Religion, translations, A15engind
Tracking edition-to-edition relationships Translation of standard edition 084 1# $a ddc $c 15 $e ind Source edition 084 1# $a ddc $c 15 $e eng Authorized derivative version of standard edition 084 8# $a ddc $c 22sdnb $d 22 $e ger Source edition 084 0# $a ddc $c 22 $e eng -Not explicitly full or abridged; “8” is used for value of first indicator -$n should be automatically populated with relevant information about the changes regarding the source edition.
Tracking record-to-record relationships 1. Record has been modified Translation of standard edition 084 1# $a ddc $c 15 $e ind 686 3# $i modified Source record 084 1# $a ddc $c 15 $e eng
Tracking record-to-record relationships (2) 2. Record was created for translation Translation of standard edition 084 1# $a ddc $c 15 $e ind 686 1# $b Source record [does not exist]
Tracking record-to-record relationships (3) 3. Unmodified record from different source edition Translation of standard edition 084 1# $a ddc $c 15 $e ind 686 0# $2 23 Source record 084 0# $a ddc $c 23 $e eng
Agenda Scenario 1.Provenance of machine-generated data 2.Edition-level metadata 3.Metadata about Dewey number components Context -Proposal for MARBI; (metadata) provenance initiatives at W3C / DCMI -Relationship between translations and other “versions” -Enhancing Dewey numbers for retrieval
085 - Synthesized Classification Number Components -085 fields provide information about components of Dewey numbers in linked 082 or 083 fields -Mirror 765 fields in MARC Classification format -Vital for faceted retrieval driven by Dewey numbers -Further enhancements possible by utilizing mappings of Dewey numbers that occur prominently as components, e.g, geographic data, time periods -Definition of new indexes is a requirement for retrieval use for WoldCat data
Exploiting Dewey facets in WorldCat Das Highlander-Kochbuch $8 1\x $a $q DE-101 $2 22/ger 085 ## $8 1\x $b ## $8 1\x $z 2 $s Cooking characteristic of specific continents, countries, localities T2—4115 Highland
Proposed new indexes (083 fields) “Dewey additional” index da index:Add $z and $c ($y) to elements already in dd index Pattern:[z--]a[-c][:a[-c]]
Proposed new indexes (085 fields) “Dewey components” index dc index:Index $s and $t concatenated with full address Pattern:[z--]rs|w[-c][:t] “Dewey synthesized” index ds index:Index all components Pattern:[z--]a|b|rs|u|w[-c][:a|b|t|u|v[-c]]
Proposed new indexes (082/083/085 fields) “Dewey general” index dg index:Index all elements in Dewey numbers Pattern:Combine dd, da, and ds indexes
Built number: History & geography +T2—435514Cologne Period of World War II, $8 1\x $a 943/ $ # $8 1\x $b 9 $a 930 $c 990 $z 2 $s $u # $8 1\x $b $a 930 $c 990 $v 01 $c 09 $f 0 $r $s 864 $u Example: History of Cologne during WWII
Access points / findability Components in dc index: Synthesized components in ds index: , 9, , :01-09, , , $8 1\x $a 943/ $ # $8 1\x $b 9 $a 930 $c 990 $z 2 $s $u # $8 1\x $b $a 930 $c 990 $v 01 $c 09 $f 0 $r $s 864 $u $8 1\x $a 943/ $ # $8 1\x $b 9 $a 930 $c 990 $z 2 $s $u # $8 1\x $b $a 930 $c 990 $v 01 $c 09 $f 0 $r $s 864 $u
Scenarios / Use cases -Components / facets can be varied independently of each other -Allows for expanding, but also "morphing" the query by changing individual components -Integration of mapped vocabularies into Dewey-driven discovery process -Using terms that have been mapped to any number components -Usage of local hierarchies of number components instead of just the hierarchical relationships of the base number
Example: Dewey-driven discovery Number components + mapped GeoNames T2— Neighboring countries: ChinaT2—51 PakistanT2—5491 BangladeshT2—5492 NepalT2—5496 BhutanT2—5498 MyanmarT2—591 notational structural T2—51 gn:neighbour Drinking of alcoholic beverages Meals Table manners structural
Thank You! Questions? Comments? Ideas?
Some useful links DDC 23http:// Abridged Edition 15http:// WebDewey 2.0http://dewey.org/webdewey dewey.infohttp://dewey.info Dewey webinars & presentations : The Dewey blog Classifyhttp://classify.oclc.org/classify2/ (Dewey Editorial Office) (Licensing, group purchases, LIS program)