Presentation is loading. Please wait.

Presentation is loading. Please wait.

High Level Browse Automatic Assignment of Broad Subject Categories Using Pre-existing Data from Catalog Records Jonathan Rothman Senior Systems Librarian.

Similar presentations


Presentation on theme: "High Level Browse Automatic Assignment of Broad Subject Categories Using Pre-existing Data from Catalog Records Jonathan Rothman Senior Systems Librarian."— Presentation transcript:

1 High Level Browse Automatic Assignment of Broad Subject Categories Using Pre-existing Data from Catalog Records Jonathan Rothman Senior Systems Librarian / Analyst University of Michigan University Library jrothman@umich.edu

2 Context / History  People like lists (aka browsable, categorized access tools)  WWW = demand for browsable, clickable lists.  “Hand-made” web lists.

3 Manually-Maintained Resource List

4 …and another…

5 Demand for Comprehensive Lists  Manual maintenance is plausible for selected lists  … but it is not supportable for “comprehensive” tools.

6 Manually Built and Maintained Electronic Journals List

7

8

9

10 The Issues  Inconsistent Categories

11

12

13 The Issues  Inconsistent Categories  Categories require a lot of maintenance work

14

15 Alternatives We Considered  Using LC Classification As Interface  Order record fund codes  Mapping from data in Bibliographic Records

16 LC Class Schedule as Interface  Doesn’t accommodate local Dewey numbers  Assumes user knowledge of classification schedule organization, unintuitive  Scatters items of interest to Departments and programs across categories  Doesn’t take advantage of local expertise

17 Using Fund Codes  Not presently available outside of our Acquisitions System  Codes don’t map neatly to topics  Master list of codes would need to be carefully maintained along with maps from codes to topics

18 Data Mapping Pros and Cons Pros  Uses data that already exists in records.  Mapping allows adjustments to topics without changing individual records. Cons  Some materials don’t historically contain class numbers.  Some records don’t contain the numbers which will get them to appropriate categories.

19 High Level Browse Project  High Level Browse??  Two Project Components Create a single set of topics to be used across access tools Create an infrastructure that allows bibliographic data to be associated with topics in a maintainable way

20 Unified Topic List  Start with merger of existing lists. Review in light of local programs and units Broad Input  Design principles Limit number of headings at a given level Limit number of levels  Mostly a Political Process – A lot of discussion, compromise and iteration.

21 Topic List, Level One Topics There are nine Level One topics Arts & Humanities Business & Economics Engineering General Reference Government Information & Law Health Sciences News & Current Events Science Social Sciences

22 Topic List, Level Two Topics 110 total - Some Examples :  African Studies  African-American Studies  American and Canadian Studies  Architecture  Art and Design  Art History  Classical Studies  East Asian Languages and Cultures  English Language and Literature  Film and Video Studies  Gay/Lesbian/Bisexual/Transg ender Studies  General and Comparative Literature  Germanic Languages and Literature  History (General)  Humanities (General)  Biological Chemistry  Biomedical Engineering  Complementary and Alternative Medicine  Dentistry  Dermatology  Family Medicine and Primary Care  Genetics  Geriatrics  Internal Medicine and Specialties  Kinesiology and Sports  Medicine (General)  Microbiology and Immunology  Molecular, Cellular and Developmental Biology  Neurosciences

23 Overview of Work Involved Development of initial maps by teams of catalogers and subject- selectors. Technical infrastructure development. Integration of high-level browse infrastructure with existing retrieval tools. Evaluation / Tuning.

24 Principles for Technical Development  Mapping Infrastructure Should be Independent of Any Specific Access Tool  Regular Maintenance of Maps Should be Possible Without Programmer Intervention

25 What Do We Mean by a Map?  BC => Philosophy  BD => Philosophy  BF 432.N5 => Afro-American and African Studies  BR 128.A16 => Afro-American and African Studies  E 185 => Afro-American and African Studies  F 1435.3.P5 => Philosophy  HF 5387-5387.5 => Philosophy

26 Topic Map African and Afro-American Studies DT1.A N1. A26 E184.7 F189.B19N4 HQ768

27 Revised Topic Map African Studies Afro-American Studies DT1.A N1. A26 E184.7 F189.B19N4 HQ768

28 Map Creation Statistics  Creation of initial maps is about 80% complete.  On average, consultation session to define a map takes about 3-4 hours.  Map size ranges from One entry

29 Science (General) Map

30 Map Statistics  Creation of initial maps is about 80% complete.  On average, consultation session to define a map takes about 3-4 hours.  Map size ranges from One entry To 1656 Entries

31 Middle Eastern, Near Eastern and North African Studies Map

32 The Map Database

33 Map Tables 1 levelTwoTopic idName 1History (General) 2Religious Studies 3West European Studies encompasses levelOnelevelTwo 11 91 12 levelOneTopic idname 1Arts & Humanities 2Business & Economics 3Engineering

34 Map Tables 2 levelTwoTopic idName 1History (General) 2Religious Studies 3West European Studies lc IdalphaStartnumStartcutStartalphaEndnumEndcutEndnotes 1az 200.000NULLaz 361.000NULL 29bl 1.000NULLbx 999.000NULL religion 34z7963.000r45z7963.000r45 women lcMap lclevelTwo 1 1 2 1 1 3

35 Map Tables 3 levelTwoTopic idName 1History (General) 2Religious Studies 3West European Studies dewey IdnumStartnumEndnotes 1 350.000 359.000NULL 29 840.000 849.990 French 34 850.000 859.940 Italian deweyMap deweylevelTwo 1 1 2 60 3 63

36 Infrastructure Software Elements  Mapping Engine  Batch Load Script  Map Maintenance Interface

37 API Call to the Mapping Engine #! /l/local/bin/perl use CallNoToTopicMap; CallNoToTopicMap::init(); print "enter call numbers (ctrl-d when done): "; while ( ) { print "\ntopic(s): ". join("\n ", @{&CallNoToTopicMap::topics($_)}). "\n\n: "; } CallNoToTopicMap::finish(); print "\n";

38 Infrastructure Demonstrations  Simple Demonstration Interface to Mapping Engine  Maintenance Interface

39 Integration with Existing Access Tools  Use to pre-generate categories associated with bibliographic items when data is updated in batch.  Use to populate menus of categories in real time  Use to generate categories associated with bibliographic items in real time.

40 Integration Demonstrations  New Books – new interface complete  Ejournals – integration still to be completed

41 Addressing Identified Issues  Types of materials that do not traditionally contain classification numbers in our system (e.g. Newspapers).  Individual items that are not classified so that they appear in all desired categories.

42 Implementation Status  New Books – move to production is imminent.  Electronic Journals and Newspapers – planned by end of 2003  NetER – Selection remains manual for now but new level one categories are integrated.

43 Work Outstanding  Completion of Initial Map Definition  Integration with Electronic Journals and Newspapers List  Tuning of Maps

44 Contact Information Jonathan Rothman Senior Systems Librarian / Analyst University of Michigan University Library jrothman@umich.edu Questions?


Download ppt "High Level Browse Automatic Assignment of Broad Subject Categories Using Pre-existing Data from Catalog Records Jonathan Rothman Senior Systems Librarian."

Similar presentations


Ads by Google