Presentation is loading. Please wait.

Presentation is loading. Please wait.

Understanding Indexes: Headings

Similar presentations


Presentation on theme: "Understanding Indexes: Headings"— Presentation transcript:

1 Understanding Indexes: Headings
Prepared by Marina Spivakov, 2002; Updated by Jerry Specht, June 2003

2 Understanding Indexes
Scope of the Lecture Points for discussion in each index: Index structure (Oracle tables) Specifying index Index creation and update Performance issues Understanding Indexes

3 Understanding Indexes
NOTE: This Power Point discusses Headings in 14.2 (and in general). It is supplemented by a NAAUG.INDEX_CH.ppt which follows and which describes Headings features new in 15.2. Understanding Indexes

4 Where to get your own copy: Understanding Indexes
Both this Power Point presentation and the following may be found on the US documentation server ( ) in the NAAUG_Indexes_2003 directory. Understanding Indexes

5 Understanding Indexes
Headings Index Understanding Indexes

6 Understanding Indexes
Headings Index Headings indexes are whole phrases from the record such as author, title, subject, publishers, etc. Understanding Indexes

7 Understanding Indexes
Database Tables Heading index: Z01 – phrase dictionary Z02 – pointers to the documents Understanding Indexes

8 Z01 Filing text (stripped sub-fields, stripped
punctuation, add leading zeros to numeric fields, character conversion etc.) Z01 record unique identifier, link to other records Authority link Display text

9 Z01- Z02 link Z02 record Z01 record Bibliographic record

10 How to Define the Headings Index?
Tables to remember tab00.lng defines system index codes & filing procedures tab11 defines connections between the bibliographic record fields and the indexes tab_filing defines filing procedures tab_expand defines expand procedures which have to be activated when index is created tab_character_conversion_line defines character conversion routines unicode_to_filing_nn character conversion table used for normalization of headings

11 How to Define the Structure of the Headings Index – Interrelation of Tables
tab00.lng tab11 tab_filing tab_expand

12 Z01: Display Text and Filing Text Useful Details
Understanding Indexes

13 Understanding Indexes
Z01 – Display Text and Filing Text Display text - data for the display text is taken directly from the record. Filing text – data undergoes filing and character conversion processing. Understanding Indexes

14 Z01 –Display Text If two records generate headings that have a common filing text but different display texts, the system will create two headings, not one. Bibliographic document 1 Bibliographic document 2 z01

15 Understanding Indexes
Z01 –Display Text In order to achieve normalization of headings in 14.2, the headings themselves must be changed to the same form. The only exception is the suppression of end punctuation, specified in tab00.eng: tab00.lng, col.4: 0 - no suppression 1 or space - suppress punctuation at the end each sub-field when creating a Z01 heading. Understanding Indexes

16 Z01 –Display Text Normalization
Bibliographic document 1 Bibliographic document 2 Tab00.lng z01 z01

17 Z01 –Display Text Normalization
Bibliographic document 1 Bibliographic document 2 Tab00.lng z01 NOTE: Version 15 allows more advanced normalization of headings.

18 Normalization of Headings – Cataloger’s Assistant
Detect Similar Headings (p_manage_26) reports headings which differ in display text only, i.e. headings which are the same except for punctuation and case differences. Example of output file

19 Normalization of headings – Cataloger’s Assistant
Correct allows you to: Discover inconsistencies Change bibliographic documents without going to Cataloguing module Reindex the documents (creates Z07)

20 Understanding Indexes
Filing of Headings Headings are filed (organized in the index, sorted) according to the filing text of the heading. Data for the filing text field is processed in two ways: Text goes through the appropriate filing routine. Characters go through character conversion. Understanding Indexes

21 Understanding Indexes
Filing Routines From version 14, the filing routines are made up of a group of individual procedures. Filing routines are defined in tab_filing: Understanding Indexes

22 tab_filing - Structure
!!-!-!!!!!!!!!!!!!!!!!!!!-!!!!!!!!!!!!!!> 01 # compress ’ 01 # char_conv FILING-KEY-01 Col.1: procedure identifier Col.2: alpha of the text Col.3: procedure name Col.4: procedure parameters Understanding Indexes

23 Examples of Filing Procedures
compress Strips characters listed in col. 4 (e.g., ()[]:,) delete_subfield Changes subfield sign to blank (e.g., $$x) to_blank Changes characters listed in col. 4 to blanks. Understanding Indexes

24 Examples of Filing Procedures
to_lower Changes all characters to lower case. to_carat Changes subfield sign to two caret (^^) signs in order to achieve hierarchical sorting of headings. suppress Suppresses all text contained within <<…>>, as well as the signs themselves. Understanding Indexes

25 Examples of Filing Procedures
expand_num For filing numbers numerically, adds leading zeroes to numbers to fixed length of 7 (e.g. 17 -> ). mc_to_mac Changes initial “mc” to “mac” (for interfiling McKay and MacKay). non_filing Suppresses initial text according to non-filing indicator defined in tab11. Understanding Indexes

26 Examples of Filing Procedures
compress_blank Strips blanks (e.g. ISBN). numbers Compresses a comma and a dot between numbers (e.g., 2,153 changes to 2153). non_numeric Deletes all non-numeric characters (e.g. for ISSN). Understanding Indexes

27 Examples of Filing Procedures
abbreviation Compresses a dot between single characters (e.g., I. B. M. changes to I B M, I.B.M. changes to IBM). build_filing_key_lc_call_no Special procedure for correct sequencing of LC call numbers. Understanding Indexes

28 Examples of Filing Procedures
char_conv Performs character conversion. Characters can be: - filed as themselves - ignored - converted to spaces or to one or more different characters. Examples. ue ( ) ü (00FC) u (0075) & (0026) and ( E 0044)

29 Examples of Filing Procedures – Character Conversion
tab_filing 01 # char_conv FILING-KEY-01 $alephe_unicode/ tab_character_conversion_line FILING-KEY-01 ##### # line_utf2line_sb unicode_to_filing_01 FILING-KEY-02 ##### # line_utf2line_sb unicode_to_filing_02 FILING-KEY-03 ##### # line_utf2line_sb unicode_to_filing_03

30 Character Conversion Tables
unicode_to_filing_nn is the one actually used by the index creation process. unicode_to_filing_nn_source - raw material, ‘human interface’ for character conversion definitions. All the editing has to be done in this table. Process unicode_to_filing_nn_source using UTIL P/3 in order to create unicode_to_filing_nn UTIL P/3 performs an additional translation in order to remove null characters.

31 IMPORTANT NOTE The procedures must be listed in the logical order.
For example, the following setup is not logical: changes characters specified in col.4 to blank compresses a comma and a dot between numbers. ‘2,153’ has to be turned into ‘2153’ by numbers But here, it will first be changed to ‘2 153’ by to_blank

32 Filing of Headings – Putting it Together…
tab00.lng tab_filing tab_character_conversion_line FILING-KEY-01 ##### # line_utf2line_sb unicode_to_filing_01 unicode_to_filing_01

33 Index Creation and Update
The headings index is : Created by p_manage_02 Enriched by ue_08 Updated by ue_01 Note : In the authority libraries the headings are created when the document is updated, before ue_01 indexes it. Understanding Indexes

34 Understanding Indexes
Maintenance of the Browse Index : - Alphabetize long headings - Resequencing - Delete unlinked headings Understanding Indexes

35 Understanding Indexes
What are Long Headings? z01-filing-sequence = 69* characters z01-display-text = 2000 characters * “Effective” length = 34 characters with double-byte p_manage_17 (Alphabetize Long Headings) sorts those headings whose display text is longer than 69 characters. Understanding Indexes

36 Alphabetize Long Headings
Before p_manage_17… After p_manage_17… Understanding Indexes

37 Alphabetize Long Headings How does it work?
util-g-2 Last heading (z01) indexed by p_manage_02 or ue_01 Last heading (z01) processed by p_manage_17 START: last-acc-number FINISH: last-long-acc-number

38 Understanding Indexes
When to run p_manage_17? p_manage_17 must be run periodically (e.g. daily) in order to alphabetize long headings that were added since the last time this function was run. Understanding Indexes

39 If the rules for filing text creation have been changed…
Run p_manage_16 (Alphabetize Headings - Setup ) p_manage_16 recreates filing text Understanding Indexes

40 Understanding Indexes
Unlinked Headings What are unlinked headings? These are headings which do not have pointers to documents (Z01s without corresponding Z02s). How are unlinked headings created? When a heading is modified, the existing Z01 is NOT updated. Instead, the Z02 record linking the heading to the bib record is deleted and a NEW Z01 record with a new Z02 is created. Thus, “orphaned”, outdated Z01s can accumulate. Understanding Indexes

41 Understanding Indexes
Unlinked Headings How to delete unlinked headings? Run p_manage_15 (Delete Unlinked Headings) periodically. NOTE: The job does not delete Z01 records which have an authority link. This is in order to keep the cross-references, which are not linked to the documents directly (do not have attached Z02 records). Understanding Indexes

42 Understanding Indexes
Performance Issues Understanding Indexes

43 Understanding Indexes
Performance Issues In order to display the browse list the system must count the documents which are connected to a heading (Z02 records attached to Z01). Understanding Indexes

44 Understanding Indexes
Performance Issues Pre-14.2 ALEPH: p_manage_10 updates Z01 (z01_number_of_doc) with the number of documents available for each heading. 14.2 and higher: the system allows extensive use of base and denied records (per user profile) functionality. That is why browse list can benefit from being pre-filtered. The system counts the number of records on the fly. Understanding Indexes

45 Understanding Indexes
Performance issues How to speed up z02 count when the headings are displayed? Count limit. A heading with records greater than the number defined in this counter will display with + rather than the number itself Understanding Indexes

46 Base Filtered Headings Understanding Indexes
(Z0102) Understanding Indexes

47 Understanding Indexes
Z0102 Pre-14.2 – Problem: The smaller a logical base is, the more work the system has to do in order to find 20 headings which are in the base to show in the Browse list display. Solution: There is a new index Z0102 which ‘divides’ Z01 into sections in accordance with the existing logical bases. Understanding Indexes

48 Understanding Indexes
Z0102 Example of Z0102 record: Understanding Indexes

49 Z0102 Structure Z0102 record is built for each Z01 in a logical base, giving the filing text and sequence. The record does not include pointers to the doc records; this is still done by Z02. Z01 Z0102

50 Understanding Indexes
Z0102 When a logical base is being browsed, the system uses the Z0102 table to “decide” whether to display the heading (Z01) without having to retrieve the documents attached to the heading, read them, and then “decide”. Understanding Indexes

51 Understanding Indexes
When to use z0102 When you have a heavily-used logical base which is less than 30% of the total database When you have a moderately-used base which is less than 15% of the total database When it is preferable not to have cross-references. (The Browse list from z0102s has no x-ref’s.) Understanding Indexes

52 Understanding Indexes
System Setup Issues Use UTIL H/1/10 to check if your tab_base_count setup is reasonable. (It displays “Recommended” for any base which is less than 30% of the total database.) Base | # Docs | Recommended | Current EXU01PUB | | No | No EXU_LAW | | Yes | Yes EXU_SER | | No | Yes Understanding Indexes

53 Understanding Indexes
Z0102 Setup tab_base_z defines which bases work with Z0102 Understanding Indexes

54 Understanding Indexes
Z0102 Creation and Update Z0102 creation – p_manage_32 How does p_manage_32 work? p_manage_32 runs on all Z01 records and builds Z0102. - When to run p_manage_32? p_manage_32 should be run directly after p_manage_02. Understanding Indexes

55 G Who touches z01 records? Z0102 Creation and Update
Z0102 update – p_manage_34 How does p_manage_34 work? p_manage_34 runs on Z01 records that have been "touched“ since the last time p_manage_32 or p_manage_34 were run. G Who touches z01 records? Z01-UPDATE-Z0102=‘Y’ - means that the Z01 record has been modified and must be processed by p_manage_32. Z01-UPDATE-Z0102=‘N’ – p_manage_34 processing isn’t needed. - p_manage_02, ue_01 and ue_08 set Z01-UPDATE-Z0102 to "Y". - p_manage_32 and _34 set Z01-UPDATE-Z0102 to "N".

56 Understanding Indexes
Z0102 Creation and Update - When to run p_manage_34? It should be run on a regular basis - i.e. nightly. Note : Online update of 0Z102 is available in version 15. Understanding Indexes

57 Understanding Indexes
Z0102 Restriction: Z0102 is used only for the WEB OPAC browse. Browse in the GUI search is still based on Z01 and Z02 Understanding Indexes

58 Understanding Indexes
Batch Jobs for AUT Enrichment Understanding Indexes

59 Understanding Indexes
Batch Jobs for AUT Enrichment Problem: Background AUT enrichment and correction (ue_08) is very time-consuming. It is especially problematic after initial conversion or re-indexing. Solution: Batch jobs for AUT enrichment and correction of BIB libraries. These batch jobs will replace the background running of ue_08 after a re-creation of the Z01 indexes. ue_08 must be activated after batch indexing is finished. Understanding Indexes

60 Understanding Indexes
Batch Jobs for AUT Enrichment p_manage_102: pre-enriches the BIB Z01 index from the entire AUT library. Creates Z01 records linked to the authority documents. p_manage_02: has a new parameter set in aleph_start_505 - sw_force_chk. When sw_force_chk=‘Y’, all Z01 records are created with z01-aut-library= "-CHK-“. p_manage_103: sends Z07 records to enable re-indexing of the records which are linked to non-preferred headings. (Run only if bib records have not had authority work done on them.) Understanding Indexes

61 Batch Jobs for AUT Enrichment
Step 1. Run p_manage_102. Result: All Z01 records created by p_manage_102 are linked to the authority library. It is clear that the new headings created by p_manage_02 have no corresponding authority records. Step 2. Run p_manage_02. Setup : aleph_start_505 : sw_force_chk =‘Y’ Result: All Z01 records are created with z01-aut-library= "-CHK-“. Step 3. Run p_manage_103. Result: Z07 records are created for the bibliographic which are linked to non-preferred headings.

62 Understanding Indexes
Continued by the NAAUG.INDEX_CH.ppt presentation which describes Headings features new in 15.2…. Understanding Indexes


Download ppt "Understanding Indexes: Headings"

Similar presentations


Ads by Google