Download presentation
Presentation is loading. Please wait.
Published byMay Campbell Modified over 9 years ago
1
Indexes Lookups or “ the Good, the Bad and the Ugly ” John S. Lemon
2
Indexes What do these have in common ? One person knows – rest of you will have to wait !
3
Indexes What I hope to cover Based on the Aberdeen Maternity and Neo-natal Data Bank ( AMNDB ) History of lookups – Dummy case SQL tabfiles Second data base Second data base + secondary indexes + lookup command
4
Indexes What is a “lookup” Converting text -> numeric code Or numeric code -> text Data entry staff enter ‘uterine adhesions’ Converted to ( and stored as ) 621.5 Why ? Space
5
Indexes Why use a ‘lookup’ Alternatives are storing full text ‘uterine adhesions’ - stored as 19 bytes / characters This is one of the shorter ones !!! Problem of different ‘spelling’ Uterine Adhesions UTERINE ADHESIONS etc. Which to use when searching
6
Indexes Why use a ‘lookup’ Store ‘uterine adhesions’ as 621.5 - stored as 8 bytes less if using integers or fixed length string codes ‘A6215’ - stored as 7 bytes
7
Indexes Why use a ‘lookup’ Value labels - not practical Size limit on label length Record specific Problems managing an increasing list Ordering / Sorting Large numbers of codes:text pairs Occupations 13872 Drugs 5357 Operations 2754 Diseases / Illnesses / Complications 4404
8
Indexes Dummy case Only option in version 2 Use a CASEID which was unique In AMNDB used –999999 This case only contained data for the ‘lookup’ record types It worked but ….
9
Indexes Dummy case - problems Specific to that data base Not shared – unless SIR FILE DUMP- from ‘active’ READ INPUT DATA- to ‘research’ Have to repeat for all data bases that need lookups
10
Indexes Dummy case - problems Can only go ‘one way’ unless duplicate data with different Key Fields RECORD SCHEMA99, ICD2NAME SORT IDSICDCODE RECORD SCHEMA98, NAME2ICD SORT IDSICDNAME Huge problems of maintenance
11
Indexes Dummy case - problems MAX KEY SIZE May have seen in LIST STATS What is it ? Need to have quick look at the ‘structure’ of a SiR record Simplified not definitive view If you want a complete definition – ask Tony !!
12
Indexes SiR record structure Essentially two components Key - fixed length - same for ALL records May not be ‘filled’ for all records Data- variable length 1 2 3 CASEID Max Key Size Data Actual Size
13
Indexes Dummy case - problems Without dummy case Max Key Size - approx 20 bytes long Text field – 50 characters long Max Key Size - approx 60 bytes long Extra 40 bytes For EVERY record 1.5 million * 40 bytes = approx 60 Mbytes Trivial today but in 1988 ……………
14
Indexes Dummy Case – solutions (sic) How to resolve size problems Reduced length of text field – 30 chars One way only Only in one data base – data entry Used ‘marvellous feature’ of sequential data base access SIR3 file held on tape(s) Only PROCESS CASES ALL No CASE IS
15
Indexes Lookups use in AMNDB Two main programmes / suites of programmes Batch run to convert text -> code Interactive programme for each record that requires lookups Need both for different reasons
16
Indexes Lookups use in AMNDB Batch run to convert text -> code Does all record types in data base that has data for looking up Only converts records where text code is already available Marks successful lookups Leave unsuccessful ones for interactive
17
Indexes Lookups use in AMNDB One interactive programme per record with data for looking up Functions in same way as batch except If no match prompts for one of following - Retype Delete Edit Add new lookup code
18
Indexes SQL tabfiles Stuck with Dummy case until SQL tabfiles SAVE TABLE SQL indexes One external file could hold lookups
19
Indexes SQL tabfiles One external file – many advantages One multi-user file for all lookups Simplified maintenance - one file to backup Multiple indexes per table ( cf. record ) Max Key Size problem removed BUT ……………
20
Indexes SQL tabfiles - problems Can only see the ‘whole’ file picture No equivalent of File dump / Add recs Need SQL+ - clumsy PQL programs- not intuitive Could use Forms but not easy Journalling was suspect Ended up using EXPORT for backup Exporting SQL tabfiles was idiosyncratic
21
Indexes SQL tabfiles - problems If tabfile is ‘volatile’ – small frequent updates Tabfile can get ‘corrupt’ Verify ‘drops’ table instead of correcting Updates appeared to work but later find records missing / corrupt Just had to ‘live with it’ ……… and hope for better
22
Indexes Second data base No worries about Max Key Size Multi user Reliable DBMS utilities and functionality UNLOAD Journalling File Dump / Add Recs Easily look at the data But ……………………
23
Indexes Second data base Considered but not used Why ? No alternative ‘views’ / indexes Back to two copies of data RECORD SCHEMA99, ICD2NAME SORT IDSICDCODE RECORD SCHEMA98, NAME2ICD SORT IDSICDNAME No real advantages – ‘devil you know’
24
Indexes What do these have in common ? Still puzzled ??
25
Indexes Two data bases + Secondary Index Then in SiR2002 Two, or more, data bases Secondary Indexes LOOKUP command Decided to ‘go for broke’ Use all three ‘features’ to remove previous problems
26
Indexes Two data bases + Secondary Index Two things spring to mind Can of worms Snail in a well
27
Indexes Two data bases + Secondary Index The can of worms was trying to understand code written many years ago How many of you have ‘revisited’ PQL you wrote 5 years ago ? Can you remember what you were trying to do ?
28
Indexes Two data bases + Secondary Index Do you add Comments to explain for future benefit C Only get records for Males over 60 Use | for ‘inline’ comments. END PROCESS REC | PREGNANCY It was ‘grey hair’ time – yet again !! Not sure which is worse teenage daughters old SiR code
29
Indexes Two data bases + Secondary Index By hard work and perseverance managed to ‘decode’ the old code This time I added Comments as I worked it out !! So that was the can of worms sorted
30
Indexes Two data bases + Secondary Index Just left the snail in the well How does this relate to SiR code Climb 3 feet during day – slip back two feet at night At almost every stage I encountered ASFs ‘Another SiR Feature’
31
Indexes Two data bases + Secondary Index Enormous thanks to Tony for help, aid, assistance and patience Six new versions of SiR within one week Gradually felt that I was ‘climbing’ higher and higher
32
Indexes Two data bases + Secondary Index One day after yet another new version - programme ran OK Message to Tony “I’m out of the well !! “ Even so did some more testing
33
Indexes Two data bases + Secondary Index Two hours later I sent a message using a phrase from this show “I’ve fallen in the water” Yet another problem
34
Indexes Two data bases + Secondary Index Yet again Tony came to the rescue Apart from one bit of my code I need to sort the programmes are working OK So how do we use A second data base Secondary Indexes The LOOKUP command
35
Indexes Two data bases + Secondary Index All lookups are held in a second, caseless data base The key fields are the numeric codes to keep MAX KEY SIZE to minimum Journalling is turned on
36
Indexes Two data bases + Secondary Index The same code might refer to multiple text strings Threatened Abortion Thr Abr TAbr All mean same Use AUTOKEY to cope Secondary indexes on the text strings
37
Indexes Two data bases + Secondary Index Experiences so far Still testing but looks good Faster and more reliable Can look at the data easily Correcting invalid data is easy All the power and features of vPQL and DBMS utilities for maintenance Only one system to learn / remember
38
Indexes Lookups – a summary Dummy case Rigid Inflexible Cumbersome Can use DBMS etc. for maintenance Now luckily replaced
39
Indexes Lookups – a summary SQL tabfiles Very flexible Obtuse No easy maintenance SQL+ is cumbersome Reliability / integrity of tabfiles Better but for long term – flawed Ad-hoc work – re-building every time
40
Indexes Lookups – a summary Final solution with Second data base, Indexes & LOOKUP Reliable PQL is familiar Fast Combines good features of dummy case and tabfiles Perhaps time will tell – but looks good
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.