April 2002Information Systems Design John Ogden & John Wordsworth FOI: 1 Database Design File organisations and indexes John Wordsworth Department of Computer Science The University of Reading Room 129, Ext 6544
April 2002Information Systems Design John Ogden & John Wordsworth FOI: 2 Lecture objectives Relate tables and their rows to file structures on secondary storage Review data organisations and access methods Introduce the different kinds of indexes supported by access methods
April 2002Information Systems Design John Ogden & John Wordsworth FOI: 3 Tables and files Table: Lecturer StaffNoNamePhoneDept Andrews6789AG File: Lecturer Andrews6789AG025891Eaglefield7890CS103001Irwin8901CS156990Ogbourne6543AG Unwin8765MU253825Yateley7654CS
April 2002Information Systems Design John Ogden & John Wordsworth FOI: 4 Access methods Software provided by the operating system. Mediates between the programmer’s logical view of data and the operating systems view of the I/O hardware (disk architecture). The implementer of the DBMS works with the the file organisations provided by the operation system’s access methods.
April 2002Information Systems Design John Ogden & John Wordsworth FOI: 5 A heap file Records stored in the order they are added to the file New records added at the end Deletions by marking the records Retrieval for update means a linear search of the file Needs reorganisation from time to time
April 2002Information Systems Design John Ogden & John Wordsworth FOI: 6 An ordered file Records stored in the order of a key field (access method might leave gaps) New records added in place (some reorganisation) Deletions by marking the records (space can be reused) Retrieval for update can be done by binary search
April 2002Information Systems Design John Ogden & John Wordsworth FOI: 7 A hash file (1) Bucket 0 Bucket1 Bucket 2 Overflow
April 2002Information Systems Design John Ogden & John Wordsworth FOI: 8 A hash file (2) The file space is organised into buckets When a record is to be added, the key is hashed to a bucket number When a bucket is full, the overflow area(s) have to be used To retrieve a record, hash the key and search the bucket Deletions by marking the records (space can be reused) A good hash algorithm produces a good spread of bucket numbers from the current and future keys
April 2002Information Systems Design John Ogden & John Wordsworth FOI: 9 A sparse index Track 0 Track 1 KeyTrackKeyTrack KeyTrack
April 2002Information Systems Design John Ogden & John Wordsworth FOI: 10 A secondary index Track 0 Track CS012345AG103001CS156900AG212212MU253825CS TKey AG0 R 1 TKey AG0 R 0 TKey CS1 R 0 TKey CS2 R 0 TKey CS2 R 1 TKey MU1 R 2
April 2002Information Systems Design John Ogden & John Wordsworth FOI: 11 A multilevel index KeyTrack KeyTrack KeyTrack KeyTrack KeyTrack KeyTrack KeyTrack index track 0 index track 1 KeyTrack
April 2002Information Systems Design John Ogden & John Wordsworth FOI: 12 Key points Database tables are stored in files on secondary storage The heap file, ordered file, and hashed file are common file organisations An access method is an operating system component that supports a file organisation The DBMS manages the data by using the access methods. Indexes are used with ordered files to improve access