Download presentation
Presentation is loading. Please wait.
Published byKerrie Ellis Modified over 9 years ago
1
File Organization and Storage Structures Chapter 5
3
Basic Concepts The database on secondary storage is organized into one or more files, where each file consists of a number of records. Each record consists of one or more fields. Typically, a record corresponds to an entity and a field to an attribute. The physical record is the unit of transfer between disk and primary storage, and vice versa. A physical record, sometimes called block or page, contains mostly several logical records, depending on the size of the records.
4
List structures Elementary list Singular list Circular list Symmetric list Symmetric circular list
5
Sequential insertion X(1) X(2) X(3) X(4) Free Zone X’(1)=X(1) X’(2)=Y X’(3)=X(2) X’(4)=X(3) free Zone X’(5)=X(4)
6
Insertion with pointer technique X(1) X(3) X(2) X(4) Y X’(1)=X(1) X’(4)=X(3) X’(3)=X(2) X’(5)=X(4) X’(2)=Y
7
Multi-list structure record with pointer record length 10 address list1 list2 list empty places 2000 3000 2020 2030 2050 2040 2010 2000 2060 3000...... A B K L
8
Insertion at beginning of list 2 list1 list2 2000 3000 2020 2030 2010 2050 2040 2000 2060 3000...... A B K L M List1: A B List2: M K L
9
General tree structure A B C DE FHJ KL MNPQR
10
Equivalent binary tree structure A B C DE F H JKL Q R M N P
11
Pointer Implementation A BC DE F H JKL QR MNP
12
Bi-directional tree X Y R S Z U T Entry -1 X Y -1 R -1 S -1 Z -1 U -1 T - first lower - higher - next
13
Ring structure X Y Z U V T R Entry X Y ZU V T R
14
File Organization The physical arrangement of data into records and pages on secondary storage Main types Heap or unordered Sorted Hash Access method The steps involved in storing and retrieving records from a file
15
Sample Data SUPPLIER file SNUM SNAME STATUS CITY S1 De Smet 20 London S2 Janssens 10 Paris S3 Blanchart 30 Paris S4 Clark 20 London S5 Adams 30 Athens
16
Hash Files S300 Blanchart 30 Paris 0 1 2 3 4 5 6 7 8 9 1011 12 S200 Janssens 10 Paris S500 Adams 30 Athens S100 De Smet 20 London S400 Clark 20 London Hashing techniques Duplicate handling - open addressing - unchained overflow - Chained overflow - Multiple hashing Hashing algorithms - folding - mid-square - division by prime number Limitations: - inappropriate for value ranges - retrieval on the non-hash fields
17
An Index An index provides an ACCESS PATH to the file it is indexing a file may have several associated indexes the sequential access path is always available an index imposes an ordering on the file it is indexing it can be used for direct access it speeds up retrieval and slows down updating it is not the same thing as a key can be build on combinations of fields can be SRA or symbolic
18
Sample Data SUPPLIER file SNUM SNAME STATUS CITY S1 De Smet 20 London S2 Janssens 10 Paris S3 Blanchart 30 Paris S4 Clark 20 London S5 Adams 30 Athens
19
Supplier file with index on city Supplier file SNUM SNAME STATUS CITY S1 De Smet 20 London S2 Janssens 10 Paris S3 Blanchart 30 Paris S4 Clark 20 London S5 Adams 30 Athens City-index Athens. London. Paris.
20
Supplier file with two indexes 10 20 30 Supplier file City-index Athens. London. Paris. SNUM SNAME STATUS CITY S1 De Smet 20 London S2 Janssens 10 Paris S3 Blanchart 30 Paris S4 Clark 20 London S5 Adams 30 Athens
21
Non-dense index S2. S4. S5. block 1 block 2 block 3 SNUM-index SNUM SNAME STATUS CITY S1 De Smet 20 London S2 Janssens 10 Paris S3 Blanchart 30 Paris S4 Clark 20 London S5 Adams 30 Athens
22
Factoring out a field SNUM SNAME STATUS CITY-pointer S1 De Smet 20 S2 Janssens 10 S3 Blanchart 30 S4 Clark 20 S5 Adams 30 Supplier file CITY-file CITY Athens London Paris
23
Combining Indexing and factoring out S1 De Smet 20 S2 Janssens 10 S3 Blanchart 30 S4 Clark 20 S5 Adams 30 AthensLondon Paris
24
Parent - Child structure S1 De Smet 20 S2 Janssens 10 S3 Blanchart 30 S4 Clark 20 S5 Adams 30 AthensLondon Paris CITY file SUPPLIER file
25
Fully inverted file SNAME-index STATUS-index CITY-index Supplier- file De Smet S1-> 10 S1-> Athens S5-> S1 Janssens S2-> 20 S1->,S4-> London S1->,S4-> S2 Blanchart S3-> 30 S3->,S5-> Paris S2->,S3-> S3 Clark S4-> S4 Adams S5-> S5
26
File organization: Indexed-sequential multi-level index blocks data blocks Behr Dooms Fagin Adams Albert Behr Bodoo Claes Codd Dooms Ernest Fagin Ace Adamo Adams Ademar Aerts Alan Albert Alois Ball Behr Bens Bodoo parameters - index block size - data block size
27
B-tree concept BALANCED tree 25 144 9 -64 100196 - 1 4 - 9 16 - 25 36 49 64 81 - 100 121 - 144 169 - 196225250 non-dense index dense index
28
B-tree insertion non-dense index dense index same B-tree after insertion of record 32 64 - 25 - 144 - 9 -36 - 100 -196 - 1 4 - 9 16 - 25 32 - 36 49 - 64 81 - 100 121 - 144 169 - 196225256
29
B-tree deletion 25 81 9 -36 -144 196 non-dense index 1 4 -- 9 16 - 25 32 - 36 49 - 81 100 121 144169 - 196225 256 Deletion of 64
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.