Jun-Ki Min
2 Logical and physical data independence allows the user to focus on logical aspects and not to worry about physical details However, the physical structure of the database affects the performance Thus, the physical schema must be carefully designed random access time of DISK -data access time -seek time + rotational delay+ block transfer time Mechanical procedure ◦ Reduce # of disk I/O
3 database Procedure Record request by Request stored record to file manager file manager finds a block containing the record Request a page to disk manager disk manager Order disk I/O DBMS users stored database file manager disk manager Request record resutl Request Stored record Return record Request Page Return page Transfer block DISK I/O OS
4 Support basic I/O service ◦ Reposible of all DISK I/OKnow all physical disk address ◦ A component of operating system Support file manager ◦ file manager can be see disk as sets of pages ◦ In page set, there is free space page set ◦ Each page set has unique page set ID ◦ Each page is unique page number Manage DISK ◦ Mapp page number to physical address support independence of file manager
5 Support that DBMS vies disk as set of stored files stored file ◦ A file is a sequence of records, where each record is a collection of data values (or data items). ◦ A set of same typed (stored) records ◦ Each page set stores several file ◦ Identified by file name or file ID stored record ◦ Identified by record number or record identifier(RID) ◦ It is unique in DISK ◦
6 disk manager’s page management ◦ Function of disk manger File manager performs logical page I/O rather than physical disk I/O Example of University DB ◦ Page set consist of 28 pages ◦ Each record is a page.
7 snocnograde E1: 100C413A E2: 100E412A E3: 200C123B E4: 300C312A E5: 300C324C E6: 300C413A E7: 400C312A E8: 400C324A E9: 400C413B E10: 400E412C E11: 500C312B snosnameyeardept S1: 100 나수영 4 컴퓨터 S2: 200 이찬수 3 전기 S3: 300 정기태 1 컴퓨터 S4: 400 송병길 4 컴퓨터 S5: 500 박종화 2 산공 enroll cnocnameCpoi nt professor C1: C123 프로그래밍 3 김성국 C2: C312 자료 구조 3 황수관 C3: C324 화일 구조 3 이규찬 C4: C413 데이타베이스 3 이일로 C5: E412 반 도 체반 도 체 3 홍봉진 student course
8 Initial: ◦ A free space page set (1 ~ 27) ◦ Except page 0 : directory file manager: insert five student records ◦ disk manager : allocate page 1~5 as “student page set" from free space page set ◦ 4 page sets “studnet"(1~5), “course"(6~10), “enroll"(11~21), “free space" page set (22~27)
9
10 file manager : insert new stuident S6 (sno 600) ◦ disk manager : get first free page (page 22) from free space page set, and add student page set file manager : delete student S2 (sno 200) ◦ disk manager : return page 2( S2) to free space page set file manager : insert new course C6 (E 515) ◦ disk manager : get first free page (page 2), and add course page set file manager : delete student S4 ◦ disk manager : return page 4(S4) to free space page set
11 I : S6 D : S2 I : C6 D : S4 ◦ Physical adjacency is removed
12 Hard to represent logical order of page set as physical adjacency Store control information in page header ◦ next page pointer Physical address of next page next page pointer is maintained by disk manager (file manager does not concern about that)
13 “next page()”
14 disk directory(page set directory) ◦ cylender 0, track 0 ◦ Stores list of all page sets and their first page address disk directory (page 0) 0 Page set Free space student course enroll address
15 stored record management ◦ Support that DBMS does not know page I/O ◦ Support handling records Example ◦ In a page, several records ◦ Logical order of student records follows sno 1) In page p, five student records (S1~ S5) P S1S2S3 S4S5.
16 P S1S3S4 S5S7S9 2) DBMS : request insert S9(sno 900) In page p, S9 stored after S5 3) DBMS : request delete S2 In page p, remove S2 and compact 4) DBMS : request insert S7(sno 700) S7 locates between S5 and S9 shift S9 and insert.
17 RID = offset= location(# of bytes) of stored record from page start If location of record is changed, contents of offset is changed without change of RID At most, two accesses is required to get a record p Record r p4 record R’s RID Page number offset (slot number)
18 File Organization sequential method Index methodHashing method Direct FileMulti-Key FileIndexed sequential File Key sequential File Entry sequential File (pile) multi-list fileInvert File
store order is same to logical order of records ◦ heap or pile : entry-sequence file ◦ general : key-sequence file Entry Sequence file ◦ New records are inserted at the end of the file. ◦ A linear search through the file records is necessary to search for a record. This requires reading and searching half the file blocks on the average, and is hence quite expensive. ◦ Record insertion is quite efficient. ◦ Reading the records in order of a particular field requires sorting the file records.
Slide Key sequential file. ◦ File records are kept sorted by the values of an ordering field. ◦ Insertion is expensive: records must be inserted in the correct order. It is common to keep a separate unordered overflow (or transact ion) file for new records to improve insertion efficiency; this is p eriodically merged with the main ordered file. ◦ A binary search can be used to search for a record on its ordering field val ue. This requires reading and searching log 2 of the file blocks on th e average, an improvement over linear search. ◦ Reading the records in order of the ordering field is quite efficient.
Slide The following table shows the average access time to access a specific record for a given ty pe of file S1(100)S2(200)S3(300) S4(400)S5(500) S4S1S2 S5S3 entry sequence file key sequence file
22 key address K1 K2 K3 Index file Data File indexed file consists of ◦ index file ◦ data file
23 support sequential access and direct access sequential data file ◦ records are sorted ◦ sequential access method index ◦ direct access method
Slide A single-level index is an auxiliary file that m akes it more efficient to search for a record in the data file. The index is usually specified on one field of t he file (although it could be specified on seve ral fields) One form of an index is a file of entries, which is ordered by field value The index is called an access path on the field.
Slide The index file usually occupies considerably less di sk blocks than the data file because its entries are much smaller A binary search on the index yields a pointer to the file record Indexes can also be characterized as dense or spar se ◦ A dense index has an index entry for every search key value (and hence every record) in the data file. ◦ A sparse (or nondense) index, on the other hand, has i ndex entries for only some of the search values
26 primary index ◦ index with primary key ◦ Defined on an ordered data file ◦ The data file is ordered on a key field ◦ with a key, find a unique record secondary index ◦ index with secondary key-non key attribute. ◦ with a value, find several records ◦ ex) name clustering index ◦ The data file is ordered on a non-key field unlike primary index, which requires that the ordering field of the data file have a distinct value for each record. ◦ records having same search key are physically clustered efficient search ◦ In a data file, one clustering index at most. nonclustering index