Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Management for Decision Support Session-5 Prof. Bharat Bhasker.

Similar presentations


Presentation on theme: "Data Management for Decision Support Session-5 Prof. Bharat Bhasker."— Presentation transcript:

1 Data Management for Decision Support Session-5 Prof. Bharat Bhasker

2 Server Hardware Architecture Disk Technology -- RAID The DISK (i.e. I/O ) speed has not kept pace with the CPU speed I/O throughput is the weakest link in the chain Greatest Possibility of Failure => loss of data What is required? A Robust reliable, possibly failsafe storage mechanism Devices with better I/O throughput

3 Server Hardware Architecture Disk Technology -- RAID Redundant Array of Independent Disks –Cheap (Small) disks can be combined to offer large storage –Plug and Play –Hot Swappable –Reliability and Availability –Disk Block Access = Seek Time + Block Transfer Time

4 Server Hardware Architecture Disk Technology -- RAID RAID- I Disk Mirroring/ Shadowing Based on VMS shadowing - uses two disks in place of one Both disk contain exact same copy of the data It’s a constant backup/shadow/mirror require twice the disk drive VMS model has common failure point RAID-1 has independent drive/controller/power

5 Server Hardware Architecture Disk Technology -- RAID RAID- 3 Data Stripping for fault tolerance Doesn’t require twice the disk for backup/mirroring Based on Parity drive I.e. one extra drive for recreating the data Assume five drives for data then RAID-3 needs 6 drives Stripping done at byte/bit level 5 1 3 4215 5 1 ? 42

6 Server Hardware Architecture Disk Technology -- RAID RAID- 4 Data Stripping for fault tolerance Stripping done at Block level Better performance Assume five drives for data then RAID-3 needs 6 drives Parallel Reads from Multiple heads 5 1 3 4215 5 1 ? 42

7 Server Hardware Architecture Disk Technology -- RAID RAID- 5 Data Stripping for fault tolerance Stripping done at Block/record segments level but parity is rotated In RAID 3/4 all drives used for reading/writing RAID 5 ability to read as many drives as it needs at the same time for different individual read/write requests

8 Data Organizations Operations on organized data –Find (Locate) –Read (Get) –FindNext –Delete –Insert –Modify –Findall –Find Ordered

9 Data Organizations Unordered File Organization Find - Average b/2 O(b) Read - O(1) Insert (1) Modify O(b) Delete O(b) A v b x c d w e

10 Data Organizations Ordered File Organization Find - O(log b) Read - O(1) Insert O(b) Modify O(log b) Delete O(log b) A b c d f t u v

11 Data Organizations Primary Index an ordered file with fixed record length and two fields- key field and block pointer field. Primary index is built on ordering key field. A b c d t A f j t f j

12 Data Organizations Assume 30,000 records Blocksize =1024 bytes and R =100 bytes Each block can store 1024/100 10 records. Total block b = 3000 In Ordered files log (3000) = 12 block accesses Ordering Key =9 bytes and Block pointer 6 bytes Primary Index R = 15 bytes records per block 1024/15 = 68 Blocks required to hold 3000 entries 3000/68 = 45 blocks log2 (45) = 6 block accesses + 1 for data block

13 Data Organizations Clustering Index an ordered file with fixed record length and two fields- key field and block pointer field. Primary index is built on file ordered on a non-key field. A A A d t A d j t d d j j

14 Data Organizations Secondary Index an ordered file with fixed record length and two fields- non ordering field and block pointer field. Secondary index is built on non ordering field (dense). A b c d t A b c d x e e t

15 Data Organizations Assume 30,000 records Blocksize =1024 bytes and R =100 bytes Each block can store 1024/100 10 records. Total block b = 3000 In Unordered files 3000/2 = 1500 block accesses Ordering Key =9 bytes and Block pointer 6 bytes Primary Index R = 15 bytes records per block 1024/15 = 68 Each record requires an entry Blocks required to hold 30000 entries 30000/68 = 442 blocks log2 (442) = 9 block accesses + 1 for data block = 10 Block accesses

16 Data Organizations Multi Data Pointers for Duplicate handling Multi Level By creating a primary index on top of the base level secondary index 442 blocks of ordered data can be addressed by primary key mechanism of 68 entries per block 442/68 = 7 log (7) = 3 for locating the block in the secondary level + 1 for secondary level + 1 for data = 5 block accesses


Download ppt "Data Management for Decision Support Session-5 Prof. Bharat Bhasker."

Similar presentations


Ads by Google