Download presentation
Presentation is loading. Please wait.
Published byJessie McKinney Modified over 9 years ago
1
SYSTEMSDESIGNANALYSIS 1 Chapter 17 Data Modeling Jerry Post Copyright © 1997
2
SYSTEMSDESIGN 2 Data Files Disk drives Platters spin Head moves/rotates Tracks Sectors File Systems Files Directories/Folders File Allocation Table Maps logical files to sectors Every file is split into pieces Scattered across the disk Minimum sector size Old DOS: 32K on 2GB Drive always retrieves fixed length sectors. Disk spins Head moves Tracks Sectors Directory Entry File Name File Extension File attribute Time of update Date of update Beginning disk cluster/sector File size Security attributes
3
SYSTEMSDESIGN 3 Disks: RAID Speed limitations Rotational speed Drive head speed Bus transfer rates RAID: Redundant Array of Independent (Inexpensive) Drives Store file across many drives (striping) Some deliberate duplication Speed/parallel Parallel searches RAID: seen as one drive File Sector 1 Sector 2 Sector 3 Sector 4 Sector 5 …
4
SYSTEMSDESIGN 4 Data Files Master Sorted Current data totals e.g., Inventory Transaction Change log Updates to master Report files (output) Initialization files (software) Need to separate data storage from physical location. Backup and Restore Changes in physical drive File Organization Sequential Indexed Linked Lists Hashed
5
SYSTEMSDESIGN 5 Sequential Storage Common uses When large portions of the data are always used at one time. e.g., 25% When table is huge and space is expensive. When transporting / converting data to a different system. IDLastNameFirstNameDateHired 1ReevesKeith1/29/96 2GibsonBill3/31/96 3ReasonerKaty2/17/96 4HopkinsAlan2/8/96 5JamesLeisha1/6/96 6EatonAnissa8/23/96 7FarrisDustin3/28/96 8CarpenterCarlos12/29/96 9O'ConnorJessica7/23/96 10ShieldsHoward7/13/96
6
SYSTEMSDESIGN 6 Indexed Sequential Common uses Large tables. Need many sequential lists. Some random search--with one or two key columns. Hold index in RAM if possible/speed. IDLastNameFirstNameDateHired 1ReevesKeith1/29/96 2GibsonBill3/31/96 3ReasonerKaty2/17/96 4HopkinsAlan2/8/96 5JamesLeisha1/6/96 6EatonAnissa8/23/96 7FarrisDustin3/28/96 8CarpenterCarlos12/29/96 9O'ConnorJessica7/23/96 10ShieldsHoward7/13/96 A11 A22 A32 A42 A47 A58 A63 A67 A78 A83 Address LastNamePointer CarpenterA67 EatonA58 FarrisA63 GibsonA22 HopkinsA42 JamesA47 O'ConnorA78 ReasonerA32 ReevesA11 ShieldsA83
7
SYSTEMSDESIGN 7 Linked Lists Separate each element/key. Pointers to next element. Starting point. Carpenter B87 B29 Gibson B38 00 Eaton B29 B71 Farris B71 B38
8
SYSTEMSDESIGN 8 Insert into a Linked List Get space/location with address. Data: Save row (A97). Key: Save key and pointer to data (B14). Find insert location. Eccles would be after Eaton and before Farris. From prior key (Eaton), put next address (B71) into new key, next pointer. Put new address (B14) in prior key, next pointer. Farris B71 B38A63 Eaton B29 B71A58 Eccles B14 B71A97 NewData = new (...) NewKey = new (...) NewKey->Key = “Eccles” NewKey->Data = NewData FindInsertPoint(List, PriorKey, NewKey) NewKey->Next = PriorKey->Next PriorKey->Next = NewKey B14
9
SYSTEMSDESIGN 9 Direct Access / Hashed Convert key value directly to location (relative or absolute). Use prime modulus Choose prime number greater than expected database size (n). Divide and use remainder. Set aside spaces (fixed- length) to hold each row. Collision/overflow space for duplicates. Extremely fast retrieval. Very poor sequential access. Reorganize if out of space! Example Prime = 101 Key = 528 Modulus = 23 Overflow/collisions
10
SYSTEMSDESIGN 10 Why Normalization? Need standardized data definition Advantages of DBMS require careful design Define data correctly and the rest is much easier It especially makes it easier to expand database later Method applies to most models and most DBMS Similar to Entity-Relationship Similar to Objects (without inheritance and methods) Goal: Define tables carefully Save space Minimize redundancy Protect data
11
SYSTEMSDESIGN 11 Notation Table name Primary key is underlined Table columns Customer(CustomerID, Phone, Name, Address, City, State, ZipCode) CustomerIDPhoneLastNameFirstNameAddressCityStateZipcode 1502-666-7777JohnsonMartha125 Main StreetAlvatonKY42122 2502-888-6464SmithJack873 Elm StreetBowling GreenKY42101 3502-777-7575WashingtonElroy95 Easy StreetSmith’s GroveKY42171 4502-333-9494AdamsSamuel746 Brown DriveAlvatonKY42122 5502-474-4746RabitzVictor645 White AvenueBowling GreenKY42102 6616-373-4746SteinmetzSusan15 Speedway DrivePortlandTN37148 7615-888-4474LasaterLes67 S. Ray DrivePortlandTN37148 8615-452-1162JonesCharlie867 Lakeside DriveCastalian SpringsTN37031 9502-222-4351ChavezJuan673 Industry Blvd.CaneyvilleKY42721 10502-444-2512RojoMaria88 Main StreetCave CityKY42127
12
SYSTEMSDESIGN 12 Sample: Video Database Repeating section Possible Keys
13
SYSTEMSDESIGN 13 Initial Objects Customers Key: Assign a CustomerID Sample Properties Name Address Phone Videos Key: Assign a MovieID Sample Properties Title RentalPrice Rating Description RentalTransaction Event/Relationship Key: Assign TransactionID Sample Properties CustomerID Date VideosRented Event/Repeating list Keys: TransactionID + MovieID Sample Properties VideoCopy#
14
SYSTEMSDESIGN 14 Initial Form Evaluation Collect forms from users Write down properties Find repeating groups (...) Look for potential keys: key Identify computed values Notation makes it easier to identify and solve problems Results equivalent to diagrams, but will fit on one or two pages RentalForm(TransID, RentDate, CustomerID, Phone, Name, Address, City, State, ZipCode, (VideoID, Copy#, Title, Rent ) )
15
SYSTEMSDESIGN 15 Problems with Repeating Sections RentalForm(TransID, RentDate, CustomerID, Phone, Name, Address, City, State, ZipCode, (VideoID, Copy#, Title, Rent ) ) TransIDRentDateCustomerIDLastNamePhoneAddressVideoIDCopy#TitleRent 14/18/953Washington502-777-757595 Easy Street122001: A Space Odyssey$1.50 14/18/95 3Washington502-777-757595 Easy Street63Clockwork Orange$1.50 24/30/957Lasater615-888-447467 S. Ray Drive81Hopscotch$1.50 24/30/957Lasater615-888-447467 S. Ray Drive21Apocalypse Now$2.00 24/30/957Lasater615-888-447467 S. Ray Drive61Clockwork Orange$1.50 34/18/958Jones615-452-1162867 Lakeside Drive91Luggage Of The Gods$2.50 34/18/95 8Jones615-452-1162867 Lakeside Drive151Fabulous Baker Boys$2.00 34/18/95 8Jones615-452-1162867 Lakeside Drive41Boy And His Dog$2.50 44/18/953Washington502-777-757595 Easy Street31Blues Brothers$2.00 44/18/95 3Washington502-777-757595 Easy Street81Hopscotch$1.50 44/18/95 3Washington502-777-757595 Easy Street131Surf Nazis Must Die$2.50 44/18/953Washington502-777-757595 Easy Street171Witches of Eastwick$2.00 Repeating Section Causes duplication Storing data in this raw form would not work very well. For example, repeating sections will cause problems. Note the duplication of data. Also, what if a customer has not yet checked out a movie--where do we store that customer’s data?
16
SYSTEMSDESIGN 16 Problems with Repeating Sections Store repeating data Allocate space How much? Can’t be short Wasted space e.g., How many videos will be rented at one time? A better definition eliminates this problem. Name Phone Address City State ZipCode VideoIDCopy#TitleRent 1. 61Clockwork Orange1.50 2. 82Hopscotch1.50 3. 4. 5. {Unused Space} Not in First Normal Form Customer Rentals
17
SYSTEMSDESIGN 17 First Normal Form Remove repeating sections Split into two tables Bring key from main and repeating section RentalLine(TransID, VideoID, Copy#,...) Each transaction can have many videos (key VideoID) Each video can be rented on many transactions (key TransID) For each TransID and VideoID, only one Copy# (no key on Copy#) RentalForm(TransID, RentDate, CustomerID, Phone, Name, Address, City, State, ZipCode, (VideoID, Copy#, Title, Rent ) ) RentalForm2(TransID, RentDate, CustomerID, Phone, Name, Address, City, State, ZipCode) RentalLine(TransID, VideoID, Copy#, Title, Rent )
18
SYSTEMSDESIGN 18 First Normal Form Problems (Data) 1NF splits repeating groups Still have problems Replication Hidden dependency: If a video has not been rented yet, then what is its title? TransIDRentDateCustIDPhoneLastNameFirstNameAddressCityStateZipCode 14/18/953502-777-7575WashingtonElroy95 Easy StreetSmith's GroveKY42171 24/30/957615-888-4474LasaterLes67 S. Ray DrivePortlandTN37148 34/18/958615-452-1162JonesCharlie867 Lakeside DriveCastalian SpringsTN37031 44/18/953502-777-7575WashingtonElroy95 Easy StreetSmith's GroveKY42171 TransIDVideoIDCopy#TitleRent 1122001: A Space Odyssey$1.50 163Clockwork Orange$1.50 281Hopscotch$1.50 221Apocalypse Now$2.00 261Clockwork Orange$1.50 391Luggage Of The Gods$2.50 3151Fabulous Baker Boys$2.00 341Boy And His Dog$2.50 431Blues Brothers$2.00 481Hopscotch$1.50 4131Surf Nazis Must Die$2.50 4171Witches of Eastwick$2.00
19
SYSTEMSDESIGN 19 Second Normal Form Definition Each non-key column must depend on the entire key. Only applies to concatenated keys Some columns only depend on part of the key Split those into a new table. Dependence (definition) If given a value for the key you always know the value of the property in question, then that property is said to depend on the key. If you change part of a key and the questionable property does not change, then the table is not in 2NF. RentalLine(TransID, VideoID, Copy#, Title, Rent) Depend only on VideoID Depends on both TransID and VideoID
20
SYSTEMSDESIGN 20 Second Normal Form Example Title depends only on VideoID Each VideoID can have only one title Rent depends on VideoID This statement is actually a business rule. It might be different at different stores. Some stores might charge a different rent for each video depending on the day (or time). Each non-key column depends on the whole key. RentalLine(TransID, VideoID, Copy#, Title, Rent) VideosRented(TransID, VideoID, Copy#) Videos(VideoID, Title, Rent)
21
SYSTEMSDESIGN 21 Second Normal Form Example (Data) TransIDVideoIDCopy# 112 163 221 261 281 341 391 3151 431 481 4131 4171 VideoIDTitleRent 12001: A Space Odyssey$1.50 2Apocalypse Now$2.00 3Blues Brothers$2.00 4Boy And His Dog$2.50 5Brother From Another Planet$2.00 6Clockwork Orange$1.50 7Gods Must Be Crazy$2.00 8Hopscotch$1.50 VideosRented(TransID, VideoID, Copy#) Videos(VideoID, Title, Rent) RentalForm2(TransID, RentDate, CustomerID, Phone, Name, Address, City, State, ZipCode) (Unchanged)
22
SYSTEMSDESIGN 22 Second Normal Form Problems (Data) Even in 2NF, problems remain Replication Hidden dependency If a customer has not rented a video yet, where do we store their personal data? Solution: split table. TransIDRentDateCustIDPhoneLastNameFirstNameAddressCityStateZipCode 14/18/953502-777-7575WashingtonElroy95 Easy StreetSmith's GroveKY42171 24/30/957615-888-4474LasaterLes67 S. Ray DrivePortlandTN37148 34/18/958615-452-1162JonesCharlie867 Lakeside DriveCastalian SpringsTN37031 44/18/953502-777-7575WashingtonElroy95 Easy StreetSmith's GroveKY42171 RentalForm2(TransID, RentDate, CustomerID, Phone, Name, Address, City, State, ZipCode)
23
SYSTEMSDESIGN 23 Third Normal Form Definition Each non-key column must depend on nothing but the key. Some columns depend on columns that are not part of the key. Split those into a new table. Example: Customers name does not change for every transaction. Dependence (definition) If given a value for the key you always know the value of the property in question, then that property is said to depend on the key. If you change the key and the questionable property does not change, then the table is not in 3NF. RentalForm2(TransID, RentDate, CustomerID, Phone, Name, Address, City, State, ZipCode) Depend only on CustomerID Depend on TransID
24
SYSTEMSDESIGN 24 Third Normal Form Example Customer attributes depend only on Customer ID Split them into new table (Customer) Remember to leave CustomerID in Rentals table. We need to be able to reconnect tables. 3NF is sometimes easier to see if you identify primary objects at the start--then you would recognize that Customer was a separate object. RentalForm2(TransID, RentDate, CustomerID, Phone, Name, Address, City, State, ZipCode) Rentals(TransID, RentDate, CustomerID ) Customers(CustomerID, Phone, Name, Address, City, State, ZipCode )
25
SYSTEMSDESIGN 25 Third Normal Form Example Data TransIDRentDateCustomerID 14/18/95 3 24/30/95 7 34/18/958 44/18/953 CustomerIDPhoneLastNameFirstNameAddressCityStateZipCode 1502-666-7777JohnsonMartha125 Main StreetAlvatonKY42122 2502-888-6464SmithJack873 Elm StreetBowling GreenKY42101 3502-777-7575WashingtonElroy95 Easy StreetSmith's GroveKY42171 4502-333-9494AdamsSamuel746 Brown DriveAlvatonKY42122 5502-474-4746RabitzVictor645 White AvenueBowling GreenKY42102 6615-373-4746SteinmetzSusan15 Speedway DrivePortlandTN37148 7615-888-4474LasaterLes67 S. Ray DrivePortlandTN37148 8615-452-1162JonesCharlie867 Lakeside DriveCastalian SpringsTN37031 9502-222-4351ChavezJuan673 Industry Blvd.CaneyvilleKY42721 10502-444-2512RojoMaria88 Main StreetCave CityKY42127 Rentals(TransID, RentDate, CustomerID ) Customers(CustomerID, Phone, Name, Address, City, State, ZipCode ) (Unchanged) VideosRented(TransID, VideoID, Copy#) Videos(VideoID, Title, Rent)
26
SYSTEMSDESIGN 26 Third Normal Form Tables (3NF) Rentals(TransID, RentDate, CustomerID ) Customers(CustomerID, Phone, Name, Address, City, State, ZipCode ) VideosRented(TransID, VideoID, Copy#) Videos(VideoID, Title, Rent)
27
SYSTEMSDESIGN 27 Checking Your Work (Quality Control) Look for one-to-many relationships. Many side should be keyed (underlined). e.g., VideosRented(TransID, VideoID,...). Check each column and ask if it should be 1 : 1 or 1: M. If add a key, renormalize. Verify no repeating sections (1NF) Check 3NF Check each column and ask: Does it depend on the whole key and nothing but the key? Verify that the tables can be reconnected (joined) to form the original tables (draw lines). Each table represents one object. Enter sample data--look for replication.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.