Chapter 2 Simple File Storage and Retrieval

Slides:



Advertisements
Similar presentations
Disk Storage, Basic File Structures, and Hashing
Advertisements

Databasteknik Databaser och bioinformatik Data structures and Indexing (II) Fang Wei-Kleiner.
Technology Guide 3 Data and Database T3-1. IT for Management Prof. Efraim Turban T3-2 File Management Hierarchy of data for a computer-based file Record.
File Management Chapter 12. File Management A file is a named entity used to save results from a program or provide data to a program. Access control.
Advance Database System
© Copyright 2011 John Wiley & Sons, Inc.
ACCOUNTING INFORMATION SYSTEMS
Chapter Chapter 13-2 Chapter 13 Data Modeling Introduction An Overview of Databases Steps in Creating a Database Using Rea Creating Database Tables.
PowerPoint Presentation for Dennis, Wixom & Tegarden Systems Analysis and Design Copyright 2001 © John Wiley & Sons, Inc. All rights reserved. Slide 1.
1 Chapter 6 Storage and Multimedia: The Facts and More.
2010/3/81 Lecture 8 on Physical Database DBMS has a view of the database as a collection of stored records, and that view is supported by the file manager.
Chapter 14 Organizing and Manipulating the Data in Databases
File Organizations and Indexes ISYS 464. Disk Devices Disk drive: Read/write head and access arm. Single-sided, double-sided, disk pack Track, sector,
METU Department of Computer Eng Ceng 302 Introduction to DBMS Disk Storage, Basic File Structures, and Hashing by Pinar Senkul resources: mostly froom.
SECTIONS 13.1 – 13.3 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin SECONDARY STORAGE MANAGEMENT.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 13 Disk Storage, Basic File Structures, and Hashing.
Chapter 5 The Relational Database Model: Introduction
Chapter 3: Data Modeling
File Structures Dale-Marie Wilson, Ph.D.. Basic Concepts Primary storage Main memory Inappropriate for storing database Volatile Secondary storage Physical.
DISK STORAGE INDEX STRUCTURES FOR FILES Lecture 12.
Chapter 3 Data Modeling Fundamentals of Database Management Systems by
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
Introduction to Databases
Chapter 7 Logical Database Design
Chapter 4 The Database Management System Concept
Chapter 6 The Relational Database Model: Additional Concepts
1 Lecture 7: Data structures for databases I Jose M. Peña
Introduction to Databases Chapter 1: Introducing Data and Data Management.
File Organization Techniques
Chapter 3 The Database Management System Concept
10-1 COBOL for the 21 st Century Nancy Stern Hofstra University Robert A. Stern Nassau Community College James P. Ley University of Wisconsin-Stout (Emeritus)
4-1 COBOL for the 21 st Century Nancy Stern Hofstra University Robert A. Stern Nassau Community College James P. Ley University of Wisconsin-Stout (Emeritus)
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 17 Disk Storage, Basic File Structures, and Hashing.
Chapter 8 Physical Database Design
Announcements Exam Friday Project: Steps –Due today.
1 Secondary Storage Management Submitted by: Sathya Anandan(ID:123)
PowerPoint Presentation for Dennis & Haley Wixom, Systems Analysis and Design, 2 nd Edition Copyright 2003 © John Wiley & Sons, Inc. All rights reserved.
Fundamentals of Database Management Systems, 2nd ed
Physical Database Design File Organizations and Indexes ISYS 464.
Today’s Agenda  Any questions about the assignment (due Mon)?  Quiz  Quiz review  Homework for Friday:  Watch the two videos on the Coursera db website.
PowerPoint Presentation for Dennis & Haley Wixom, Systems Analysis and Design Copyright 2000 © John Wiley & Sons, Inc. All rights reserved. Slide 1 Systems.
Chapter 8 Physical Database Design
5-1 Chapter 5 The Repetition Process in VB.NET. 5-2 Learning Objectives Understand the importance of the repetition process in programming. Describe the.
1 Chapter 17 Disk Storage, Basic File Structures, and Hashing Chapter 18 Index Structures for Files.
Database Management Systems,Shri Prasad Sawant. 1 Storing Data: Disks and Files Unit 1 Mr.Prasad Sawant.
Core Concepts of ACCOUNTING INFORMATION SYSTEMS Moscove, Simkin & Bagranoff John Wiley & Sons, Inc. Developed by: Marianne Bradford, Ph.D. Bryant College.
13-1 COBOL for the 21 st Century Nancy Stern Hofstra University Robert A. Stern Nassau Community College James P. Ley University of Wisconsin-Stout (Emeritus)
Copyright © 2000 John Wiley & Sons, Inc. All rights reserved
Slide 1-1 Chapter 1 Terms Information Systems Overview Introduction to Information Systems Judith C. Simon.
Chapter Ten. Storage Categories Storage medium is required to store information/data Primary memory can be accessed by the CPU directly Fast, expensive.
Chapter 13 Disk Storage, Basic File Structures, and Hashing. Copyright © 2004 Pearson Education, Inc.
Core Concepts of ACCOUNTING INFORMATION SYSTEMS Moscove, Simkin & Bagranoff John Wiley & Sons, Inc. Developed by: S. Bhattacharya, Ph.D. Florida Atlantic.
Slide 1-1 Chapter 1 Information Systems Overview Introduction to Information Systems Judith C. Simon.
File Structures. 2 Chapter - Objectives Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 13 Disk Storage, Basic File Structures, and Hashing.
Introduction to Information Technology Turban, Rainer and Potter John Wiley & Sons, Inc. Copyright 2005.
Chapter 5 Record Storage and Primary File Organizations
1 CSCE 520 Test 2 Info Indexing Modified from slides of Hector Garcia-Molina and Jeff Ullman.
Chapter 13 Client/Server Database and Distributed Database Fundamentals of Database Management Systems by Mark L. Gillenson, Ph.D. University of Memphis.
Slide 6-1 Chapter 6 Terms System Software Considerations Introduction to Information Systems Judith C. Simon.
( ) 1 Chapter # 8 How Data is stored DATABASE.
Physical Changes That Don’t Change the Logical Design
Ch. 8 File Structures Sequential files. Text files. Indexed files.
Introduction to Information Technology
9/12/2018.
Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin
Disk Storage, Basic File Structures, and Hashing
Disk storage Index structures for files
Computers: Tools for an Information Age
ACCOUNTING INFORMATION SYSTEMS
Presentation transcript:

Chapter 2 Simple File Storage and Retrieval Fundamentals of Database Management Systems by Mark L. Gillenson, Ph.D. University of Memphis Presentation by: Amita Goyal Chin, Ph.D. Virginia Commonwealth University John Wiley & Sons, Inc.

Chapter Objectives Discuss the nature of data. Define data-related terms such as entity and attribute. Define storage-related terms such as field, record, and file.

Chapter Objectives Identify the four basic operations performed on stored data. Compare sequential access of data with direct access of data. Describe how a disk device works.

Chapter Objectives Describe the principles of file organizations and access methods. Describe how simple linear indexes and B+-tree indexes work. Describe how hashed files work.

What is Data? A single piece of data is a single fact about something that interests us. A fact can be any characteristic of an object.

Records and Files Entity - a “thing” or “object” in our environment that we want to keep track of. Entity set - A collection of entities of the same type (e.g., all of the company’s employees).

Records and Files Attribute - a property of, a characteristic of, or a fact that we know about an entity. Some attributes have unique values within an entity set.

Records and Files key field Record - each row of a structure like above Fields - the columns, representing the facts File - the entire structure

Records and Files Record type - a structural description of each and every record in the file Record occurrence / Record instance - a specific record of the salesperson file

Retrieving and Manipulating Data Four fundamental operations can be performed on stored data: Retrieve or Read - looking at a record’s contents without changing it Insert - adding a new record to the file, as when a new salesperson is hired Delete - deleting a record from the file, as when a salesperson leaves the company Update - changing one or more of a record’s field values

Data Retrieval Method Sequential access - the retrieval of all or a portion of the records of a file one after another, in some sequence, starting from the beginning, until all of the required records have been retrieved. Physical sequential access - records are retrieved, one after the other, just as they are stored on the disk device. Logical sequential access - records are retrieved in an order based on the values of one or a combination of the fields.

Data Retrieval Method Direct Access - the retrieval of a single record of a file or a subset of the records of a file based on one or more values of a field or a combination of fields in the file. a crucial concept in information systems today requires hardware storage device that will accommodate direct access requires software that will take advantage of the hardware’s capabilities and store and retrieve the data in such a way that it accomplishes direct access.

Disk Storage Primary (Main) Memory - where computers execute programs and process data Very fast Permits direct access Has several drawbacks relatively expensive not transportable is volatile

Disk Storage Secondary Memory - stores the vast volume of data and the programs that process them Data is loaded from secondary memory into primary memory when required for processing.

Primary and Secondary Memory When a person needs some particular information that’s not in her brain at the moment, she finds a book in the library that has the information and, by reading it, transfers the information from the book into her brain.

How Disk Storage Works Disks come in a variety of types and capacities 3.5” diskettes hold 1.44 MB on a single plastic disk or platter Large, multi-platter, aluminum or ceramic disk units Provide a direct access capability to the data.

How Disk Storage Works PC diskettes are designed to be removable. Fixed or hard disk drives in PCs are designed to be nonremovable.

How Disk Storage Works Several disk platters are stacked together, and mounted on a central spindle, with some space in between them. Referred to as “the disk.”

How Disk Storage Works The platters have a metallic coating that can be magnetized, and this is how the data is stored, bit-by-bit.

Access Arm Mechanism The basic disk drive has one access arm mechanism with arms that can reach in between the disks. At the end of each arm are two read/write heads. The platters spin, all together as a single unit, on the central spindle, at a high velocity.

Tracks Concentric circles on which data is stored, serially by bit. Numbered track 0, track 1, track 2, and so on.

Cylinders A collection of tracks, one from each recording surface, one directly above the other. Number of cylinders in a disk = number of tracks on any one of its recording surfaces.

Cylinders The collection of each surface’s track 76, one above the other, seem to take the shape of a cylinder. This collection of tracks is called cylinder 76.

Cylinders Once we have established a cylinder, it is also necessary to number the tracks within the cylinder. Cylinder 76’s tracks.

Steps in Finding and Transferring Data Seek Time - The time it takes to move the access arm mechanism to the correct cylinder from whatever cylinder it’s currently positioned. Head Switching - Selecting the read/write head to access the required track of the cylinder. Rotational Delay - Waiting for the desired data on the track to arrive under the read/write head as the disk is spinning.

Steps in Finding and Transferring Data Transfer Time - The time to actually move the data from the disk to primary memory once the previous 3 steps have been completed.

File Organizations and Access Methods File Organization - the way that we store the data for subsequent retrieval. Access Method - The way that we retrieve the data, based on it being stored in a particular file organization.

Achieving Direct Access An index tool. Hashing Method - a way of storing and retrieving records. If we know the value of a field of a record that we want to retrieve, the index or hashing method will pinpoint its location in the file and instruct the hardware mechanisms of the disk device where to find it.

The Index Principal is the same as that governing the index in the back of a book.

The Index The items of interest are copied over into the index, but the original text is not disturbed in any way. The items in the index are sorted. Each item in the index is associated with a “pointer.”

Simple Linear Index Index is ordered by Salesperson Name field. The first index record shows Adams 3 because the record of the Salesperson file with salesperson name Adams is at relative record location 3 in the Salesperson file.

Simple Linear Index An index built over the City field. An index can be built over a field with nonunique values.

Simple Linear Index An index built over the Salesperson Number field. Indexed sequential file - the file is stored on the disk in order based on a set of field values (salesperson numbers), and an index is built over that same field.

Simple Linear Index

Simple Linear Index French 8, would have to be inserted between the index records for Dickens and Green to maintain the crucial alphabetic sequence. Would have to move all of the index records from Green to Taylor down one record position. Not a good solution for indexing the records of a file.

B+-tree Index The most common data indexing system in use today. Unlike simple linear indexes, B+-trees are designed to comfortably handle the insertion of new records into the file and to handle record deletion.

B+-tree Index An arrangement of special index records in a “tree.” A single index record, the “root,” at the top, with “branches” leading down from it to other “nodes.”

B+-tree Index The lowest level nodes are called “leaves.” Think of it as a family tree.

B+-tree Index Each key value in the tree is associated with a pointer that is the address of either a lower level index record or a cylinder containing the salesperson records. The index records contain salesperson number key values copied from certain of the salesperson records.

B+-tree Index

B+-tree Index Each index record, at every level of the tree, contains space for the same number of key value/pointer pairs. Each index record is at least half full. The tree index is small and can be kept in main memory indefinitely for a frequently accessed file.

B+-tree Index Figure 2.15 is an indexed-sequential file, because the file is stored in sequence by the salesperson numbers and the index is built over the Salesperson Number field. B+-tree indexes can also be used to index nonkey, nonunique fields. In general, the storage unit for groups of records can be the cylinder or any other physical device subunit.

B+-tree Index Say that a new record with salesperson number 365 must be inserted. Suppose that cylinder 5 is completely full.

B+-tree Index The collection of records on the entire cylinder has to be split between cylinder 5 and an empty reserve cylinder, say cylinder 11. There is no key value/pointer pair representing cylinder 11 in the tree index.

B+-tree Index The index record, into which the key for the new cylinder should go, which happens to be full, is split into two index records. The now five key values and their associated pointers are divided between them.

Indexes Can be built over any field (unique or nonunique) of a file. Can also be built on a combination of fields. In addition to its direct access capability, an index can be used to retrieve the records of a file in logical sequence based on the indexed field.

Indexes Many separate indexes into a file can exist simultaneously. The indexes are quite independent of each other. When a new record is inserted into a file, an existing record is deleted, or an indexed field is updated, all of the affected indexes must be updated.

Hashed Files The number of records in a file is estimated, and enough space is reserved on a disk to hold them. Additional space is reserved for additional overflow records.

Hashed Files To determine where to insert a particular record of the file, the record’s key value is converted by a hashing routine into one of the reserved record locations on the disk. To find and retrieve the record, the same hashing routine is applied to the key value during the search.

Division-Remainder Method Divide the key value of the record that we want to insert or retrieve by the number of record locations that we have reserved. Perform the division, discard the quotient, and use the remainder to tell us where to locate the record.

A Hashed File Storage area for 50 records plus overflow records. Collision - more than one key value hashes to the same location. The two key values are called “synonyms.”

Hashed Files Hashing disallows any sequential storage based on a set of field values. A file can only be hashed once, based on the values of a single field or a single combination of fields. If a file is hashed on one field, direct access based on another field can be achieved by building an index on the other field.

Hashed Files Many hashing routines have been developed. The goal is to minimize the number of collisions, which can slow down retrieval performance. In practice, several hashing routines are tested on a file to determine the best “fit.” Even a relatively simple procedure like the division-remainder method can be fine-tuned.

Hashed Files A hashed file must occasionally be reorganized after so many collisions have occurred that performance is degraded to an unacceptable level. A new storage area with a new number of storage locations is chosen, and the process starts all over again.

“Copyright 2004 John Wiley & Sons, Inc. All rights reserved “Copyright 2004 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in Section 117 of the 1976 United States Copyright Act without express permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages caused by the use of these programs or from the use of the information contained herein.”