Welcome to ….. File Organization
Data and File organization Instructor: Dr Halla Abdel Hameed e-mail: hala_hamed@hotmail.com Objectives of the Course and Preliminaries
Objectives of The Course To Provide a Solid Introduction to the Topic of File Structures Design. To Discuss a number of Advanced Data Structure Concepts that are necessary for achieving high efficiency in File Operations. To Develop important programming skills in and Object-Oriented Language such as C++ or Java. January 10, 2000
Pre-Requisites Introduction to Computer Science. Knowledge of C++ Course Link….. http://www.acadox.com/class/27279#resources January 10, 2000
Required TextBook Title: File Structures An Object-Oriented Approach with C++ Authors: Michael J. Folk Bill Zoellick Greg Riccardi January 10, 2000
Course Requirements Lab Work (Programming assignments or A Programming Project + oral Exam) (25%) A Mid-Term (10%) A Final Exam (65%) January 10, 2000
Special Thank to Eng. Mostafa Elmasri for his contribution in preparing this material.
Course Outline Introduction To File Management Fundamental File Processing Operations Secondary Storage, Physical Storage Devices: Disks, Tapes And CD-ROM. fundamental file structure . Managing Files Of Records. Organizing file for performance (File Compression- Reclaiming Space In Files- Internal Sorting- Binary Searching- Keysorting). Indexing. Consequential Processing And Eternal Sorting Multilevel Indexing And B Trees Indexed Sequential Files And B+trees. Hashing And Extendible Hashing.
Course Outline Introduction To File Management Fundamental File Processing Operations Secondary Storage, Physical Storage Devices: Disks, Tapes And CD-ROM. fundamental file structure . Managing Files Of Records. Organizing file for performance (File Compression- Reclaiming Space In Files- Internal Sorting- Binary Searching- Keysorting). Indexing. Consequential Processing And Eternal Sorting Multilevel Indexing And B Trees Indexed Sequential Files And B+trees. Hashing And Extendible Hashing.
All this will be built on your knowledge of Data structure Data Processing from computer science perspective: Storage of data Organization of data Access to data Processing of data All this will be built on your knowledge of Data structure
File Organization Lecture 1 Introduction to the Design and Specification of File Structures File Organization
Lecture Objectives Introduce the primary design issues that characterize file structure design. Survey the history of file structure. Introduce conceptual toolkit for file structure design. Develop an object-oriented toolkit that makes file structure easy to use.
Lecture Contents The heart of file structure design. A short history of file structure design. A conceptual toolkit: File structure literacy. An object-oriented toolkit: Making file structure usable.
The heart of file structure design Section 1.1 The heart of file structure design
File Structure Definition & Functions A combination of representations for data in files and of operations for accessing the data. Functions Allowing applications to read, write and modify data.
Memory versus Secondary Storage Secondary storage such as disks can pack 1000’s of megabytes in a small physical location. Computer Memory (RAM) is limited. Comparing to Memory, access to secondary storage is extremely slow. Getting information from slow RAM takes 120. 10-9 seconds (= 120 nanoseconds) while getting information from Disk takes 30. 10-3 seconds (= 30 milliseconds) Roughly, 20 second on RAM ≈ 58 days on Disk
Improve Secondary Storage Access Time representation of the data the implementation of the operations ⇒ the efficiency of the file structure for particular applications
General Goals Get the information we need with one access to the disk. If that’s not possible, then get the information with as few accesses as possible. Group information so that we are likely to get everything we need with only one trip to the disk.
A short history of file structure design Section 1.2 A short history of file structure design
Early Work Early Work assumed that files were on tape. Access was sequential and the cost of access grew in direct proportion to the size of the file.
The emergence of Disks and Indexes As files grew very large, unaided sequential access was not a good solution. Disks allowed for direct access. Indexes made it possible to keep a list of keys and pointers in a small file that could be searched very quickly. With the key and pointer, the user had direct access to the large, primary file.
The emergence of Tree Structures As indexes also have a sequential flavor, when they grew too much, they also became difficult to manage another problem was the changing of files. The idea of using tree structures to manage the index emerged in the early 60’s. Trees can grow very fast as records are added and deleted resulting in long searches requiring many disk accesses to find a record.
Hash Tables Retrieving entries in 3 or 4 accesses is good, but it does not reach the goal of accessing data with a single request. From early on, Hashing was a good way to reach this goal with files that do not change size greatly over time. Recently, Extendible Dynamic Hashing guarantees one or at most two disk accesses no matter how big a file becomes.
A conceptual toolkit: File structure literacy Section 1.3 A conceptual toolkit: File structure literacy
Conceptual tools For File Structure Design Tree Structure Direct Access Sequentially Decrease the number of disk accesses by collecting data into buffers, blocks, or buckets. Manage their growth by splitting them. Find a way to increase our address or index space. Find new ways to combine the basic tools.
Intended Learning Outcomes After completing the course, the student will be able to: Demonstrate knowledge of storage by describing how data is saved on disk. Demonstrate knowledge of how file organization allows applications to read, write and modify data. Demonstrate knowledge of cost-based query optimization by finding the data that match some search criteria.
Lecture Style New Lecture Previous Lecture A brief review of the previous lecture. Answer questions addressed to instructor’s email. New Lecture Introduce and explain current lecture topics. Next Lecture Follow up practice and tutorial scheme. A brief proposal for the next lecture
Next Lecture
Fundamental File Processing Operations Physical and logical file. Opening and closing files. Reading and writing. Seeking. Special Characters in files. Physical devices and logical files. File-related header files.
Questions?