Download presentation
Presentation is loading. Please wait.
1
CMPUT 291 File and Database Management Systems
2
Objectives File processing Database Management Data Modeling
What are files? How are file systems organized? What is the functionality of file systems? Database Management What is database management? How does it differ from file systems? What is the basic functionality that they offer? Data Modeling How does one use a database management system? What is the data modeling process? What are the basic techniques?
3
Outline
4
Course Documents Available as a single package at the bookstore: J. D. Ullman and J. Widom. A First Course in Database Systems. Prentice-Hall,1997. H. Garcia-Molina, J. D. Ullman and J. Widom. Database System Implementation. Prentice-Hall,1999. Lab manual Available on-line Lecture notes are available on-line, accessible from the CMPUT 291 home pages:
5
Administravia Office Hours Grading Announcements Re-examination
B1 (Prof. Nascimento): WF 15: :00 in GSB773 B2 (Prof. Özsu): TR 14:00 -15:00 in GSB 779 Also by appointment Grading Assignments 20% Project 25% Midterm 25% Final 30% Announcements In class; material will also be available electronically Re-examination None Collaboration Collaborate on assignments, but do not merely copy. Newsgroup ualberta.cs.c291 – make sure you check this regularly
6
Laboratories Oracle DBA There are four TAs
Shauna Grabinsky, Do not contact her as your first source There are four TAs Shu Lin, Vishal Chitkara, Bin Yao, Peng Wang,
7
What is “Data”? ANSI definition: Volatile vs. non-volatile data Data
A representation of facts, concepts, or instructions in a formalized manner suitable for communication, interpretation, or processing by humans or by automatic means. Any representation such as characters or analog quantities to which meaning is or might be assigned. Generally, we perform operations on data or data items to supply some information about an entity. Volatile vs. non-volatile data Our concern is primarily with non-volatile data
8
Manual Data Management
Data are not stored Programmer defines both logical data structure and physical structure (storage structure, access methods, I/O modes, etc) One data set per program. High data redundancy. PROGRAM 1 Data Management PROGRAM 2 PROGRAM 3 DATA SET 1 DATA SET 2 DATA SET 3
9
Problems There is no persistence.
All data is transient and disappears when the program terminates. Random access memory (RAM) is expensive and limited All data may not fit available memory Programmer productivity low The programmer has to do a lot of tedious work.
10
File Processing Data are stored in files with interface between programs and files. Various access methods exist (e.g., sequential, indexed, random) One file corresponds to one or several programs. PROGRAM 1 Data Management FILE 1 PROGRAM 2 File System Services Redundant Data Data Management PROGRAM 3 FILE 2 Data Management
11
File System Functions Mapping between logical files and physical files
Logical files: a file viewed by users and programs. Data may be viewed as a collection of bytes or as a collection of records (collection of bytes with a particular structure) Programs manipulate logical files Physical files: a file as it actually exists on a storage device. Data usually viewed as a collection of bytes located at a physical address on the device Operating systems manipulate physical files. A set of services and an interface (usually called application independent interface – API)
12
File System Services Mapping a logical file to a physical file
assign(logical_file, ‘physical_file’) Opening a file file_desc=open(logical_file, flags, [protect]) flags indicate the mode in which the file is to be opened e.g.: create, read only, write only, read/write,append protect is the file protection code in case of create Closing a file close(file_desc)
13
File System Services Reading from a file Writing to a file
read(source_file, destination_addr, size) source_file is the file descriptor obtained by opening destination_addr is the memory address where data will be read into Writing to a file write(destination_file, source_addr, size)
14
File System Services Seeking a location in a file
seek(source_file, offset) moves the read/write head to a particular position avoids sequential reading
15
Performance Considerations
Disk access is very slow RAM access: 120 nanosecond Disk access: 30 millisecond Disk access is 250,000 times slower This has direct performance implications on applications
16
Principles of Disk Access
Go to disk as few times as possible index structures Every time you go, bring as much relevant data as possible clustering Make each access to disk as efficient as possible random access rather than sequential
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.