Managing data Resources: An information system provides users with timely, accurate, and relevant information. The information is stored in computer files. When files are properly arranged and maintained, users can easily access and retrieve the information when they need. If the files are not properly managed, they can lead to chaos in information processing. Even if the hardware and software are excellent, the information system can be very inefficient because of poor file management.
File Organization Terms and Concepts A computer system organizes data in a hierarchy that starts with the bit. Bit represents 0 or 1. 8 bits are grouped to form a byte. Each byte represents one character, number, or symbol. Bytes can be grouped to form a field. It can represent a person’s name or age. Related fields can be grouped to form a record. Related fields can be student’s name, course taken and the grade. Related records can be grouped to form a file. Related files can be grouped to form a database
The Data Hierarchy Database NAME COURSE GRADE File Record Field Byte Bit 0 or (letter A in ASCII) Cynthia Lokker (NAME field) Cynthia Lokker CIS 500 A Daniel Boles IST 203 B Course File Financial File Personal History File Student database attribute
More File Organization Terms Key field Every record in a file should contain at least one field that uniquely identifies that record so that the record can be retrieved, updated, or sorted. This identifier field is called a key field. Cynthia Lokker CIS 500 A Daniel Boles IST 203 B NAME STUDENT # COURSE GRADE Key field Entity A person, place, thing, or event on which we maintain information.
Accessing Records from Computer Files Computer stores files on secondary storage devices. Records can be arranged in several ways on storage media. How individual records can be accessed or retrieved depends on how they are arranged on storage media. There are mainly two ways to organize records: sequentially or randomly. In sequential file organization, data records must be retrieved in the same physical sequence in which they are stored. In direct or random file organization, data records can be accessed in any sequence as users desire, without regard to actual physical order on the storage media. Sequential file organization is the only file organization that can be used on magnetic tape. Example: Payroll Direct or random file organization is utilized with magnetic disk. Most computer applications utilize this method.
Indexed Sequential Access Method The indexed sequential access method is a way of storing data records on a physical storage device in sequential order for sequential processing. Example payroll applications. However, ISAM also allows any specific record to be directly accessed without searching through the file sequentially by using the record’s key field to find its storage address in an index.
Direct File Access Method The direct file access method does not store the records in sequential order, as with ISAM. It uses a key field to determine the location of each record. However, rather than carrying that location in an index, the location is calculated each time using an algorithm that translates the key field directly into the record’s physical storage address. Transformation algorithm: Divide key field by the prime number closest to max. number of records in the field. The remainder determines the address location for the record. Divide 2367 by 997=> remainder is 373. So, the record address is 373.
Problems of the Traditional File Environment Data redundancy is the presence of duplicate data in multiple data files. In this situation confusion results because the data can have different meanings in different files. Program-data dependence is the tight relationship between data stored in files and the specific programs required to update and maintain those files. This dependency is very inefficient, resulting in the need to make changes in many programs when a common piece of data (such as zip code) changes. Lack of flexibility refers to the fact that it is very difficult to create new reports from the data when needed. Ad hoc reports are impossible; a new report could require several weeks of work by more than one programmer and the creation of intermediate files to combine data from disparate files. Poor security is a problem resulting from the lack of control over the data because it is widespread and distributed into so many files.
Database and Database Management System Database => A database is a collection of data organized to service many applications efficiently by centralizing the data and minimizing redundant data. Database management System => A database management system is a special software that permits an organization to centralize data, manage it efficiently, and provide access to the stored data by application programs.
The Components of a DBMS The data definition language which is the formal language used by programmer to specify the content and structure of the database. The data manipulation language, which is used to manipulate the data in database. It contains commands that permit end-users and programming specialists to extract data from the database to satisfy information requests and develop applications. The data dictionary, which is an automated or manual file that stores definitions of data elements and data characteristics such as usage, physical representation, ownership and security.
Logical and Physical View of Data A logical view of data is the way data is perceived by end users or business specialists. A physical view of data is the way the data are actually organized and structured on physical storage media.
Advantages of a DBMS Complexity of the information system environment can be reduced. Data redundancy and inconsistency can be reduced. Data confusion can be eliminated. Program-data dependency can be reduced. Program development and maintenance costs can be reduced. Flexibility of IS can be enhanced. Access and availability of information system can be increased.