Databases and Database Management Systems Books on order in waterstones…….
What is the difference between Data and Information? What is Data? What is the difference between Data and Information? Books on order in waterstones…….
Examples of a Database: Student Records at UCC Credit Card details Directory Enquiries Insurance Broker Library System Books on order in waterstones…….
What is data? What is data? Data is the raw material from which information is obtained The processing of data consists of manipulating it into a form which provides information in a format that is meaningful and usable to the manager or other end-user Arrival of computer processing meant this process was ‘mimicked’. Automation of existing systems. This is a traditional file-based system Before we consider what a database is, it is sensible to ask what we mean by data. Business applications software generally consists of specific programs designed to create, maintain and update the data in the series of data files together with programs to process the data to provide the reports required by the management. This is traditional file-processing method The operation of these computer based systems closely resembles that of a manual system. For example a computerised accounts system closely fellows standard book-keeping practice - it is certainly faster with fewer manual operations but all it achieves is the automation of the existing system.
History of Information: Initially the information needs of an organisation were met using a ‘manual system’. This system was very labour intensive. With the arrival of computers, the manual filing system was moved on to a computer. This early use of computers for gathering information was called the ‘file based approach’. Before we consider what a database is, it is sensible to ask what we mean by data. Business applications software generally consists of specific programs designed to create, maintain and update the data in the series of data files together with programs to process the data to provide the reports required by the management. This is traditional file-processing method The operation of these computer based systems closely resembles that of a manual system. For example a computerised accounts system closely fellows standard book-keeping practice - it is certainly faster with fewer manual operations but all it achieves is the automation of the existing system.
What is a file-based system? “A collection of application programs that perform services for the end-users such as the production of reports. Each program defines and manages its own data.” (Connolly & Begg) An early attempt to computerise the manual filing system used The operation of these systems closely resembles that of a manual system. All that is really achieved is the automation of the existing system. In your systems analysis & design course last year you might have touched on how early computer business systems just automated existing systems without looking to alter the system and improve it. If one particular item of data requires amendment or updating it may need to be updated on several different files to ensure data integrity. If this piece of amended data is added to some files but not other there is the possibility of having conflicting basic data items, with the consequence that the processed information may contain incompatible information. In early processing systems, an organization's information was stored as groups of records in separate files. These file processing systems consisted of a few data files and many application programs. Each file, called a flat file, contained and processed information for one specific function, such as accounting or inventory. Programmers used programming languages such as COBOL to write applications that directly accessed flat files to perform data management services and provide information for users. In creating the files and applications, developers focused on business processes, or how business was transacted, and their interactions. However, business processes are dynamic, requiring continuous changes in files and applications. In addition, early programmers focused on physical implementation and access procedures when designing a database. These physical procedures were written into database applications; therefore, physical changes resulted in intensive rework on the part of the programmer. As systems became more complex, file processing systems offered little flexibility, presented many limitations, and were difficult to maintain.
File Based Approach RESPONSIBILITY HELD APPLICATIONS DATA HELD PAYROLL PROGRAM Employee Name, Age, Address, Hours, Pay Rate Payroll Dept. ADMIN. PROGRAM Dept. Name, Employee Name, Emp. Address, Office Location Dept. Managers PROJECT SCHEDULING PROGRAM Project Name, Start Date, Staff Name, Staff Address, Project Hours Project Leaders
What are the limitations of the file-based system? Separation and isolation of data Decentralised data makes cross-referenced searching slow and difficult Duplication of Data Wastes time and money for entering and storage, leads to corruption of data integrity Program-Data Dependence Data Integrity - can be trusted, but if not all instances updated then it has lost integrity. Would lead manager’s and users to mistrust the system. Separated and isolated data. To make a decision, a user might need data from two separate files. First, the files were evaluated by analysts and programmers to determine the specific data required from each file and the relationships between the data. Then applications could be written in a third generation language to process and extract the needed data. Imagine the work involved if data from several files was needed! Data redundancy. Often, the same information was stored in more than one file. In addition to taking up more file space on the system, this replication of data caused loss of data integrity. For instance, if a customer's address was stored in four different files, an address change would have to be updated in each file separately. If a user was not consistent in updating all files, no one would know which information was correct. Program - data interdependence involving file formats and access techniques. In file processing systems, files and records were described by specific physical formats that were coded into the application program by programmers. If the format of a certain record was changed, the code in each file containing that format must be updated. For example, a field in the sales file might be coded as "decimal," while the same field in the customer file could be coded as "binary." In order to combine these fields into one application, a programmer would have to write code to convert every value of the "decimal" field in the sales file to a "binary" field (or the reverse) in addition to coding the application. Furthermore, instructions for data storage and access were written into the application's code. Therefore, changes in storage structure or access methods could greatly affect the processing or results of an application. Difficulty in representing data from the user's view. To create useful applications for the user, often data from various files must be combined. In file processing it was difficult to determine relationships between isolated data in order to meet user requirements. Data inflexibility. Program-data interdependency and data isolation limited the flexibility of file processing systems in providing users with ad hoc information requests. Because designing applications was so programming-intensive, information requests usually were restricted by MIS department staff. Therefore, users often resorted to manual methods to obtain needed information.
What are the limitations of the file-based system? Incompatibility of files Structure and format is dependent on the development language and platform of the application Fixed queries and proliferation of application programs Ad Hoc querying and reporting code to be written from scratch Management might request a one-off report taking information from a number of different systems. This could prove difficult because of file formats for each system may be different, consequently the data on the files would need to be converted to a common format before it could be processed to provide the requested report.
What is a Database? “a shared collection of logically related data (and a description of this data), designed to meet the information needs of an organisation” (Connolly & Begg) Implications? Centralised (minimal duplication), self-describing (program independent to an extent), logical structure (entities, attributes and relationships). Logical Structure - obviously experience of modelling from first year will be useful when it comes to understanding and designing databases. “a collection of interrelated data stored together with controlled redundancy, to serve one or more applications, in an optimal fashion; the data is stored so that it is independent of the application programs which use it; a common and controlled approach is used in adding new data and in a modifying existing data within the database”
Advantages of a Database: Data Integrity is easier to maintain as all data is held in on central location A database system allows for ad-hoc queries and caters to complex questions involving the interaction and relationships between the various data items in the database to be investigated Security Minimisation of data duplication Control of data redundancy Improved Maintenance Database management systems are applications that were developed to create, manage, and use data and to deal with the problems of file processing systems. The data is stored as records in various database files that can be combined to produce meaningful information for users. The DBMS controls all functions of capturing, processing, storing, and retrieving data from databases and generates various forms of data output. The application programs are written either in a separate language or in the DBMS language, and the DBMS can contain hundreds of applications and files. Modeling business data, as opposed to business processes, allows the definition of data objects that are important to the business. These data objects are more stable and less likely to change than business processes. Because the representation of the data is separate from the physical implementation and access functions, the relationships between the data files is more apparent. Therefore, DBMS have more flexibility than file processing systems and require less programming maintenance. This allows programmers to focus more on information representation than on physical aspects of data management.
Disadvantages of a Database: Complexity – increased functionality means the system is more complex and sophisticated in structure Size – complexity and functionality makes the DBMS a large piece of software, taking up a lot of space Cost of DBMSs – the cost can vary depending on functionality required and the environment Additional Hardware Costs Cost of Conversion - conversion of existing systems High Impact of Failure - as a result of centralisation Database management systems are applications that were developed to create, manage, and use data and to deal with the problems of file processing systems. The data is stored as records in various database files that can be combined to produce meaningful information for users. The DBMS controls all functions of capturing, processing, storing, and retrieving data from databases and generates various forms of data output. The application programs are written either in a separate language or in the DBMS language, and the DBMS can contain hundreds of applications and files. Modeling business data, as opposed to business processes, allows the definition of data objects that are important to the business. These data objects are more stable and less likely to change than business processes. Because the representation of the data is separate from the physical implementation and access functions, the relationships between the data files is more apparent. Therefore, DBMS have more flexibility than file processing systems and require less programming maintenance. This allows programmers to focus more on information representation than on physical aspects of data management.
What is a DBMS? The DBMS is a piece of software whose main function is to organise data so it can be retrieved, modified or updated at will. It is the link between the user and the data, giving access to the data required for the systems and their application programs. “A software system that enables users to define, create, and maintain the database and provides controlled access to this database” (Connolly & Begg).
Database Management System APPLICATIONS DATA HELD PAYROLL PROGRAM Database ADMIN. PROGRAM Employee Administration and project Details DBMS PROJECT SCHEDULING PROGRAM
Explanation of a DBMS In the database structure, each system draws its data via the database management system, so each system’s program interacts with the DBMS rather than the database files themselves (e.g MS Access) A DBMS can be described as an intelligent filing cabinet, as it performs all the functions of an efficient filing clerk
Components of a DBMS: Data definition language is used to define the database (types, structure and constraints) Data Manipulation Language is used to insert, update, delete and retrieve data. Utilises a flexible, ad hoc, query language There are two types of query language, procedural (one record at a time, “specifies how”) and non-procedural (sets of records, “specifies what”). Access control includes security, integrity, concurrency, recovery and catalogues.
Components of a DBMS: End-users use VIEWS which makes the DBMS transparent in its activities A DBMS consists of hardware (machines, network connections, physical storage), software (OS, DBMS, applications), data, procedures and people (administrators, designers (logical and physical), programmers and end-users. Advantages: Less redundancy, improved consistency, information, integrity, security, scalability, flexibility, productivity, concurrency, maintenance and recovery. Disadvantages: complexity, size, cost, generalisation, high impact of failure.
Roles in Database Management System Database Administrator Database Designers Application Programmers End-Users DBA - responsible for physical realisation of database, including physical database design and implementation, security and integrity control, maintenance of the operational system and ensuring satisfactory performance for the applications and users. Data and Database Administrators - DA in charge of data resources, including planning, development and maintenance of standards, policies and procedures and conceptual/logical database design. Designer - logical db designer and physical designers LDBD - concerned with identifying data, the relationships between data, the constraints on the data that is to be stored in the database. To be effective must include all prospective users in development of the data model. PDBD - takes the logical data model and decides how to physically realise it what is a data model - an integrated collection of concepts for describing data, relationships between data, operations to manipulate the data and a set of integrity rules or constraints for the data. Application Programmers - once the database is implemented, the applications programs that provide the required functionality for the end-users must be implemented End Users - Naïve User and Sophisticated User
Architecture Most DBMS’s use a three-level architecture: External, Internal and Conceptual Internal - describes how the data is stored in the database (space allocation, compression, encryption etc.) and interfaces with the OS to manage files in physical storage Conceptual - Describes what data is stored and the relationships between data External - Defines the users view of the data
Reasons for Three-Tier Architecture