Copyright © by Curt Hill Database Introduction History Why we want to use them Other fun
Six Generations of Data Management Manual – Prehistory Punch cards – 1900 Programmed Unit record – 1955 Online database – 1965 Client Server, Relational – 1980 Multimedia – 1995 All of these still continue Copyright © by Curt Hill
Manual Writing has existed for millennia Kings used writing to inventory their goods, record their laws Sumerian tablets date from 2000 BC or before Copyright © by Curt Hill
Punch cards Originally used by Jacquard to program silk weaving machines Not really data management Hollerith used to record census data in 1890 A suite of machines that would punch, sort, print and tabulate from cards –Programmed by rewiring control panels Known as unit record or electronic accounting machines Copyright © by Curt Hill
Machines Copyright © by Curt Hill
Programmed Unit record Stored program computers change the face of data management Tapes store the data much more densely than cards Programming removes the limits on what sort of calculations or transformations may be done on the data Produced a file-oriented record processing approach Copyright © by Curt Hill
Consider and example A college has many files that describe their system –Faculty –Catalog or courses –Grades –Students –Among many others We will look at payroll and grades as an example Copyright © by Curt Hill
Payroll File Fields Name Address Salary/wage Earnings Year to Date Among very many more Copyright © by Curt Hill
Payroll File
Grades Fields Course –Including section Student name Term Letter grade Instructor Copyright © by Curt Hill
Grades File
Background Vocabulary fields –collection of related characters records –collection of related fields files –collection of related records database –collection of related files Copyright © by Curt Hill
How to use Actions we might want on each file: –Create –Update (add, remove, change records) –Sort –Generate any of several reports Each action for each file would be a program for an overworked programming staff –Typically a COBOL program –Eight programs, or sections of programs, for two files These are typically done in a batch environment Copyright © by Curt Hill
Online database Many things do not work well in batch mode: –Travel reservations need up to the second information –The database is born Started out as disk based unit record, but that is not the best organization for this type of application Developed into two models: –Hierarchical and network Copyright © by Curt Hill
Two Models Both require direct access devices –Each requires disk addresses in the database –Required to get to the pointed at record directly Programmer as navigator –Access programs must still be written specially for a particular database –Must understand the low level structure –Must run on the same machine as database Copyright © by Curt Hill
Client Server, Relational EF Codd suggests the relational model and he and other develop a substantial theoretical base Queries may now be simple and short –Needs to know a schema, but not complete organization –This allows transmission of a simple query –Client server computing is born Copyright © by Curt Hill
Relational Database The key –All the programs previously described are about the same – every update is nearly the same –All that changes is the underlying file The solution –Describe the file in a general way –Generate a program that handles the file based on the description
Copyright © by Curt Hill How to describe a file A file is a collection of records Each record is a collection of fields –Typically only one type of record in a file Each field is described by a: –Name –Type For example numeric, string, boolean etc. –Length Booleans have a predefined length, others require specification
SQL Structured Query Language Has become the “standard” for queries A relational database does not have to accept SQL –Unless it wants to be commercially viable SQL is mostly declarative but with some procedural features –Declarative – what is wanted –Procedural – how to get it Copyright © by Curt Hill
The Files
Copyright © by Curt Hill What is wrong with the original example? Redundancy in faculty description –Space is wasted –Discrepancies may occur between grades and payroll Some reports need to access multiple files –Eg. Transcript generation –Complicates the programming issue
Copyright © by Curt Hill Advantages (1 of 3) Data independence –Application program no longer need some or all of the files –Do not know or care how data is stored, aka abstraction –Simplifies application development Efficient access –The DBMS employs sophisticated access techniques seldom used by normal programmers
Copyright © by Curt Hill Advantages (2 of 3) Integrity constraints –The DBMS may check data in a way seldom done in normal file processing –Eg. Account validity Security –A DBMS may enforce requirements on who can access the data and in what way
Copyright © by Curt Hill Advantages (3 of 3) Administration –Minimize redundancy –Manage sharing of the data –Optimize for the enterprise, not a small group –Easier to backup the data Concurrent access –Manages the simultaneous update problem
Copyright © by Curt Hill Disadvantages A DBMS is: –Complex –Expensive –Bulky –Simple file access is much quicker and less expensive The view a DBMS provides may not be helpful to a particular application
Multimedia The relational model was king for a time What if what we to store does not conform to the notion of typed text? –Sound, pictures, video One of the results is the object oriented data base which stores data as objects –Data and programs to manipulate Copyright © by Curt Hill
NoSQL Once it is realized that the relational database is not the end-all, all manner of new types of databases appear These are the NoSQL databases –SQL is the universal query language –NoSQL may mean no SQL or Not Only SQL This is a field that is not finished developing Copyright © by Curt Hill
Finally The course focuses on relational data bases –They are comparatively standardized We will also examine the NoSQL databases and the Hierarchical and Network models Of course, we will also learn SQL Copyright © by Curt Hill