HAP 709: Healthcare Databases Introduction to Database Structures By Farrokh Alemi, Ph.D. Francesco Loaiza, Ph.D. J.D. Updated by Janusz Wojtusiak, Ph.D. Fall 2008
What is database? Is an Excel table with students’ grades a database? Is your notebook a database? Is a phonebook a database? Is the GMU schedule of classes a database? Is a medical record of a patient a database? Is a list of nurses working in a hospital a database?
What is database? Database is a collection of data with defined structure and purpose. Wikipedia: A Database is a structured collection of data which is managed to meet the needs of a community of users. Wordnet: Database is an organized body of related information.
What is computer database? Computer database is a database stored in a computer. It is usually managed by special software called Database Management System (DBMS). There are many DBMS systems available Access, Oracle, MUMPS, dBASE, portgress, sql server, mysql, db2, …
Objectives of this lecture Learn about flat, hierarchical, relational, and object-oriented databases Learn about information-less databases If checking an information item takes a fraction of a second, why is it that we can go through billions of information items in a fraction of a second?
Types of Data Structures Flat data Hierarchical data Relational data Object-oriented data
Flat Models Student ID Name Midterm grade Final grade Address Zip code ... 4561 Ali Safaie B A 1311 Manor Park 22101 7878 Mike Smith C 1619 Ozkan Street 44115 8954 Mike Smith Jr. 2121 Euclid 563
Flat Data Advantages Most software include free access to flat data files. For a small number of cases, flat databases do a reasonably fast job. Most analytical software use flat data. Disadvantages Flat databases waste computer storage by requiring it to keep information on items that logically cannot be available. Flat databases are not conducive to complicated search queries
In a relational database, tables do not need to be of the same size Relational Databases In a relational data base, one stores a record with related fields as data. In a relational database, tables do not need to be of the same size
Table for "Students grades" Example Table for "Students grades" Student ID Key column Name Mid-term Final 4561 Ali Ghadiri B A 7878 Mike Smith C 8954 Mike Smith Jr. Table for "Students' contact information" Student ID Key column Address Zip 8954 2121 Euclid 563 22101 4561 1311 Manor Park 7878 1619 Ozkan Street 44115
Advantages of Relational Databases Data can be examined from many different perspectives. No need to enter missing information for variables that are not logically possible. Easy to modify because adding new concepts involves adding new Tables, not altering old ones.
Hierarchical models Data models in which the relationship between higher and lower items are inherited.
Example of Hierarchal Model File items on your desk top
Advantages of Hierarchical Models Operations on parents save time and affect all children. Disadvantages Many relationships are not hierarchical
Object-oriented data models Data are organized in the form of “objects” that represent real world entities. Each objects have its properties, that can be regular values or other objects.
Advantages of Object-oriented models High efficiency Use of the actual “real life” entities as objects Integration with object-oriented programming languages (C++, Java, C# …) Disadvantages Lack of one good standard
Distributed data models Data are kept in different settings and on different computers. Distributed databases need not only addresses for where the data are but also need an audit trail
Example of Distributed Database World Wide Web
Advantages of Distributed Databases Security of these databases are difficult to maintain. Many agreements must be made ahead of time. Data loss is limited to nodes affected. Decentralized databases are more flexible and allow different units to update and maintain their own data. Variation in quality of data
Data-less Information Systems Distributed Databases without data until need arises, less problems with privacy of patients
Components of a Data-less System Decoder Communicator Analysis
Advantages of the Data-less Information Systems The system is substantially less expensive than centralized registries as it requires no new equipment and little personnel. The use of the system does not require vague and time-independent patients’ consents. The system does not require duplication of data in different databases.
Inductive Databases Researchers investigate databases that can answer questions about things which are not in that databases. They use artificial intelligence to give “plausible” answers.
Take Home Lesson Structure makes it possible to process and analyze large amount of data