Managing Data Resources Chapter Seven
Hierarchy of Data Bit Byte Field Record File Database Database management system
Traditional Data Environment Files for each application/department/function Duplicated/redundant data/files Inability to link data/files Program-data dependence (change one, you must change the other)
Problems with Traditional Data Environment Data redundancy leads to Lack of data integrity Program-data dependence Lack of flexibility (no ad hoc reports/different views Poor security (access) Lack of data sharing & availability
DBMS Approach Database: Collection of one or more files containing data organized to serve multiple applications by minimizing redundant data. Database management system controls organization of & access to data and database files by acting as interface between the data & application programs and as an environment for developing and using databases.
Views/Schemas Logical view: How end users perceive the data is organized –Schema: The view of all the data –Subschema: A partial view of the data accessible to an end user (e.g., "view only" a subset of screens/data) Physical view: How the data are actually organized on physical storage media
Components of a DBMS #1 Data definition language (DDL) –Formal language associated with DBMS –Used by both end users & programmers to manipulate data Data manipulation language –Commands to modify/extract data & to develop apps –Structured Query Language (SQL) –Can use various languages in addition to/ instead of SQL
Components of a DBMS #2 Data dictionary –Defines each data element (# bytes, text/numeric, etc., format, range, access, use, ownership, physical representation) –Used for communication between developers & users and for standardization of data/databases/ programs –Some data dictionaries are active; changes automatically change related databases/programs NOTE: Any properly developed information system should have a data dictionary.
Database Models (Types) #1 –Hierarchical Upside-down tree-like structure Root (top most data element) is the key field Each child record can have only one parent record (1:M relationships); parents can have many children Pointers for expressing relationships Hard to change & limited retrieval capabilities "Legacy" systems –Network Similar to hierarchical but M:M relationships between records Complex and hard to change
Database Models (Types) #2 Relational –2-dimensional tables (relations) –Physically appear similar to files (but are not) –Row/record/tuple –Column/field/attribute/data element –Ability to link relations on-the-fly Select creates a subset of all records that meet specified criteria Join combines tables into a single new table Project creates a subset of columns in a table, resulting in new tables/views
SQL Principal data manipulation language for relational DBMS Versions that can run on almost any OS & computer (mainframe, PC, etc.) Easy to learn & use –Select lists desired columns from desired table(s) –From identifies tables/views from which to select columns –Where are conditions for selecting specific records & for joining multiple tables
SQL Example SELECT Part:Part_Number, Supplier:Supplier_Number, Supplier:Supplier_Name, Supplier:Supplier_Address FROM Part, Supplier WHERE Part:Supplier_Number=Supplier:Supplier_Num ber AND Part_Number=137 OR Part_Number=152 Note: No line returns in any of the commands
Object-Oriented Databases (OODBMS) Store data & procedures that act on the data as objects that can be automatically retrieved & shared Objects can contain multimedia Object-relational databases: Relational databases that can store both traditional data & object-oriented data that store graphics & multimedia
Designing Databases Entity-relationship diagram (E-R diagram) –Documents database by showing relationships among entities in database Normalization –Creates small, stable data structures (tables) from complex groups of data –Example: Student data: Normalization results in several DB tables of student data: Name/address, Courses taken, Funds/received/distributed, etc. –See Figures 7-14 & 7-15
E-R Diagram ORDERPART Includes Delivered by SUPPLIER 1 M M 1 Figure 7-13, p. 217
Distributed Databases Stored in more than one physical location Reduce vulnerability Increase responsiveness Can run on cheaper computer systems Weakness: Vulnerability of telecommunications Sometimes, locals can depart from acceptable DB practices –Partitioned: Each remote processor has its own necessary data –Duplicated: Duplicated database (reconciled periodically)
Data Administration Information policy: Planning & rules governing DB operations & information use Data planning: Enterprise analysis Maintenance of data dictionaries Data quality standards Database Administration –Technical –Operational –May include personnel, purchasing, etc. for DB function
DB Trends Multidimensional Data Analysis –Online analytical processing (OLAP) Ability to "slice and dice" data interactively Multiple perspectives Matrices or cubes
Data Warehouses & Datamining Data warehouse –Consolidates & stores current & historical data extracted from various operational systems –Meta-data (summaries of transactional data) –Reporting & query tools, including OLAP & data mining Data mart: Subset of data warehouse Datamining: Analysis of data in data warehouses to find patterns/rules to aid in decision making
Databases & the Web Hypermedia database –Organize data in network of nodes linked in user- specified patterns/relationships –Text, graphics, sound, video, programs Linking internal DBs to the Web –Middleware is interface between DB & browser –Application server uses middleware to interface between DB & browser –Common gateway interface (CGI) written in a language interfaces between DB, app server, & browser
Ford & Firestone: Tire Disaster Late September, 2001: Firestone recalled another 3.5 million Wilderness tires How does this crisis represent an information management problem? Why did management of these two companies (and government officials) not see the trends in the data? What would you suggest should have been done in terms of database management & queries?
