This presentation was developed by Dr. Steven C This presentation was developed by Dr. Steven C. Ross for use in MIS 320 classes at Western Washington University. Some of the material contained herein is © 2007, John Wiley & Sons, Inc. and other sources, as noted. All rights reserved.
Data and Knowledge Management MIS 320 Data and Knowledge Management Opening music is Peter Gunn theme
Managing Data Why is it difficult? Amount of data increasing exponentially Data are scattered through the organization Data from many sources Internal Personal External
Data Life Cycle* * Figure 4.1 from the Rainer et al., textbook.
The Database Approach “Single Place” for all data Database management systems (DBMS) software Minimizes problems Redundancy Isolation Inconsistency Maximizes benefits Security Integrity Data-program independence
DBMS and Users* * Figure 4.3 from the Rainer et al., textbook.
OLTP and OLAP* What is a “database”? Give me an example of a database. Student registration system Customer data Product and production data Does the database support OLTP or OLAP? What are examples of transactions? Monetary things like purchases New entries, like a new customer Changes to data, like price changes Deletions, like purging old data What sort of analyses do we do? Descriptive statistics “Predictive” – Correlation and regression, ANOVA * Graphic from Haag, et al.
Physical vs. Logical* Do you (typical business professional) need to know the physical terms? To pass this class, perhaps To help design the database, only in the vaguest sense Do you (typical business professional) need to know the logical terms? To pass this class, definitely To help design the database, some familiarity is useful What does an IS/DBMS professional need to know? Much more emphasis on the logical concepts. OS and DBMS take care of the physical so we can concentrate on the logical. * Graphic from Haag, et al.
Database Structure* This is a simple example of a relational database. What is a field? Give an example What is a record? Give an example What is a table? Give an example We can use it to answer questions like How many different parts are in our Orion facility? Which facility has the most expensive part? Does James Riley have the 50’ tape measure? These questions you can answer by looking at the data. With a computer you can give commands that would quickly answer these questions in a million-record database. Facility Number links the two tables * Graphic from Haag, et al.
Follow the links: Which employees have delivered to John Yu? * Graphic from Haag, et al.
The Keys make the Relationship* * Graphic from Haag, et al.
DBMS Components* What is data definition? Defining the structure of the database What is data manipulation? CRUD (next slide) Views Reports QBE SQL What is application generation? Making forms, reports and menus (and other stuff) to display the results of data manipulation What is data administration? Backup/recovery Security Query optimization Reorganization (physical arrangement) Concurrency control Change analysis * Graphic from Haag, et al.
CRUD Database software must enable and manage four types of access to the data Create Read Update Delete These are all data manipulation concepts Create means to add new records Read means to extract data Update means to change existing records Delete means to remove records from tables
DBMS Concepts Query languages Data dictionary SQL: Structured Query Language QBE: Query by Example Data dictionary Describes fields in tables Name Type of data (text, numeric, …) Size Key or not Valid values
Data Warehouse* “A repository of historical data that are organized by subject to support decision makers in the organization.” (Rainer et al., p. 111) DW data comes from many sources, some might even come from external to the organization ETL – Extract, Transform, Load Extract – pulling data from various sources Transform – summarize, deal with missing entries, combine data from related tables (e.g., add facility names to part data table) Load – put it in tables that are easily recalled * Graphic from Haag, et al.
Characteristics of Data Warehouse Data Organized by business dimension or subject Consistent coding Historical (but might be recent) Nonvolatile (but might be refreshed) OLAP not OLTP Multidimensional Drawn from DB and other sources
DW vis-à-vis DB * Figure 4.9 from the Rainer et al., textbook.
Data Marts * Graphic from Haag, et al.
Data Mining What do query and reporting tools do? – statistics type inquiries What kinds of questions can you answer with Q&R tools? – how much, where? What do intelligent agents do? – information discovery What kinds of questions can you answer with intelligent agents? – what patterns and associations exist What do multi-dimensional analysis tools do? – slice and dice (previous slide) What kinds of questions can you answer with MDA tools? – how is x related to y and z? * Graphic from Haag, et al.
Data Warehouse Considerations Do you need a data warehouse? Do you already have a data warehouse? Who will the users be? Who gets access to what data? How up-to-date must the data be? What data mining tools do you need? Do you need a data warehouse? Consider expense, what you already have, can you support it? Do you already have a data warehouse? You might already have the capabilities you need Who will the users be? What do they want to do? How sophisticated are they? Who gets access to what data? Critical question How up-to-date must the data be? How do you decide this? What data mining tools do you need? Query and reporting, intelligent agents, multi-dimensional analysis What is a data mart? Subset of Data warehouse
Knowledge Management “Knowledge is information that is contextual, relevant, and actionable.” (Rainer, et al., p. 121) Explicit knowledge Objective, rational, technical Tacit knowledge Subjective or experiential
Knowledge Management Cycle * Figure 4.13 from the Rainer et al., textbook.
Information Resource Management Considerations How will changes in technology affect what we are doing? Who should oversee the organization’s information? CIO, DA, DBA Who owns the information? How “clean” must the information be? What are the ethical concerns? How will changes in technology affect what we are doing? “Cooler” isn’t always better What type of DBMS is most appropriate? This is more a question of scale (and expense) than a question of technology. Who should oversee the organization’s information? CIO, DA, DBA Who owns the information? Who cares? What are the ethical concerns? What data should we keep? Who should have access to it?
References Haag, Cummings, and McCubbrey, Management Information Systems for the Information Age (5th Edition), McGraw-Hill Irwin, 2005. Rainer, Turban, and Potter, Introduction to Information Systems: Supporting and Transforming Business, Wiley, 2007.