Data and Knowledge Management CHAPTER 5 Data and Knowledge Management
CHAPTER OUTLINE 5.1 Managing Data 5.2 Big Data 5.3 The Database Approach 5.4 Database Management Systems 5.5 Data Warehouses and Data Marts 5.5 Knowledge Management
LEARNING OBJECTIVES 1. Discuss ways that common challenges in managing data can be addressed using data governance. 2. Define Big Data, and discuss its basic characteristics. 3. Explain how to interpret the relationships depicted in an entity-relationship diagram. 4. Discuss the advantages and disadvantages of relational databases.
Learning Objectives (continued) 5. Explain the elements necessary to successfully implement and maintain data warehouses. 6. Describe the benefits and challenges of implementing knowledge management systems in organizations.
Chapter Opening Case Big Data! The data deluge is here
Chapter Opening Case (continued) Big Data and HR
Chapter Opening Case (continued) Big Data and product development
Chapter Opening Case (continued) Big Data and operations
Chapter Opening Case (continued) Big Data and marketing
5.1 Managing Data The Difficulties of Managing Data Data Governance Difficulties in managing data: Amount of data increasing exponentially Data are scattered throughout organizations and collected by many individuals using various methods and devices. Data come from many sources. Data security, quality, and integrity are critical.
Difficulties in Managing Data The Data Deluge Difficult to manage data for many reasons: Amount of data increasing exponentially over time; Data are scattered throughout organizations; Data obtained from multiple internal and external sources; Data degrade over time; Data subject to data rot; Data security, quality, and integrity are critical, yet easily jeopardized; Information systems that do not communicate with each other can result in inconsistent data; Federal regulations.
Data Governance See video Data governance is an approach to managing information across an entire organization. Master data management is a process that spans all of an organization’s business processes and applications. Master data are a set of core data that span all of an enterprise’s information systems.
Data Governance (continued) This image shows where data governance and master data management fit into the organization’s IT governance.
Master Data Management John Stevens registers for Introduction to Management Information Systems (ISMN 3140) from 10 AM until 11 AM on Mondays and Wednesdays in Room 41 Smith Hall, taught by Professor Rainer. Transaction Data Master Data John Stevens Student Intro to Management Information Systems Course ISMN 3140 Course No. 10 AM until 11 AM Time Mondays and Wednesdays Weekday Room 41 Smith Hall Location Professor Rainer Instructor
5.2 Big Data video
Annual Flood of Data from….. Credit card swipes RFID tags Digital video surveillance E-mails Blogs Digital video Radiology scans Online TV
Annual Flood of New Data! In the zettabyte range A zettabyte is 1000 exabytes According to the annual survey of the global digital output by International Data Corporation, the total amount of global data was expected to pass 1.2 zettabytes sometime during 2010. This is equivalent to the amount of data that would be generated by everyone in the world posting messages on Twitter continuously for a century.[
5.3 The Database Approach Database management system (DBMS) minimize the following problems: Data redundancy Data isolation Data inconsistency Data redundancy: The same data are stored in many places. Data isolation: Applications cannot access data associated with other applications. Data inconsistency: Various copies of the data do not agree.
Database Approach (continued) DBMSs maximize the following issues: Data security Data integrity Data independence Data security: Keeping the organization’s data safe from theft, modification, and/or destruction. Data integrity: Data must meet constraints (e.g., student grade point averages cannot be negative). Data independence: Applications and data are independent of one another. applications and data are not linked to each other, meaning that applications are able to access the same data.
Database Management Systems
Data Hierarchy Bit Byte Field Record File (or table) Database A bit is a binary digit, or a “0” or a “1”. A byte is eight bits and represents a single character (e.g., a letter, number or symbol). A field is a group of logically related characters (e.g., a word, small group of words, or identification number). A record is a group of logically related fields (e.g., student in a university database). A file is a group of logically related records. A database is a group of logically related files.
Hierarchy of Data for a Computer-Based File
Data Hierarchy (continued) Bit (binary digit) Byte (eight bits)
Data Hierarchy (continued) Example of Field and Record
Data Hierarchy (continued) Example of Field and Record
Designing the Database Data model Entity Attribute Primary key Secondary keys The data model is a diagram that represents the entities in the database and their relationships. An entity is a person, place, thing, or event about which information is maintained. A record generally describes an entity. An attribute is a particular characteristic or quality of a particular entity. The primary key is a field that uniquely identifies a record. Secondary keys are other field that have some identifying information but typically do not identify the file with complete accuracy.
Entity-Relationship Modeling Database designers plan the database design in a process called entity-relationship (ER) modeling. ER diagrams consists of entities, attributes and relationships. Entity classes Instance Identifiers Entity classes are groups of entities of a certain type. An instance of an entity class is the representation of a particular entity. Entity instances have identifiers, which are attributes that are unique to that entity instance.
Entity-Relationship Diagram Model
5.4 Database Management Systems Database management system (DBMS) Relational database model Structured Query Language (SQL) Query by Example (QBE) A database management system is a set of programs that provide users with tools to add, delete, access, and analyze data stored in one location. The relational database model is based on the concept of two-dimensional tables. Structured query language allows users to perform complicated searches by using relatively simple statements or keywords. Query by example allows users to fill out a grid or template to construct a sample or description of the data he or she wants.
Student Database Example
Normalization Normalization Minimum redundancy Maximum data integrity Best processing performance Normalized data is when attributes in the table depend only on the primary key. Normalization is a method for analyzing and reducing a relational database to its most streamlined form for minimum redundancy, maximum data integrity, and best processing performance.
Non-Normalized Relation
Normalizing the Database (part A)
Normalizing the Database (part B)
Normalization Produces Order
5.5 Data Warehousing Data warehouses and Data Marts Organized by business dimension or subject. Multidimensional. Historical. Use online analytical processing. A data warehouse is a repository of historical data organized by subject to support decision makers in the organization. The data cube has three dimensions: customer, product, and time. A Data Cube
Data Warehouse Framework & Views
Relational Databases
Multidimensional Database
Equivalence Between Relational and Multidimensional Databases
Equivalence Between Relational and Multidimensional Databases
Equivalence Between Relational and Multidimensional Databases
Benefits of Data Warehousing End users can access data quickly and easily via Web browsers because they are located in one place. End users can conduct extensive analysis with data in ways that may not have been possible before. End users have a consolidated view of organizational data.
Data Marts A data mart is a small data warehouse, designed for the end-user needs in a strategic business unit (SBU) or a department.
5.6 Knowledge Management Knowledge management (KM) Knowledge Intellectual capital (or intellectual assets) Knowledge management is a process that helps organizations manipulate important knowledge that is part of the organization’s memory, usually in an unstructured format. Knowledge that is contextual, relevant, and actionable. Intellectual capital is another term often used for knowledge.
Knowledge Management (continued) Explicit Knowledge (above the waterline) Tacit Knowledge (below the waterline) Explicit knowledge: objective, rational, technical knowledge that has been documented. Examples: policies, procedural guides, reports, products, strategies, goals, core competencies Tacit knowledge: cumulative store of subjective or experiential learning. Examples: experiences, insights, expertise, know-how, trade secrets, understanding, skill sets, and learning
Knowledge Management (continued) Knowledge management systems (KMSs) Best practices Knowledge management systems refer to the use of information technologies to systematize, enhance, and expedite intrafirm and interfirm knowledge management. Best practices are the most effective and efficient ways of doing things.
Knowledge Management System Cycle Create knowledge Capture knowledge Refine knowledge Store knowledge Manage knowledge Disseminate knowledge
Knowledge Management System Cycle