Carnegie Mellon University ©2006 - 2008 Robert T. Monroe 45-875 BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March.

Slides:



Advertisements
Similar presentations
Dimensional Modeling.
Advertisements

CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
C6 Databases.
Database Management3-1 L3 Database Management Santa R. Susarapu Ph.D. Student Virginia Commonwealth University.
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS MBNA
Systems Development Life Cycle
Chapter 3 Database Management
Business Intelligence Michael Gross Tina Larsell Chad Anderson.
Database Management: Getting Data Together Chapter 14.
Mgt 20600: IT Management & Applications Databases
Chapter 4: Database Management. Databases Before the Use of Computers Data kept in books, ledgers, card files, folders, and file cabinets Long response.
BUSINESS DRIVEN TECHNOLOGY
Mgt 20600: IT Management & Applications Databases Tuesday April 4, 2006.
 MODERN DATABASE MANAGEMENT SYSTEMS OVERVIEW BY ENGINEER BILAL AHMAD
Chapter 1: The Database Environment
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
ACS1803 Lecture Outline 2 DATA MANAGEMENT CONCEPTS Text, Ch. 3 How do we store data (numeric and character records) in a computer so that we can optimize.
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS MBNA ebay
CSI315CSI315 Web Development Technologies Continued.
Database Systems COMSATS INSTITUTE OF INFORMATION TECHNOLOGY, VEHARI.
MD240 - MIS Oct. 4, 2005 Databases & the Data Asset Harrah’s & Allstate Cases.
Systems analysis and design, 6th edition Dennis, wixom, and roth
Carnegie Mellon University © Robert T. Monroe Management Information Systems Introduction to Data Management Management Information.
The McGraw-Hill Companies, Inc Information Technology & Management Thompson Cats-Baril Chapter 3 Content Management.
Web-Enabled Decision Support Systems
1 INTRODUCTION TO DATABASE MANAGEMENT SYSTEM L E C T U R E
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Introduction: Databases and Database Users
Carnegie Mellon University © Robert T. Monroe Management Information Systems Some Questions To Ponder… How many different information.
Organizing Data and Information AD660 – Databases, Security, and Web Technologies Marcus Goncalves Spring 2013.
CS 474 Database Design and Application Terminology Jan 11, 2000.
Chapter 7: Database Systems Succeeding with Technology: Second Edition.
311: Management Information Systems Database Systems Chapter 3.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
Lecturer: Gareth Jones. How does a relational database organise data? What are the principles of a database management system? What are the principal.
DAY 12: DATABASE CONCEPT Tazin Afrin September 26,
© 2007 Robert T. Monroe Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Data Warehousing II: Extract, Transform,
Storing Organizational Information - Databases
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
MIS 673: Database Analysis and Design u Objectives: u Know how to analyze an environment and draw its semantic data model u Understand data analysis and.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 4 Data and Databases. Learning Objectives Upon successful completion of this chapter, you will be able to: Describe the differences between data,
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 7 Storing Organizational Information - Databases.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
© 2007 Robert T. Monroe Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Data Warehousing BI Tools and Techniques.
Chapter 3 Databases and Data Warehouses: Building Business Intelligence Copyright © 2010 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
CISB113 Fundamentals of Information Systems Data Management.
Building Dashboards SharePoint and Business Intelligence.
Administrivia HW #1 management option due now – please submit
Carnegie Mellon University © Robert T. Monroe Management Information Systems Introduction to Data Management Management Information.
© 2003 Prentice Hall, Inc.3-1 Chapter 3 Database Management Information Systems Today Leonard Jessup and Joseph Valacich.
Carnegie Mellon University © Robert T. Monroe Management Information Systems Introduction To MIS Management Information Systems.
1 Copyright © 2009, Oracle. All rights reserved. Oracle Business Intelligence Enterprise Edition: Overview.
Fundamentals of Information Systems, Sixth Edition Chapter 3 Database Systems, Data Centers, and Business Intelligence.
1 Copyright © Oracle Corporation, All rights reserved. Business Intelligence and Data Warehousing.
Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Competing on Analytics Robert Monroe March 20, 2008.
1 Management Information Systems M Agung Ali Fikri, SE. MM.
The Concepts of Business Intelligence Microsoft® Business Intelligence Solutions.
BUSINESS INTELLIGENCE. The new technology for understanding the past & predicting the future … BI is broad category of technologies that allows for gathering,
Building the Corporate Data Warehouse Pindaro Demertzoglou Lally School of Management Data Resource Management.
Managing Data Resources File Organization and databases for business information systems.
Operation Data Analysis Hints and Guidelines
Fundamentals & Ethics of Information Systems IS 201
Fundamentals of Information Systems
Competing on Analytics II
Basic Concepts in Data Management
The Database Environment
Presentation transcript:

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Business Intelligence Tools and Techniques Robert Monroe March 18, 2008

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Agenda Quick survey Overview of Business Intelligence Tools and Techniques Course structure, grading, and expectations Data management fundamentals

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Survey Please complete and hand back the survey Survey helps me to: –Understand your goals and expectations for the course –Evaluate your previous IT knowledge and experience –… adjust the class accordingly

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Introducing Business Intelligence Tools and Techniques

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Corporations Are Drowning in Data … but thirsty for actionable knowledge Our ability to collect and store data seems to have surpassed our ability to make sense of it! Important trends: –Storage capacity continues to rise rapidly –Cost of storage continues to drop

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Business Intelligence Core question: How can an organization manage and leverage large data sets to make better business decisions? Business Intelligence (BI) –A broad category of applications and technologies for gathering, storing, analyzing, and providing access to data to help enterprise users make better business decisions. (Wikipedia) Two common uses for BI tools –Measuring where you are / how your business is performing –Identifying problems and opportunities

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Business Intelligence Systems Improve Decision Making Source: O’Brien, Management Information Systems, 6 th ed.

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques In-Class Exercise Take out a piece of paper and pencil Select a company that you are familiar with and a managerial role in that company Write down five pieces of quantitative information that you would most want to have to manage your business (or your part of the business) effectively

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques A Business Intelligence Roadmap

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Module 1: Course Intro, Data Management Fundamentals What is Business Intelligence? –How can it help me make better business decisions? –What kinds of questions can BI tools help me answer? What is the relationship between data, information, & knowledge? What does it mean to ‘Compete on Analytics’ –Why would I want to do so? –How might I do so effectively? Data Info Knowledge

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Module 2: Data Warehousing What is a Data Warehouse? –How about a Data Mart? –How is a Data Warehouse different from a ‘regular’ database? Why do we need another database that just duplicates data that we already have? How can fill a data warehouse with comprehensive, timely, and high-quality data?

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Module 3: Reporting and OLAP How do I convert the data in my data warehouse into actionable information or knowledge? What tools are available to help non-programmers analyze warehouse data? What is dimensional modeling? Why is it powerful? What kinds of questions are OLAP tools designed to answer?

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Module 4: Info Viz and Data Mining What tools are available to help me visualize very large data sets? Why would (or wouldn’t) I want to use visual tools to explore my data? What do data mining tools do? What different kinds of data mining tools and techniques are available? How do I tell which tools are appropriate for which situations?

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Module 5: Dashboards What is an executive dashboard? –Are they only for executives? –Why are they useful? –What are their drawbacks? How can I implement dashboards effectively in my organization?

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Module 6: ‘Real-Time’ Business Intelligence How can we move from historical analysis to ‘real- time’ analysis? Why is this hard to do in practice? What tools and techniques are available to support real- time analysis?

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Module 7: Implementing BI, Ethical use of BI What does my organization need to do to implement a successful BI program? What ethical issues arise with BI capabilities? How can we insure that our BI capabilities are used ethically? –What does it mean to do so?

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Dashboard: Expected Effort First two weeks focus on BI foundation –Eat your vegetables, exercise more Middle classes focus on using various BI tools effectively –Use the tools, Luke Final classes combine fundamentals, tools, people, processes, and ethics –Pull it all together → →

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Course Structure, Grading, and Expectations

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Course Goals Understand how to apply various Business Intelligence (BI) tools and techniques to analyze and evaluate large data sets to make better business decisions Understand the benefits, drawbacks, and applicability of various approaches to BI Improve awareness of a variety of challenges and ‘gotchas’ that arise when implementing BI systems –… and how to avoid them

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Course Philosophy Focus on applying BI technology to solve business problems, not building BI systems You will develop new skills by doing and participating –You will need to use the BI tools –When in doubt try something, experiment –Most work done in teams – learn from/with your peers –Casual interactive class – your participation is important Many of the technologies we will look at are relatively new –Not everything will work perfectly the first time… –Flexibility, patience, and a willingness to explore will help a lot Let’s have some fun – life’s too short to do otherwise

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Expectations, Etiquette, and Academic Integrity Waitlist Office hours, 3:30 – 4:30 MWF Expectations and etiquette Academic integrity Teaching Assistant –Bao-Jun Jiang,

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Pass/Fail I allow students to take the course pass/fail provided that they agree to: –Attend class regularly –Prepare for class as if they were taking it for a grade –Complete all of the assignments –Take the final exam at its regular time and place –Complete all of the necessary administrative paperwork

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Blackboard And The Course Wiki Blackboard is used only for archival postings Most information is posted on the class wiki – –Read permissions open to everybody, need to register to get write permissions –Contact Bao-Jun if you have not received an invitation by this evening Wiki participation is strongly encouraged –Participation on wiki counts towards course participation grade –Add interesting and useful things that you find to the wiki –Wiki will remain available as a resource after course ends –Please don’t mess with things like assignments, due dates, etc.

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Grading Grades will be computed as follows: –Homework exercises (3)45% –Final exam 30% –Class attendance, preparation, 25% and participation  Late assignments policy: 25% deduction each day late  I curve final grades, not individual assignments  Please see regrade request policy in syllabus document

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Assignments Three homework assignments –Groups of 2-4 people Assignment #1: Data warehousing –Analyze data warehousing scenario and make business, technology, and process recommendations based on your analysis (management option) –Create a simple data warehouse and ETL process to load it (tech option) Assignment #2: Reporting and OLAP tools –Use Microsoft’s Reporting and/or OLAP tools to retrieve, analyze, and present useful information from a data warehouse and OLAP cubes Assignment #3: Case analysis, dashboards or visualizations –Case analysis – Continental or SYSCO cases (management option) –Analyze scenario/case and design dashboard(s) and/or data visualizations to meet business needs (tech option)

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Computing Resources There are many good BI platforms We will primarily use Microsoft’s SQL Server 2005 –Client tools –Reporting Services –Analysis Services –Integration Services (ETL tool – optional) We will also experiment with a variety of other BI tools You must provide a laptop that can run SQL Server 2005 client –At least client tools, servers are optional –600Mhz proc, 512MB of RAM, 0.5–2.0GB of disk space –Install instructions are available on Blackboard –Please try to install SQL Server 2005 client tools before Monday’s class

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Data Management Fundamentals

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Definitions What is the difference between data, information, and knowledge? Data is a collection of raw value elements or facts used for calculating, reasoning, or measuring. Data may be collected, stored, or processed but not put into a context from which any meaning can be inferred. [Los03] Information is the result of collecting and organizing data in a way that establishes relationships between data items, which thereby provides context and meaning. [Los03] Knowledge is information to which experience, interpretation, and reflection are added by individuals so that it becomes a high value form of information –The OR Society

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Exercise 3/21/05$27.743/22/05 $ /21/05$19.783/22/05 $ /21/05$21.413/22/05 $ /21/05$83.813/22/05 $84.24 MSFT INTC CSCO IBM 3/21/05 3/22/05 3/22/05 3/21/05 3/22/05 3/22/05 3/21/05 3/21/05 $27.74 $19.78 $21.41 $83.81 $27.01 $19.72 $21.50 $84.24 CSCO MSFT INTC IBM Closing Stock Prices

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Goal: Convert Data to (Actionable) Knowledge Data Info Knowledge Increasing Value Why is this so hard to do in practice?

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Challenge: What To Capture and Store? The amount of data that can be captured is enormous –Storing data is relatively cheap (  the margin) –Structuring and retrieving data is relatively expensive –Converting large data sets to actionable knowledge tends to be relatively challenging and expensive Rules of thumb for deciding what to capture and store –Start with what you want to get out and work backwards –Evaluate what is already available –Insure that you capture high-quality data –Analyze fundamental data requirements for the enterprise, independent of the specific project at hand

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Exercise: What To Capture And Store Scenario 1: You are a marketing VP for a large chain food retailer. You need to figure out how to properly price and promote a specific brand of snack chips over the next year What questions do you need to ask? What analyses would you like to do to answer them? What data will you need to do these analyses? Where will you get that data? –Is your organization likely to already have all the data that you need? –Are there other data sources that you should try to take advantage of and incorporate into your analyses?

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Exercise: What To Capture And Store Scenario 2: You are an executive at Ferrari who needs to decide how to allocate the latest and greatest sports car your company is introducing in six months to maximize your company’s profits long-term What questions do you need to ask? What analyses would you like to do to answer them? What data will you need to do these analyses? Where will you get that data? –Is your organization likely to already have all the data that you need? –Are there other data sources that you should try to take advantage of and incorporate into your analyses?

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Exercise: What To Capture And Store Scenario 3: You are a HR executive responsible for recruiting salespeople. Your bonus each year is directly tied to how well the salespeople you bring in do in their first three years at your company. You’ve read Moneyball and Competing on Analytics, and you want to take a more analytic approach to your job What questions do you need to ask? What analyses would you like to do to answer them? What data will you need to do these analyses? Where will you get that data? –Is your organization likely to already have all the data that you need? –Are there other data sources that you should try to take advantage of and incorporate into your analyses?

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques The Relational Data Model The Relational Model has become the de-facto standard for managing operational business data Core concepts in a relational model: –Tables (relations) –Records (rows) –Data fields (columns) –Primary keys –Foreign keys Products Product IDDescriptionColorSizeQty Available 52Shoes (pair)Blue Socks (pair)WhiteLarge BlouseGreen PantsBlue32/340

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Data, Information, Database Example Purchases Order IDCustomer NameProduct IDQuantityDate 5623Jimmy Hwang52312/15/ Sue Smith64512/16/ Jane Chen145112/16/2004 Products Product IDDescriptionColorSizeQty Available 52Shoes (pair)Blue Socks (pair)WhiteLarge BlouseGreen PantsBlue32/340 Jimmy Hwang purchased 3 pairs of size 10 shoes on 12/15/2004 What other information can we derive from these data tables? Data in Database Tables Information

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Relational Data, Tables, Records, and Metadata Example Purchases Order IDCustomer NameProduct IDQuantityDate 5623Jimmy Hwang52312/15/ Sue Smith64512/16/ Jane Chen145112/16/2004 Products Product IDDescriptionColorSizeQty Available 52Shoes (pair)Blue Socks (pair)WhiteLarge BlouseGreen PantsBlue32/340 Table Name: Products ProductID Int (pkey) Description Text(50) Color Text(50) SizeText(20) QtyAvailableInt Table Name: Purchases OrderIDInt (pkey) CustomerNameText(75) ProductIDInt (fkey) QuantityDecimal DateDateTime Data (Records) in Database Tables Metadata

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Normalization And Denormalization Data normalization is the process of decomposing relations with anomalies to produce smaller, well-structured relations –Basic idea: each table only holds data about one ‘thing’ Goals of normalization include: –Minimize data redundancy –Simplifying the enforcement of referential integrity constraints –Simplify data maintenance (inserts, updates, deletes) –Improve representation model to match “the real world” Normalization sometimes hurts query performance

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Example: Denormalized Table Insertion anomaly: when an employee takes a new class we need to add duplicate data (Name, Dept_Name, and Salary) Deletion anomaly: If we remove employee 140, we lose information about the existence of a Tax Acc class Modification anomaly: Employee 100 salary increase forces update of multiple records These anomalies exist because there are two themes (entity types) into one relation – course and employee, resulting in duplication, and an unnecessary dependency between the entities Employee Emp_IDNameDept_NameSalaryCourse_TitleDate_Completed 100Margaret SimpsonMarketing48000SPSS6/19/ Margaret SimpsonMarketing48000Surveys10/7/ Alan BeetonAccounting52000Tax Acc12/8/ Chris LuceroInfo Systems43000SPSS1/12/ Chris LuceroInfo Systems43000C++4/22/ Lorenzo DavisFinance Susan MartinMarketing42000Java8/12/ Susan MartinMarketing42000SPSS6/19/2005 Example Derived from Hoffer, Prescott, McFadden, Modern Database Management, 7th ed.

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Normalizing Previous Employee/Class Table Course_Completion Emp_IDCourse_IDDate_Completed 10016/19/ /7/ /8/ /12/ /22/ /19/ /12/2002 Employee Emp_IDNameDept_NameSalary 100Margaret SimpsonMarketing Alan BeetonAccounting Chris Lucero Lorenzo DavisFinance Susan MartinMarketing42000 Course Course_IDCourse_Title 1SPSS 2Surveys 3Tax Acc 4C++ 5Java This seems more complicated Why might this approach be superior to the previous one?

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Indexing An index is a table or other data structure used to determine the location of rows in a file that satisfy some condition Indices reduce the time needed to retrieve records … but increase the time and cost to insert, update, or delete Indexing is critical for high performance in large, complex db’s, –Especially data warehouses and data marts Products Product IDDescriptionColorSize 52Shoes (pair)Blue10 145Socks (pair)WhiteLarge 62BlouseGreen7 12PantsBlue32/34 532SkirtGreen7 ………… Product_Index Product IDRow ……

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Alternative Data Models The relational data model is the current de-facto standard for storing and managing corporate data There are other data storage models, usually associated with legacy systems –The data you need for your analysis may be stored in them! Four common alternative data models –Flat file –Hierarchical –Network –Object

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Structured Query Language (SQL) SQL provides a standard language for describing, manipulating, and querying data from relational databases SQL allows applications to interact with databases without requiring a tight binding between the application and the underlying DBMS All of the major relational database vendors implement some form of SQL in their database products Example Query: SELECT ProductName, ProductPrice FROM Products WHERE SupplierName=‘Acme’ ORDER BY ProductsPrice DESC;

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Query Example English: Find the 10 most expensive products that we stock SQL: SELECT TOP 10 Products.ProductName AS TenMostExpensiveProducts, Products.UnitPrice FROM Products ORDER BY Products.UnitPrice DESC; Query Results:

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Transactional and Analytical Systems Transactional systems: System that are used to run a business in real time, based on current data. Also called “systems of record” Analytical systems: Systems designed to support decision making based on historical point-in-time and prediction data for complex queries or data mining applications BI systems are generally analytical systems

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Examples of Transactional and Analytical Systems Transactional System Examples Supermarket checkout system ATM machines Purchase order processing Student course registration Warehouse/inventory tracker Airline ticketing system E-Z Pass Analytical System Examples Data warehouses Data marts Enterprise spend analysis –Where do we spend our $$$ Sales force productivity analysis –By sales person, region, or product line Product-line profitability analysis –Which products are most profitable? –Which do we lose money on?

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Why Not Use Operational Data Stores For BI? It is good practice to separate operational and analytical systems and data Why? –To improve system performance –To improve database managability and maintainability –Optimize each type of system for it’s primary purpose

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques Wrap Up

Carnegie Mellon University © Robert T. Monroe BI Tools and Techniques For Thursday We will be discussing part 1 of Competing on Analytics –Reading assignment is available on the wiki Come prepared to apply the concepts in part 1 of the book in class discussions to analyze how some well- known organizations might be able to improve their business by aggressively pursuing the principles of analytic excellence described in the book –Feel free to suggest organizations to discuss prior to class: I’ll be taking requests as I spin your favorite on-the-fly cases –Post suggestions for organizations to discuss in class, along with a brief description of why they would be an interesting to discuss, to the course wiki by Wednesday evening.