Data Warehousing – An Introductory Perspective

Slides:



Advertisements
Similar presentations
Dimensional Modeling.
Advertisements

An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
1 Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this proposal or quotation. An Introduction to Data.
OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
Copyright © Starsoft Inc, Data Warehouse Architecture By Slavko Stemberger.
Data Warehousing M R BRAHMAM.
Data Warehouse Architecture Sakthi Angappamudali Data Architect, The Oregon State University, Corvallis 16 th May, 2005.
Data Warehouse IMS5024 – presented by Eder Tsang.
Exploiting the DW data DW is a platform for creating a wide array of reports It solves data feed problems, but does not lead to specific decision support.
Chapter 15 Data Warehousing, OLAP, and Data Mining
Chapter 13 The Data Warehouse
DATA WAREHOUSE (Muscat, Oman).
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Designing a Data Warehouse
Components of the Data Warehouse Michael A. Fudge, Jr.
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
Data Conversion to a Data warehouse Presented By Sanjay Gunasekaran.
Basic Concepts of Datawarehousing An Overview Prasanth Gurram.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
Datawarehousing Concepts | 7.0 9/7/2015 Datawarehousing Concepts.
Intro to MIS – MGS351 Databases and Data Warehouses Chapter 3.
Data Warehouse & Data Mining
DW-1: Introduction to Data Warehousing. Overview What is Database What Is Data Warehousing Data Marts and Data Warehouses The Data Warehousing Process.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
AN OVERVIEW OF DATA WAREHOUSING
OnLine Analytical Processing (OLAP)
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
1 Data Warehouses BUAD/American University Data Warehouses.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support.
Data Warehousing.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
CISB594 – Business Intelligence
October 28, Data Warehouse Architecture Data Sources Operational DBs other sources Analysis Query Reports Data mining Front-End Tools OLAP Engine.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
Ch3 Data Warehouse Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
CISB594 – Business Intelligence Data Warehousing Part I.
Ayyat IT Group Murad Faridi Roll NO#2492 Muhammad Waqas Roll NO#2803 Salman Raza Roll NO#2473 Junaid Pervaiz Roll NO#2468 Instructor :- “ Madam Sana Saeed”
UNIT-II Principles of dimensional modeling
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
CISB594 – Business Intelligence Data Warehousing Part I.
Advanced Database Concepts
Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems.
1 Copyright © Oracle Corporation, All rights reserved. Business Intelligence and Data Warehousing.
Introduction to OLAP and Data Warehouse Assoc. Professor Bela Stantic September 2014 Database Systems.
Data Warehouse/Data Mart It’s all about the data.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
11/20/ :11 AMData Mining 1 Data Mining – CSE 9033 Chapter – 1; Data Warehousing Dr. Goutam Sarker, B.E., M.E., Ph.D.(Engineering), Fellow: IE(I),
Intro to MIS – MGS351 Databases and Data Warehouses
Advanced Applied IT for Business 2
Defining Data Warehouse Concepts and Terminology
Data warehouse.
Decision Support System by Simulation Model (Ajarn Chat Chuchuen)
Data Warehousing CIS 4301 Lecture Notes 4/20/2006.
Chapter 13 Business Intelligence and Data Warehouses
Data warehouse and OLAP
Chapter 13 The Data Warehouse
Data Warehouse.
Defining Data Warehouse Concepts and Terminology
Components of the Data Warehouse Michael A. Fudge, Jr.
Data Warehouse and OLAP
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
An Introduction to Data Warehousing
Data Warehousing: Data Models and OLAP operations
Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009
Introduction of Week 9 Return assignment 5-2
Data Warehouse.
Data Warehousing Concepts
Data Warehouse and OLAP
Presentation transcript:

Data Warehousing – An Introductory Perspective Selling your ideas is challenging. First, you must get your listeners to agree with you in principle. Then, you must move them to action. Use the Dale Carnegie Training® Evidence – Action – Benefit formula, and you will deliver a motivational, action-oriented presentation. DWCC BBSR

Agenda Why Data Warehouse Definition and Architecture Terminology Open your presentation with an attention-getting incident. Choose an incident your audience relates to. The incidence is the evidence that supports the action and proves the benefit. Beginning with a motivational incident prepares your audience for the action step that follows.

The Business Need Business Decisions Are not made by Rolling Dices I think…. errrr, I guess so Business Decisions Are not made by Rolling Dices We Don’t know What we don’t know Next, state the action step. Make your action step specific, clear and brief. Be sure you can visualize your audience taking the action. If you can’t, they can’t either. Be confident when you state the action step, and you will be more likely to motivate the audience to action.

Current Business Environment Competitive Ever Changing Chaotic Global Urgency to make decisions Competitive advantages stems from well informed decisions Based on an understanding of: Your Products Your Customers Preferences The Competition Your own company strengths To complete the Dale Carnegie Training® Evidence – Action – Benefit formula, follow the action step with the benefits to the audience. Consider their interests, needs, and preferences. Support the benefits with evidence; i.e., statistics, demonstrations, testimonials, incidents, analogies, and exhibits and you will build credibility.

The Value Pyramid Increased revenue Increased productivity Each layer provides Value en route to a targeted business Outcome Increased revenue Increased productivity Reduced costs Competitive advantage To close, restate the action step followed by the benefits. Speak with conviction and confidence, and you will sell your ideas.

Definitions A collection of integrated, subject oriented databases designed to support the DSS function where each unit of data is relevant at some moment of time (Inmon 1991) A copy of transaction data specifically structured to Query and Analysis (Kimball 1996) Data Warehouse is NOT a specific technology It is a series of processes, procedures and tools that help the enterprise understand more about itself, its products, its customers and the market it services. It is NOT possible to purchase a Data Warehouse But, it is possible to build one.

sachin_kambhoj: sachin_kambhoj: FEATURES Non Volatile - Used mainly for reporting purpose and it is independent of transactional data. Subject Orientation- All relevant data is stored together. Ex: Sales, Finance, Marketing, Customer data etc. Historical data- Can contain data of several years depending on company requirements.

Subject Orientation. Operational Datawarehouse AUTO Customer HEALTH Policy LIFE Premium CASUALTY Claims Applications Subjects

Goals and Applications Goals of a Data Warehouse Provide reliable, High performance access Consistent view of Data: Same query, same data. All users should be warned if data load has not come in. Slice and dice capability Quality of data is a driver for business re-engineering. Data Warehousing Applications: Customer Profitability Analysis Customer satisfaction and retention Buyer behavior. Pricing, Promotion Analysis Market research Inventory optimization

OLTP v/s Data Warehouse OLTP system runs the business, Data Warehouses tell you how to run the business Characteristic OLTP Data Warehouse Orientation Transaction Analysis Data Access Record at a time Set at a time Updates Frequent & Unscheduled Periodic & Scheduled Response time Seconds required Minutes acceptable Concurrent users Many Few Availability Guaranteed As needed Data structures Highly normalized Often de-normalized Data nature Current historical

If most of your business needs are To report on data in a single transaction processing system All the historical data you need are in the system Data in the system is clean Your hardware can support reporting against the live system data The structure of the system data is relatively simple Your firm does not have much interest in end user adhoc query/report tools Data warehousing may not be for your business!!

Modeling Constructs Entity Relationship Diagram Star schema Snow flake schema Within the implementation of a warehouse, several of these constructs may be integrated to form an optimal design

Entity Relationship Diagram Based on set theory and SQL Highly normalized Optimized for update and fast transaction turnaround Not suited for querying in a data warehouse environment diagrams like these are very difficult for users to visualize and memorize.

Star Schema Facts are numerical measurements of business with A central fact table surrounded by a number of dimension tables. Dimensions are business entities on which calculations are done. They can be numeric or alphanumeric. Example: Product table comprising brand name, category, packaging type, size. Facts are numerical measurements of business with respect to dimensions.They are numeric and additive (summable across any combination) e.g. A sales fact table could contain time, product and store key along with dollars sold, units sold, dollars cost.

Snow Flake Schema Normalized version of the star schema with the addition of normalized dimension tables. Normalization helps to reduce redundancy in the dimension tables, but affects performance and user comprehension.

DW Terminology Granularity Granularity (or grain) defines the level of detail stored in the physical warehouse Low granularity indicates lot of detail while high granularity indicates less detail. Example: A commercial airline is building a data warehouse. What will the granularity be? Choice A: Each record represents a flight Choice B: Each record represents the customer on a flight There is no correct answer. To a large extent, the granularity depends on the business User’s exploitation needs. However, you should be aware that the granularity of data affects Volumes of Data, Data Maintenance, Indexing Level of Data Exploration Query and Reporting constraints

DW Terminology Metadata At all levels of the data warehouse, information is required to support the maintenance and use of the data warehouse. Metadata is data about data. There are two views of Metadata Business – are warehouse attributes and properties for use by business users Technical – describe data flow from Operational systems into the data warehouse OLAP Online Analytical processing Tool(s) for Analytical Reporting including Graphical capabilities.

DW Terminology OLAP Tools available for exploring the information built in a DW : Multi-dimensional On-line Analytical Processing (MOLAP) The data from data warehouse is queried and dumped periodically on to a server on local network to a data storage called Multi-dimensional Database (MDDB) provided by the OLAP tool. This MDDB forms a Data Mart which is then used for querying and reporting. Relational On-Line Analytical Processing (ROLAP) Refers to the ability to conduct OLAP analysis directly against a relational warehouse without any constraints on the number of dimensions, database size, analytical complexity, or number and type of users. Hybrid On-line Analytical Processing (HOLAP) An environment with a combination of MOLAP and ROLAP data storage. Summarized information is typically stored in an MDDB and detailed data is stored in a Relational environment.

Terminology Data Mart- Contains Data about a specific subject. Eg. Official data, Customer data, Campaign data etc. Metadata- Data about data. Describes the data stored in Data warehouse. Data Cubes- Central object of data containing information in a multidimensional structure. Data Cleansing- Regular cleaning of data. ETL- Extraction, Transformation and Loading of Data. Data Mining- A mechanism which uses intelligent algorithms to discover patterns, clusters and models from data.

Stages Extraction, Transformation & Loading (ETL) Business Intelligence Heterogeneous Source Systems Query & Reporting Operational Staging Area Data Warehouse OLAP Legacy External Data Mining

A Typical Data Warehouse Summarized Data Facilitates in firing queries on detailed data. Meta Data Detailed Data Data Mart Data Mart Data Mart Data marts contain data specific to a subject.

MOLAP/ROLAP/HOLAP MDD Proprietary API MDD Proprietary API SQL Data Warehouse (RDBMS) SQL Custom Loader Query Tool by MDD Vendor OLAP Engine Rows Rows Cubes MDD Database Storage Periodic, Manual Data Load

OLAP Terminology Region State Region District Location Month Product Analytical technique whereby the user navigates from the most summarized to the most detailed level.

OLAP Terminology Rotation Or Dicing Region Month M O N T h Region U C t Product Region

OLAP Terminology Slicing Region M O N T h Product

Products and Vendors Data Warehouses OLAP tools Data Mining Oracle Sybase DB2 OLAP tools Oracle Express Hyperion Essbase Data Mining Oracle Darwin IBM Intelligent Data Miner Querying & Reporting Oracle Discoverer Business Objects