Download presentation
Presentation is loading. Please wait.
Published byPrudence Singleton Modified over 9 years ago
1
Basic Concepts of Datawarehousing An Overview Prasanth Gurram
2
What is the sales distribution region wise? What is Defaulter’s Profile? What are the slow movers in my product line? How did my revenue improve in the past 5 years? Which of my Sales Agents are doing better? Who are my profitable customers? Currency Risk, Interest Rate Risk, Liquidity Risk Strategic Planning / Budgeting Which channel costs me more and pays less? How to answer these Business Queries?
3
Enable users to get a “Business View” of the data Facilitate Data based Decision Making that would drive and improve the Business Discover “Hidden Trends” Decision Support Systems Decision Support Systems (DSS) are interactive computer- based systems intended to help decision makers utilize data and models to identify and solve problems and make decisions. Data Warehouse is the foundation of DSS process. It is a Strategy and a Process for Staging Corporate Data. Decision Support Systems Decision Support Systems (DSS) are interactive computer- based systems intended to help decision makers utilize data and models to identify and solve problems and make decisions. Data Warehouse is the foundation of DSS process. It is a Strategy and a Process for Staging Corporate Data. DSS
4
Driving Forces for DSS RESULT: Customers Reform Technology Business Speed COMPETITION
5
Unavailability of Tools and Techniques for acquisition of data from various sources for answering business questions and making decisions, in earlier days Intensive efforts in data formatting than data analysis Static and inflexible report generation Time-lag in accessing the information at central place Scenario without DSS
6
OLTP Environment get data IN large volumes of simple transaction queries continuous data changes low processing time mode of processing transaction details data inconsistency mostly current data DSS Environment get information OUT small number of diverse queries periodic updates only high processing time mode of discovery subject oriented - summaries data consistency historical data is relevant OLTP v/s DSS Environment
7
OLTP Environment high concurrent usage highly normalized data structure static applications automates routines DSS Environment low concurrent usage fewer tables, but more columns per table dynamic applications facilitates creativity OLTP v/s DSS Environment
8
Benefits for Business User Flexible Information Access High Availability Ease of Use Quality & Completeness of Data Focus on Information Processing Information Base for Knowledge Discovery
9
Advances in dbms technology Data warehousing On-line analytical processing Data mining Available line of technology
10
Datawarehouse Data warehouses store large volumes of data which are frequently used by DSS.It is maintained separately from the organization’s operational databases Data warehouse is subject-oriented, integrated, time- variant, and nonvolatile collection of data Subject-oriented : Contains information regarding objects of interest for decision support: Sales by region, by product, etc. Itegrated: Data are typically extracted from multiple, heterogeneous data sources (e.g., from sales, inventory, billing DBs etc.). Time-variant: Contain historical data, longer horizon than operational system. Nonvolatile : Data is not (or rarely) directly updated.
11
Datawarehouse Is the enabling technology that facilitates improved business decision-making It’s a process, not a product A technique for assembling and managing a wide variety of data from multiple operational systems for decision support and analytical processing It’s a journey not a destination...
12
Transmission NETWORKNETWORK Metadata Layer Cleansing Transformation Aggregation Summarization Data Mart Population Knowledge Discovery ODS DW OLAP ANALYSIS Extraction DM1 DM2 DMn Legacy System FS1 FS2 FSn...... STAGINGAREASTAGINGAREA DW Components
13
Data extraction Data Cleansing and Transformation Data Load and refresh Build derived data and views Service queries Administer the warehouse Operational Process
14
Data Capturing Process Feed System Application Business Transactions Incremental Data Control Metadata Extract the incremental data from feed system Store the extracted data into a temporary area Extract data from multiple, heterogeneous, and external sources Extraction Process ( Data Capturing )
15
Network Cloud Transmit the extracted data from Feed system to Staging area Periodicity of transmission ( daily / weekly ) depends upon the feed system Feed System Side Incremental Data Staging area Incremental Data FTP Extraction Process (Data Transmission )
16
Raw data (Staging Area) Process Metadata Cleansing Rules Control Metadata Cleansing Process Cleansing Reports Good Bad Clean data Detect errors in the data and rectify them when possible Mark it Good/Bad Generate the cleansing Reports and mail to the DWA and Feed System representatives Cleansing Process
17
Transformation Process Clean Operational Data Operational Data Store Transform the cleaned Operational Data into DSS Data Load the DSS data into ODS ODS contains the current DSS data at the lowest level of granularity Control Metadata Process Metadata Mapping Detail Transformation Rule Transformation Process
18
Summarization Process ODS WeeklyMonthlyYearly DW Summarize and aggregate ODS data and Populate to the Warehouse Periodicity of Summarization Process depends upon the level of summarization at Warehouse ( weekly, monthly, daily ) Control Metadata Summarization Process
19
Data about Data Used to maintain Datawarehouse Control data Static: Roles, permissions, naming standards, source system names, Locations, target names, transformation and mapping rules Dynamic: Scheduling, scripts, load statistics, space usage, Backup statistics Business data Business rules,Who validates data,Who controls,How they validate Metadata
20
Extraction/transformation/load tool (family of tools including data modeling tool, extraction tool, Meta data repository, and DW administration tools) Meta data exchange architecture (API used to integrate all components of DW with central Meta data) Target databases (relational, multidimensional, hybrid) Data access and analysis tools for end users Database servers, operating systems, networks DW Components/Tools
21
DW Tools
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.