Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data.

Slides:



Advertisements
Similar presentations
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Advertisements

Data Manager Business Intelligence Solutions. Data Mart and Data Warehouse Data Warehouse Architecture Dimensional Data Structure Extract, transform and.
FAST Radar System Engineering Overview. FAST Radar Overview –What’s Required? IIS 6.0  With Microsoft.NET Framework 1.1 and SMTP for MS SQL Server.
SQL Server Accelerator for Business Intelligence (SSABI)
James Serra – Data Warehouse/BI/MDM Architect
Technical BI Project Lifecycle
Data Warehousing M R BRAHMAM.
Workload Management BMO Financial Group Case Study IRMAC, January 2008 Sorina Faur, Database Development Manager.
Chapter Physical Database Design Methodology Software & Hardware Mapping Logical Design to DBMS Physical Implementation Security Implementation Monitoring.
Components and Architecture CS 543 – Data Warehousing.
Physical Design CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 Physical Design Steps 1. Develop standards 2.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
Chapter 14 The Second Component: The Database.
Business Intelligence System September 2013 BI.
Designing a Data Warehouse
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
M ODULE 5 Metadata, Tools, and Data Warehousing Section 4 Data Warehouse Administration 1 ITEC 450.
Leaving a Metadata Trail Chapter 14. Defining Warehouse Metadata Data about warehouse data and processing Vital to the warehouse Used by everyone Metadata.
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | OFSAAAI: Modeling Platform Enterprise R Modeling Platform Gagan Deep Singh Director.
ETL By Dr. Gabriel.
Chapter 5 Using SAS ® ETL Studio. Section 5.1 SAS ETL Studio Overview.
By N.Gopinath AP/CSE. Why a Data Warehouse Application – Business Perspectives  There are several reasons why organizations consider Data Warehousing.
Designing a Data Warehouse Issues in DW design. Three Fundamental Processes Data Acquisition Data Storage Data a Access.
Overview of SQL Server Alka Arora.
Ihr Logo Data Explorer - A data profiling tool. Your Logo Agenda  Introduction  Existing System  Limitations of Existing System  Proposed Solution.
SQL Server Integration Services (SSIS) Presented by Tarek Ghazali IT Technical Specialist Microsoft SQL Server (MVP) Microsoft Certified Technology Specialist.
Database Systems – Data Warehousing
M icrosoft Data Warehousing - SQL Server State of the Technology Presentation by Sujata Angara Nakul Johri Sang Ho Park.
Systems analysis and design, 6th edition Dennis, wixom, and roth
Copyright © 2003, SAS Institute Inc. All rights reserved. Company confidential - for internal use only 1 Know Your Customers SAS® Banking Intelligence.
Best Practices for Data Warehousing. 2 Agenda – Best Practices for DW-BI Best Practices in Data Modeling Best Practices in ETL Best Practices in Reporting.
Data Warehousing Seminar Chapter 5. Data Warehouse Design Methodology Data Warehousing Lab. HyeYoung Cho.
Converting COBOL Data to SQL Data: GDT-ETL Part 1.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
Microsoft ® System Center Service Manager 2010 Infrastructure Planning and Design Published: December 2010.
Oracle9i Performance Tuning Chapter 1 Performance Tuning Overview.
Copyright © 2002, SAS Institute Inc. All rights reserved. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries.
2 Copyright © Oracle Corporation, All rights reserved. Defining Data Warehouse Concepts and Terminology.
FORUM II Best Practices in Data Warehousing in Higher Education: A Framework for Higher Education Reporting April 18, 2005 Slide 1 Cornell University’s.
Data Management Console Synonym Editor
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
Datawarehouse A sneak preview. 2 Data Warehouse Approach An old idea with a new interest: Cheap Computing Power Special Purpose Hardware New Data Structures.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
Building Dashboards SharePoint and Business Intelligence.
7 Strategies for Extracting, Transforming, and Loading.
Using Oracle BI Suite EE Plus with Oracle E-Business Suite Joe Dahl Product Specialist Noetix Corporation.
Copyright 2007, Information Builders. Slide 1 Machine Sizing and Scalability Mark Nesson, Vashti Ragoonath June 2008.
9 Copyright © 2009, Oracle. All rights reserved. Deploying and Reporting on ETL Jobs.
Using Oracle BI Suite EE Plus with Oracle E-Business Suite Joe Dahl Product Specialist Noetix Corporation.
1 Copyright © 2008, Oracle. All rights reserved. I Course Introduction.
SSIS – Deep Dive Praveen Srivatsa Director, Asthrasoft Consulting Microsoft Regional Director | MVP.
© Copyright IBM CorporationSolvency II prototype – Solution Design focused on your success 1 02/09/1002/09/10.
Copyright © 2006, Oracle. All rights reserved. Czinkóczki László oktató Using the Oracle Warehouse Builder.
C Copyright © 2007, Oracle. All rights reserved. Introduction to Data Warehousing Fundamentals.
2 Copyright © 2006, Oracle. All rights reserved. Defining Data Warehouse Concepts and Terminology.
Configuring SQL Server for a successful SharePoint Server Deployment Haaron Gonzalez Solution Architect & Consultant Microsoft MVP SharePoint Server
1 Copyright © 2007, Oracle. All rights reserved. Installing and Setting Up the Warehouse Builder Environment.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
11 Copyright © 2009, Oracle. All rights reserved. Enhancing ETL Performance.
ETL Validator Deployment Options
Building and Implementing Integrated Data Models
Introduction.
IBM DATASTAGE online Training at GoLogica
Data Warehouse.
An Introduction to Data Warehousing
Data Warehousing Concepts
Best Practices in Higher Education Student Data Warehousing Forum
Presentation transcript:

Copyright © 2004, SAS Institute Inc. All rights reserved. Building and Implementing Integrated Data Models Nancy Wills, Director, Access, Query and Data Mgmt Ralph Hollinshead, Manager, Solutions Data Integration

Copyright © 2004, SAS Institute Inc. All rights reserved. Overview Part One: Building an Integrated Data Model Part Two: Deploying and Scaling the Data Architecture

Copyright © 2004, SAS Institute Inc. All rights reserved. SAS ® Banking Intelligence Solutions Framework Customer Retention X Sell Up sell X Sell Up sell Marketing Automation Marketing Automation Credit Scoring Credit Scoring Credit Risk Banking Intelligence Architecture Strategic Performance Management INTEGRATED EXTENDABLE ARCHITECTURE FOCUSED ON BUSINESS ISSUES BASED ON EXPERIENCE New Solutions

Copyright © 2004, SAS Institute Inc. All rights reserved. SAS ® Cross-Sell and Up-Sell for Banking SAS ® Customer Retention for Banking SAS ® Credit Scoring for Banking Solution Data Marts Extract and Cleanse Files Enterprise Source Systems Independent Solutions Solutions SAS ® Credit Risk Management

Copyright © 2004, SAS Institute Inc. All rights reserved. Integrated Data Model: Not All Customers are the Same  Customer A: No Data Warehouse Interested Multiple SAS Solutions  Customer B: With Data Warehouse Adverse to Data Replication Issues  Customer C: With Data Warehouse No Data Marts allowed – Active Data Warehousing Approach

Copyright © 2004, SAS Institute Inc. All rights reserved. Customer A: Full SAS Data Architecture Solution Data Marts Extract and Cleanse Files Enterprise Source Systems Solution s SAS® Cross-Sell and Up-Sell for Banking SAS® Customer Retention for Banking SAS® Credit Scoring for Banking SAS® Credit Risk Management SAS Banking Detail Data Store Flexible Options to Meet Customer Needs!

Copyright © 2004, SAS Institute Inc. All rights reserved. Customer B: Partial SAS Data Architecture Solution Data Marts Extract and Cleanse Files Enterprise Source Systems Solution s SAS® Cross-Sell and Up-Sell for Banking SAS® Customer Retention for Banking SAS® Credit Scoring for Banking SAS® Credit Risk Management Customer Enterprise Data Warehouse Flexible Options to Meet Customer Needs!

Copyright © 2004, SAS Institute Inc. All rights reserved. Customer C: Customer Data Architecture Extract and Cleanse Files Enterprise Source Systems Solution s SAS® Marketing Automation Customer Enterprise Data Warehouse

Copyright © 2004, SAS Institute Inc. All rights reserved. Scorecard for Data Architecture Approach Data Management IssueScore Sensitivity to Data Replication-0-5 Sensitivity to H/W processor and storage budget-0-5 Existing warehouse quality-0-5 Implementation time constraints-0-5 Intentions to implement >1 SAS solution+0-5 Historical data requirements+0-5 ScoreDecision -25No DDS. Marts only if absolutely necessary. Information maps may be appropriate. 0Use DDS to persist current extract from source systems. Marts hold multiple extracts up to full history. +25Implement full warehouse, persist history in DDS and as much as wanted in the marts.

Copyright © 2004, SAS Institute Inc. All rights reserved. Techniques for Data Model Integration  Detail Data Store Varying Industries General Standards Warehousing Techniques  Data Marts Approach Compared to DDS

Copyright © 2004, SAS Institute Inc. All rights reserved. Integrating Models at the Industry Level

Copyright © 2004, SAS Institute Inc. All rights reserved. Detail Data Store Standards Needed for Integration  Data Types / Lengths / Classifier Codes  Naming Conventions  Standards for Data Structures Hierarchies Subtypes Reference Data

Copyright © 2004, SAS Institute Inc. All rights reserved. Data Administration Standards Domain Data Type Width Applicable Class Codes Comment/Example IdentifierVarchar32IDTypically the identifier from the source system. Small CodeVarchar3CDShort length codes such as ADDRESS_TYPE_CD Medium CodeVarchar10CDMedium length codes such as EXCHANGE_SYMBOL_CD Large CodeVarchar20CDLong length codes such as POSTAL_CD Standard Count CodeNumeric6CNTStandard counts such as AUTHORIZED_USERS_CNT NameVarchar40NMProper name. For example, LAST_NM, FIRST_NM, etc. Short Length TextVarchar20TXTShort freeform text. Medium Length TextVarchar100TXT, DESC Longer freeform text and descriptions associated with code tables. Indicator FieldCharacter1FLGBinary indicatory flag (Y or N). Surrogate KeyNumeric10RK, SKGenerated surrogate keys. Currency AmountNumeric18,5AMTStandard currency amount. Rates and Percentages Numeric9,4PCT, RTFor example, exchange rates. DateTimeDateDT, DTTMAccommodate dates as well as date/time.

Copyright © 2004, SAS Institute Inc. All rights reserved. Detail Data Store: Data Warehousing Standards Surrogate Keys, Point-in-Time, and Rapidly Changing Data CUSTOMER_RKVALID_FROM_DTVALID_TO_DTACCOUNT_RKMARITAL_STATUS_CDFIRST_NMLAST_NM 10001JAN199929FEB SJohnSmith 10001MAR DEC MJohnSmith ACCOUNT_RKVALID_FROM_DTVALID_TO_DTCUSTOMER_RKFINANCIAL_ACCOUNT_TYPE_CDOPEN_DT 20101JAN199931DEC SAVINGS01JAN2000 CUSTOMER FINANCIAL_ACCOUNT ACCOUNT_RKVALID_FROM_DTVALID_TO_DTBALANCE_AMTCURRENCY_CD 20101JAN199931JAN USD 2011FEB199928FEB USD FINANCIAL_ACCOUNT_CHNG

Copyright © 2004, SAS Institute Inc. All rights reserved. Conformed Dimensions

Copyright © 2004, SAS Institute Inc. All rights reserved. Tools: Extending Models CUSTOMER EXTERNAL_ORG SUPPLIER INTERNAL_ORG INTERNAL_ORG_ASSOC INTERNAL_ORG_ASSOC_TYPE COMPETITORS

Copyright © 2004, SAS Institute Inc. All rights reserved. Change Analysis Tool

Copyright © 2004, SAS Institute Inc. All rights reserved. Deploying the Integrated Data Architecture

Copyright © 2004, SAS Institute Inc. All rights reserved. Option A: Full SAS Data Architecture Solution Data Marts Extract and Cleanse Files Enterprise Source Systems Solution s SAS® Cross-Sell and Up-Sell for Banking SAS® Customer Retention for Banking SAS® Credit Scoring for Banking SAS® Credit Risk Management SAS Banking Detail Data Store Flexible Options to Meet Customer Needs!

Copyright © 2004, SAS Institute Inc. All rights reserved. Populate DDS and Data Mart Flat File Step 1 - Extract cleanse and transform from source data into flat file Data Warehouse DDS Step 2 – ETL processing to load data warehouse data validation key creation slowly changing dimensions Banking Data Mart Step 3 - Transform into data mart model Excel SAS SAP Oracle PeopleSoft Source Data

Copyright © 2004, SAS Institute Inc. All rights reserved. Deployment Focus Scalability and Performance  ETL flows  Physical data model

Copyright © 2004, SAS Institute Inc. All rights reserved. Deployment What did We do?  Create and Generate Data  Deploy Hardware and Software  Populate DDS  Populate Data Mart  Analyze ETL Flows  Analyze DDS Model  Change Management

Copyright © 2004, SAS Institute Inc. All rights reserved. It All Starts with Data  Bought and Built Data Generators  Built Simulated Data  Applied Business Rules  Scaled - 5 gig -> 50 gig -> 500 gig -> 1TB

Copyright © 2004, SAS Institute Inc. All rights reserved. Deploy Hardware and Software  Choose Software Components SAS for the DDS or Data Warehouse Databases for the DDS or Data Warehouse SAS for the Data Marts  Install and Configure SAS Software  Configure Hardware  Design for Progressive Larger Deployment Growth

Copyright © 2004, SAS Institute Inc. All rights reserved. Windows Server *Dell PowerEdge 1600SC Windows 2003 DualHyper-threaded 2.8 Ghz processors 4 GB RAM 4 internal IDE drives 60 GB C drive 275 GB D drive Single I/O channel 5gig -> 50gig of Data

Copyright © 2004, SAS Institute Inc. All rights reserved. AIX UNIX Servers IBM P630 eServer AIX processors 4 I/O channels 8 GB RAM 4x72 GB disks 14-drive SCSIS storage array IBM P670 eServer AIX processors 8 - 1gig fiber I/O Channels Dynamic logical partitioning 2 TB disks 50gig -> 500gig 5500gig -> 1TB of Data

Copyright © 2004, SAS Institute Inc. All rights reserved. Populate DDS and Data Mart  Ran ETL Flows Registered in SAS Metadata Repository Loaded Data into Tables Use Slowly Changing Dimension Load Process  Analyze ETL Flows

Copyright © 2004, SAS Institute Inc. All rights reserved. Example of SAS ETL Studio Flow Analysis

Copyright © 2004, SAS Institute Inc. All rights reserved. Change Management  Loaded New Release of DDS in TST Repository  Compared PRD Repository to TST Repository  Ran Batch Reports to Examine Differences.  Ran Impact Analysis on Column and Table

Copyright © 2004, SAS Institute Inc. All rights reserved. What Did We Find  Specific Techniques that Work Best  Recommendations Tremendous Performance Gains!

Copyright © 2004, SAS Institute Inc. All rights reserved. Specific Techniques Examples ETL Flows  Parallel ETL flows  SAS coding techniques to use  Use hash table instead of look up  Make sure the I/O buffer size is tuned  Drop constraints

Copyright © 2004, SAS Institute Inc. All rights reserved. Specific Techniques Examples DDS Model  Indexes – when and when not to add  Denormalized some tables  Separate tables for data with high volume changes  Partition data by usage (date ranges)

Copyright © 2004, SAS Institute Inc. All rights reserved. Recommendations  Debugging techniques  Sorting and memory usage  Joins  Understand disk requirements  I/O optimization  Compression and performance

Copyright © 2004, SAS Institute Inc. All rights reserved. Above All  Write ETL  Test, Tune  Test, Tune!!!!

Copyright © 2004, SAS Institute Inc. All rights reserved. Summary and Conclusions  Data integration is key  Different approaches for customers  Change management is vital  Performance tuning is vital  Technology evolving

Copyright © 2004, SAS Institute Inc. All rights reserved. Questions?