Presentation is loading. Please wait.

Presentation is loading. Please wait.

Populating a Data Warehouse. Overview Process Overview Methods of Populating a Data Warehouse Tools for Populating a Data Warehouse Populating a Data.

Similar presentations


Presentation on theme: "Populating a Data Warehouse. Overview Process Overview Methods of Populating a Data Warehouse Tools for Populating a Data Warehouse Populating a Data."— Presentation transcript:

1 Populating a Data Warehouse

2 Overview Process Overview Methods of Populating a Data Warehouse Tools for Populating a Data Warehouse Populating a Data Warehouse by Using DTS

3  Process Overview Validate, Gather,Transform Populate Data Distribute Make Data Consistent Data WarehouseData Sales Service Other Data Marts Source OLTP Systems Temporary Data Staging AreaDataWarehouse Sales Data Hardware Data Oracle SQLServer Other

4 Validating Data Validate and Correct Data at the Source Before You Import It Determine and Correct Processes That Invalidate Data Save Invalid Data to a Log for Review

5 Making Data Consistent Data Can Be Inconsistent in Several Ways: Data in each source is consistent, but you want to represent it differently in the data warehouse Data is represented differently in different sources You Can Make Data Consistent by: Translating codes or values to readable strings Converting multiple versions of the same information into a single representation

6 Transforming Data Transform Change Combine Calculate buyer_namebuyer_name Barr, Adam Chai, Sean O’Melia, Erin... reg_idreg_id 2 2 4 4 6 6 total_salestotal_sales 17.60 52.80 8.82... buyer_namebuyer_name Barr, Adam Chai, Sean O’Melia, Erin... reg_idreg_id 2 2 4 4 6 6 total_salestotal_sales 17.60 52.80 8.82... buyer_namebuyer_name Barr, Adam Chai, Sean O’Melia, Erin... price_idprice_id.55 1.10.98... qty_idqty_id 32 48 9 9... buyer_namebuyer_name Barr, Adam Chai, Sean O’Melia, Erin... reg_idreg_id II IV VI... total_salestotal_sales 17.60 52.80 8.82... buyer_namebuyer_name Barr, Adam Chai, Sean O’Melia, Erin... price_idprice_id.55 1.10.98... qty_idqty_id 32 48 9 9... total_salestotal_sales 17.60 52.80 8.82... buyer_firstbuyer_first Adam Sean Erin... buyer_lastbuyer_last Barr Chai O’Melia... reg_idreg_id 2 2 4 4 6 6 total_salestotal_sales 17.60 52.80 8.82...

7 Methods of Populating a Data Warehouse Select the Method of Populating a Data Warehouse That Suits Your Business Needs Method 1: Validate, combine, and transform data in a temporary data staging area Method 2: Validate, combine, and transform data during the loading process Migrate Data During Periods of Relatively Low System Use

8  Tools for Populating a Data Warehouse What Is the Appropriate Tool to Use Transact-SQL Query Distributed Query bcp Utility and the BULK INSERT Statement DTS

9 What Is the Appropriate Tool to Use Format of Source and Destination Data Location of Source and Destination Data Import or Export of Database Objects Frequency of Data Transfer Interface Preference Tool Performance

10 Transact-SQL Query FullNameFullName Johnson, Steve Smith, Douglas Wilson, Les Salinger, Paul CustomerSummaryCustomerFirstNameFirstName Steve LastNameLastName Johnson Douglas Smith Les Wilson Paul Salinger USE northwind_mart SELECT Lastname + ', ' + Firstname As Fullname INTO CustomerSummary FROM Northwind.dbo.Customer USE northwind_mart SELECT Lastname + ', ' + Firstname As Fullname INTO CustomerSummary FROM Northwind.dbo.Customer

11 Distributed Query USE northwind_mart SELECT productname, companyname INTO item_dim FROM StockServer.sales.dbo.products p JOIN AccountingServer.sales.dbo.suppliers s ON p.supplierid = s.supplierid USE northwind_mart SELECT productname, companyname INTO item_dim FROM StockServer.sales.dbo.products p JOIN AccountingServer.sales.dbo.suppliers s ON p.supplierid = s.supplierid Sales Products Table Sales AccountingServerStockServer Local SQL Server Suppliers Table Item_Dim Table

12 bcp Utility and the BULK INSERT Statement BULK INSERT Accounting.dbo.orders FROM 'C:\ordersdir\orderstble.dat' WITH ( DATAFILE TYPE = 'char' FIELDTERMINATOR = '|', ROWTERMINATOR = '|\n') BULK INSERT Accounting.dbo.orders FROM 'C:\ordersdir\orderstble.dat' WITH ( DATAFILE TYPE = 'char' FIELDTERMINATOR = '|', ROWTERMINATOR = '|\n') BCP accounting.dbo.orders in Orderstbl.dat –c –t, -r \n –Smysqlserver –Usa –Pmypassword BCP accounting.dbo.orders in Orderstbl.dat –c –t, -r \n –Smysqlserver –Usa –Pmypassword bcp Utililty BULK INSERT Statement

13 DTS When to Use DTS DTS Data Source and Destination Types OLE DB ODBC ASCII text file DTS Tools DTS Import and Export wizards DTS Designer dtsrun utility Custom HTML Spreadsheet

14  Populating a Data Warehouse by Using DTS Building a DTS Package Transforming Data by Using an ActiveX Script Transforming Data by Using a Lookup Query Defining Transactions Tracking Data Lineage Creating a DTS Package Programmatically

15  Building a DTS Package Mapping Source and Destination Data Defining Data Transformation Tasks Creating and Saving a DTS Package Executing a DTS Package Scheduling and Securing a DTS Package

16 Mapping Source and Destination Data Mapping Columns Decide which columns to copy Choose the columns in the target database that map to the source columns Mapping Data Types Specify transformation rules Specify levels of data conversion

17 Defining Data Transformation Tasks DTS Packages Contain Tasks A Task Can: Execute a Transact-SQL statement Execute a script Launch an external application Transfer SQL Server 7.0 objects Execute or retrieve results from a DTS package

18 Creating and Saving a DTS Package Creating a DTS Package By using DTS wizards By using DTS Designer By using a COM interface exposed by DTS Saving a DTS Package COM-structured storage file Microsoft Repository SQL Server msdb database

19 Executing a DTS Package You Can Execute a DTS Package by Using SQL Server Enterprise Manager or dtsrun Command Prompt Utility File Storage Location Determines the dtsrun Syntax dtsrun /sAccounts /uJose /nOrdersImport

20 Scheduling and Securing a DTS Package Scheduling a DTS Package Use DTS Import or DTS Export wizards when you save the DTS package to the msdb database Use SQL Server Enterprise Manager when you use the dtsrun command prompt utility Implementing DTS Package Security Login permissions Owner and user passwords

21 Demonstration: Transferring Data by Using DTS

22 Transforming Data by Using an ActiveX Script Why Use an ActiveX Script How to Use an ActiveX Script Define a function to contain the transformation script Specify the destination column Specify the source columns Use operators and VBScript or JScript functions and control-of-flow statements Set the return code value for the function How to Handle Errors with Return Codes

23 Examples of ActiveX Scripts FullNameFullName Johnson, Steve Smith, Douglas Wilson, Les Salinger, Paul CustomerSummaryCustomerFirstNameFirstName Steve LastNameLastName Johnson Douglas Smith Les Wilson Paul Salinger Function Main() DTSDestination(“FullName”) = DTSSource(“Lastname”) + “, ” + DTSSource(“Firstname”) Main = DTSTransformStat_OK End Function Function Main() DTSDestination(“FullName”) = DTSSource(“Lastname”) + “, ” + DTSSource(“Firstname”) Main = DTSTransformStat_OK End Function

24 Demonstration: Transforming Data by Using an ActiveX Script

25 Transforming Data by Using a Lookup Query Customer_dimCustomer_dim Name D. Smith L. Wilson P. Salinger State Florida Wyoming Arkansas Destination Data Source DataCustomer_sourceCustomer_source Name D. Smith L. Wilson P. Salinger State FL WY AR Lookup Table State_lookupState_lookup Abbreviation FL WY AR State Florida Wyoming Arkansas Transform

26 Implementing a Lookup Query Set Up Connections to Source, Destination, and Lookup Tables Create a Task, and Specify the Source and Destination Add a Lookup Query Definition Map the Source and Destination Columns, and Call the Lookup Query from the ActiveX Script

27 Defining Transactions You Specifically Must Add a Step or Task to the Transaction You Can Specify When a Transaction Commits DTS Only Supports One Transaction Per Package MS DTC Must Be Running The Data Provider for the Data Destination Must Support Transactions

28 Tracking Data Lineage Using Data Lineage Tracks history of data at package and table row levels Provides audit trail of data transformation and DTS package execution Implementing Data Lineage Create the table columns in the data warehouse Add data lineage variables to the DTS package Map data lineage source and destination columns Viewing Data Lineage1122 33

29 Demonstration: Defining Transactions and Tracking Data Lineage

30 DTS Package Create Process Source Columns Steps Precedence Constraints Send Mail Bulk Insert Transfer Objects Execute SQL Data-driven Query Custom ActiveX Data Pump Steps Tasks Steps Global Variables Destination Steps Connections Creating a DTS Package Programmatically

31 Recommended Practices Correct and Validate Data at the Source Use an ActiveX Script or a Transact-SQL Script to Transfer and Transform Data Use a Temporary Data Storage Area Save and Store DTS Packages in the Microsoft Repository to Maintain Data Lineage Save and Store DTS Packages in the Microsoft Repository to Maintain Data Lineage

32 Lab A: Populating a Data Warehouse

33 Review Process Overview Methods of Populating a Data Warehouse Tools for Populating a Data Warehouse Populating a Data Warehouse by Using DTS


Download ppt "Populating a Data Warehouse. Overview Process Overview Methods of Populating a Data Warehouse Tools for Populating a Data Warehouse Populating a Data."

Similar presentations


Ads by Google