Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 5 Using SAS ® ETL Studio. Section 5.1 SAS ETL Studio Overview.

Similar presentations


Presentation on theme: "Chapter 5 Using SAS ® ETL Studio. Section 5.1 SAS ETL Studio Overview."— Presentation transcript:

1 Chapter 5 Using SAS ® ETL Studio

2 Section 5.1 SAS ETL Studio Overview

3 3 What Is SAS ETL Studio? SAS ETL Studio, a Java application, is a visual design tool that helps organizations quickly build, implement, and manage ETL processes from source to destination, regardless of the data sources or platforms. Users can standardize metadata across the organization and perform in-depth transformations with minimal programming or manual work to meet enterprise data integration requirements and to support business and analytic intelligence.

4 4 What Is SAS ETL Studio? SAS ETL Studio enables you to perform the following tasks: the Extraction of data from operational data stores the Transformation of this data the Loading of the extracted data into your data warehouse or data mart.

5 5 What Is SAS ETL Studio? SAS ETL Studio is an application that enables you to manage ETL process flows by allowing: specification of metadata for sources, such as tables in an operational system specification of metadata for targets – the tables and other data stores in a data warehouse creation of jobs that specify how data is extracted, transformed, and loaded from a source to a target.

6 6 SAS ETL Studio: Change Management In SAS ETL Studio, the change management facility enables multiple SAS ETL Studio users to work with the same metadata repository at the same time  without overwriting each other’s changes.

7 7 SAS ETL Studio: Data Surveyor Wizards Optional Data Surveyor wizards can be licensed that provide access to the metadata in enterprise applications, such as PeopleSoft SAP R/3 Siebel Oracle Applications.

8 8 SAS ETL Studio: Metadata CWM Compliant The metadata maintained by SAS ETL Studio is CWM (Common Warehouse Metamodel) compliant and portable to other CWM-compliant applications. Likewise, metadata from other CWM-compliant applications (that is, data modeling tools) can be imported easily into SAS ETL Studio.

9 9 SAS ETL Studio: Data Quality SAS ETL Studio is fully integrated with the data quality software from DataFlux Corporation. Both products now use the same Quality Knowledge Base (QKB), which contains rules, routines, and schemes necessary to integrate data quality into the ETL process.

10 10 Extending SAS ETL Studio Functionality The SAS ETL Studio functionality is extended by Java plug-ins packaged with the product. Further extensions can be implemented by writing additional plug-ins (Java programming required) using the Transformation Generator Wizard (no Java programming required).

11 11 Server Connections and SAS ETL Studio As a client, SAS ETL Studio must connect to a SAS Metadata Server to read or write metadata. It must connect to other servers to run SAS code, connect to a third-party database management system, or to perform other tasks.

12 12 Interaction with SAS Application Servers SAS ETL Studio can use different types of application servers: SAS Metadata Server Required to read and write metadata in a SAS metadata repository. SAS Workspace Server Required to execute SAS code and access data. SAS/CONNECT Server Required to submit generated SAS code to machines that are remote to the default SAS application server....

13 Section 5.2 The SAS ETL Studio Interface

14 14 SAS ETL Studio: The Interface SAS ETL Studio is a Java client developed to control the ETL process. The interface has several “ease-of-use” features including copy and paste in any text field multiple windows can be open at one time (including multiple process flow diagrams) Windows look and feel wizard-driven interfaces.

15 15 Tools, Menus, and Online Help SAS ETL Studio takes full advantage of toolbars and pull- down menus. The icons available on the toolbar depend on which window is active from within the interface. Menus and Tools

16 16 The Shortcut Bar One of the most significant features of SAS ETL Studio is the new process-driven functionality. Processes are available via a Shortcut bar on the far left side of the main SAS ETL Studio window. Shortcut Bar

17 17 The Shortcut Bar The Shortcut bar is populated with icons for each task an ETL user would typically perform, including: Source Designer defines metadata about the source(s) for a process. Metadata Importer imports metadata from other applications. Metadata Exporter exports metadata to be used by other applications. Process Designer defines metadata about the ETL processes.... continued...

18 18 The Shortcut Bar Target Designer defines metadata about the target table(s) to be created by the process. Options provides numerous options for the SAS ETL Studio user to customize the look and feel of the application....

19 19 Tree View The SAS ETL Studio Tree View enables you to view the metadata associated with the current metadata repository display different views or “trees” of the current repository. Tree View

20 20 Tree View There are several tabs available in the tree view area:... continued... Inventory Tree lists the metadata objects in the default metadata repository (and any dependant repositories), organized by predetermined groupings.

21 21 Tree View... continued... Custom Tree lists the metadata objects in the default metadata repository (and any dependant repositories), organized by user-defined groupings of objects.

22 22 Tree View Process Library Tree lists the available data transformations to be used in the ETL process....

23 23 Process Library Tree The Process Library tree displays a collection of transformation templates. There are four collections (folders) of templates that are provided with SAS ETL Studio: Analysis Data Transforms Output Publish.

24 24 Process Designer View The Process Designer window is the workspace for building ETL processes. The Process Designer view appears as a final step in the Process Designer wizard. Once the process is defined, the Process Designer view is populated with icons that represent the chosen processes. The Process Designer window can be used to view SQL source code review the SAS log (from submitting jobs) view the resulting output from running a SAS job.

25 25 Process Designer and Overview Windows Process Designer View Overview window...

26 26 Overview Window The Overview window shows you the complete process from the process view. From within the Overview window, you can control which part of the process is displayed in the Process View window.

27 27 SAS ETL Studio Wizards There are shortcuts which invoke wizards that aid the user in performing various tasks with SAS ETL Studio. Some of these wizards are Source Designer Target Designer New Job.

28 28 Source Designer The Source Designer is a wizard-driven interface that enables you to define the physical layout of existing tables using a data dictionary or metadata information from the source system. The result of running the Source Designer successfully is a metadata registration that describes the data source.

29 29 Target Designer The Target Designer is a wizard that allows metadata to be entered for a target. In designing the target table, you can access any metadata about any source tables and columns registered in the metadata repository override any metadata that was imported from another source and add new columns to the target table create indexes on the target table being created.

30 30 Target Designer The person designing the target table has full control over the type of table being built. The types of targets that can be built include database types that are supported by the SAS/ACCESS products SAS data sets (including both data files and data views) SAS/SHARE data sets SPDE tables.

31 31 New Job Wizard The New Job wizard enables you to define the metadata necessary to run an ETL process to load data into a target or targets.

32 32 Additional Wizards Other wizards available to provide assistance with various tasks in SAS ETL Studio include Metadata Importer Metadata Exporter Cube Designer Transformation Generator wizard. You can also install optional data surveyor wizards, which provide access to the metadata in enterprise applications, such as PeopleSoft, SAP R/3, Siebel, and Oracle.

33 33 Options Window The Options window can be used to define standard settings for the SAS ETL Studio interface. There are several tabs in the Options window: General Process Editor Metadata Tree SAS Server Data Quality.

34 34 Course Case Study Tasks Recall the case study tasks diagram discussed earlier. Each of these tasks involves either reading or writing (or both) metadata. Register Source Tables Define Data Libraries Create ETL Jobs Define Target Tables Create OLAP Cubes View and Analyze Data Create Stored Processes Create Reports Create Information Maps Use the Information Delivery Portal Metadata

35 35 SAS ETL Studio Case Study Tasks SAS ETL Studio will concentrate on the following four tasks: Register Source Tables Define Data Libraries Create ETL Jobs Define Target Tables Create OLAP Cubes View and Analyze Data Create Stored Processes Create Reports Create Information Maps Use the Information Delivery Portal Metadata

36 36 SAS ETL Studio Case Study Define Data Libraries (+) These tasks will be performed in sequence: Define Source Tables Metadata Define Target Tables Metadata Define and Run Jobs 1. 2. 3. 4.

37 37 SAS ETL Studio Case Study – Setup Tasks Define Data Libraries (+) Define Source Tables Metadata Define Target Tables Metadata Define and Run Jobs 1. 2. 3. 4. Build Custom Tree Groupings Libraries Jobs Source Tables Target Tables Define Additional Library Definitions Target Tables Library Source Tables Library Demo Exercises Demo Exercises...

38 38 SAS ETL Studio Case Study – Define Sources Define Data Libraries (+) Define Source Tables Metadata Define Target Tables Metadata Define and Run Jobs 1. 2. 3. 4. The Source Designer defines metadata for the source tables. Orders Order_Item Product_List Demo Exercises...

39 39 SAS ETL Studio Case Study – Define Targets Define Data Libraries (+) Define Source Tables Metadata Define Target Tables Metadata Define and Run Jobs 1. 2. 3. 4. The Target Designer defines metadata for the target tables. OrderFact ProductDim Demo* Exercises * Some derived columns for OrderFact are completed in the exercises....

40 40 SAS ETL Studio Case Study – Define Jobs Define Data Libraries (+) Define Source Tables Metadata Define Target Tables Metadata Define and Run Jobs 1. 2. 3. 4. The Process Designer defines metadata for jobs that contain the process flow diagrams necessary to load the target tables. Demo Exercises Populate the OrderFact table Populate the ProductDim table...

41 41 Creating the OrderFact Table The OrderFact table will be created from the Orders and Order_Item tables. Target Table Source Tables...

42 42 Creating the OrderFact Table The source tables, Orders and Order_Item, will be combined using the SQL Join transformation. SQL Join The SQL Join will be used to define computed columns. ...

43 43 Creating the OrderFact Table The table that is the result of the SQL Join will then be loaded into the OrderFact table. Loader...

44 44 Creating the ProductDim Table The ProductDim table will be created from the Product_List table. Target Table Source Table...

45 45 Creating the ProductDim Table The Extract transformation will be used so that a computed column can be defined. SAS Extract...

46 46 Creating the ProductDim Table The results of the Extract transformation will then be loaded into the target table, ProductDim. Loader...

47 47 SAS ETL Studio Case Study – Setup Tasks Define Data Libraries (+) Define Source Tables Metadata Define Target Tables Metadata Define and Run Jobs 1. 2. 3. 4. Build Custom Tree Groupings Libraries Jobs Source Tables Target Tables Define Additional Library Definitions Target Tables Library Source Tables Library Demo Exercises Demo Exercises

48 48 This demonstration shows how to define a logical grouping object and create a library definition to store in the new grouping. Create a Logical Grouping and Adding a Library Definition Register Source Tables Define Data Libraries Create ETL Jobs Define Target Tables Create OLAP Cubes View and Analyze Data Create Stored Processes Create Reports Create Information Maps Use the Information Delivery Portal Metadata

49 49 This exercise creates logical grouping elements and defines two SAS libraries. Exercises Register Source Tables Define Data Libraries Create ETL Jobs Define Target Tables Create OLAP Cubes View and Analyze Data Create Stored Processes Create Reports Create Information Maps Use the Information Delivery Portal Metadata

50 50 Using the Source Designer The Source Designer is a wizard that generates metadata for one or more selected tables, based on the physical structure of the table(s) The Source Designer can be used to specify metadata for any existing table, not just tables used as data sources for ETL jobs.

51 51 Using the Source Designer The Source Designer is an easy to use wizard interface....

52 52 SAS ETL Studio Case Study – Define Sources Define Data Libraries (+) Define Source Tables Metadata Define Target Tables Metadata Define and Run Jobs 1. 2. 3. 4. The Source Designer defines metadata for the source tables. Orders Order_Item Product_List Demo Exercises

53 53 This demonstration shows how to add a source table definition for the Orders table. Add a Source Table Definition Register Source Tables Define Data Libraries Create ETL Jobs Define Target Tables Create OLAP Cubes View and Analyze Data Create Stored Processes Create Reports Create Information Maps Use the Information Delivery Portal Metadata

54 54 These exercises add source table definitions for several source tables. Exercises Register Source Tables Define Data Libraries Create ETL Jobs Define Target Tables Create OLAP Cubes View and Analyze Data Create Stored Processes Create Reports Create Information Maps Use the Information Delivery Portal Metadata

55 55 Using the Target Designer The Target Designer is a wizard that can create new metadata about a single table that might or might not already exist in physical storage. It can also be used to create and edit metadata about an OLAP cube.

56 56 Using the Target Designer The Target Designer is an easy to use wizard interface....

57 57 SAS ETL Studio Case Study – Define Targets Define Data Libraries (+) Define Source Tables Metadata Define Target Tables Metadata Define and Run Jobs 1. 2. 3. 4. The Target Designer defines metadata for the target tables. OrderFact ProductDim Demo* Exercises * Some derived columns for OrderFact are completed in the exercises.

58 58 This demonstration illustrates defining a target table definition for the OrderFact table. Defining a Target Table Register Source Tables Define Data Libraries Create ETL Jobs Define Target Tables Create OLAP Cubes View and Analyze Data Create Stored Processes Create Reports Create Information Maps Use the Information Delivery Portal Metadata

59 59 These exercises add target table definitions for several tables. Exercises Register Source Tables Define Data Libraries Create ETL Jobs Define Target Tables Create OLAP Cubes View and Analyze Data Create Stored Processes Create Reports Create Information Maps Use the Information Delivery Portal Metadata

60 60 Using the Process Designer The Process Designer invokes the New Job wizard to create metadata about a job. That metadata is used to build a process flow diagram for the job. A job is a metadata object that specifies processes that create output. SAS ETL Studio organizes sources, targets, and transformations into jobs that can be displayed in a process flow diagram. SAS ETL Studio uses each job to generate and/or retrieve SAS code that reads sources and creates targets on a file system.

61 61 Using the Process Designer The New Job wizard prompts for information that is used to build a template in the Process Designer....

62 62 SAS ETL Studio Case Study – Define Jobs Define Data Libraries (+) Define Source Tables Metadata Define Target Tables Metadata Define and Run Jobs 1. 2. 3. 4. The Process Designer defines metadata for jobs that contain the process flow diagrams necessary to load the target tables. Demo Exercises Populate the OrderFact table Populate the ProductDim table

63 63 This demonstration shows how to define a job for the OrderFact target table and enter information about the extraction and transformation of data Defining a Job Register Source Tables Define Data Libraries Create ETL Jobs Define Target Tables Create OLAP Cubes View and Analyze Data Create Stored Processes Create Reports Create Information Maps Use the Information Delivery Portal Metadata

64 64 This exercise shows how to create a new job and enter information about the extraction and transformation of data. Exercises Register Source Tables Define Data Libraries Create ETL Jobs Define Target Tables Create OLAP Cubes View and Analyze Data Create Stored Processes Create Reports Create Information Maps Use the Information Delivery Portal Metadata

65 65 This demonstration shows how to specify the load process attributes as well as executing and verifying a job. Loading the Target Tables Register Source Tables Define Data Libraries Create ETL Jobs Define Target Tables Create OLAP Cubes View and Analyze Data Create Stored Processes Create Reports Create Information Maps Use the Information Delivery Portal Metadata

66 66 This exercise shows how to specify the load process attributes as well as executing and verifying a job. Exercises Register Source Tables Define Data Libraries Create ETL Jobs Define Target Tables Create OLAP Cubes View and Analyze Data Create Stored Processes Create Reports Create Information Maps Use the Information Delivery Portal Metadata

67 Section 5.3 Advanced SAS ETL Studio Features (Self-Study)

68 68 Advanced Features This section introduces several advanced features to review on your own, including: Data Quality Plug-Ins Importing and Exporting Metadata Change Management.

69 69 Data Quality Plug-Ins SAS ETL Studio contains two data quality transformation templates in the Process Library tree: Create Match Code Used to create a job that creates match codes and cluster numbers for a specified source column and based on a set of criterion. Apply Lookup Standardization Used to create a job that standardizes the values of a source column according to the contents of a specified standardization scheme. These templates increase the value of data through data analysis and data cleansing....

70 70 Data Quality Plug-Ins To use the data quality transformation templates, the SAS Data Quality Server software must be installed a SAS application server must be configured to access a Quality Knowledge Base the Quality Knowledge Base must contain the locales needed to reference data quality jobs. When the prerequisites have been met, the data quality transformations can be dragged into process flow diagrams.

71 71 Create Match Code Plug-In The Create Match Code plug-in is a tabbed dialog box that reads the Quality Knowledge Base for the specified locale creates match codes based on the user-specified criterion. The match code can then be used to de-duplicate data or join data as part of the transformation step in defining the target table.

72 72 Apply Lookup Standardization Plug-In The Apply Lookup Standardization plug-in is a tabbed dialog box that reads the Quality Knowledge Base for the specified locale loads all of the available standardization schemes. You can then apply the scheme to one of the source columns as part of the transformation step in defining the target table.

73 73 Metadata Import Wizard The Metadata Import Wizard is an interface for importing metadata files that are compliant with the Common Warehouse Metamodel (CWM) standard. By using the Import Wizard to import the metadata from a previously defined data model (source tables or target tables), you do not have to enter the metadata for each table individually. You simply reference a location for the model file, which was created by a third-party modeling tool.

74 74 Which Metadata Can Be Imported? The CWM standard for metadata was developed by Object Management Group (OMG). More information about OMG and the CWM metadata standard can be obtained from: http://www.omg.orghttp://www.omg.org More information about Meta Integration Technology, Inc., and the purchase of MIMBs, can be obtained from the following location: http://www.metaintegration.net

75 75 Metadata Export Wizard The Metadata Export Wizard is an interface for exporting metadata from within SAS ETL Studio to third-party CWM-compliant applications. The user has the ability to specify the path and the file to create from the export of the metadata. Once the user completes the Metadata Exporter wizard, a confirmation window verifies all of the selections the user has made for the export of the metadata. Upon exiting this window, the metadata is written to the external file that was specified in the wizard.

76 76 Change Management SAS ETL Studio enables you to create metadata objects that define sources, targets, and the transformations that connect them. These objects are saved to one or more metadata repositories. The change management feature (or more specifically, metadata source control) enables multiple SAS ETL Studio users to work with the same metadata repository at the same time without overwriting each other's changes.

77 77 Change Management Change management features in SAS ETL Studio include: menus that support change management operations such as check out and check in the Inventory tree and the Custom tree for working with metadata that is contained in a change-managed repository the Project tree for working with metadata that is contained in a project repository an audit history for each metadata object.

78 78 Change Management After an object has been checked out by one person, it is locked so that it cannot be updated by another person until the object has been checked back in. The only people who can change the metadata in a change-managed repository are the person who started the metadata server administrators who have write access to the repository any users who are authorized to use a project repository for the change-managed repository.


Download ppt "Chapter 5 Using SAS ® ETL Studio. Section 5.1 SAS ETL Studio Overview."

Similar presentations


Ads by Google