Download presentation
Presentation is loading. Please wait.
1
Custom Auditing in SSIS
- Meghana Vasavada
2
Contents A bit about Myself SQL Server Business Intelligence
SQL Server Integration Services (SSIS) Essential Aspects in Data Warehousing Key Components of Package Auditing Performance tuning Demo : Custom Auditing Take away References/Resources Questions Evaluation
3
SQL Server Database Developer and Analyst
Briefing Myself SQL Server Database Developer and Analyst Big Data Certified M.S. Bioinformatics (2013)
4
SQL Server Business Intelligence
Business Intelligence:- A set of processes, tools, technologies and methodologies that consolidate raw data scattered across functional areas and transform the data into meaningful information in order to gain insight for driving strategic or tactical decision making.
5
SQL Server Business Intelligence
Microsoft SQL Server Integration Services Microsoft SQL Server Analysis Services Microsoft SQL Server Reporting Services Integrate Analyze Report Data acquisition from source systems and integration Data transformation and synthesis Data enrichment, with business logic, hierarchical views Data discovery via data mining Data presentation and distribution Data access for the masses
6
SQL Server Business Intelligence
How does BI help Business? Fast and fact based strategic, tactical, informed decision making in timely manner Align the organization towards its key objectives Keep track of progress over the period of time Decreasing operational costs and improve operational efficiency Improving operational efficiency of each operational area, for example to understand what your true manufacturing costs are and how to optimize them Identifying cross-selling and up-selling opportunities for the business Understanding how the business is doing over a period of time, i.e. where it has been, where it is now and where it is going
7
SQL Server Business Intelligence
Who needs Business Intelligence? Top Level Executive and Business Decision Makers Middle Level Management Line Managers BUSINESS INTELLIGENCE External Customers /Vendors Business Analysts
8
SQL Server Integration Services (SSIS)
A component of the Microsoft SQL Server database software. A Microsoft SQL Server Business Intelligence Application. A fast and flexible data warehousing tool used for data extraction, transformation, and loading (ETL) The Platform for a new generation of high performance Data Integration Technologies.
9
SQL Server Integration Services (SSIS)
Customer benefits of SSIS Performance: Data flows process large volumes of data efficiently Facility: Many prebuilt adapters and transformations reduce hand coding Highly productive visual environment Data cleansing features Data mining
10
SQL Server Integration Services (SSIS)
Basic aspects in Datawarehouse Integration ETL Package Configurations Error Handling Event Handling / tracking Events Logging / Auditing Package Deployment
11
SQL Server Integration Services (SSIS)
ETL = Extract – Transform – Load Source Destination Get the data from source system as efficiently as possible Load the data in the target storage Perform Calculations on the data
12
SQL Server Integration Services (SSIS)
Dimensional Data Modeling: Data structure technique optimized for Data warehousing tools. The concept developed by Ralph Kimball Comprised of "fact" and "dimension" tables. Designed to read, summarize, analyze numeric information like values, balances, counts, weights, etc. in a data warehouse. Used in data warehouse systems
13
SQL Server Integration Services (SSIS)
Components of Dimensional Data Modeling: Star Schema Dim_Product Product_ID Product_name Attributes Fact Table Fact _Sales Customer_ID Product_ID Order_ID Shipping_ID Sales Amount Dim_Customer Customer_ID Name Title Phone Dim_Order Order_ID Unit_Price Order_date Dim_Shipment Shipping_ID Ship_date Ship_country Relationships Dimension Tables
14
SQL Server Integration Services (SSIS)
Slowly Changing Dimensions (SCDs)
15
SQL Server Integration Services (SSIS)
SSIS Package Configuration: A configuration is a property/value pair that you add to a completed package. Used to update the values of properties at run time. For example, by using a configuration, you can change the connection string of a connection manager, or update the value of a variable.
16
SQL Server Integration Services (SSIS)
Types of SSIS Package Configuration: Configuration Types Use XML Configuration File Saves the configuration settings in XML file. Most common approach while deploying reports. Environment Variable Allows you to use the Environment variables to save the configuration settings, or package variables values. Registry Entry Allows to save the configuration settings in Registry. Parent Package Variable Saves the configuration settings in a variable. You can use this variable to update properties in child packages. Example SQL Server Configuration settings will be saved in a table present in the SQL Server Database.
17
Error Handling in SSIS SQL Server Integration Services (SSIS) Reasons
Failure of ancestors control Truncation or source/destination connection issue Conversion failure Issues from migrating files and files data Package has an error due to privileges of OS controls Other failure reasons Package task failed !!!!
18
SQL Server Integration Services (SSIS)
Error Handling in SSIS Control Flow: Add a failure constraint and redirects the workflow to an alternate task. Data Flow: Send the row out to an error path by configuring the error output of the Source/ Destination/Transformation as redirect to error row and save it to review later. Use Event Handler: OnError event in a separate window. Write custom script or just send an to a team to notify the error.
19
SQL Server Integration Services (SSIS)
Logging in SSIS: Log providers Log Providers Description Text File log provider writes log entries to ASCII text files in a comma-separated value (CSV) format (.log) SQL Server Profiler log provider writes traces that you can view using SQL Server Profiler (.trc) SQL Server log provider writes log entries to the sysssislog table in a SQL Server database Windows Event log provider writes entries to the Application log in the Windows Event log on the local computer XML File log provider writes log files to an XML file (.xml)
20
SQL Server Integration Services (SSIS)
Event Handling in SSIS Integration Services packages are event-driven. This means we can specify routines to execute when a particular event occurs. An event can be the completion of a task or an error that occurs during task execution. Event Handlers Description OnError Generated as the result of an error condition OnPreValidate Fired before Validation process starts OnQueryCancel Fired when user clicks on cancel or during an executable to determine whether it should stop running OnTaskFailed Signals the failure of a task and typically follows OnError Event OnPreExecute Indicates that an executable component is about to be launched OnPostExecute Takes place after an executable component finishes running
21
SQL Server Integration Services (SSIS)
SSIS Auditing Add task(s) to the Event Handlers of the package. Select auditing for the entire package or for a specific task. Select events OnError, OnPostExecution, OnVariableValueChanged etc. Inside every dataflow task add row count components after source and before target to track extracted and loaded row count. Add Variables at package level scope to store rowcount for each dataflow. Add Variables in the OnPostExecute event handler scope to store certain information about DataFlow source/target (e.g. Query, TableName ...)
22
SQL Server Integration Services (SSIS)
SSIS Auditing: Helps to answer the following questions Which package was run and for how long? Who owns the package? Or who modified the package? When was the package executed? What kind of data and how many records were written or changed by ETL? What kind of errors and how many errors occurred?
23
SQL Server Integration Services (SSIS)
Benefits that Custom Auditing and Logging can bring to your ETL process Help you provide regulatory compliance Provide a deep understanding of database activity and additional insight into data anomalies that may occur Can help answer important questions like, “When was that row last updated?” help you identify specific data for targeted rollbacks
24
SQL Server Integration Services (SSIS)
Logging vs Custom Auditing Logging Custom Auditing Captures Metadata- Information about package execution itself Captures Information about data, along with package metadata Errors encountered, Execution Time for package, Data bytes, data flow buffer details, Machine name, Package name, Task name Row counts of Extracts, Inserts, Updates, Deletes and Errors, you can default the status of the changed rows and package execution Choose the Log providers and its location Use Execute SQL task to define variables , parameter binding, and assigning values to the parameters Provides limited information on the package Designed for DBAs/ Users who can query and request for more information about the data
25
SQL Server Integration Services (SSIS)
Question: If Change Data Capture (CDC) reads and tracks every Historical data and net changes from SQL Server transaction logs, then whey not use CDC for Auditing?
26
SQL Server Integration Services (SSIS)
Answer: There’s a downside to this The amount of history data can become huge fast Does not return all information about the changes you might need, for e.g. who made the change, when and how? (when a record is deleted or updated) Delay possible: The history data takes some time to catch up, because it is based on the transaction logs and the operation is asynchronous. It depends on the SQL Server Agent. If the Agent is not running or crashes, no history is being tracked.
27
SQL Server Integration Services (SSIS)
Longer running package? How would you optimize the Package Execution? Parallel execution of SSIS tasks In case of Incremental load, Use Execute SQL task instead of OLEDB Command transformation to process the updated/ new inserts Avoid processing the redundant columns in the data flow task Keep notice of buffers and execution tree Avoid using checkpoints while auditing SSIS package as they cannot store variables as Type objects and cannot integrate with, or are most often ignored by event handlers Use Lookup, Conditional split to customize the SCD work flow Enable Error handling and logging on package failure
28
SQL Server Integration Services (SSIS)
Key Components in SSIS Package Custom Auditing ETL: Data warehouse tables (Staging, Dimensions, Facts, Data marts) Slowly Changing Dimensions Type 1/2 Extract Meta data and Row counts (DML Operations) Parent-Package Configuration On-Error Event Handler to Capture Error message Audit SSIS Package
29
Demo An Audit table was structured by defining the components with the required information regarding the Metadata and the transactional records. Attributes Definition Audit Key A global unique ID assigned to every execution of ETL package or packages in the target table. Usually it is an auto identity integer starting from ‘1’ ParentAuditkey Surrogate ID assigned to the execution of child packages as metadata. ID inherited from the audit key of master package. Batch/ Load ID mapped with the process of every loads/ updates PkgName assigns the Name of every Corresponding Package executed including Master package and child packages. PkgID Internal GUID of every SSIS package ExecStartDT Start time of package execution
30
Demo (contd.) Attributes Definition ExecStopDT
End time of package execution TableName Assigns the table name when package executes to define or populate a table. PkgName assigns the Name of every Corresponding Package executed including Master package and child packages. ExecutionInstanceGUID Contains the Global Unique ID for every process, generated by SSIS ExtractRowCnt Contains the Count of records extracted from the source file InsertRowCnt Contains the Count of Inserted Records in Staging and Dimension tables by the ETL process. UpdateRowCnt: Contains the count of updated records, especially in dimension Tables which uses the SCD Type II functionality.
31
Demo (contd.) Attributes Definition ErrorRowCnt
Contains the records which were erroneous or not processed in ETL TableInitialRowCnt Contains the number of records initially existing in any staging and Dimension table TableFinalRowCnt Contains the total number of records in any staging and Dimension table after inserted , updated or deleted in the execution process DeleteRowCnt Contains the number records deleted in the ETL process SuccessfullyProcessingInd determines the status of execution of every ETL. If successfully executed then sets ‘Y’. Default is set to ‘N’.
32
Demo (contd.): Workflow
1. Master package (Capturing Meta data) Execute SQL task that checks for the number of rows affected and inserts a Temporary Dummy row in the beginning of the Audit table Add Audit key and Parent Audit Key variables and add Execute SQL task that Stores the highest value of Audit key in the parameter. Add Execute SQL task which populates the Meta data in the Audit table. In the Task Editor connect to the target database and write a T-SQL query to insert the Meta data to the defined Parameters including Audit and Parent audit keys
33
Demo (contd.): Workflow
1. Master package (calling child packages in data warehouse) Add the Execute Package Task to call the Child package (e.g Audit package of a Staging table) which processes the loading and auditing of the first Staging table Add the Execute Package Task to call the Child package (e.g Audit package of a Dimension table) which processes the loading and auditing of the firs Dicmension table updates the Metadata specifically the end time of the execution and success status of the execution process in the audit table
34
Take Aways SQL Server Integration Services is an exceptionally high performance integration and transformation tool Customize the Auditing using Execute SQL tasks, Row counts, parameters, system package variables to capture transactional information and metadata (DMLs) Recommended using an alternative to SCD transform component to preserve historical record and counts in Dimension tables if implementing SCD Type 2 Implement Error capturing strategies in Data flow and Control flow tasks Unless required, limit the use of Event handlers to OnError and/ or OnTaskFailed events as it carries a large I/O overhead and can slow the application performance dramatically Use Script component and Script task to customize the error information at the package level and at individual task flow
35
Resources/ References
Microsoft SQL Server community on the web ( ) SQL Server Integration Services on the web ( Custom Logging and Auditing ( auditing/ server-integration-services-package-errors-using-onerror- event-handlers/ services/performance/integration-services-ssis- logging?view=sql-server-2017
36
Questions Linkedin:
37
Please complete an Evaluation to provide feedback to our speakers!
How did we do? Please complete an Evaluation to provide feedback to our speakers!
38
Sponsors and Raffle Please make sure to visit and thank our fantastic sponsors for their valuable assistance in putting on this event.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.