Solving ETL Bottlenecks with SSIS Scale Out
About Me BI Consultant at Coeo Southampton SQLServer User Group Leader SQLBits Volunteer! Twitter: @stephj_martin Email: steph@coeo.com
SSIS Scale Out - Scalability “the capability of a system, network, or process to handle a growing amount of work, or its potential to be enlarged to accommodate that growth” For SSIS, this means Peak workloads – needing to process everything in a single short window Peak activity During core hours Specific events - Christmas/Black Friday etc. Increasing data volumes https://en.wikipedia.org/wiki/Scalability
SSIS Bottlenecks A single SSIS Instance servicing requests across a large estate Many independent ETL jobs running Guessing how long a job will take before starting a different one Data Warehouse load Running packages in parallel to the same destination
Scaling Up Upgrade hardware Add more memory Single point of failure How much time is the shiny new hardware doing nothing? Will the SSIS package even make use of the increase?
Scaling Out Add more servers to your estate Spread application processing Take individual servers offline for maintenance without impacting the whole Packages could exist on one or more servers Does not provide central management
SSIS Scale Out Provides the ability to scale while retaining control Centralised management Deploy packages to a single location Centrally manage package execution and monitoring Distribute workload to worker servers Add new workers as required Pause and resume VMs for busy periods Specific workers can be used for packages if needed
SSIS Scale Out Checklist image - https://openclipart.org/detail/169719/checklist
SSIS Scale Out – Scale Out Master Responsible for Scale Out management View information in SSISDB catalog Enable or disable workers Execute packages @runinscaleout=True Use a specific worker or set @useanyworker=True Can also be a Scale Out Worker Install worker on the same machine catalog.master_properties catalog.worker_agents catalog.enable_worker_agent, catalog.disable_worker_agent catalog.add_execution_worker
SSIS Scale Out – Scale Out Worker Pull execution tasks from the Scale Out Master Execute packages with ISServerExec.exe Execution is retried automatically if it terminates unexpectedly Set the number of retries in the config file Permissions Default account is NT Service\SSISScaleOutWorker140 Account will need access to the resources in the package Consider using a specific account or set of accounts Execute packages with ISServerExec.exe ISserverExec is not intended for direct use. It is an internal "runner" of packages in SSISDB (catalog). DtExec is the API for running packages via command line, BAT files, scripts or programmatically
SSIS Scale Out – Installation Install Master SQL Server Database Engine Integration Services Scale Out Master Open Firewall Ports Enable SQL Authentication Copy Certificate to Worker Install Worker Scale Out Worker Install Worker Certificate on Master Copy Certificate to Master Requires SQL Authentication to be enabled to allow the execution log to be able to write to SSISDB Default Port 8391 Repeat… Enable Scale Out Worker
SSIS Scale Out – Master Set Up Default Port Number is 8391 Create a new SSL certificate or use an existing one Existing Certificate must be stored in Trusted Root Certification Authorities, Local Computer Check the CNs in the certificate If you are mixing on-premises and Azure VMs, ensure that the public IP address is included Worker Certificate Must be installed in Trusted Root Certification Authorities, Local Computer
SSIS Scale Out – Worker Set Up No changes needed to firewall Master Endpoint is fully qualified name plus port E.g. https://WinSSISMaster.mydomain:8391 Can be configured after install ...\140\DTS\Binn\WorkerSettings.config Make sure the service is running as the correct account
High Availability Make use of Availability Groups SSIS in AGs from 2016 Configure multiple master/worker combinations https://docs.microsoft.com/en-us/sql/integration-services/scale-out/scale-out-support-for-high-availability
SSIS Scale Out - Executing Manually through SSISDB Execution Useful for testing
SSIS Scale Out - Executing Execute Package through SQL Agent Need to set default option to Scale Out in Catalog Properties Useful in situations where multiple jobs are running across a large estate, when you want to execute a specific package
SSIS Scale Out - Executing Execute Package via T-SQL/Stored Procedures Useful to execute multiple packages in parallel in a single load Be careful when executing packages via T-SQL. By default they execute asynchronously so you need to keep a check for when they’ve finished. You can set them to execute synchronously, but there is no way to control the timeout, and if it’s a long running package it will fail with “Unexpected Termination”.
SSIS Scale Out – Monitoring SSISDB Executions Dashboard
SSIS Scale Out – Log Files Scale Out Master Logs C:\Users\SSISScaleOutMaster140\AppData\Local\SSIS\ScaleOut\Master Scale Out Worker Logs C:\Users\SSIS_Svc\AppData\Local\SSIS\ScaleOut\ Agent C:\Users\SSIS_Svc\AppData\Local\SSIS\ScaleOut\Tasks
SSIS Scale Out – Licencing Scale Out Master Requires Enterprise Licence Needs SQL Server to host SSISDB Scale Out Worker Standard or Enterprise Licence Requires Enterprise Licence if the SSIS packages include enterprise features Oracle/Teradata connectors CDC Etc. https://docs.microsoft.com/en-us/sql/integration-services/integration-services-features-supported-by-the-editions-of-sql-server
Thank You Questions? https://docs.microsoft.com/en-us/sql/integration-services/integration-services-features-supported-by-the-editions-of-sql-server
Just like Jimi Hendrix … We love to get feedback Please complete the session feedback forms
SQLBits - It's all about the community... Please visit Community Corner, we are trying this year to get more people to learn about the SQL Community, equally if you would be happy to visit the community corner we’d really appreciate it.