SEATTLE BI MEETUP BI & BIG FISH April 2 nd, 2014 Emre Motan
About the Speaker Emre Motan BI Engineer, Big Fish Been in Seattle 1.5 years, previously in Chicago Involved in BI community Chicago SQL BI PASS chapter UW BI Certificate Program BI Over Beers TDWI Random: Basketball, co-rec sports, greyhounds
About the Seattle BI Meetup Started in 2012, I took over after period of inactivity Meetups will be monthly Primary goal is to educate Topics will be wide-ranging but more technical Networking is encouraged Speakers will be people who use technologies
About BI Meetup (cont.) Looking to develop relationships with willing speakers, venues, and sponsors Desire is to have meetings in different venues each month As part of hosting, it would be nice to “sponsor” with food & drink
About Big Fish World’s largest producer of casual games Core business used to be PC- based “Hidden Object” games Now pushing deep into Mobile space with titles such as Big Fish Casino and Fairway Solitaire
Today’s Story BI/DW Implementation ETL Framework ETL Development BIML BIDS Helper Mist BIML IDE w/ Hadron BIML Compiler Summary
BI INFRASTRUCTURE April 2 nd, 2014 Emre Motan
BI Infrastructure
What do we want from our ETL? Minimize manual coding errors Minimize time spent on boilerplate Support agile software development practices Use software development best practices Pair / collaborate Source control / diff changes Auto-generate / automate Reduce effort spent on operations and support
ETL FRAMEWORK April 2 nd, 2014 Emre Motan
ETL Framework Overview Cycles (subject areas, master package) Jobs (restart-able units of work, package) Steps (logic units, Data Flow / Execute SQL / …)
12 Job A Step 1 Step 2 Step 3 Job B Step 4 Step 5 Step 6 Job C Step 7 Step 8 Step 9 Job D Step 10 Step 11 Step 12 Job E Step 13 Step 14 Step 15 Cycle Log Start Log Success Do WorkLog Fail
13 1 st Run Job A Step 1 Step 2 Step 3 Job B Step 4 Step 5 Step 6 Job C Step 7 Step 8 Step 9 Job D Step 10 Step 11 Step 12 Job E Step 13 Step 14 Step 15 Success Failure Didn’t Run
14 2 nd Run Job A Step 1 Step 2 Step 3 Job B Step 4 Step 5 Step 6 Job C Step 7 Step 8 Step 9 Job D Step 10 Step 11 Step 12 Job E Step 13 Step 14 Step 15 Completed Successfully last run; Skip Next Time Rerun on Failure Choose to Retry because of a logical dependency Choose to Skip Run because it did not run last time
Example Job Flow Data Warehouse Staging ODSReporting Layer Stg_sales Stg_customers Stg_products Stg_payment_ methods … ods_sales ods_customers ods_products ods_payment_ methods … fact_sales dim_customers dim_products dim_payment_ methods … Source Systems Ecommerce ELT Server Extract/ Load
ETL Framework Summary Framework is injected at compile time Uses SQL Server 2008 R2 via stored procedures Logical units automatically logged Event handlers automatically added Metadata based alterations for flow control (skip/restart) Metadata based balances and validation scripts to detect warning/error conditions Variables and values stored for use in ETL Metadata based run-time Alterations for flow control (e.g. skip job, skip step) Custom tools to administer ETL infrastructure and metadata
BIML ECOSYSTEM April 2 nd, 2014 Emre Motan
Language and Compiler BIML (BI Markup Language) Lightweight XML dialect Represents SQL Server BI Stack objects (SSIS, SSAS, SQL Server) Works like ASP.NET / PHP (combines declarative & imperative language) Hadron Compiles BIML to SQL Server BI Stack artifacts Called via MSBuild, Mist, …
Tools BIDS Helper Free, open-source extension to BIDS Code in BIML, then generate SSIS Subset of functionality Mist IDE Graphical & Text Based Editors Transformers Extensions
ETL DEVELOPMENT April 2 nd, 2014 Emre Motan
Big Fish BI Engineering We integrate a wide variety of data sources We don’t develop in BIDS/SSIS We code in BIML, compile in Mist or via Hadron directly
BI Engineering ETL Development Flow Develop BIML locally, committed to SVN Generate most of the code besides business logic Run code validations before / during compile Compile BIML during development or on deployment to ELT boxes Produce SSIS packages Handle pushing to target environment Kick off Cycles using Job Scheduler or DtExec
DEMOS April 2 nd, 2014 Emre Motan
Demo: Simple SSIS Package Demo of BIMLScript1.biml Show BIDS environment Show BIML Generate SSIS Run SSIS
Demo: Programmatic BIML Demo of BIMLScript2.biml Introduce.NET addition to BIML script Describe what we’re doing with getting tables from DB Describe how we’ll loop over each table, and then each column of table, to generate insert commands Generate SSIS Run SSIS
Demo: Mist Visual Designer Show audience visual designer of one job Select elements to see visual designer We don’t use visual designer very often since most code is auto-generated now and we have established patterns
Demo: Mist Project Show Mist environment with sample cycle Show cycle file with one job Show job file Show metadata for sample table (source, ODS) Show Extensions
Demo: Auto-generating ETL One substantial accelerator of our work is auto- generating ETL for new extracts, loads, and processing BIML representation of table Columns, business keys, primary keys, data types Annotations like ETL pattern required (full load, incremental new, incremental new/updated) Only need to code transformation logic, all boilerplate is auto-generated BimlScript to autogenerate boilerplate ETL code
Demo: NZ SSIS Console Cycle/Job/Step Status Alterations Variables Deploy / Execute Cycles
SUMMARY April 2 nd, 2014 Emre Motan
Why did Big Fish choose BIML? Non-standard technology needs (extensibility) Ease of developing and maintaining ETL to leave more time for high business value work Plenty of people with SSIS experience in Seattle Cost effective Organization already supported SQL Server and SSIS Happy developers
Q & A April 2 nd, 2014 Emre Motan
THANK YOU April 2 nd, 2014 Emre Motan
Resources BIDSHelper.codeplex.com Varigence.com BimlScript.com Biml Tutorials by Andy Leonard