Designing a Data Warehouse from the Ground Up Dustin Ryan Data Platform Architect sqldusty@gmail.com
Dustin Ryan Designing analytics solutions for past 10+ years Currently Data Platform Solution Architect for Microsoft for past 3+ years Blogs at SQLDusty.com Author, technical editor, speaker Facebook.com/SQLDusty Twitter.com/SQLDusty Baby wrangler and chicken farmer Live in Jacksonville, Florida with my wife, Angela, and three kiddos, Dallas, Bradley, Andrew LinkedIn.com/in/SQLDusty YouTube.com/dustinryan
Why a Data Warehouse? OLTP Data Warehouse Purpose Execution of business Analysis of business Primary Interaction Single transaction Aggregated transactions Interaction Method Insert, Update, Delete Select Temporal Focus Current Current/historic Design Optimization Update concurrency High-performance queries Design Principle 3NF Star Schema
Four Steps Identify the business process Identify the grain Choose the dimensions Choose the measures
Identify the Business Process Business process NOT business department If just starting, choose high impact, low risk area of the business The business can help you here For this example Retail Sales High Impact Low Impact Low Risk High Risk
Identify the Grain What does one fact row represent? Choose the most atomic level We can’t predict the queries! “One row represents a movie rented by a customer from an employee in a store on a day.”
Define the Dimensions Who, what, where, when? De-normalized design focuses on high performance reads Best attributes are descriptive Use smallest data type possible
Define the Measures How the business measures success Best measures are fully additive Non-additive measures should be handled in SSAS
Resources
http://tinyurl.com/DesignDWGroundUp Dustin Ryan Facebook.com/SQLDusty Twitter.com/SQLDusty Dustin Ryan Data Platform Solution Architect SQLDusty@gmail.com SQLDusty.com LinkedIn.com/in/SQLDusty YouTube.com/dustinryan