Architecting a Large-Scale Data Warehouse with SQL Server 2005 Mark Morton Senior Technical Consultant IT Training Solutions DAT313.

Slides:



Advertisements
Similar presentations
Yukon – What is New Rajesh Gala. Yukon – What is new.NET Framework Programming Data Types Exception Handling Batches Databases Database Engine Administration.
Advertisements

Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Adam Jorgensen Pragmatic Works Performance Optimization in SQL Server Analysis Services 2008.
Data Management and Index Options for SQL Server Data Warehouses Atlanta MDF.
BY LECTURER/ AISHA DAWOOD DW Lab # 3 Overview of Extraction, Transformation, and Loading.
Big Data Working with Terabytes in SQL Server Andrew Novick
Technical BI Project Lifecycle
Dos and don’ts of Columnstore indexes The basis of xVelocity in-memory technology What’s it all about The compression methods (RLE / Dictionary encoding)
Tables Lesson 6. Skills Matrix Tables Tables store data. Tables are relational –They store data organized as row and columns. –Data can be retrieved.
Query Evaluation. An SQL query and its RA equiv. Employees (sin INT, ename VARCHAR(20), rating INT, age REAL) Maintenances (sin INT, planeId INT, day.
IS 4420 Database Fundamentals Chapter 6: Physical Database Design and Performance Leon Chen.
Fundamentals, Design, and Implementation, 9/e Chapter 11 Managing Databases with SQL Server 2000.
Virtual techdays INDIA │ 9-11 February 2011 SQL 2008 Query Tuning Praveen Srivatsa │ Principal SME – StudyDesk91 │ Director, AsthraSoft Consulting │ Microsoft.
Physical Design CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 Physical Design Steps 1. Develop standards 2.
PARTITIONING “ A de-normalization practice in which relations are split instead of merger ”
Dual Partitioning for improved performance in VLDBs Ashwin Rao Karavadi, Rakesh Parida Microsoft IT.
1 Chapter Overview Transferring and Transforming Data Introducing Microsoft Data Transformation Services (DTS) Transferring and Transforming Data with.
Copying, Managing, and Transforming Data With DTS.
Overview SQL Server 2008 Overview Presented by Tarek Ghazali IT Technical Specialist Microsoft SQL Server MVP, MCTS Microsoft Web Development MCP ITIL.
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
CSC271 Database Systems Lecture # 30.
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Loading Ola Ekdahl IT Mentors 9/12/08.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Ashwani Roy Understanding Graphical Execution Plans Level 200.
Partitioning Design For Performance and Maintainability Martin Cairns
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
Indexes / Session 2/ 1 of 36 Session 2 Module 3: Types of Indexes Module 4: Maintaining Indexes.
1 Biometric Databases. 2 Overview Problems associated with Biometric databases Some practical solutions Some existing DBMS.
How to Build Scalable & Secure Database Applications Noel Jerke & Erin Welker Scalability Experts.
SQL Server 2005 – Table Partitioning Vinod Kumar Intel Technology India Pvt. Ltd. MVP – SQL Server
Praveen Srivatsa Director| AstrhaSoft Consulting blogs.asthrasoft.com/praveens |
SQL/Lesson 7/Slide 1 of 32 Implementing Indexes Objectives In this lesson, you will learn to: * Create a clustered index * Create a nonclustered index.
Chapter 4 Indexes. Index Architecture  By default data is inserted on a first-come, first-serve basis  Indexes bring order to this chaos  Once you.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Relational Operator Evaluation. Overview Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g.,
Chapter 4 Logical & Physical Database Design
Chapter 5 Index and Clustering
Session 1 Module 1: Introduction to Data Integrity
Creating Indexes on Tables An index provides quick access to data in a table, based on the values in specified columns. A table can have more than one.
Data Management Conference Performance & Scalability Simon Sabin London September 29th.
Bigtable: A Distributed Storage System for Structured Data
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
Best Practices in Loading Large Datasets Asanka Padmakumara (BSc,MCTS) SQL Server Sri Lanka User Group Meeting Oct 2013.
SQL Server 2005 – Table Partitioning Chad Gronbach Microsoft.
# CCNZ What is going on here???
APRIL 13 th Introduction About me Duško Mirković 7 years of experience.
SQL Basics Review Reviewing what we’ve learned so far…….
Doing fast! Optimizing Query performance with ColumnStore Indexes in SQL Server 2012 Margarita Naumova | SQL Master Academy.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
Introduction to Partitioning in SQL Server
Using Partitions and Fragments
Introduction to SQL Server Management for the Non-DBA
Database Performance Tuning and Query Optimization
The Ins and Outs of Partitioned Tables
Blazing-Fast Performance:
CHAPTER 5: PHYSICAL DATABASE DESIGN AND PERFORMANCE
Introduction to partitioning
Table Partitioning Intro and make that a sliding window too!
Sunil Agarwal | Principal Program Manager
Table Partitioning Intro and make that a sliding window too!
Four Rules For Columnstore Query Performance
Table Partitioning Intro and make that a sliding window too!
Chapter 11 Database Performance Tuning and Query Optimization
Chapter 11 Managing Databases with SQL Server 2000
Partition Switching Joe Tempel.
All about Indexes Gail Shaw.
Advanced Database Topics
Sunil Agarwal | Principal Program Manager
Presentation transcript:

Architecting a Large-Scale Data Warehouse with SQL Server 2005 Mark Morton Senior Technical Consultant IT Training Solutions DAT313

What We Will Cover Discuss the role of partitioning and indexing in a relational data warehouse Review enhancements in Microsoft SQL Server 2005 that facilitate partition and index strategies Loading large volumes of fact table data for optimal performance Bringing at all together…

Partitioning Role of partitioning in a relational data warehouse Partitioning refers to the process of segmenting rows of data horizontally into smaller, more manageable sets Improve scalability and availability Enhance manageability and maintenance Faster incremental load times Reduce query time

Partitioning (cont’d) Criteria for partitioning a relational data warehouse Partitioning should be considered for large fact tables that require high availability or exhibit poor query or maintenance performance Partitioning is not typically implemented or recommended for dimension tables Implementing a partition strategy is complex and should not be considered if the fact table is not sufficiently large or query performance, maintenance or availability are not an issue

Partitioning (cont’d) In SQL Server 2000 A partitioned view joins horizontally partitioned data from a set of member tables across one or more servers, making the data appear as if from one table Accomplished by creating physically separate member tables Non-overlapping CHECK constraints are required on the partition column Queries against the partition view filtering on the partition column will only include the physical tables required to resolve the query Limited to 256 member tables

Partitioning (cont’d) In SQL Server 2005 Partitions data horizontally by dividing table and index data into subsets of data which may be spread across multiple file groups A partition function is used to define the ranges and boundaries in which the partitions are segmented A partition scheme is used to map each partition segment defined by a partition function to a specific file group

Partitioning (cont’d) In SQL Server 2005 Tables, indexes or indexed views can be created directly on a partition scheme instead of a file group Queries and maintenance operations targeting a subset of data are optimized as only the partitions required to complete the operation are utilized Limited to 1000 partitions Supports local partitions only

Partitioning (cont’d) Defining your partition strategy Define Your Partition Column: Identify the single column in which data should be partitioned Define Your Partition Function: Identify the number of partitions and ranges (boundaries) in which data should be partitioned Define Your Partition Scheme: Identify and create the file groups required to store the rows of a partitioned table or index

Defining and Creating Partitions

Indexing Role of indexing in a relational data warehouse Dimension tables typically use a surrogate key created specifically for the data warehouse as the primary key Fact tables typically use the composite of all related dimension surrogate keys as the primary key A data warehouse may contain many additional non-clustered indexes in order to increase efficiency and responsiveness of ad-hoc queries Covering indexes—indexes that contain all columns referenced in the query—are often implemented to support well defined, frequently executed queries

Indexing (cont’d) Covering indexes in SQL Server 2000 Create a composite index that includes key columns for all columns referenced in the query Produces a large key size with possibly many key columns that are not used for filtering or lookups Maximum of 16 key columns (900 total bytes)

Indexing (cont’d) Covering indexes in SQL Server 2005 A non-clustered index can be extended to include non-key columns Non-key columns are stored at the leaf level similar to non-key columns of a clustered index Since all columns used in the query are located at the leaf level, only the index page is required to resolve the query Maximum of 1023 include columns (8060 total bytes)

Creating Indexes With Include Columns

Indexing (cont’d) Alternatives to covering indexes If an effective clustered index can be utilized, then non-clustered covering indexes may not be required Create single, non-clustered indexes on all referenced columns in a table and allow the query optimizer to utilize index intersection

Loading and Maintenance Managing and maintaining partitions Load data into an empty partition Remove all data from an existing partition Relocate all data in one partition from a partitioned table to another partitioned table Split one partition into two partitions Merge two partitions into one partition

Managing and Maintaining Partitions

Loading and Maintenance Optimizing bulk load performance Prefer native format over ASCII Execute multiple bulk loads concurrently Set recovery mode to bulk-logged Use the TABLOCK hint to minimize locking Load each data file in a single batch

Loading Data

Further Reading Partitioned Tables and Indexes in SQL Server 2005 (Kimberly L. Tripp)Partitioned Tables and Indexes in SQL Server 2005 (Kimberly L. Tripp) us/dnsql90/html/sql2k5partition.asphttp://msdn.microsoft.com/library/default.asp?url=/library/en- us/dnsql90/html/sql2k5partition.asphttp://msdn.microsoft.com/library/default.asp?url=/library/en- us/dnsql90/html/sql2k5partition.asphttp://msdn.microsoft.com/library/default.asp?url=/library/en- us/dnsql90/html/sql2k5partition.asp Native Partitioned Tables and Indexes (Itzik Ben-Gan)Native Partitioned Tables and Indexes (Itzik Ben-Gan) SQL SERVER 2005 Books OnlineSQL SERVER 2005 Books Online

We invite you to participate in our online evaluation on CommNet, accessible Friday only If you choose to complete the evaluation online, there is no need to complete the paper evaluation Your Feedback is Important!

© 2005 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.