Big Data Working with Terabytes in SQL Server Andrew Novick www.NovickSoftware.com.

Slides:



Advertisements
Similar presentations
Big Data Working with Terabytes in SQL Server
Advertisements

Adam Jorgensen Pragmatic Works Performance Optimization in SQL Server Analysis Services 2008.
Help! My table is getting too big! How to divide and conquer SQL Relay 2014.
1. Aim High with Oracle Real World Performance Andrew Holdsworth Director Real World Performance Group Server Technologies.
Dos and don’ts of Columnstore indexes The basis of xVelocity in-memory technology What’s it all about The compression methods (RLE / Dictionary encoding)
SQL Server 2005 features for VLDBs. SQL Server 2005 features for VLDBs aka (it’s fixed in the next release)
Stephen Archbold. Working with SQL Server for 6+ years Former Production DBA for 24/7 High volume operation Currently SQL Server consultant at Prodata.
Module 6 Implementing Table Structures in SQL Server ®2008 R2.
Database Optimization & Maintenance Tim Richard ECM Training Conference#dbwestECM Agenda SQL Configuration OnBase DB Planning Backups Integrity.
Chapter Physical Database Design Methodology Software & Hardware Mapping Logical Design to DBMS Physical Implementation Security Implementation Monitoring.
IS 4420 Database Fundamentals Chapter 6: Physical Database Design and Performance Leon Chen.
Making Data Warehouse Easy Conor Cunningham – Principal Architect Thomas Kejser – Principal PM.
Dual Partitioning for improved performance in VLDBs Ashwin Rao Karavadi, Rakesh Parida Microsoft IT.
Fast Track, Microsoft SQL Server 2008 Parallel Data Warehouse and Traditional Data Warehouse Design BI Best Practices and Tuning for Scaling SQL Server.
Russ Houberg Senior Technical Architect, MCM KnowledgeLake, Inc.
SQL Server 2008 Implementation and Maintenance Chapter 7: Performing Backups and Restores.
Exam QUESTION CertKiller.com has hired you as a database administrator for their network. Your duties include administering the SQL Server 2008.
Cloud Computing Lecture Column Store – alternative organization for big relational data.
Chapter 2: Designing Physical Storage MCITP Administrator: Microsoft SQL Server 2005 Database Server Infrastructure Design Study Guide (70-443)
Module 3: Managing Database Files. Overview Introduction to Data Structures Creating Databases Managing Databases Placing Database Files and Logs Optimizing.
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Loading Ola Ekdahl IT Mentors 9/12/08.
Sofia, Bulgaria | 9-10 October SQL Server 2005 High Availability for developers Vladimir Tchalkov Crossroad Ltd. Vladimir Tchalkov Crossroad Ltd.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Partitioning Design For Performance and Maintainability Martin Cairns
© 2008 Quest Software, Inc. ALL RIGHTS RESERVED. Perfmon and Profiler 101.
IN-MEMORY OLTP By Manohar Punna SQL Server Geeks – Regional Mentor, Hyderabad Blogger, Speaker.
Srik Raghavan Principal Lead Program Manager Kevin Cox Principal Program Manager SESSION CODE: DAT206.
SQL Server 2005 – Table Partitioning Vinod Kumar Intel Technology India Pvt. Ltd. MVP – SQL Server
Praveen Srivatsa Director| AstrhaSoft Consulting blogs.asthrasoft.com/praveens |
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
SQL Server 2005 – Table Partitioning Chad Gronbach Microsoft.
# CCNZ What is going on here???
Strategies for Working with Texas-sized Databases Robert L Davis Database Engineer
In this session, you will learn to: Manage databases Manage tables Objectives.
Database Administration for the Non-DBA Denny Cherry twitter.com/mrdenny.
Doing fast! Optimizing Query performance with ColumnStore Indexes in SQL Server 2012 Margarita Naumova | SQL Master Academy.
Indexing strategies and good physical designs for performance tuning Kenneth Ureña /SpanishPASSVC.
Database Design: Solving Problems Before they Start! Ed Pollack Database Administrator CommerceHub.
Hitting the SQL Server “Go Faster” Button Rob Douglas #509 | Brisbane 2016.
Memory-Optimized Tables Querying at the speed of light.
Introduction to Partitioning in SQL Server
Temporal Databases Microsoft SQL Server 2016
Indexes By Adrienne Watt.
Temporal Databases Microsoft SQL Server 2016
UFC #1433 In-Memory tables 2014 vs 2016
Very Large Databases in your future
Taking your application to memory
Data Warehouse in the Cloud – Marketing or Reality?
Building Modern Transaction Systems on SQL Server
Introduction to SQL Server Management for the Non-DBA
The Ins and Outs of Partitioned Tables
Database Administration for the Non-DBA
CHAPTER 5: PHYSICAL DATABASE DESIGN AND PERFORMANCE
Migrating a Disk-based Table to a Memory-optimized one in SQL Server
Hitting the SQL Server “Go Faster” Button
Real world In-Memory OLTP
SQL 2014 In-Memory OLTP What, Why, and How
Introduction to partitioning
20 Questions with Azure SQL Data Warehouse
Very large Databases in your future Eric Peterson.
Table Partitioning Intro and make that a sliding window too!
Microsoft SQL Server 2014 for Oracle DBAs Module 7
In Memory OLTP Not Just for OLTP.
Stretch Database - Historical data storage in SQL Server 2016
Table Partitioning Intro and make that a sliding window too!
Table Partitioning Intro and make that a sliding window too!
Partition Switching Joe Tempel.
Using Columnstore indexes in Azure DevOps Services. Lessons learned
Using Columnstore indexes in Azure DevOps Services. Lessons learned
An Introduction to Partitioning
Presentation transcript:

Big Data Working with Terabytes in SQL Server Andrew Novick

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Agenda  Challenges  Architecture  Solutions

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Introduction  Andrew Novick – Novick Software, Inc.  Business Application Consulting – SQL Server –.Net   Books: – Transact-SQL User-Defined Functions – SQL 2000 XML Distilled

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server What’s Big?  100’s of gigabytes and up to 10’s of terabytes  100,000,000 rows an up to 100’s of Billions of rows

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Big Scenarios  Data Warehouse  Very Large OLTP databases (usually with reporting functions)

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Big Hardware  Multi-core 8-64  RAM 16 GB to 256 GB  SAN’s or direct attach RAID  64 Bit SQL Server

Challenges

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Challenges  Load Performance (ETL)  Query Performance  Data Management Performance

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server How fast can you load rows into the database?

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Load 1,000,000 rows into a 400,000,000 million row table that has 12 indexes? 12 Hours

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Load 1,000,000 rows into an empty table and add 12 indexes? 5 Minutes

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server How do you speed up queries on 1,000,000,000 row tables?

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Challenge - Backup  Let’s say you have a 10 TB database.  Now back that up.

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Backup Calculation  10 TB = GB  Typical Backup speed - 1 to 20 GB / Min  At 10 GB/Minute Who’s got 1000 minutes? Who as 16 hours to spare?

Architecture What do we have to work with?

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server SQL Server Storage Architecture Physical IO - subsystem Disk Logical Disk System – Windows Drives Drive C:Drive D:Drive E: SQL Server Storage FileGroupB FileB2 FileB1 FileGroupA FileA1 Table1Table2

Solutions

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Solutions  Use Multiple FileGroups/Files  INSERT into empty unindexed tables  Partitioned Tables and/or Views  Use READ_ONLY FileGroups

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server At 3 PM on the 1 st of the month: Where do you want your data to be?

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Spread to as many disks as possible

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server I/O Performance  Little has changed in 50 years  Size for Performance Not for Space – My app needs 1500 reads/sec and 800 writes/sec I/O throughput is a function of the number of disk drives.

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Solution: Load Performance

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Partitioning

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Partitioned Views Created like any view CREATE VIEW Fact AS SELECT * FROM Fact_ UNION ALL SELECT * FROM Fact_

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Partitioned Views: Check Constraints  Check constraints tell SQL Server which data is in which table ALTER TABLE Fact_ ADD CONSTRAINT CK_FACT_ _Date CHECK (FactDate >= ‘ ’ and FactDate < ‘ ’)

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Partitioned View - 2  Looks to a query like any table or view SELECT FactDate, ….. FROM Fact WHERE CustID= AND FactDate = ‘ ’

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server View Fact Partitioned View Physical IO - subsystem Disk Logical Disk System – Windows Drives Drive C:Drive D:Drive E: SQL Server Storage FileGroupB FileB2 FileB1 FileGroupA FileA1 Table1Table2 FGF1 F1 FGF2FGF3FGF4 F4F3F2 Fact_ Fact_ Fact_ FGF1 F1 FGF2FGF3FGF4 F4F3F2

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Partition Elimination  The query compiler can eliminate partitions from consideration in the plan  Partition elimination happens at query compile time.  It is often necessary to make partition values string constants.

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Demo 1 – Partitioned Views

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Partitioned Tables  SQL Server Enterprise/2005  Require a non-null partitioning column  Check constraints tell SQL Server what data is in each parturition  All tables are partitioned!

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Partitioned Function  Defines how to split data CREATE PARTITION FUNCTION Fact_PF(smalldatetime) RANGE RIGHT FOR VALUES (‘ ’, ‘ ’)

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Partition Scheme  Defines where to store each range of data CREATE PARTITION SCHEME Fact_PS AS PARTITION Fact_pf TO (PRIMARY, FG_ , FG_ )

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Creating a Partitioned Table  The table is created ON the Partitioned Scheme CREATE TABLE Fact (Fact_Date smalldatetime, all my other columns) ON Fact_PS (Fact_Date)

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Table Fact Partitioned Table Physical IO - subsystem Disk Logical Disk System – Windows Drives Drive C:Drive D:Drive E: SQL Server Storage FileGroupB FileB2 FileB1 FileGroupA FileA1 Table1Table2 Fact.$Partition=1 Fact.$Partition=2 Fact.$Partition=3 Fact.$Partition=4 FGF1 F1 FGF2FGF3FGF4 F4F3F2

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Demo 2 – Partitioned Tables

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Partitioning Goals  Adequate Import Speed  Maximize Query Performance – Make use of all available resources  Data Management – Migrate data to cheaper resources – Delete old data easily

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Achieving Query Speed  Eliminate partitions during query compile  All disk resources should be used – Spread Data to use all drives – Parallelize by querying multiple partitions  All available memory should be used  All available CPUs should be used

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Issues with Partitioning  No foreign keys can reference the Partitioned Table  Identity columns must be more closely managed.  UPDATES on partitioned tables with part of the table in READ_ONLY filegroups must have partition elimination that restricts the updates to READ_WRITE filegroups.  INSERTs into partitioned views require all columns and face additional restrictions.

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Solution: Backup Performance  Backup less!  Maintain data in a READ_ONLY state  Compress Backups

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Read_Only FileGroups  Requires only one Backup after becoming read_only  Don’t require page or row locks  Don’t require maintenance  The ALTER requires exclusive access to the database before SQL 2008 ALTER DATABASE MODIFY FILEGROUP SET READ_ONLY

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Partial Backup  Partial Base – Backs up read_write filegroups  Partial Differential – Differential backup of read_write filegroups BACKUP DATABASE READ_WRITE_FILEGROUPS WITH DIFFERENTIAL …. BACKUP DATABASE READ_WRITE_FILEGROUPS …..

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Maintenance Operations  Maintain only READ_WRITE data – DBCC CHECKFILEGROUP – ALTER INDEX  REBUILD PARTITION =  REORGANIZE PARTITION =  Avoid SHRINK

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Sliding Window Always There Data Temporal Data Temporal Data Temporal Data Temporal Data Temporal Data

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server SQL Server 2008 – What’s New  Row, page, and backup compression  Filtered Indexes  Optimization for star joins  Lock Escalation to the Partition Level  Partitioned Indexed Views  Fewer operations require exclusive access to the database

PASS Community Summit 2008AD-202Big Data: Working with Terabytes in SQL Server Thanks for Coming Andrew Novick  