Rushabh Mehta Managing Director (India) | Solid Quality Mentors

Slides:



Advertisements
Similar presentations
2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN TechTalk Beste Skalierbarkeit dank massiv.
Advertisements

SSRS 2008 Architecture Improvements Scale-out SSRS 2008 Report Engine Scalability Improvements.
Microsoft Data Warehouse Vision Massive Scalability at Low Cost Improved Business Agility and Alignment Democratized Business Intelligence Hardware.
High Performance Analytical Appliance MPP Database Server Platform for high performance Prebuilt appliance with HW & SW included and optimally configured.
Performance and Scalability. Optimizing PerformanceScaling UpScaling Out.
Web RoleWorker Role At runtime each Role will execute on one or more instances A role instance is a set of code, configuration, and local data, deployed.
Connect with life Bijoy Singhal Developer Evangelist | Microsoft India |
Tuning SQL Server 2012 for SharePoint 2013 Jump Start 01 | Key SQL Server and SharePoint Server Integration Concepts (50 minutes) Dedicated Server or.
Microsoft Business Intelligence Gustavo Santade Business Intelligence Project Manager Improving Business Insight Building a cube using Analysis Services.
GAINING INSIGHT TOUR 2007 Business Intelligence Shahid Gaglani Technology Specialist Microsoft Corporation.
Connect with life Praveen Srvatsa Director | AsthraSoft Consulting Microsoft Regional Director, Bangalore Microsoft MVP, ASP.NET.
Ravi Sankar Technology Evangelist | Microsoft Corporation
Introduction to Big Data and Hadoop Name Title Microsoft Corporation.
PlacePlace TypeType ServiceService Analysis Caching Integration Sync Search Relational BLOB Query BackupLoad Multi Dim In Memory File XML Reporting.
SQL Server Warehousing (Fast Track 4.0 & PDW)
Training Workshop Windows Azure Platform. Presentation Outline (hidden slide): Technical Level: 200 Intended Audience: Developers Objectives (what do.
SQL Server Data Warehousing Overview
DBI332 ilikesql brianwmitchelll UNSTRUCTURED UNBALANCED UNPREDICTABLE.
Data Management Conference Data Warehousing John Plummer TSP Architect
New technologies for BI and Data Warehousing – they’re cool alright, but how do they fit together? Amit Bansal
Sudesh Krishnamoorthy Developer Technology Specialist | Microsoft |
Amit Bansal CTO | Peopleware India (unit of eDominer Systems) | |
LegendCorp What is System Center Virtual Machine Manager (SCVMM)? SCVMM at a glance Features and Benefits Components / Topology /
2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN Welcome November 2012 Vorstellung Parallel.
Connect with life Vedant Kulshreshtha Technology Solutions Professional – SharePoint | Microsoft India
Connect with life Vinod Kumar Technology Evangelist - Microsoft
Connect with life Cheryl Johnson VSTS Solution Expert | Canarys Automations Pvt Ltd Performance Testing.
Connect with life Ravi Sankar Technology Evangelist | Microsoft Corporation Ravisankar.spaces.live.com/blog.
SMP MPP with PDW ** Workload requirements usually drive the architecture decision.
Praveen Srivatsa Director| AstrhaSoft Consulting blogs.asthrasoft.com/praveens |
Patrick Ortiz Global SQL Solution Architect Dell Inc. BIN209.
Connect with life Bijoy Singhal Microsoft India Jadeja Dushyantsinh A Microsoft India.
Connect with life Tejasvi Kumar Developer Technology Specialist | Microsoft India
Microsoft Analytics Platform System Stefan Cronjaeger, Microsoft.
SQL Server 2008 R2 Parallel Data Warehouse: Under the Hood Brian Mitchell Senior Premier Field Engineer.
Ramesh Meyyappan SQL Server Performance Tuning Consultant & Trainer SQLWorkshops.comSQLWorkshops.com / SQLIO.comSQLIO.com.
Comprehensive Flexible Global Storage and Search Responsive Available Secure Manageable Federation Coordination Consolidation Transformation Synchronization.
…the secret sauce! Diagrams and video from Microsoft white papers and slide decks.
Azure.
Dev and Test Solution reference architecture.
Data Platform and Analytics Foundational Training
PowerApps & Flow Licensing Overview for Partners
Data Platform and Analytics Foundational Training
Presenter Date | Location
Dev and Test Solution reference architecture.
System Center Marketing
Dev and Test Solution reference architecture.
System Center Marketing
Dev and Test Solution reference architecture.
Dev and Test Solution reference architecture.
Installation and database instance essentials
Design and Implement Cloud Data Platform Solutions
Introduction to SQL Server Management for the Non-DBA
Data Warehousing: SQL Server Parallel Data Warehouse AU3 update
Azure.
SharePoint Online Management and Control
Overview of Fast Track and PDW
Windows Azure 講師: 李智樺, Ruddy Lee
Server & Tools Business
Microsoft Analytics Platform System
Delivering an End-to-End Business Intelligence Solution
Microsoft Virtual Academy
Dev and Test Solution reference architecture.
From DTS to SSIS, Redesign or Upgrade
LitwareHR v2: an S+S reference application
Windows Server 2008 Iain McDonald Director of Program Management
TechEd /11/ :25 AM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered.
Microsoft Analytics Platform System 03 – Distribution Theory & Design
Windows Azure Hybrid Architectures and Patterns
Presentation transcript:

Rushabh Mehta Managing Director (India) | Solid Quality Mentors

About me: Rushabh Mehta Professional Association for SQL Server President Solid Quality Mentors (SolidQ) Business Intelligence Mentor Managing Director, India SQL Server MVP

Agenda Microsoft Data Warehousing Overview SMP v/s MPP Architecture Microsoft Parallel Data Warehouse Architecture and Components

Microsoft Data Warehousing Offerings

OLAP and ETL Data Mining Managed Reporting Microsoft’s Commitment to DW and BI Pervasive Insight Data Warehousing Ad-hoc Reporting DW Scale Data Profiling Compression VS Integration KPIs Multiple sources Resource Governor Partitioning Power Pivot Load Optimize Parallel Processing Scale to 100s of TB Gartner Leaders Quadrant for Business Intelligence, since 2008 Gartner Leaders Quadrant for Data Warehouse, since 2008 Leader in “The Forrester Wave: Enterprise Data Warehousing Platforms, Q1 2009” Fastest growing of top 5 data warehouse vendors - IDC Microsoft spends as a company $9.1 billion in research annually

SQL Server Fast Track Data Warehouse A method for designing a cost-effective, balanced system for Data Warehouse workloads Reference hardware configurations developed in conjunction with hardware partners using this method Best practices for data layout, loading and management Solution to help customers and partners accelerate their data warehouse deployments

Fast Track Scope Data Path Data Warehouse Analysis Services Cubes PerformancePoint Services SAN, Storage Array Reporting Services Web Analytic Tools Integration Services ETL SharePoint Services Microsoft Office SharePoint Data Staging, Bulk Loading Subject Area Data Marts Supporting SystemsBI Data Storage SystemsPresentation Layer Systems Reference Architecture Scope (dashed) Presentation Data

Fast Track Value Proposition 8

SMP Architecture SMP = Symmetric Multiprocessing Two or more identical processors connected to single shared main memory and controlled by single OS instance Any processor can work on any task Easily move tasks between processors to balance workload efficiently All SQL Server implementations up until now have been SMP

MPP Architecture MPP = Massively Parallel Processing Uses many separate CPUs running in parallel to execute a single program Each CPU has its own memory Applications must be segmented, using high speed communications between nodes

Advantages of MPP Architecture

Parallel Data Warehouse Control Rack Data Rack Control Rack Data Rack/s

Compute Nodes Storage Nodes Spare Compute Node Dual Fiber Channel Parallel Data Warehouse Compute Node + Storage Node PDW Node PDW Node

Compute Nodes Each MPP node is a highly tuned SMP node with standard interfaces Dedicated hardware, database & storage Running SQL Server 2008 EESQL as primary interface Compute Node

Architecture: Compute Server Node Hardware Options Pre-configured For Each Sqlserver Instance On Each Compute Node. Drives Configured As RAID1 To Avoid Appliance Failover For A Single Drive Failure Dell Compute Nodes Have 2 LUN’s (2 RAID1 Pairs) HP Compute Nodes Have 3 LUN’s (3 RAID1 Pairs) tempdb Used For The Following Purposes Sort-work Area For Data Loading Into Clustered Index Tables Spill Area For Hash Joins Not Fitting Into Memory Temporary PDW Tables Enterprise Class DBMS TempDB Workspace Dual Multi-Core Processors DUAL 4Gb FC Dual InfiniBand CPU RAM

Data Layout Replicated: A table structure that exists as a full copy within each discrete PDW Node. Distributed: A table structure that is hashed on a single column and uniformly distributed across all nodes on the appliance. Each distribution is a separate physical table in the DBMS. Ultra shared nothing: The ability to design a schema of both distributed and replicated tables to minimize data movement between nodes Small sets of data can be more efficiently stored in full (replicated). Certain set operations are more efficient against full sets of data (i.e., single node operations).

Data Layout Date Dim Date Dim ID Calendar Year Calendar Qtr Calendar Mo Calendar Day Date Dim ID Calendar Year Calendar Qtr Calendar Mo Calendar Day Store Dim Store Dim ID Store Name Store Mgr Store Size Store Dim ID Store Name Store Mgr Store Size Item Dim Prod Dim ID Prod Category Prod Sub Cat Prod Desc Prod Dim ID Prod Category Prod Sub Cat Prod Desc Sales Fact Date Dim ID Store Dim ID Prod Dim ID Mktg Camp Id Qty Sold Dollars Sold Promo Dim Mktg Camp ID Camp Name Camp Mgr Camp Start Camp End DD SD ID MDMD MDMD SF 1 SF 1 DD SD ID PD SF 2 SF 2 DD SD ID PD SF 3 SF 3 DD SD ID PD SF 4 SF 4 DD SD ID PD SF 5 SF 5 DD SD ID PD SF 1 SF 1

Compute Nodes Storage Nodes Spare Compute Node Dual Fiber Channel Dual Infiniband Control Nodes Active / Passive Landing Zone Backup Node Management Servers Client Drivers ETL Load Interface Corporate Backup Solution Support / Patching Corporate Network Private Network Parallel Data Warehouse

Control Node & Client Drivers Client connections always go through the control node The Control Node contains no persistent user data PDW ‘Secret Sauce’ Processes SQL requests Prepares execution plan Orchestrates distributed execution Local SQL Server to do final query plan processing / result aggregation Client Drivers provided by DataDirect ODBC, OLE-DB, JDBC and ADO.NET client drivers Available drivers for 32 and 64 bits

PDW Benefits – Massive Parallel Processing Control Rack Data Rack Query 1 Query 1 is standard T-SQL submitted to SQL Server on Control Node ? ? ? ? ? ? ? ? ? ? Query is executed on all 10 Nodes Results are sent back to client

PDW Benefits – Massive Parallel Processing Blazing fast performance by parallelizing queries on highly optimized ultra shared nothing nodes. Control Rack Data Rack Multiple queries are simultaneously executed across all nodes. PDW supports querying while data is loading. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

Compute Nodes Storage Nodes Spare Compute Node Dual Fiber Channel Dual Infiniband Control Nodes Active / Passive Landing Zone Backup Node Management Servers Client Drivers ETL Load Interface Corporate Backup Solution Support / Patching Corporate Network Private Network Parallel Data Warehouse Support / Patching Management Nodes Active / Passive Cluster

Management Node Runs a separate domain controller (Active Directory) Used for deploying patches to all nodes in the appliance Holds images in case a node needs reimaging High Availability using Active / Passive clustering

Compute Nodes Storage Nodes Spare Compute Node Dual Fiber Channel Dual Infiniband Control Nodes Active / Passive Landing Zone Backup Node Management Servers Client Drivers ETL Load Interface Corporate Backup Solution Support / Patching Corporate Network Private Network Parallel Data Warehouse Landing Zone ETL Load Interface

Landing Zone Provides high capacity storage for data files from ETL processes Integration services available on the landing zone Connected to internal network Available as sandbox for other applications and scripts that run on internal network Source Landing Zone Files Data Loader Compute Nodes

Storage Nodes Spare Compute Node Dual Fiber Channel Dual Infiniband Control Nodes Active / Passive Landing Zone Management Servers Client Drivers ETL Load Interface Support / Patching Corporate Network Private Network Backup Node Corporate Backup Solution Parallel Data Warehouse Backup Node Corporate Backup Solution

Backup Node Coordinated backup across the nodes Database level backup Full or differential Metadata backup Can restore to a larger appliance Optional item – 1 size per config Up to 524TB of capacity Available in XS, S, M, L and XL

PDW Software Architecture SQL Server DW Authentication DW Configuration DW Queue DW Schema PDW Services DMS IIS Compute Nodes Compute Node Landing Zone Backup Node Management Node Built by DWPUExisting MS software3 rd Party Nexus Query Tool Nexus Query Tool JDBC OLE-DB ODBC ADO.NET JDBC OLE-DB ODBC ADO.NET SQL Server DMS User Data Admin Console DSQL Core Engine Services DMS Manager MS BI (AS, RS) MS BI (AS, RS) DMS Loader Client SQL SSIS HPC AD SQL OS Control Node 3 rd Party Tools (Client Access)

Conclusion MPP architecture supports massive scale through increased parallelization and shared-nothing architecture Microsoft SQL Server 2008 R2 Parallel Data Warehouse Edition brings massive scale wrapped in the simplicity of an appliance

References Microsoft Parallel Data Warehouse official site

Feedback / QnA Your Feedback is Important! Please take a few moments to fill out our online feedback form at: > For detailed feedback, use the form at Or us at Use the Question Manager on LiveMeeting to ask your questions now!

Contact SolidQ Address

© 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.