Introducing SQL Server 2012 Improvements
MODULE OUTLINE Introduction New Improvements and Features
INTRODUCTION
INTRODUCTION SQL SERVERS NEW STORAGE MODEL
BUSINESS intelligence highlights
The Vision IMPROVED PRODUCTIVITY FOR DEVELOPERS (CONTINUED)
Introducing SQL Server 2012 Integration Services Improvements SQL11UPD05-DECK-01
Complete, Current, Consistent and Clean Data INTRODUCTION SQL SERVER ENTERPRISE INFORMATION MANAGEMENT Integration Services Master Data Services Data Quality Services Primarily designed to implement ETL processes Provides a robust, flexible, fast, scalable and extensible architecture Master data management Manages reliable, centralized data Broadens its reach with a new Excel Add-in that can leverage Data Quality Services Knowledge-driven data cleansing Corrects, de-duplicates and standardizes data Integrates with Integration Services Complete, Current, Consistent and Clean Data
New Improvements and Features IMPROVED USABILITY FOR NEW DEVELOPERS “Getting Started” window with links to samples and videos New SSIS Toolbox: Exposes a description and links to samples for a selected component Allows categorizing, including Favorites Data flow connection assistants to facilitate the configuration of sources and destinations
New Improvements and Features IMPROVED PRODUCTIVITY FOR DEVELOPERS Undo/Redo Data flow Script component debugging SQL Server Data Tools (SSDT) designer Updated look and feel New VSTA scripting environment with support for .NET 4 Auto-save and recovery Improved performance when opening packages Project connection managers Icon marker to indicate when expressions are used
New Improvements and Features IMPROVED DEPLOYMENT, CONFIGURATION AND MANAGEMENT SSIS catalog: User database (SSISDB) hosted on a SQL relational instance with a collection of views and stored procedures that provide a T-SQL API New project model for bundling project resources together to simplify deployment (.ispac) New parameter model to simplify configuration management Similar to parameters in programming functions Read-only variable in a special namespace
New Improvements and Features IMPROVED DEPLOYMENT, CONFIGURATION AND MANAGEMENT (CONTINUED) Available with the SSIS catalog: Connection manager properties are automatically exposed on the server Environments can be created to run packages with different settings Automatic capture of package execution logs Data tap functionality, to dynamically capture data as it flows through the data flow, and without modifying the package Server can be managed using PowerShell
New Improvements and Features IMPROVED DEPLOYMENT, CONFIGURATION AND MANAGEMENT (CONTINUED) Available with the SSIS catalog (continued): Built-in reports: Integration Services Dashboard – view all packages that have run on the server in the past 24 hours Performance Reports – view a package’s performance over time, at package and component level Error Message Report – details failed package executions and related error messages Support for custom reports
DEVELOPER OPPORTUNITIES Manage the SSIS catalog by using the public T-SQL API, or the .NET interface: Develop management tools and solutions Develop custom monitoring and dashboard reports Embed functionality into solutions (e.g. package execution) Automate project deployment as part of an installer package Develop dynamic packages or SSIS extensions by using the SSIS object model: Custom tasks Custom data flow components (Transformations, Sources, Destinations) Data source connectors ForEach enumerators
Introducing SQL Server 2012 Master Data Services Improvements SQL11UPD05-DECK-02
Complete, Current, Consistent and Clean Data INTRODUCTION SQL SERVER ENTERPRISE INFORMATION MANAGEMENT Integration Services Master Data Services Data Quality Services Primarily designed to implement ETL processes Provides a robust, flexible, fast, scalable and extensible architecture Master data management Manages reliable, centralized data Broadens its reach with a new Excel Add-in that can leverage Data Quality Services Knowledge-driven data cleansing Corrects, de-duplicates and standardizes data Integrates with Integration Services Complete, Current, Consistent and Clean Data
NEW IMPROVEMENTS AND FEATURES IMPROVED MASTER DATA MANAGER
NEW IMPROVEMENTS AND FEATURES IMPROVED MASTER DATA MANAGER (CONTINUED)
NEW IMPROVEMENTS AND FEATURES NEW MICROSOFT EXCEL ADD-IN (CONTINUED)
DEVELOPER OPPORTUNITIES For the Systems Integrator: Leverage MDS to connect disparate information systems that share common master data For the ISV: By integrating with MDS, customers who deploy your line-of-business solution can incorporate it into their own enterprise-wide master data management solution This ensures that your solution is not an island and stays consistent and compliant with master data that is shared with other applications running in the customers environment Anything that MDS can do can be embedded into, or automated by, your solutions by using the WCF API. For example: Create, read, update, and delete metadata Create, read, update, and delete entity members
Introducing SQL Server 2012 Data Quality Services SQL11UPD05-DECK-03
Complete, Current, Consistent and Clean Data INTRODUCTION SQL SERVER ENTERPRISE INFORMATION MANAGEMENT Integration Services Master Data Services Data Quality Services Primarily designed to implement ETL processes Provides a robust, flexible, fast, scalable and extensible architecture Master data management Manages reliable, centralized data Broadens its reach with a new Excel Add-in that can leverage Data Quality Services Knowledge-driven data cleansing Corrects, de-duplicates and standardizes data Integrates with Integration Services Complete, Current, Consistent and Clean Data
INTRODUCTION Knowledge Management and Reference Data BUSINESS SCENARIOS Knowledge Management and Reference Data Create and manage DQS Knowledge Bases Discover knowledge from the organization‘s data Explore and integrate with 3rd party reference data Cleansing and Matching Correct, de-duplicate and standardize data Administration Monitor and control data quality processes
Discover / Explore Data / Connect INTRODUCTION DQS PROCESS Cloud Services Knowledge Management Build Reference Data Discover / Explore Data / Connect Enterprise Data Knowledge Manage Integrated Profiling Notifications Progress Status Use Match & De-duplicate Correct & Standardize Data Quality Projects
DQS COMPONENTS DATA QUALITY CLIENT
Knowledge BaseS Rationale: To cleanse data you need knowledge about it The Knowledge Base is a data repository of knowledge that enables professionals to understand their data and maintain its integrity Knowledge in a Knowledge Base is maintained in Domains, each of which is specific to a data field
KNOWLEDGE BASES Domains capture the semantics of the data (CONTINUED) Domains capture the semantics of the data Domains can use online reference data Online DataMarket Reference Data Service Direct Online 3rd Party Reference Data Services Processes include: Domain Management – to define domains Knowledge Discovery – to learn domain values Matching Policy – to identify potential duplicates and non-matches
Represent the data type KNOWLEDGE BASES (CONTINUED) Composite Domains Domains Represent the data type Values Rules & Relations Reference Data Knowledge Base Matching Policy Domains
DATA QUALITY PROJECT A Data Quality Project is a means of using a Knowledge Base to improve the quality of source data by performing data cleansing and data matching activities Created and managed in the Data Quality Client Results can be exported to a SQL Server table or CSV file Two types: Cleansing Activity – processed data is categorized as new, invalid, corrected, and correct Matching Activity – used to prevent data duplication by identifying exact and approximate matches
DQS CLEANSING TRANSFORM Implements data cleansing in an SSIS 2012 data flow Configuration involves: Defining a connection to a Data Quality Server Selecting a knowledge base Mapping input columns to domains Selecting advanced statistical columns The output includes the original data and corrected data, together with status
Introducing SQL Server 2012 Analysis Services Improvements SQL11UPD06-DECK-01
Tabular model development is explored in SQL11UPD06-DECK-02 introduction SQL Server 2012 Analysis Services (SSAS) is the fifth release of the product New features and enhancements in this release are based on: A vision to expand the reach to a broader user base Embracing the tabular data model Bringing together tabular and multidimensional models under a single unified platform – the BI Semantic Model Note: There are no improvements to the data mining component Tabular model development is explored in SQL11UPD06-DECK-02 This presentation introduces what is new for Analysis Services in SQL Server 2012. Take care not to cover content, specifically about tabular model development and DAX, that will be delivered in the next presentation.
BI SEMANTIC MODEL One Model for All End User Experiences Team BI Client Tools Analytics, Reports, Scorecards, Dashboards, Custom Apps Data Sources Databases, LOB Applications, OData Feeds, Spreadsheets, Text Files BI Semantic Model Data Model Business Logic and Queries Data Access Team BI PowerPivot for SharePoint Personal BI PowerPivot for Excel Corporate BI Analysis Services One Model for All End User Experiences Use this slide to introduce the BI Semantic Model as one model for all end user experiences. Explain briefly at this stage that it works for all existing Microsoft OLAP clients and tools, but do not go into the details at this stage. It is sufficient to introduce the BI Semantic Model and describe the three layers of Data Model, Business Logic and Queries and Data Access. Respectively, these are the model as exposed to the end user, the logic in the form of calculations, and data access to different types of data stores either cached or pass through. The BI Semantic Model supports Personal BI with PowerPivot for Excel, Team BI with PowerPivot for SharePoint and Corporate BI with Analysis Services.
Third-Party Applications Reporting Services Excel PowerPivot SharePoint Insights Power View DAX Query MDX Query BI Semantic Model Project Type PowerPivot Workbook Tabular Project Multidimensional Project Excel 2010 DAX In-Memory N/A SharePoint Library / Analysis Services PowerPivot Tabular SQL Server Data Tools DAX In-Memory DirectQuery Analysis Services Tabular Tabular SQL Server Data Tools MDX MOLAP ROLAP Analysis Services Multidimensional Multidimensional Design Type Development Tool Business Logic Data Access – Cache Passthrough Deployment A PowerPivot workbook can be restored to a Tabular instance, or imported to create a Tabular Project LOB Applications Files OData Feeds Cloud Services Relational Databases Deployed BI Semantic Model This comprehensive animated slide has been designed to build up an understanding of the three project types that can be used to develop a BI Semantic Model. [2] Introduce the three project types: PowerPivot Workbook (existing and enhanced), Tabular Project (new) and Multidimensional Project (renamed from the Unified Dimensional Model). [3] Each project type is described in terms of Design Type, Development Tool, Business Logic, Data Access and Deployment. [4] PowerPivot Workbook: Targeted at business analysts, these models are created using PowerPivot for Excel and business logic is implemented with DAX. There is only support for cached data with the In-Memory storage mode. Deployment is optional, and is achieved with SharePoint Server 2010 with the PowerPivot Add-in for SharePoint. [5] Tabular Project: Targeted at IT Professionals, the development tool is SQL Server Data Tools. These models can be configured with DirectQuery which is a pass through mode to the underlying data source. Models (projects) must be deployed to an Tabular mode instance of Analysis Services. [6] PowerPivot Workbooks can be restored to a Tabular mode instance of Analysis Services or imported into a Tabular Project. [7] Multidimensional Project: Also targeted at IT Professionals, this model type is what was previously known as the Unified Dimensional Model (UDM). The only major changes in this release are the SQL Server Data Tools designers and the change of name. [8] All existing OLAP clients, third party applications and the new report authoring tool named Power View can be used to query any type of BI Semantic Model. [9] Existing clients and third party applications continue to use MDX. [10] Power View uses a new query language named DAX Query. This new query language is supported only by tabular models in this release. [11] Data can be integrated into models from a variety of data sources. [12] The Multidimensional Project only supports producing models based on the relational databases. [13] The tabular models support relational databases, files, OData feeds, and other deployed BI Semantic Models.
Flexibility Richness Scalability BI SEMANTIC MODEL DELIVERABLES Flexibility Richness Scalability Tabular and multidimensional modeling experiences DAX and MDX for business logic and queries Cached and passthrough storage modes Choice of end-user BI tools Rich data modeling capabilities Sophisticated business logic using DAX and MDX Fine-grained security – row and cell level Enterprise capabilities – multi-language and perspectives In-Memory for high performance, MOLAP for mission critical scale DirectQuery and ROLAP for passthrough access to data sources State-of-the-art compression algorithms Scales to the largest of enterprise servers Use this slide to describe the BI Semantic Model deliverables in terms of flexibility, richness and scalability. This slide should be used to reinforce the material presented in the previous slide.
CHOOSING THE RIGHT DEVELOPMENT APPROACH The model developer needs to choose the right development approach Use this slide to describe the project templates now available to the developer in SQL Server Data Tools.
CHOOSING THE RIGHT DEVELOPMENT APPROACH DATA MODEL Tabular Multidimensional Familiar model, easier to build, faster time to solution Some advanced concepts are not available natively in the model and may need calculations to simulate these Easy to wrap a model over a raw database or data warehouse for analytics and reporting Sophisticated model involving a higher learning curve Advanced concepts are baked in to the model and optimized (parent-child, many-to-many, attribute relationships, key vs. name, etc.) Ideally suited for OLAP type applications (e.g. planning, budgeting, forecasting) that need the power of the multidimensional model Data model considerations are detailed on this slide for both types of model development.
CHOOSING THE RIGHT DEVELOPMENT APPROACH BUSINESS LOGIC DAX MDX Based on Excel formulas and relational concepts – easy to get started Complex solutions require steeper learning curve – row/filter context, CALCULATE, etc. Calculated columns enable new scenarios, however no named sets or calculated members (other than measures) Based on an understanding of multidimensional concepts – involves a higher initial learning curve Complex solutions require steeper learning curve Ideally suited for applications that need the power of multidimensional calculations involving scopes, assignments, and calculated members Business logic considerations are detailed on this slide for both types of model development.
CHOOSING THE RIGHT DEVELOPMENT APPROACH DATA ACCESS AND STORAGE In-Memory In-memory column store with typical 10x compression Brute force memory scans is high performance by default, and no tuning is required Basic paging support and data volume mostly limited to physical memory MOLAP Disk based store with typical 3x compression Disk scans with in-memory subcube caching, and aggregation tuning required Extensive paging support and data volumes can scale to multiple terabytes DirectQuery Passes through DAX queries and calculations to fully exploit backend database capabilities No support for MDX queries and no support for data sources other than SQL Server ROLAP Passes through fact table requests and is not recommended for large dimension tables Supports most relational data sources though no support for aggregations except with SQL Server indexed views Data access and storage considerations are detailed on this slide for both types of model development.
Exploring SQL Server 2012 Power View SQL11UPD06-DECK-04
INTRODUCTION Power View is an interactive data exploration, visualization, and presentation experience Highly visual design experience Rich meta-driven interactivity Presentation-ready at all times Provides intuitive ad-hoc reporting for business users such as data analysts, business decision makers, and information workers Ordinarily, a Power View report needs to be based on a tabular BI Semantic Model that has been optimized for the report authoring tool Use this slide to describe in broad terms what Power View has been designed to achieve, and that it must be based on a tabular BI Semantic Model (probably optimized for Power View).
POWER VIEW REPORTING AUDIENCE AND AUTHORING TOOLS To make it very clear where this new report authoring tools fits, spend good time ensuring the audience appreciate that it targets end users for data exploration and visualization. Importantly it has not been designed to compete with the Report Designer or Report Builder authoring tools.
POWER VIEW EXAMPLE REPORT This slide provides a preview of the visually impressive layout of a Power View report. It is in fact the report created in the hands-on lab. The amazing part about this report is the ease and speed at which it can be produced, and the supported interactivity.
Tabular BI Semantic Model Optimization Ordinarily, the tabular BI Semantic Model needs to be optimized for the Power View experience This is required to exploit the unique capabilities of the report authoring tool by supplying hints and directives Note: Optimizing a model for Power View may de-optimize it for OLAP clients As described in this slide, the tabular BI Semantic Model will need to be optimized for use by Power View.
Tabular BI Semantic Model Optimization (CONTINUED) The following model resources are not available in the Power View Field List: Hidden tables, columns and measures Hierarchies Implicit measures (defined in the PowerPivot Field List) Key Performance Indicators (KPIs) Only the default perspective can be used This slide details what will not be shown in the Power View Field List. Describe that an implicit measure is one created in the PowerPivot Field List in Excel. Emphasize that it is not a recommended way to create measures – use the explicit approach instead. This will be covered in more detail later in this presentation.
Tabular BI Semantic Model Optimization (CONTINUED) The model can be optimized by: Providing friendly names for tables, columns and measures Hiding unnecessary tables, columns and measures Setting appropriate formats for columns and measures Providing descriptions for tables, columns and measures These are surfaced as tooltips in the Field List Adding columns that contain images (binary data) Images can also be referenced by their URL There may not be the need to define measures This slide provides details about how a model can be optimized. The hands-on lab will provide deeper discussion on these optimizations and reporting properties.