Download presentation
Presentation is loading. Please wait.
Published byFay Cunningham Modified over 8 years ago
1
Event Title Event Date
3
Module 09— Introducing SSAS Data Mining Models Name Title Microsoft Corporation
4
Disclaimer The information contained in this slide deck represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication. This slide deck is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this slide deck may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this slide deck. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this slide deck does not give you any license to these patents, trademarks, copyrights, or other intellectual property. Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, email address, logo, person, place or event is intended or should be inferred. © 2008 Microsoft Corporation. All rights reserved. Microsoft, SQL Server, Office System, Visual Studio, SharePoint Server, Office PerformancePoint Server,.NET Framework, ProClarity Desktop Professional are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. 4
5
Where Are We? Data Sources Staging Area Manual Cleansing Data Marts Data Warehouse Client Access 5
6
Module Overview Introducing Data Mining Integration with SQL Server 2008 Components Data Mining Programmability 6
7
Introducing Data Mining Purpose of Data Mining Business Scenarios SQL Server 2008 Data Mining Data Preparation Data Mining Process Data Mining Visualization 7
8
Purpose of Data Mining Addresses the problem of too much data and not enough information Enables data exploration, pattern discovery, and pattern prediction— which lead to knowledge discovery Forms a key part of a BI solution 8
9
Business Scenarios Identifying responsive customers/unresponsive customers (also known as churn analysis) Detecting fraud Targeting promotions Managing risk Forecasting sales Cross-selling Segmenting customers 9
10
SQL Server 2008 Data Mining Hides the complexity of an advanced technology Includes full suite of algorithms to automatically extract information from data Handles large volumes of data and complex data Data can be sourced from relational and OLAP databases Uses standard programming interfaces −XMLA −DMX Delivers a complete framework for building and deploying intelligent applications 10
11
SQL Server 2008 Algorithms Decision Trees −The most popular data mining technique −Used for classification Clustering −Finds natural groupings inside data Sequence Clustering −Groups a sequence of discrete events into natural groups based on similarity −Use this algorithm to understand how visitors use your Web site 11
12
SQL Server 2008 Algorithms Naïve Bayes −Used for classification in similar scenarios to Decision Trees Linear Regression −Finds the best possible straight line through a series of points −Used for prediction analysis Logistic Regression −Fits to an exponential factor −Used for prediction analysis 12
13
SQL Server 2008 Algorithms Association Rules −Supports market basket analysis to learn what products are purchased together Time Series −Forecasting algorithm used for short-term or long-term predictions future values from a time series −Use multiple series to predict “what if” scenarios Neural Network −Used for classification and regression tasks −More sophisticated than Decision Trees and Naïve Bayes, this algorithm can explore extremely complex scenarios −Often challenging to configure and interpret its results 13
14
Data Preparation Often significant amounts of effort are required to prepare data for mining −Transforming for cleaning and reformatting −Isolating and flagging abnormal data −Appropriately substituting missing values −Discretizing continuous values into ranges −Normalizing values between 0 and 1 Other requirements −Clear business objectives −Adequate data −Consider all attributes that may be required as inputs for classification −For example, demographic data: Age, Gender, Region, etc. 14
15
Design time Process time Query time Data Mining Process Mining Model 15
16
Design time Process time Query time Data Mining Process Mining Model Training Data Data Mining Engine 16
17
Design time Process time Query time Data Mining Process Data Mining Engine Data to Predict Predicted Data Mining Model 17
18
Data Mining Visualization In contrast to OLTP and OLAP queries, data mining queries typically extract previously unknown information Visualizations can effectively present data discoveries SQL Server 2008 provides algorithm-specific visualizations that you can use to −Test and explore models in Business Intelligence Development Studio −Embed into Windows Forms applications Developers can construct and plug-in custom data mining viewers Office 2007 includes three data mining add-ins −Table Analysis tools for Excel 2007 −Data Mining Client tools for Excel 2007 −Data Mining Templates for Visio 2007 18
19
Integration with SQL Server 2008 Components Integration with SSIS Integration with SSAS Integration with SSRS 19
20
Integration with SSIS Perform data mining directly in the control flow or the data flow pipeline Configure “intelligent” packages based on data mining query results 20 Enterprise Edition only
21
Integration with SSAS Create data mining models directly from OLAP stores Create dimensions from data mining models to slice cubes using discovered patterns −Decision Trees −Clustering −Association Rules 21
22
Integration with SSRS Present data mining results in SSRS reports −Prediction queries −Content queries −Parameterized queries Use a data mining query builder to easily select results Apply grouping and aggregation to summarize results Distribute data mining results by using subscriptions 22
23
Data Mining Programmability SSAS Data Mining Programmability Overview Programming Interfaces Embedding SSAS Data Mining Extending SSAS Data Mining 23
24
SSAS Data Mining Programmability Overview Data Mining Interfaces Analysis Server OLAPData Mining Server ADOMD.NET.NET Stored Procedures Microsoft Algorithms Third-Party Algorithms WAN XMLA Over TCP/IP OLE DBADOADOMD.NET XMLA Over HTTP Any Platform, Any Device C++ AppVB App.NET AppAny App AMO 24
25
Programming Interfaces AMO (Analysis Management Objects) −Administer database objects −Apply security −Manage processing ADOMD.NET −Connect to SSAS databases −Retrieve and manipulate data Server ADOMD.NET −Extend DMX by using.NET stored procedures 25
26
Embedding SSAS Data Mining Validate or repair user entry Integrate predictions −Targeted advertising −“Those that bought this book also purchased these books” Embed custom visualizations into Windows Forms applications to allow users to explore and understand model patterns 26 SSAS Data Mining ships with custom visualizations
27
Extending SSAS Data Mining Stored procedures Enhanced Visual Studio data mining tools Plug-in algorithms Plug-in data mining viewers 27
28
Classifying Customers Likely to Purchase a Bicycle
29
Data Platform Information Worker Platform Microsoft BI Platform Integration – Data Mining Excel / Visio Data Visualization SharePoint Report Libraries SSAS Data Mining Models SSRS Data Mining Reports Deployment / Management 29 Performance Management Integrated BI Solution
30
Resources SQL Server 2008 Books Online www.microsoft.com/sql/technologies/dm −Links to technical resources, case studies, news, and reviews www.sqlserverdatamining.com −Site designed and maintained by the SQL Server Data Mining team −Live samples −Tutorials −Webcasts −Tips and tricks −FAQ Data Mining with SQL Server 2008, by Jamie MacLennan, ZhaoHui Tang and Bogdan Crivat 30
31
31
32
© 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.