Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, ERP Centric Data Mining and Knowledge Discovery Naeem Hashmi Chief Technology Officer Information Frameworks Web: Webcast - searchsap.com September 10, 2002
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Founder and CTO of Information Frameworks, an author, speaker and world-renowned expert on emerging Information Architectures, Integration and Business Intelligence Technologies. Author of the best selling book titled, –SAP Business Information Warehouse for SAP, Technical Editor –SAP BW Certification Guide, authored by Catherine Roze 2002 Contributing Author, SAP BW Handbook, 2002 Member of Intelligent ERP magazine's board of editors, is a frequent speaker at IT industry conferences including SAP TechEd, ASUG, Oracle Open World, DCI, The ERP World, Data Mining and the Data Warehouse Institute. 25+ years of experience in emerging Information Technology research, development, and management; Information Architectures; Enterprise Application Integration e-business; ERP applications; Data Warehousing; Data Mining; CRM; Internet, Object and Client/Server Technologies and Strategic Consulting. - url: Tel: Naeem Hashmi About the Speaker
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Agenda Data Mining and Knowledge Discovery Basics ERP Vendors and Data Mining Solutions Data Mining in SAP Business Information Warehouse Pro and Cons of ERP centric Data Mining Q&A
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Agenda Data Mining and Knowledge Discovery Basics ERP Vendors and Data Mining Solutions Data Mining in SAP Business Information Warehouse Pro and Cons of ERP centric Data Mining Q&A
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, What is Data Mining and Knowledge Discovery ? Data Mining is a tactical process that uses mathematical algorithms to sift through large data- stores to extract data patterns/models/rules The Knowledge Discovery is the process of identifying and understanding potentially useful hidden anomalies, trends and patterns. Data mining is an integral part of knowledge discovery process
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Data Mining and Statistics ? DM sounds very similar to regression analysis but its approach and purpose are quite different –Statistical methods tests a hypothesis on a data set –Data Mining starts from the data sets to construct a hypothesis
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Data Mining - Present State Source: Application Domains
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Data Mining Methodologies Source: CRoss Industry Standard Process for Data Mining 1.Business Understanding 2.Data Understanding 3.Data Preparation 4.Modeling 5.Evaluation 6.Deployment CRISP-DM Source: SIX STEPS PROCESS
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Data Mining Process CRoss Industry Standard Process (CRISP) for Data Mining Data Understanding Data Preparation Data Warehouse 1.Business Understanding 2.Data Understanding 3.Data Preparation 4.Modeling 5.Evaluation 6.Deployment Initially will take about 60% to 80% of the data mining project time Source:
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Data Mining - Tools and Data Formats Source: Domains 57% Flat files 37% Proprietary 27% DBMS
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Visualization Use human pattern recognition capabilities Statistics Applying statistical techniques to predict Decision Trees Building scripts based on historic data Association Rules (Rule Induction) Reasoning from specific facts to reach a hypothesis Clustering Refers to finding and visualizing groups of facts that were not previously known Neural Networks Learning how to solve problems based on examples K-Nearest Neighbor Classification by looking at similar data Genetic Algorithms Survival of the fittest … TECHNIQUESTECHNIQUES USAGEUSAGE Discover Understand Predict Data Mining Technology
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Data Mining Models Regression algorithms Neural Networks, Rule Induction Predict Numerical Outcome Classification algorithm CHAID, discriminant analysis Predict Symbolic Outcome Two Types of Data Mining Models Clustering/Grouping algorithms K-means, Kohonen, Factor Analysis Association algorithms Apriori, Sequence Descriptive Models Grouping & Associations Prediction Models Prediction and Classification
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Traditional DM vendors SPSS Clementine SAS Enterprise Miner IBM Intelligent Miner Salford CART/MARTS …more
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Database Vendors – DM within the Products Data Mining Engine in Oracle 9i –Oracle 9i consists of key products Oracle9i Database,Oracle9i Application Server,Oracle9i Developer Suite IBM Intelligent Miner into DB2 TeraMiner into Teradata Microsoft – SQL Server 2000 When you implement DM functionality in a DBMS, you are limited to a specific database engine and not quite flexible in a typical enterprise application landscape - heterogeneous environment.
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Data Mining Standards PMML - Predictive Model Markup Language OleDB for Data Mining Java Data Mining API Other Data Exchange Standards for Analytics and need Data Mining extensions –CWM: Common Warehouse Metadata –XML/A: XML for Analytics –CPEX: Customer Profile EXchange –xCIL: Extensible Customer Information Language
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Agenda Data Mining and Knowledge Discovery Basics ERP Vendors and Data Mining Solutions Data Mining in SAP Business Information Warehouse Pro and Cons of ERP centric Data Mining Q&A
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Enterprise Applications Landscape ERP Solutions –Oracle –PeopleSoft –SAP ERP vendors have extended scope of their applications far beyond tradition ERP functions to a wide array of business solutions such as: Customer Relationships Management Business Intelligence Enterprise Portals Siebel Oracle Business Intelligence Solution Peoplesoft Enterprise Performance Management SAP Business Information Warehouse
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Oracle Business Intelligence Solution Business Processes (Pre-Built Portlets) Response to Lead (27) Lead to Quote (56) Quote to Order (15) Order to Cash (34) Demand to Build (40) Procure to Pay (28) Revenue to Compensation (29) Expiration to Renewal (33) Issue to Resolution (51) HR Family (43) Source: Oracle Oracle 9i DM Integration Oracle Marketing Online for Campaign Management Oracle9iAS Personalization iStore more to come… Oracle9iDS Warehouse Builder Oracle9iAS Discoverer Oracle9iDS Reports Oracle9iAS Portal Oracle9iAS Clickstream Intelligence Oracle9iAS Personalization Oracle9i Data Mining Oracle9iDS Business Intelligence Beans Oracle 9i Business Intelligence
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, PeoplSoft Business Intelligence Solution Customer Profitability Finance Workforce Analytics Supply Chain Management Process Workforce Rewards Enrollment Management Retail Merchandise Project Analysis Student Administration Balanced Scorecard Employee Scorecard Customer Scorecard Vendor Scorecard Enterprise Performance Management (EPM) Courtesy: eBusiness Advantage Inc. ( CRM Prospect Analysis CRM Marketing Analysis CRM Sales Effectiveness CRM Service Effectiveness Data mining Capabilities No word on PeopleSoft Data Mining tools/technologies for predictive analytics - home grown, acquired or 3rd Party Products. No response from PeopleSoft contacts
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, SAP Business Intelligence Solution +420 InfoCubes Queries Source: SAP SAP CRM Campaign management Opportunity analytics Customer behavior modeling SAP SCM Demand planning Spend optimization SCOR KPIs SAP Financials, Human Capital Management SEM Balanced scorecard Planning Economic profit Benchmarking Employee turnover & retention Corporate investment management Closed loop platform capabilities Drill-through (report-report i/f) Remote cubes (read through) Real-time data warehousing Data mining Write back to operational system SAP Portals E-commerce analysis SAP Markets, Procurement Bidding, pattern-based offering Activity reproting, service analytics 90 ODS Objects Business Information Warehouse
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, CRM Venders – Data Mining Integration Oracle CRM –Pre 9i Darwin –Post 9i ODM RightPoint and E.piphany SPSS and Siebel SAP CRM –Native Data Mining built in SAP BW - Database Independent –Interface to IBM Intelligent Miner Interface with SAP BW PeopleSoft CRM –No official data mining product or vendor solution –Waiting for their response on what they have?
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Agenda Data Mining and Knowledge Discovery Basics ERP Vendors and Data Mining Solutions Data Mining in SAP Business Information Warehouse Pro and Cons of ERP centric Data Mining Q&A
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, SAP BW 3.0b Data Mining Implementation Currently for Customer Subject Area Algorithm Supported –Decision Trees –Scoring –Clustering/Segmentation –Association Data Mining process –Model definition –Training the model –Performing prediction using the training results –Uploading the results back into BW –Utilizing the mining results (on the operational side) –SAPGUI is the Interface to the Data Mining modeling and analysis No Extensive Data Staging
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Modeling a Decision Tree Create a mining model Source: SAP 2 Model ccolumns 1 Specifying the column parameters 6 Specifying the values in case the original values in the column are to be treated differently Indicating the prediction column 4 Indicating the key column 5 The nature of the column content 3 Data type of the column 7
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Modeling a Decision Tree Specify Model Parameters Source: SAP Use portion (%) of the data for training or the whole data set for training 1 Size of the window (such as 10%) The number of repeats with different samples Stop training when the no. of cases under the given node is less than/equal to the specified value 4 Stop training when the accuracy is greater than or equal to the expected accuracy 5 If the tree is too big, prune the tree without violating the expected accuracy 6 Use the information gain threshold to check the relevance 7 3 2
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Create a training source and map the model columns Source: SAP 2 Modeling a Decision Tree BW Query Runtime parameters for query Model columns 1 Selected source columns 3 Mapping between model column and source column 4 5
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Create a mining model Train the model Predictions using Training results Using the data mining results against BW Query Source: SAP SAP BW Data Mining – Process Steps
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Source: SAP 3 5 Viewing Decision Tree Training Results This decision tree predicts whether the customer has left or is still “on board 1 Chances of a customer leaving is 70.7% if the profession is “LABOURER” 2 Chart shows the distribution at the selected node 28/41 customers are likely to leave 13/41 customers are likely to stay 6 Out of a total of 705 cases, 41 cases are covered under this node 4
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Uploaded in BW Then BEX for further Analysis Source: SAP Data Mining – Decision Trees
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Create a Association model Define Model Columns Train the model Predictions using Training results Using the data mining results against BW Query Source: SAP Data Mining – Association
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Source: SAP Data Mining – Association
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Create a Cluster model Train the model Predictions using Training results Using the data mining results against BW Query Source: SAP Data Mining – Cluster Analysis
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Source: SAP Viewing Cluster Analysis Results 1 2 3
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Uploaded in BW Then BEX for further Analysis Source: SAP Viewing Cluster Analysis results
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Good attempt to implement few Data Mining Algorithms Very traditional Data Mining Approach Requires a well versed Statistician or Data Mining Expert to model and interpret the results Source: BEX Query – Big Limitation in DM Weak Visualization BEX for additional discovery - slicing and dicing SAP Data Mining
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, IBM Intelligent Miner is designed to: SAP BW - IBM Intelligent Miner Copy data from SAP BW to IBM Intelligent Miner –Results of reports in BW – Modeling in Business Explorer Analyzer –Data direct from InfoCubes (for cross-selling analysis) –Descriptions, hierarchies Results data from IBM IM back into SAP BW –Results of segmentation can be loaded as master data or hierarchies Data transport is designed through Wizards in SAP BW –Possible to get a good view of Intelligent Miner Results from SAP BW
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Agenda Data Mining and Knowledge Discovery Basics ERP Vendors and Data Mining Solutions Data Mining in SAP Business Information Warehouse Pro and Cons of ERP centric Data Mining Q&A
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, ERPs and Data Mining: Good and the Bad News Good News –Known Business Processes –Few data Sources –Improved Data Quality –Metadata Integration –Near real-time data mining –Closed-loop Knowledge Discovery –Consistent Infrastructure Bad News –Complex Data Structures –Performance –Availability –Very few Data Mining algorithms - Today 1.Business Understanding 2.Data Understanding 3.Data Preparation 4.Modeling 5.Evaluation 6.Deployment CRISP-DM
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Data Understanding Data Preparation Deployment Business Understanding Data Mining Process and ERP Data Mining Business Understanding Data Understanding Data Preparation Modeling Evaluation Deployment Will reduce data mining project time up to 50% 50% Source: Good News for Future Business Applications
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Agenda Data Mining and Knowledge Discovery Basics ERP Vendors and Data Mining Solutions Data Mining in SAP Business Information Warehouse Pro and Cons of ERP centric Data Mining Q&A
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, INFORMATION FRAMEWORKS Technology/Solution Assessment Product Strategy Solution Strategy Product Positioning Competitive Analysis Software product architecture Marketing Strategy Product Performance and Benchmarking Consulting Hardware Configuration Market Research Market Assessment Competitive Analysis Technology due Seminars Webinars Keynotes Panel Moderator Publications Hands-on training Conferences Executive and Senior IT Management Consulting Enterprise Information Architectures (EIA) Business Case Development Information Architecture Application Deployment Architectures implementation Legacy Application Migration Strategies ERP Application deployment strategies Enterprise Applications Integration (EAI) Architectures, Service Modeling and design, EAI technology assessment Tools and Technology Assessment Vendor Selection and Assessment Conference Room Pilot implementation Business Intelligence and Portals Architectures, Methodologies Tool/technology/Vendor assessment and selection Data Warehouse, Data Marts, Analytics, Information Delivery Deployment Architectures Business Intelligence and eBusiness Integration architectures Portals Strategies, Business case, Assessment, Architectures, Modeling, Planning and knowledge Transfer KNOWLEDGE TRANSFER INFORMATION TECHNOLOGY ORGANIZATION SOFTWARE AND SOLUTION VENDORS INFORMATION TECHNOLOGY INVESTORS
Copyrights 2002 ERP Data Mining & Knowledge Discovery webcast searchsap.com Sept 10, Questions Naeem Hashmi Chief Technology Officer September 10, Web Site: Tel: