1 Exploring Data Mining Implementation By Karim Hirji, IBM Canada Chichang Jou, Tamkang University.

Slides:



Advertisements
Similar presentations
Develop an Information Strategy Plan
Advertisements

Building a Customer-focused and Learning Culture with KM Philip Fung Vice Chairman of KMDC July 2005.
Designing and Developing Decision Support Systems Chapter 4.
Essentials of Marketing 13e
It’s Time to Talk About Risk and Control
Chapter 10 Schedule Your Schedule. Copyright 2004 by Pearson Education, Inc. Identifying And Scheduling Tasks The schedule from the Software Development.
Copyright 2009  Develop the project charter: working with stakeholders to create the document that formally authorizes a project—the charter  Develop.
Degree and Graduation Seminar Scope Management
Chapter 3 Database Management
For use only with Perreault/Cannon/McCarthy or Perreault/McCarthy texts. © 2008 McGraw-Hill Companies, Inc. McGraw-Hill/Irwin Improving.
PPA 502 – Program Evaluation Lecture 10 – Maximizing the Use of Evaluation Results.
1 SYS366 Week 1 - Lecture 2 How Businesses Work. 2 Today How Businesses Work What is a System Types of Systems The Role of the Systems Analyst The Programmer/Analyst.
Manajemen Basis Data Pertemuan 8 Matakuliah: M0264/Manajemen Basis Data Tahun: 2008.
Database – Part 2b Dr. V.T. Raja Oregon State University External References/Sources: Data Warehousing – Sakthi Angappamudali at Standard Insurance; BI.
SLIDE 1IS 257 – Fall 2008 Data Mining and the Weka Toolkit University of California, Berkeley School of Information IS 257: Database Management.
© 2002 McGraw-Hill Ryerson Ltd.1 Selection Bryan Andrews.
Copyright Cengage Learning 2013 All Rights Reserved 1 Chapter 14: Supply Chain Management Introduction to Designed & Prepared by Laura Rush B-books, Ltd.
© 2013 IBM Corporation Information Management Discovering the Value of IBM InfoSphere Information Analyzer IBM Software Group 1Discovering the Value of.
TOPIC 1: GAINING COMPETITIVE ADVANTAGE WITH IT (CONTINUE) SUPPLY CHAIN MANAGEMENT & BUSINESS INTELLIGENCE.
What SMS means for an Operator’s relationship with the CAA
Introduction to Systems Analysis and Design
Construction Management Practice Implications of the Theory of Construction Management.
Sharif University of Technology Session # 4.  Contents  Systems Analysis and Design Sharif University of Technology MIS (Management Information System),
Lean Supply Chain Action Learning Program September 2007.
INTRODUCTION Performance management is a relatively new concept to the field of management.
Development of Competence Profile Quality managers in VET-institutions Project no: PL1-LEO This publication [communication] reflects the.
ACS1803 Lecture Outline 2 DATA MANAGEMENT CONCEPTS Text, Ch. 3 How do we store data (numeric and character records) in a computer so that we can optimize.
Data and Data Collection Questionnaire
Chapter 3: Marketing Intelligence Copyright © 2010 Pearson Education Canada1.
Striving for Quality Using continuous improvement strategies to increase program quality, implementation fidelity and durability Steve Goodman Director.
McGraw-Hill/Irwin ©2009 The McGraw-Hill Companies, All Rights Reserved Marketing Research, Primary Data, Secondary Data, Qualitative Research, Quantitative.
Slide 1 D2.TCS.CL5.04. Subject Elements This unit comprises five Elements: 1.Define the need for tourism product research 2.Develop the research to be.
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
@ ?!.
1. INTERNET MARKET RESEARCH 2. OPERATIONAL DATA TOOLS Info. for Competitive Marketing Advantages Maher ARAFAT, June, 2010.
Evaluating a Research Report
BUSINESS DRIVEN TECHNOLOGY
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc. All rights reserved. 1-1 BUSINESS DRIVEN TECHNOLOGY UNIT 1: Achieving Business Success Through.
Strategically Managing the HRM Function McGraw-Hill/Irwin ©2012 The McGraw-Hill Companies, All Rights Reserved.
Assessing the influence on processes when evolving the software architecture By Larsson S, Wall A, Wallin P Parul Patel.
Microsoft Office Project 2003: Selling EPM in your Organization Matt Wilson Business Solutions Specialist LMR Solutions.
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
1 Chapter 4 Analyzing End-to-End Business Processes.
The Value Driven Approach
1 Standard Student Identification Method Jeanne Saunders Session 16.
Chapter Fourteen Communicating the Research Results and Managing Marketing Research Chapter Fourteen.
Business Analysis. Business Analysis Concepts Enterprise Analysis ► Identify business opportunities ► Understand the business strategy ► Identify Business.
1 Accounting systems design & evaluation Karen Lau 25 Feb 2002.
© 2006 Pearson Education Canada Inc. 3-1 Chapter 3 Database Management PowerPoint Presentation Jack Van Deventer Ward M. Eagen.
Chapter Nine: Qualitative Procedures
© 2003 Prentice Hall, Inc.3-1 Chapter 3 Database Management Information Systems Today Leonard Jessup and Joseph Valacich.
Why BI….? Most companies collect a large amount of data from their business operations. To keep track of that information, a business and would need to.
By: Dalila Ochoa Mary S Garcia
1A FAST EXCELLENCE THROUGH FACILITATION Gary Rush The FAST Process MGR Consulting
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
Self Assessment SELF ASSESSMENT FOR YOU Ann Pike 30 th September 2010.
SWH The Marketing Plan Devising a Marketing Plan Reviewing and Revising the Marketing Plan 1/22/2016SWH.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 1 Database Systems.
Fundamentals of Information Systems, Sixth Edition Chapter 3 Database Systems, Data Centers, and Business Intelligence.
Impact Research 1 Enabling Decision Making Through Business Intelligence: Preview of Report.
Chapter Two Copyright © 2006 McGraw-Hill/Irwin The Marketing Research Process.
Info-Tech Research Group1 1 Info-Tech Research Group, Inc. is a global leader in providing IT research and advice. Info-Tech’s products and services combine.
1 Working with Project Stakeholders in a Statewide Project PMI-SVC PMO Forum Monthly Meeting Dan Conway, PMP October 22, 2008.
AUDIT STAFF TRAINING WORKSHOP 13 TH – 14 TH NOVEMBER 2014, HILTON HOTEL NAIROBI AUDIT PLANNING 1.
Learning Objective Chapter 12 Using Reports and Proposals Copyright © 2001 South-Western College Publishing Co. Objectives O U T L I N E Types of Reports.
Identify the Risk of Not Doing BA
Business Development Career Ladder | avitusgroup.com.
Software Development Life Cycle
Data and Data Collection
Data Warehousing and Data Mining
Presentation transcript:

1 Exploring Data Mining Implementation By Karim Hirji, IBM Canada Chichang Jou, Tamkang University

2 Motivation Traditional Statistical techniques could not scale to handle millions of records and thousands of variables Data mining emerges to handle the scalability issue How to perform data mining? The study (2001) provides information and experiences about 5-step data mining proposed by Cabena et. al

3 5-Stage Data Mining Model (Cabena et al.) 1.Business objective determination 2.Data preparation 3.Data mining 4.Results analysis 5.Knowledge assimilation

4 Research Method Case study (Benbasat et. al) –Concerned with the larger question of developing a deeper understanding of “how” data mining should be done –It is important to find a company willing to participate in this study and provide full access to the organization during the time frame of the study –Multiple methods of data collection were used archival records, documentation, interviews, observation

5 The Participating Company TAKCO is a mature North America fast- food retailer Its Canadian headquarter is in Toronto Interesting aspects of the fast-food industry – Consumer-driven –Striving toward operational efficiency –Extensive marketing analysis

6 Data Collection Direct observation is the primary data collection method –Comments recorded and probed –Notes reviewed after each site visit for content –Final data analysis after gathering qualitative data from all site visits, structured by comparing to Cabena model stage-by-stage –Totally 10 site visits from 1998/07 to 1998/11

7 Members in the Data Mining Project A data mining specialist A project manager A senior director of strategic planning (the executive sponsor) A research supervisor A business analyst An end-user analyst A data architect, and A database administrator (DBA) The executive sponsor and project manager decided that the entire team should be present during key project activities. Accordingly, the data mining activities were highly interactive.

8 Project Time Line An enterprise-wide transaction data warehouse of 30 gigabyte were completed eight months ago. Tool: IBM Intelligent Miner for Data Functions: clustering, associations, predicting

9 Project Outcome The executive sponsor –Not a failure Completed on time and within budget –No completely new and unexpected results

10 First Visit The executive sponsor and project manager discuss the final parameters of the DM project –Candidate business problems identified –How much historical transaction data to mine –Received a formal project number and budget

11 not to develop a production data mining application

12 Stage 1 A workshop held in 1998/09 to identify the business problem to be mined –Team members introduced –Roles and responsibilities assigned –High-level project plan developed Extensive discussion about original candidate business problems –Immediate obstacle: to ensure supporting data was readily available –Input from the data architect and DBA invaluable 2 out of 3 original business problems replaced, due to data issues –The research supervisor played a dominant role in framing the business problems

13 Two Project Discontinuity Points in Stage 1 1.Anticipation –Expectations about the potential to deliver novel and interesting findings –Goal alignment was employed to provide focus and clarity –Emphasis placed on establishing and reaching consensus on a realistic, measurable, and achievable business goal and project goal –Agreed business goal: understanding DM technology benefits to enhance business decision making –Agreed project goal: Demonstrating DM potential to provide new and valuable insights into a subset of existing production system data

14 Two Project Discontinuity Points in Stage 1 2.Anxiety/Apprehension –Concerns about the nature of the data preparation stage and the potential bias and noise in the data set –With data quality efforts in the data warehouse project, members concerned about incorrect interpretation and improper transformation –A data audit stage added after data preparation to demonstrate validity, reliability, consistency, completeness and integrity of the resulting transformed data set Minimize the danger of automatically dismissing potential anomalous and relevant DM results

15 One Project Discontinuity Point in Interactive Data Mining and Results Analysis Stage 3.Frustration –OLAP already used extensively to gain knowledge about product offerings and fast-food customer profiles –“I already know that” comment –Back end data mining, involving data enrichment and additional DM algorithm execution, introduced to increase the dimensionality of the data set with 3rd- party demographic data Effective in providing different and interesting analysis results

16 Implications and Discussion 1.A DM project appears to follow a more elaborated set of stages than previously reported 2.Unlike other work, data preparation in this study is not the most resource intensive stage 3.Several important process aspects relevant for the Interactive DM and Results Analysis stage A DM briefing would have made the stage more efficient and effective without “I don’t understand what this means” comment The DM specialist worked as a facilitator Linking DM results with business strategy and using application software to perform sensitivity analysis Product combinations Importance of contextualizing the stage with business strategy Use spreadsheet to perform sensitivity analysis