1 Business System Analysis & Decision Making - Lecture 14 Zhangxi Lin ISQS 5340 Summer II 2006.

Slides:



Advertisements
Similar presentations
Chapter 1 Business Driven Technology
Advertisements

QMM 384 – Data Mining Data Mining: Introduction Introduction to Predictive Analytics.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Introduction Lecture Notes for Chapter 1 Introduction to Data Mining by Tan,
1. Abstract 2 Introduction Related Work Conclusion References.
Week 9 Data Mining System (Knowledge Data Discovery)
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Introduction Lecture Notes for Chapter 1 Introduction to Data Mining by Tan,
Data Mining By Archana Ketkar.
Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Decision Support: Data Mining Introduction.
Data Mining – Intro.
Data mining By Aung Oo.
Data Mining: A Closer Look
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
Enterprise systems infrastructure and architecture DT211 4
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
McGraw-Hill/Irwin McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University.
DATA MINING Team #1 Kristen Durst Mark Gillespie Banan Mandura University of DaytonMBA APR 09.
Data Mining: Introduction. Why Data Mining? l The Explosive Growth of Data: from terabytes to petabytes –Data collection and data availability  Automated.
Tang: Introduction to Data Mining (with modification by Ch. Eick) I: Introduction to Data Mining A.Short Preview 1.Initial Definition of Data Mining 2.Motivation.
Business Intelligence. business intelligence is a broad category of applications and technologies for gathering, providing access to, and analyzing data.
INTELLIGENT SYSTEMS BUSINESS MOTIVATION BUSINESS INTELLIGENCE M. Gams.
Spatial Statistics and Spatial Knowledge Discovery First law of geography [Tobler]: Everything is related to everything, but nearby things are more related.
Data Mining and Application Part 1: Data Mining Fundamentals Part 2: Tools for Knowledge Discovery Part 3: Advanced Data Mining Techniques Part 4: Intelligent.
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
1 1 Slide Introduction to Data Mining and Business Intelligence.
1 Business System Analysis & Decision Making – Data Mining and Web Mining Zhangxi Lin ISQS 5340 Summer II 2006.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Introduction Lecture Notes for Chapter 1 Introduction to Data Mining by Tan,
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Guest Lecture Introduction to Data Mining Dr. Bhavani Thuraisingham September 17, 2010.
1 Improving quality of graduate students by data mining Asst. Prof. Kitsana Waiyamai, Ph.D. Dept. of Computer Engineering Faculty of Engineering, Kasetsart.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
CRM - Data mining Perspective. Predicting Who will Buy Here are five primary issues that organizations need to address to satisfy demanding consumers:
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
1 What is Data Mining? l Data mining is the process of automatically discovering useful information in large data repositories. l There are many other.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
MIS2502: Data Analytics Advanced Analytics - Introduction.
DATA MINING PREPARED BY RAJNIKANT MODI REFERENCE:DOUG ALEXANDER.
Academic Year 2014 Spring Academic Year 2014 Spring.
Data Mining Copyright KEYSOFT Solutions.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Waqas Haider Bangyal. 2 Source Materials “ Data Mining: Concepts and Techniques” by Jiawei Han & Micheline Kamber, Second Edition, Morgan Kaufmann, 2006.
Business intelligence systems. Data warehousing. An orderly and accessible repositery of known facts and related data used as a basis for making better.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Impact Research 1 Enabling Decision Making Through Business Intelligence: Preview of Report.
1. ABSTRACT Information access through Internet provides intruders various ways of attacking a computer system. Establishment of a safe and strong network.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
An Introduction to Data Mining
Department of Computer Science Sir Syed University of Engineering & Technology, Karachi-Pakistan. Presentation Title: DATA MINING Submitted By.
DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.
Data Mining – Intro.
MIS2502: Data Analytics Advanced Analytics - Introduction
Statistics 202: Statistical Aspects of Data Mining
Data Mining: Introduction
Data mining and real systems modeling
Data Mining: Introduction
Techniques for Finding Patterns in Large Amounts of Data: Applications in Biology Vipin Kumar William Norris Professor and Head, Department of Computer.
Data and Applications Security Introduction to Data Mining
Adrian Tuhtan CS157A Section1
Data Mining: Introduction
Data Mining: Introduction
Data Warehousing and Data Mining
Supporting End-User Access
Data Mining: Introduction
Welcome! Knowledge Discovery and Data Mining
Presentation transcript:

1 Business System Analysis & Decision Making - Lecture 14 Zhangxi Lin ISQS 5340 Summer II 2006

2 Chapter 12: Improving Decision Making Outline of the chapter Strategy 1: Acquiring Experience and Expertise Strategy 2: Debiasing Judgment Strategy 3: Analogical Reasoning Strategy 4: Taking an Outsider’s View Strategy 5: Using Linear Models and Other Statistical Techniques Strategy 6: Understanding Biases in Others

3 Decision Making in Sports Statistics has outperformed experts in predicting the outcomes of sport games The Future of NBA Statistics: Part 1, Part 2Part 1Part 2 Houston Rocket Performance in 2006 Yao Ming’s statistics Yao Mingstatistics Questions Why did it take so long for rationality to enter into decision making in sports (baseball)? To what extent are managers in other industries still replying on false expertise when better strategies exist?

4 Experience vs. Expertise “Experience is a dear teacher” (Dawes 1988) “Learning from an experience of failure … is indeed ‘dear’, …” Need to realize the value of gaining a conceptual understanding of how to make a rational decision, rather than simply depending on the relatively mindless, passive learning obtained via experience. The final benefit of developing a strategic conceptualization of decision-making concerns transferability – the ability to pass on the knowledge to future generations. Key element is to avoid the many biases in individual and group context.

5 Debiasing Judgment Unfreezing Change Refreezing

6 Business Intelligence and Data Analysis

7 Adopting Business Intelligence Collecting data – database and data warehousing Using linear models - regression Using other statistical techniques – ANOVA, correlation analysis, time series analysis, etc. Applying data mining techniques Classification Clustering Association analysis Link analysis Text mining Adopting new business intelligence ideas Web mining 6 sigmas Realtime advertising/marketing Accurate marketing Narrowcasting

8 A model of course contents IT Business Intelligence Behavioral Biases Models Tools Methods Data Decision Problems

9 Business Intelligence (restate) Wikipedia.org’s definition: A broad category of applications and technologies for gathering, providing access to, and analyzing data for the purpose of helping enterprise users make better business decisions. The term implies having a comprehensive knowledge of all of the factors that affect your business. It is imperative that you have an in depth knowledge about factors such as your customers, competitors, business partners, economic environment, and internal operations to make effective and good quality business decisions. Business intelligence enables you to make these kinds of decisions. Reference:

10 Business Intelligence (restate) The Data Warehousing Institute’s definition: The processes, technologies, and tools needed to turn data into information, information into knowledge, and knowledge into plans that drive profitable business action. Business intelligence encompasses data warehousing, business analytic tools, and content/knowledge management.

11 Benefits for MBA Students in Business Intelligence Understand the growing trend of demand in data mining from industry Know the general concepts and ideas in data analysis Be able to manage data mining projects for businesses Understand what technical people are doing Understand the outcomes from data mining projects Catch the advanced business concepts, business processes and new working patterns

12 Sending Advertising Materials 100,000 customer Only 10% of them may be interested in life insurance Mailing an insurance advertising package costs $1 (material printing, stamp, processing, etc.) If someone purchases the insurance, the company will make $4 net profit. So, if a letter results no purchase of the insurance package, the loss is $1. Questions What is the total profit if sending the ad to all customers? How to improve the efficiency of advertising and make positive profits?

13 Data What like of data we have now? Historical dataset. It shows previous life insurance purchase history Customers’ profile dataset. It contains customers’ properties and other information, except the information whether they will purchase the life insurance.

14 Case: Life Insurance Promotion Income Life insurance Credit card insuranceGenderAge 40-50,000No Male ,000Yes Female ,000No Female ,000NoYesMale ,000Yes Female ,000No Female ,000Yes Male ,000No Male ,000No Male ,000YesNoFemale41

15 Customer Profiles Dataset No: Income Range Magazine Promo Life Ins Promo Credit Card Ins.SexAge ,000Yes?NoMale ,000Yes?NoFemale ,000No? Male ,000Yes? Male ,000Yes?NoFemale ,000No? Female ,000Yes? Male ,000No? Male ,000Yes?NoMale ,000Yes? Female ,000No?YesFemale ,000No?YesMale ,000Yes?NoFemale ,000No? Male ,000No? Female19

16 Performance Analysis Originally, 40% customers purchased life insurance, i.e. P(“Life Ins”) = 0.4 We notice 3 out of 5 females purchase life insurance, i.e. P(“Life Ins”|Female) = 3 / 5 = out of 4 customers who purchase credit card insurance also purchase life insurance, i.e. P(“Life Ins”| “Credit Ins”) = 3 / 4 = 0.75 there is strong correlation between “Life ins” and “Credit ins”, or “Life Ins” and “Female”. So, we may send promotion packages to female customers or to those who purchase credit card insurance. This will improve the acceptance rate.

17 Definitions If we send the life insurance promotion package to female customers, the acceptance rate is 0.6, which is called accuracy rate. As the strategy will likely improve the acceptance rate from original 0.4 (based on all customers) to 0.6. The ratio of them, 0.6 / 0.4 = 1.5, is called Lift. A lift value greater than 1 indicates the improvement. However, we can see that one of the customers who also purchases life insurance is a male. He will be excluded from the promotion mailing list. Therefore, using the rule “female” only covers 3 out of 4 customers who purchase life insurance. The ratio “# of included targets” / “# of all target”, i.e. 3 / 4 = 0.75 in this case, is called Coverage rate. A coverage rate less than 1 implies some valuable customers are lost. To improve the accuracy of decision-making, we may apply more than one criterion, e.g. “Female” plus “Credit Ins”.

18 Performance Evaluation (Rule: “Female”) Using a Confusion Matrix Actual Accept Actual Reject Computed Accept Computed Reject True or 1 3 True or 1 4 False or 0 2 False or Accuracy = 3 / (2+3) =0.6 5 Coverage = 3 / (3 + 1) = 0.75

19 Performance Evaluation (Rule: “Female”) Actual Accept Actual Reject Computed Accept Computed Reject P(Actl A|Comp A) = 60% (3) P(Actl R|Comp R) = 80% (4) P(Actl R|Comp A) = 40% (2) P(Actl A|Comp R) = 20% (1) 5 Accuracy = 3 / (2+3) =0.6 5 Coverage = 3 / (3 + 1) = 0.75

20 Decision Tree (1) Total: 10 Accept: 4 Reject: 6 Accuracy: 40% Coverage: 100% Gender Female Male Total: 5 Accept: 3 Reject: 2 Accuracy: 60% Coverage: 75% Total: 5 Accept: 1 Reject: 4 Accuracy: 20% Coverage: 25% Credit Card Insurance Yes No Total: 2 Accept: 2 Reject: 0 Accuracy: 100% Coverage: 50% Total: 3 Accept: 1 Reject: 2 Accuracy: 33.3% Coverage: 25%

21 Decision Tree (2) Total: 10 Accept: 4 Reject: 6 Accuracy: 40% Coverage: 100% Gender Female Male Total: 4 Accept: 3 Reject: 1 Accuracy: 75% Coverage: 75% Total: 6 Accept: 1 Reject: 5 Accuracy: 16.7% Coverage: 25% Credit Card Insurance Yes No Total: 2 Accept: 2 Reject: 0 Accuracy: 100% Coverage: 50% Total: 2 Accept: 1 Reject: 1 Accuracy: 50% Coverage: 25% What are the differences of this decision tree from the last one?

22 Rules from the analysis 1. IF Sex = Female Then Life Insurance Promotion = Yes Rule accuracy: 60% Rule Coverage: 75% 2. IF Credit card Insurance = Yes Then Life Insurance Promotion = Yes Rule accuracy: 75% Rule Coverage: 75% 3. IF Sex = Female & Credit card Insurance = Yes Then Life Insurance Promotion = Yes Rule accuracy: 100% Rule Coverage: 50%

23 Total Benefit Rule 1 Gain: $4 * 3 = $12; Loss: $1 * 2 = $2; Net = $12 - $2 = $10 Rule 2 Gain: $4 * 3 = $12; Loss: $1 * 1 = $1; Net = $12 - $1 = $11 Rule 3 Gain: $4 * 2 = $8; Loss: $1 * 0 = $0; Net = $8 No Rule Gain: $4 * 4 = $16; Loss: $1 * 6 = $6; Net = $16 - $6 = $10 Conclusions Choosing the best rule maximizes the profit Sometime “No Rule” could be better than some rule, which depends on the number of instances being included by the rule. So, we need a greater coverage rate from a rule.

24 Exercise 4 100,000 customer Only 10% of them may be interested in life insurance Mailing an insurance advertising package costs $1 (material printing, stamp, processing, etc.) If someone purchases the insurance, the company will make $4 net profit. So, if a letter results no purchase of the insurance package, the loss is $1. If there are three rules available to improve the accuracy of marketing, which one is the best? Calculate the total benefits based on each rule and provide your argument. Rule 1: picking out 20,000, 30% accuracy rate (6,000 / 10,000 = 60% coverage) Rule 2: picking out 30,000, lift = 2 (accuracy rate = 2 * 10% = 20%, 30,000 * 20% = 6,000, 6,000 / 10, 10,000 = 60% coverage rate) Rule 3: picking out 10,000, 60% accuracy rate Rule 1: 30% accuracy rate, 60% coverage rate Rule 2: lift = 2, 65% coverage Rule 3: 60% accuracy rate, 50% coverage rate

25 What is Data Mining? Many Definitions Non-trivial extraction of implicit, previously unknown and potentially useful information from data Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns

26 Draws ideas from machine learning/AI, pattern recognition, statistics, and database systems Traditional Techniques may be unsuitable due to Enormity of data High dimensionality of data Heterogeneous, distributed nature of data Origins of Data Mining Machine Learning/ Pattern Recognition Statistics/ AI Data Mining Database systems

27 Lots of data is being collected and warehoused Web data, e-commerce purchases at department/ grocery stores Bank/Credit Card transactions Computers have become cheaper and more powerful Competitive Pressure is Strong Provide better, customized services for an edge (e.g. in Customer Relationship Management) Why Mine Data? Commercial Viewpoint

28 Why Mine Data? Scientific Viewpoint Data collected and stored at enormous speeds (GB/hour) remote sensors on a satellite telescopes scanning the skies microarray s generating gene expression data scientific simulations generating terabytes of data Traditional techniques infeasible for raw data Data mining may help scientists in classifying and segmenting data in Hypothesis Formation

29 Data Mining Tasks Prediction Methods Use some variables to predict unknown or future values of other variables. Description Methods Find human-interpretable patterns that describe the data. From [Fayyad, et.al.] Advances in Knowledge Discovery and Data Mining, 1996

30 Data Mining Tasks... Classification [Predictive] Clustering [Descriptive] Association Rule Discovery [Descriptive] Sequential Pattern Discovery [Descriptive] Regression [Predictive] Deviation Detection [Predictive]

31 Using Data Mining Tools Statistics Analysis System ( “SAS®9 is the most recent release of SAS. It delivers analytical, data manipulation and reporting capabilities within a completely new framework. ” SPSS ( “SPSS customers include telecommunications, banking, finance, insurance, healthcare, manufacturing, retail, consumer packaged goods, higher education, government, and market research. ” Weka, an open source software product ( ) Microsoft SQL Server comes with major data mining utilities There are more.

32 SAS Data Mining Examples Credit Promotion Dataset CreditPromCreditProm German Credit Data Online SAS materials (View PDF (2.24MB))View PDF (2.24MB) P70, dataset description P71, decision matrix

33 Life Insurance Promotion Data (more detailed) No: Income Range Magazine Promo Life Ins Promo Credit Card Ins.SexAge ,000YesNo Male ,000Yes NoFemale ,000No Male ,000YesNoYesMale ,000Yes NoFemale ,000No Female ,000Yes Male ,000No Male ,000YesNo Male ,000Yes Female ,000NoYes Female ,000NoYes Male ,000Yes NoFemale ,000No Male ,000No Female19