Lecture 4 MARK2039 Winter 2006 George Brown College Wednesday 9-12.

Slides:



Advertisements
Similar presentations
Acquire foundational knowledge of marketing-information management to understand its nature and scope Marketing Indicator 1.05.
Advertisements

Fashion Marketing Basics
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
MICROSOFT OFFICE ACCESS 2007.
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
How Abacus solutions can increase your ROI Abacus Insights Event – Wednesday 1 st October 2014.
Business Intelligence Andrew Davis Andria Zippler Jana Krinsky Tiffany Ferris.
Chapter 14 The Second Component: The Database.
Data Resource Management Data Concepts Database Management Types of Databases Chapter 5 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies,
Overview of Database Marketing. Historical Perspective Mass Production, Mass Media and Mass Mkt now replaced by -a one-to-one economic system The one-to-one.
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Data Mining: Concepts & Techniques. Motivation: Necessity is the Mother of Invention Data explosion problem –Automated data collection tools and mature.
Lecture 8 MARK2039 Summer 2006 George Brown College Wednesday 9-12.
Lecture 4 MARK2039 Winter 2006 George Brown College Wednesday 9-12.
McGraw-Hill/Irwin McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
IDENTIFY AND MEET A MARKET NEED
ACS1803 Lecture Outline 2 DATA MANAGEMENT CONCEPTS Text, Ch. 3 How do we store data (numeric and character records) in a computer so that we can optimize.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts - 5 th Edition, Aug 26, 2005 Buzzword List OLTP – OnLine Transaction Processing (normalized,
Copyright © 2003 by Prentice Hall Computers: Tools for an Information Age Chapter 13 Database Management Systems: Getting Data Together.
IDENTIFY AND MEET A MARKET NEED
Copyright © 2012 Pearson Education. All rights reserved. Chapter 2 Data.
Building Databases, Selecting Customers, and Managing Relationships
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
Entrepreneurship: Ideas in Action 5e © 2011 Cengage Learning. All rights reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible.
Lecture 9 MARK2039 Summer 2006 George Brown College Wednesday 9-12.
Lecture 7 MARK2039 Summer 2006 George Brown College Wednesday 9-12.
The DM Process – MS’s view (DMX). The Basics  You select an algorithm, show the algorithm some examples called training example and, from these examples,
Lecture 6 MARK2039 Winter 2006 George Brown College Wednesday 9-12.
© 2007 by Prentice Hall 1 Introduction to databases.
Data Warehouse and Business Intelligence Dr. Minder Chen Fall 2009.
Section 28.1 Marketing Information Systems
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
Lecturer: Gareth Jones. How does a relational database organise data? What are the principles of a database management system? What are the principal.
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
BUS1MIS Management Information Systems Semester 1, 2012 Week 6 Lecture 1.
Marketing Research.
1 Data Warehouses BUAD/American University Data Warehouses.
 Mail Order Company in USA › Would like to find out if there is a way › To reduce mailing cost › By analyzing the past data.
Chapter 3 The Impact of Databases. What is a database? Flat file – Access is slow – Most older legacy systems Relational – Files are linked by a duplicate.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Research: The Key to the Venture Concept. Your business will exist in a marketplace of customers and competitors The more you know about your marketplace.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
Lecture 10 MARK2039 Summer 2006 George Brown College Wednesday 9-12.
CISB113 Fundamentals of Information Systems Data Management.
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
Lecture 3 MARK2039 Winter 2006 George Brown College Wednesday 9-12.
Business Intelligence - 2 BUS 782. Topics Data warehousing Data Mining.
OLAP On Line Analytic Processing. OLTP On Line Transaction Processing –support for ‘real-time’ processing of orders, bookings, sales –typically access.
Lesson 9: Types of information system. Introduction  An MIS is a decision support system in which the form of input query and response is predetermined.
Information Management and Market Research. Marketing Research Links…. Consumer, Customer, and Public Marketer through information Marketing Research:
3/6: Data Management, pt. 2 Refresh your memory Relational Data Model
Chapter 17 Preparing Data for Mining. 2 Introduction Just as manufacturing and refining are about transformation of raw materials into finished products,
Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems.
Optimal Database Marketing Drozdenko & Drake, ©
Databases Flat Files & Relational Databases. Learning Objectives Describe flat files and databases. Explain the advantages that using a relational database.
Fundamentals of Information Systems, Sixth Edition Chapter 3 Database Systems, Data Centers, and Business Intelligence.
1 05 IT.ppt Market and Customer Management - Customer Loyalty 5. Loyalty and Information Technology Frequently asked questions: qWhat is a customer loyalty.
1 Management Information Systems M Agung Ali Fikri, SE. MM.
Stat 101Dr SaMeH1 Statistics (Stat 101) Associate Professor of Environmental Eng. Civil Engineering Department Engineering College Almajma’ah University.
Copyright  2007 McGraw-Hill Pty Ltd PPTs t/a Marketing Research 2e by Lukas, Hair, Bush and Ortinau Slides prepared by Judy Rex 19-1 Chapter Nineteen.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Data Mining Introduction to data mining concepts.
Data Mining – Introduction (contd…) Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Introduction BIM Data Mining.
Project Activity #1: Who are my customers?
Kenneth C. Laudon & Jane P. Laudon
Lecture One Data Copyright © 2012 Pearson Education. All rights reserved.
Presentation transcript:

Lecture 4 MARK2039 Winter 2006 George Brown College Wednesday 9-12

2 Assignment 3-

3 Assignment 3 1) continued

4 Assignment 3 2) Listed below is a table containing a number of variables. Explain the reason why each variable is useful or notuseful in a future analysis. Variables# of records Data Field Format # of unique values # of missing values 1st 3 digits of postal code100000character10000 household size100000numeric Credit score100000numeric mortgage account100000character20 Product code100000character50000 Median Income of Postal Code of record100000numeric200000

5 Assignment 3 2)continued

6 Recap- Data,Data,Data-What Phase of the Data Mining Process are we in? Data Formats? –Examples? Data Transformations? What do we mean here? Examples? In all data mining projects, what must the final data values be?

7 Recap- Data,Data,Data-What Phase of the Data Mining Process are we in? Data Types? What are they? What is discrete vs. index vs. continuous and how do they relate to Data Type. –Birthdate-Gender –Product category-Model rank –Spending percentile-Income –Promotion Date-Model Score

8 Recap Let’s take a look at postal code Let’s take a look at postal code How would you use the info here. Create binary variables for every postal code value. Is there another better way to group? How would you use the info here. Create binary variables for every postal code value. Is there another better way to group?

9 Types of Data Nominal Ordinal Interval Nominal is basically a yes/no variable or variable with outcomes that have no order or sense of magnitude to the numbers –Derived variables are coded as 0,1. –Give me an example of this and how you would create a nominal variable for data mining?. Assume you are analyzing response rate trends for customers?

10 Types of Data Ordinal –There is order to the values of the variable –Give me examples of this and what it would like in a data mining exercise. Assume you are analyzing response rate trends for customers? Interval –There is a sense of magnitude between two values. –Give me examples of this and what it would like in a data mining exercise. Assume you are analyzing response rate trends for customers? How does ordinal and interval differ. Explain it within the context of a data mining exercise where we analyze response?

11 Data Usefulness When is Data Useful? –Few Missing values –Variable does not consist primarily of one value –Non-Numeric Data consists of only a few values which can be properly grouped into more meaningful categories

12 Examples-Analytical Perspective What fields are useful and why?

13 Examples Closer look at income Closer look at gender Create a data mining response rate trend with each variable For both variables, demonstrate how no response rate might exist.

14 Examples Closer Look at Customer Type Closer look at Product Type Create a data mining response rate trend with each variable For both variables, demonstrate how no response rate trend might exist.

15 Examples What variables would be useful here What variables would be useful here What would be the number of unique values What would be the number of unique values What would some of these look like in a data mining response rate analysis exercise? What would some of these look like in a data mining response rate analysis exercise?

16 Examples What variables would be useful here What variables would be useful here What would some of these look like in a data mining response rate analysis exercise? What would some of these look like in a data mining response rate analysis exercise?

17 Examples-Marketing Perspective A mortgage company is conducting a campaign to its high value customers. One of the key characteristics of value is high income which is self-reported at time of application. As a marketer, how will you use this information and what do you need to consider? What might the results be if you applied this learning to a marketing campaign.

18 Examples-Marketing Perspective An insurance company is marketing an insurance product to people over the age of 60. Listed below is a report indicating the distribution of age. As a marketer, how will you use this information? What might the results be if you applied this learning to a marketing campaign.

19 Examples-Marketing Perspective An retail company has over 1000 product SKU’s. After investigation, it has been determined that the 1 st digit represents a broader product category. You have been asked to design the product layout for all stores. As a marketer, how will you use this information?

20 Examples-Marketing Perspective What can be done here, if anything and what else can we consider in terms of using gender and income information? What might it look like in a data mining exercise?

21 Examples-Marketing Perspective You have postal code information for each customer. You are asked to design customer reports by province.How would you do this? What would this look like in a data mining response rate analysis exercise?

22 Examples-Data Mining Perspective You have the following variables and values –Gender: ’M’:Male ‘F’:Female –Income ‘B’: <20 ‘F’: – ‘R’:40-60 ‘S’:60-80 ‘T’: ‘Z’: 100+ What must be done here? What would this look like in a data mining response rate analysis exercise?

23 Concepts Operational Database –Customer DB –Transactional DB Data Warehouse Data Mart Analysis Flat file vs. OLAP External Data Overlays –Postal Code Overlays –Survey/Registration Data

24 Databases Operational Databases vs. Data Warehouses vs. Data Marts vs. Analytical File Operational data consists of information from the source systems –Customer File –Transaction System –Finance System –Operations –Human Resources –Etc. In practice, what do you think an operations database is really dealing with?

25 Databases Data warehouse –Pulls elements and fields from each source system –May summarize/organize or aggregate information with each system to present the information in a more meaningful way? –Warehouse can comprise information from disparate areas of company –What do we mean by this?

26 Databases Data mart –Can in many cases be very similar to data warehouses in the way that information is summarized and aggregrated –Pulls elements and fields from each source system –May summarize/organize or aggregate information with each system to present the information in a more meaningful way? –Usually is focussed solely towards one functional area of the company Marketing data Mart Let’s think of some information that might be contained in a data mart?

27 DatabasesCustomerTransaction Finance.. Etc. Data Warehouse Data Mart- Marketing Data Mart- Finance Data Mart- Etc.

28 Database Structure In Database Design, most databases are relational –Creates a key which becomes a database index –This index or key becomes the link between different files Customer TransactionPromotion Customer ID is the link between all the tables Why do we need to think about the notion relational?

29 Database Structure Relational DB –Database indexes allow very quick processing of data when joining and merging files together The key in all database design is to create a database that optimizes processing of all information. In database design, you want the right data to be stored which is useful from a data mining perspective From a marketing standpoint, can you think of some examples? Why is this important from a data mining standpoint?

30 Database Structure Other approaches used in speeding up database processing –Inverted flat files This technology allows each field to be indexed Very common amongst the leading-edge DB suppliers today. Is much faster at processing data than traditional relational DB technology Again, why is this relevant from a data mining perspective?

31 Databases Analytical File For most data mining applications, your analytic file needs to be in the format of one record per customer with all known attributes Generally, the database is not in that format. ECTL – extraction, clean, transform, load – is the process/methodology for preparing data for data mining  Typically a flat file used for analysis  What do think is the most important concept for data mining?  Databases or Analytical File  How do they work together?

32 Databases File 1 -Cust ID -Income-Age -Household Size File 2 -Cust ID -Trans. Type -Trans Date -Trans Amt

33 Databases In building databases, the notion of continuity management is important In the context of household or customers on a database, continuity management is the process by which you are able to track customers through events in time. Why is this important?

34 Analytical file All data mining algorithms want their input in tabular form – rows & columns as in a spreadsheet or database table Typically, if we saw data like this, what typically needs to be done? Assume reference number is the customer I.D. What does continuity mean here?

35 What the Data Should Look Like A customer “snapshot” = a single row Each row represents the customer and whatever might be useful for data mining

36 What the Data Should Look Like The columns –Contain data that describe aspects of the customer (e.g., sales $ and quantity for each of product A, B, C) –Contain the results of calculations referred to as derived variables (e.g., total sales $) Derived variables are Total Price in 1 st chart and # of months since last purchase in 2 nd chart

37 Sourcing the Data from External Data Sources Typical Data Sources - External Geo-demographic information –Statistics Canada (aggregated level data) Census data Taxfiler data Geo-demographic Cluster Codes –Generation 5 – Mosaic –Equifax -Psyte Survey Data –ICOM

38 Sourcing the Data (Extraction) Census data Data collected every 5 years. Enumeration Area level. ~ 250 households on average. ~ 440 households in large urban areas. ~125 households in rural areas. ~ 50,000 EA’s in Canada Can be converted to postal code level and appended to your file. Type of data -immigration/ethnicity/language patterns -occupation -education -income/gender/age/employment -religion

39 Sourcing the Data (Extraction) Taxfiler data. Data collected every year. Postal walk level. ~ 450 households on average. ~ 26,000 Postal Walks in Canada. ·Contains data from previous year tax returns. · Income by source and type. Employment, investment. · RRSP contributions and room. Etc. Can also be appended to your files at postal code level.

40 Sourcing the Data (Extraction) Geo-Demographic cluster codes. Uses Stats Can data in most cases plus other external data overlays to determine postal code cluster groups –Quebec farm families –Young and Struggling –Empty Nesters –Upper Income Family-Oriented Equifax –High credit risk –Medium credit Risk –Etc.

41 Sourcing the Data-Stats Can Type Table Postal AreaMedian IncomeAvg. Age Avg. Household Size% French Area % Area % Area % Area % Area % Area % Area % Area %

42 Sourcing the Data (Extraction) Typical Data Sources - External Business to Business “Firmographics” –SIC, Number of Employees, Revenue etc. –Sources: D&B CBI / InfoCanada Scott’s

43 Sourcing the Data (Extraction) Typical Data Sources - Survey Attitudinal- Needs, preferences, social values, opinions Behavioral- Buying habits, lifestyle, brand usage  For most data mining projects, we want to assign a value to all customers; therefore the information used must be available for all customers –survey-based information generally cannot be used as it typically can only be applied a small portion of the database

44 Sourcing the Data (Extraction) Typical Data Sources - Survey ICOM –Surveys to approx. 10MM Canadians –Fully updated every 2 years –Contains attitude behaviour and purchase behaviours across all industry sectors What do you think the value is here?

45 Examples A marketer wants to target high risk cancels for a retention campaign for a Telco. Information is contained in legacy database systems containing a customer file, transaction file, and call detail file. As a marketer and analyst, answer the following requirements –5 Key Data fields from above files that should be created in analytical exercise –Create a diagram or schema of how this data would be linked into an analytical file –What resources would you need and why? People Software

46 Examples How would the previous example change if the information was available in a data mart or warehouse

47 Examples A university is conducting a fund raising campaign to its alumni( members). On its database, it has the following information: –Age of alumnus –Year graduated –Degree and specialization –Donation value –Current Address It has also collected information from a survey. 10% of members have responded to the survey with the following %’s of members answering the following information: –Current Occupation-5% –Current Income-8% –Why they give?-7% –How much they give As a marketer and analyst, how would you use the information to conduct a campaign to its high value donors

48 Examples A computer company collects information from all customers who purchase a new product. This new product information is collected through a product registration form which the customer fills in at point of purchase. This information relates to the following: –Product preferences,Income,household size and hobbies All customer tombstone information as well as purchase information related to products bought has been summarized and stored onto a data mart. As a marketer and analyst, how would you use the information to develop a cross-sell campaign.

49 Examples A credit card company has customers containing tombstone information and detailed transactional information on their database customers have addresses. 10% of customers have responded to a survey in which 5% have indicated that they consider themselves loyal customers. Web activity of these loyal customers indicate that many of them have clicked on travel-related packages. As a marketer and analyst, how would you use this information to sell travel-related insurance.