Download presentation
Presentation is loading. Please wait.
Published byHillary Hodge Modified over 9 years ago
1
Beyond Opportunity; Enterprise Miner Ronalda Koster, Data Analyst
2
Agenda Introduction SAS EM at Dalhousie University Exploring SAS EM Discussion
3
Introduction Teaching Assistant with Dalhousie University Analyst, Precision BioLogic Inc. Consultant
4
Informatics at Dalhousie Informatics The study of the application of computer and statistical techniques to the management of information -HGSC glossary Dalhousie University First marketing informatics MBA major in North America The first to use SAS EM for teaching purposes Health Informatics program New Bachelor of Informatics Success story
5
Other courses required for Informatics major Multivariate statistics Direct marketing Marketing research Marketing strategy Database design Internet marketing
6
Our students Work for: Small consulting companies Large financial institutions Not for profit organizations Telecommunications companies Insurance companies Hospitals Loyalty program companies Travel companies Oil and gas industry Publishing houses A common thing is – they all work with information
7
SEMMA Process Sample Input, partition and sample data Explore View distributions and associations Modify Transform data, filter outliers, cluster to derive new variables Model Develop models i.e. Decision tree’s and Regression Access Assess models
8
Business Problem Have you ever wanted to understanding things that occur together or in sequence? Market Basket Analysis: Association Node Broad applications Basket data analysis, cross-marketing, catalog design, campaign sales analysis Web log (click stream) analysis, DNA sequence analysis, etc.
9
Associations Node Support, probability that a transaction contains X Y Frequency the combination occurs Confidence, conditional probability that a transaction having X also contains Y Percentage of cases that Y occurs, given that X has occurred Sequential Association Y occurs some time period after X occurs
10
Associations Node If a customer purchases Avocado, then 80% of the time they will purchase steak Confidence = 800 / 1,000 = 80% Support = 800 / 8,000 = 10% Avocado Steak 8,000 transactions 1,000 Avocados 2,000 Steak 800 Avocados & Steak antecedent consequent
11
Business Problem Have you ever wanted to classify or segment data on the basis of similar attributes so that each segment or cluster differs from another and all objects within a cluster share traits? Segmentation: Clustering Node Broad Applications Demographic / psychographic segmentation, campaign segmentation etc.
12
Clustering Example Identify similar objects or groups that are dissimilar from other clusters through disjoint cluster analysis on the basis of Euclidean distances Profile clusters graphically within EM Use derived segments for further analysis / algorithms (as an input variable or a target) Customize clusters based on standardization method, clustering method and clustering criterion
13
Business Problem Have you ever wanted to predict the likelihood of an event (and assign a cost to it)? Decision tree Node Broad Applications classify observations, predict outcomes based on decision alternatives.
14
Decision Tree Example A flow-chart-like tree structure Internal node denotes a test on an attribute Branch represents an outcome of the test Leaf nodes represent class labels or class distribution Handles missing data well Represent the knowledge in the form of IF-THEN rules Decision tree generation consists of two phases Tree construction At start, all the training examples are at the root Partition examples recursively based on selected attributes Tree pruning Identify and remove branches that reflect noise or outliers
15
Business Problem Have you ever wanted to ensure you target those most likely to purchase from a campaign whom you’ve never contacted previously? Scoring Node Broad applications: Testing model scalability, applying learning for subsequent events, etc.
16
EM Diagram
17
Lessons learned Data cleansing and transformation takes most of the time Data analysis done using EM – interpretable results Data modeling techniques are very robust SAS EM works well with huge datasets Knowledge obtained is transferred easily Learning never stops – EM reference, tutorial examples You can analyze almost any kind of data You can use SAS EM regardless the industry and size of dataset You need: a good computer, SAS support, and patience While not all students use SAS in their careers, the analytical principles they learn are extremely useful for their careers
18
Discussion
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.