Download presentation
Presentation is loading. Please wait.
Published byGwendoline Barrett Modified over 8 years ago
1
Monday, February 22, 2016
2
The term analytics is often used interchangeably with: Data science Data mining Knowledge discovery Extracting useful business patterns or mathematical decision models from a preprocessed data set
4
Analytics techniques come from a variety of disciplines: Statistics (e.g., regression) Machine learning (e.g., decision trees) Biology (e.g., neural networks, genetic algorithms)
5
Applications exist in numerous areas Retail Travel Health care Actuarial science Credit scoring Movies Sports Marketing Financial services Pharmaceuticals Telecommunications Etc.
6
1. In predictive analytics, a target variable is typically available Can be categorical (e.g., churn or not, fraud or not) or continuous (e.g., customer lifetime value, loss given default) 2. In descriptive analytics, no such target variable is available Clustering is one example
8
Missing data values can occur for various reasons Customer decides not to disclose income Error occurs in merging because of typos in name Popular schemes to deal with it: Replace data With average or median Using a regression based on other data (e.g., age, income) Delete data Simplest and most straightforward option Assumes no meaningful interpretation is lost Keep data Missing data may be meaningful (e.g., customer did not disclose income because he is currently unemployed)
9
Two types of outliers can be considered: Valid observation (e.g., salary of $2 million) Invalid observation (e.g., age of 200 years) Detection can be done statistically Couple techniques: Trimming/truncating – remove outliers Winsorising – bring data back to lower and upper limits (e.g., median +/- 3SD)
10
Regression – target variable is continuous Stock prices Loss given default (LGD) Customer lifetime value (CLV) Classification – target is categorical Binary (fraud, churn, credit risk) Multiclass (predict credit ratings)
11
Active churn – customer stops relationship with firm Contractual setting (e.g., cell phone service) – easy to detect – customer cancels contract Noncontractual setting (e.g., grocery store) – need to operationalize – customer has not purchased any products in last 3 months Passive churn – decreasing product or service usage Forced churn – company stops the relationship Expected churn – customer no longer needs a product or service (e.g., baby products)
12
Recursive partitioning algorithm (RPA) that represents patterns in underlying data set Leaf/terminal nodes represent outcomes Building a decision tree: Splitting: Which variables and at what values? Stopping: When to stop growing the tree? Decisions: What class to assign each leaf node?
13
Decision trees essentially model decision boundaries orthogonal to the axes
14
Decision trees can be used for continuous targets
15
Contrary to predictive analytics, there is no real target variable available Sometimes called unsupervised learning since there is no target variable to steer the learning process
16
Typically begins with a database of transactions:
17
Stochastic in nature, with a statistical measure of the strength of the association Rules measure correlation association and should not be interpreted in a causal way Examples: If a customer buys spaghetti, then customer buys red wine in 70 percent of the cases If a customer visits web page A, then the customer will visit web page B in 90% of the cases If a customer has a car loan and car insurance, then the customer has a checking account in 80% of the cases
18
Suppose customer web page visits were logged: Session 1: A, B, C Session 2: B, C Session 3: A, C, D Session 4: A, B, D Session 5: D, C, A Consider the sequence rule A -> C The support and confidence can be measure in various ways Support: C follows A in any subsequent stage (2/5) C immediately follows A (1/5) Confidence (given that A occurs): C follows A in any subsequent stage (2/4) C immediately follows A (1/4)
19
Divisive clustering starts with the entire data set in one cluster and breaks it up into smaller clusters until one observation per cluster remains (right to left below) Agglomerative clustering does the reverse – it merges clusters until one big cluster is left (left to right)
20
The vertical lines on the dendogram gives the distance between two clusters amalgamated The elbow point of a scree plot indicates the optimal clustering
21
Non-hierarchical procedure 1. Select k observations as initial cluster centroids (seeds) 2. Assign each observation to the cluster that has the closest centroid 3. When all observations have been assigned, recalculate the positions of the k centroids 4. Repeat steps #2 and #3 until the cluster centroids no longer change Notes: The number of clusters, k, must be specified before the procedure begins. Different seeds should be tried to verify the stability of the clustering solution.
22
Read Chapter 6 of your textbook Work on my term project
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.