Monday, February 22, 2016.  The term analytics is often used interchangeably with:  Data science  Data mining  Knowledge discovery  Extracting useful.

Monday, February 22, 2016

 The term analytics is often used interchangeably with:  Data science  Data mining  Knowledge discovery  Extracting useful business patterns or mathematical decision models from a preprocessed data set

 Analytics techniques come from a variety of disciplines:  Statistics (e.g., regression)  Machine learning (e.g., decision trees)  Biology (e.g., neural networks, genetic algorithms)

 Applications exist in numerous areas  Retail  Travel  Health care  Actuarial science  Credit scoring  Movies  Sports  Marketing  Financial services  Pharmaceuticals  Telecommunications  Etc.

1. In predictive analytics, a target variable is typically available  Can be categorical (e.g., churn or not, fraud or not) or continuous (e.g., customer lifetime value, loss given default) 2. In descriptive analytics, no such target variable is available  Clustering is one example

 Missing data values can occur for various reasons  Customer decides not to disclose income  Error occurs in merging because of typos in name  Popular schemes to deal with it:  Replace data With average or median Using a regression based on other data (e.g., age, income)  Delete data Simplest and most straightforward option Assumes no meaningful interpretation is lost  Keep data Missing data may be meaningful (e.g., customer did not disclose income because he is currently unemployed)

 Two types of outliers can be considered:  Valid observation (e.g., salary of $2 million)  Invalid observation (e.g., age of 200 years)  Detection can be done statistically  Couple techniques:  Trimming/truncating – remove outliers  Winsorising – bring data back to lower and upper limits (e.g., median +/- 3SD)

 Regression – target variable is continuous  Stock prices  Loss given default (LGD)  Customer lifetime value (CLV)  Classification – target is categorical  Binary (fraud, churn, credit risk)  Multiclass (predict credit ratings)

 Active churn – customer stops relationship with firm  Contractual setting (e.g., cell phone service) – easy to detect – customer cancels contract  Noncontractual setting (e.g., grocery store) – need to operationalize – customer has not purchased any products in last 3 months  Passive churn – decreasing product or service usage  Forced churn – company stops the relationship  Expected churn – customer no longer needs a product or service (e.g., baby products)

 Recursive partitioning algorithm (RPA) that represents patterns in underlying data set  Leaf/terminal nodes represent outcomes  Building a decision tree:  Splitting: Which variables and at what values?  Stopping: When to stop growing the tree?  Decisions: What class to assign each leaf node?

 Decision trees essentially model decision boundaries orthogonal to the axes

 Decision trees can be used for continuous targets

 Contrary to predictive analytics, there is no real target variable available  Sometimes called unsupervised learning since there is no target variable to steer the learning process

 Typically begins with a database of transactions:

 Stochastic in nature, with a statistical measure of the strength of the association  Rules measure correlation association and should not be interpreted in a causal way  Examples:  If a customer buys spaghetti, then customer buys red wine in 70 percent of the cases  If a customer visits web page A, then the customer will visit web page B in 90% of the cases  If a customer has a car loan and car insurance, then the customer has a checking account in 80% of the cases

 Suppose customer web page visits were logged:  Session 1: A, B, C  Session 2: B, C  Session 3: A, C, D  Session 4: A, B, D  Session 5: D, C, A  Consider the sequence rule A -> C  The support and confidence can be measure in various ways  Support: C follows A in any subsequent stage (2/5) C immediately follows A (1/5)  Confidence (given that A occurs): C follows A in any subsequent stage (2/4) C immediately follows A (1/4)

 Divisive clustering starts with the entire data set in one cluster and breaks it up into smaller clusters until one observation per cluster remains (right to left below)  Agglomerative clustering does the reverse – it merges clusters until one big cluster is left (left to right)

 The vertical lines on the dendogram gives the distance between two clusters amalgamated  The elbow point of a scree plot indicates the optimal clustering

 Non-hierarchical procedure 1. Select k observations as initial cluster centroids (seeds) 2. Assign each observation to the cluster that has the closest centroid 3. When all observations have been assigned, recalculate the positions of the k centroids 4. Repeat steps #2 and #3 until the cluster centroids no longer change  Notes: The number of clusters, k, must be specified before the procedure begins. Different seeds should be tried to verify the stability of the clustering solution.

 Read Chapter 6 of your textbook  Work on my term project

Monday, February 22, 2016.  The term analytics is often used interchangeably with:  Data science  Data mining  Knowledge discovery  Extracting useful.

Similar presentations

Presentation on theme: "Monday, February 22, 2016.  The term analytics is often used interchangeably with:  Data science  Data mining  Knowledge discovery  Extracting useful."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Monday, February 22, 2016.  The term analytics is often used interchangeably with:  Data science  Data mining  Knowledge discovery  Extracting useful.

Similar presentations

Presentation on theme: "Monday, February 22, 2016.  The term analytics is often used interchangeably with:  Data science  Data mining  Knowledge discovery  Extracting useful."— Presentation transcript:

Similar presentations

About project

Feedback