Presentation is loading. Please wait.

Presentation is loading. Please wait.

THE BEGINNING.

Similar presentations


Presentation on theme: "THE BEGINNING."— Presentation transcript:

1 THE BEGINNING

2 Learning Objecitve Be prepared to use the auto numeric node in IBM
SPSS Modeler. Having used the node, identify the best model. Be prepared to interpret the outputs of CHAID, Regression, Linear, KNN (K Nearest Neighbors) Algorithm, and C&R (Classification and Regression) Tree.

3 $ Productivity Drivers Creation of Value Through Exchange
Competition: $ $ (Falling Prices) $ Productivity Drivers $ $ $ $ $ $ $ $ $ $ $ $ Mass Marketing Customize, target low incidence, high value customers Mass Customization DATABASE Marketer Pushes Product (Queries, Mail Merge) Customer Pulls Product (Web App) 1 to to 1 Segmentation & Targeting Identifying Segments Measuring Market Segment Value: Predicting Consumer Response (Models) Cluster Analysis Gain Scores Lifetime Value Analysis Non-Statistical Statistical Table Design Relationship Editor --Joins: 1 to 1, 1 to ∞. ∞ to ∞ Queries --Select Query (Bring data from one or more tables into virtual table.) ● Sort ● And / Or Logic --Inner / Outer Join RFM (current customers) Market Basket Analysis (Web, Directed Web, Apriori) Data Attributes Data Types: Norminal, Ordinal, Interval, Ratio Data Attributes: Central Tendency, Spread Relationship Tests Correlation: Pearson (Interval/Ratio), Spearman (Ordinal) Difference Tests T – Test (Nominal/Interval or Ratio) Mann-Whitney U (Nominal/Ordinal) Chi Square (Nominal/Nominal or just Nominal) Comprehensive Models CHAID Regression (simple & multiple) IBM Modeler Autonumeric (compares multiple models)

4

5

6

7

8

9

10

11

12 KNN Algorithm K Nearest Neighbors This procedure plots all observations in multidimensional space, here on three dimensions for volatile acidity, alcohol, and Award. Each wine is plotted in this space. The place of each wine can be determined by putting the cursor on a dot. When that happens, the number of the observation is given and its dependent variable value. In this case, wine 1002 has a quality value of 5. Values are represented by the darkness of the dot, as indicated by the key on the left of the chart.

13 CHAID Output in Viewer

14 C&R Tree uses a GINI purity measure (as CHAID
uses chi square) to split off branches that differentiate groups with different patterns of response. It always splits into two branches. Improvement is a measure of increase in the purity measure following the spilt.

15 The Regression node functions the
same way as it would in SPSS. It uses the Enter method, i.e., uses the predictors in the same order that you list them. The outputs are interpreted in exactly the same way as they would be in SPSS.

16 Regression Prediction Calculator

17 The Linear node is a regression node that works with
transformed data. In other words, the data is modified in certain ways (e.g., trimming outliers) before running the regression. While the printouts are somewhat different from those for the regression node, they should be interpreted in the same way. The r – square value is given in the Accuracy pane, i.e., 36.5% of the variance is explained by this model. The coefficients and p – values are shown when you put the cursor over one of the line. For nominal variables, the coefficient is given for all forms except the baseline value. Here Award = 1 is the baseline. No coefficient is given for that value. Use coefficient given X 1 for Award = 0.

18 Nominal & Ordinal Variables in Linear Node
If you have a nominal or ordinal variable in the Linear node, a coefficient will be reported for all values of the variable except one, the baseline value. Suppose you have a nominal ethnicity variable with the following values: 1 Hispanic, 2 Black, 3 Asian, 4 Caucasian, 5 Native American. The Linear node would report 4 coefficient values, e.g., for values 1, 2, 3, and 4. At least one value will not be given, e.g., 5. The missing value or values are the baseline. In calculating the predicted value if ethnicity is Asian, the coefficient for 3 Asian would be used (with an X value of 1) to get the predicted value. If the ethnicity is Native American, just calculate the predicted value from all coefficients except ethnicity to get this baseline group predicted value.

19 We have a coefficient for all of the
ethnic groups except group 5, Native Americans. To get the predicted Native American value, just use Intercept, Alcohol, and Volatileacidity.

20 THE END


Download ppt "THE BEGINNING."

Similar presentations


Ads by Google