Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Preparation for Analytics Using SAS Gerhard Svolba, Ph.D. Reviewed by Madera Ebby, Ph.D.

Similar presentations


Presentation on theme: "Data Preparation for Analytics Using SAS Gerhard Svolba, Ph.D. Reviewed by Madera Ebby, Ph.D."— Presentation transcript:

1 Data Preparation for Analytics Using SAS Gerhard Svolba, Ph.D. Reviewed by Madera Ebby, Ph.D.

2 2 What is the purpose of this book? Introduces the reader to data preparation Why data preparation is not only important but a must prior to data analysis From data preparation process to data analytics

3 3 The Analysis Path: From raw data to results that can be implemented Data sourcesData PreparationAnalytic Modeling Results and Actions Different Data Sources Merges, Denormalization Modeling, Parameter Estimation, Tuning Usage of Results Relational Models, Star Schemes Derived Variables Transpositions, Aggregations Predictions, Classifications or Clustering Profiling Interpretations

4 4 The Analysis Path: From raw data to results that can be implemented ` Data availability Adequate Preparation Clever Modeling Good Results

5 5 Four Dimensions for Analytic Data Preparation Business and Process Knowledge Analytical Knowledge Efficient SAS coding Documentation and Maintenance Analytic Data Preparation

6 6 Business question: How did students who met the provincial standard in grade 3 perform in grade 6? Generates many other questions Work with people in other departments such as IT to carry out a data analytic process

7 7 Why is this author qualified or not qualified to address this topic?  He is an experienced SAS user as exemplified in the many Macros  He addresses issues by presenting examples from different background

8 8 What are the strengths or weaknesses of this book?  The book is written clearly and is easy to read  Provides the reader with a lot of examples of codes, input and outputs

9 9 Would you recommend this book? If so, who would you recommend it to and for what purpose?  Those who prepare data marts for statistics or data mining or time series analyses  Those who provide data used in creating data marts IT and data warehousing  Both new and experienced SAS users who perform data analyses using data marts  Those who prepare data in relational databases with SQL

10 10 Does the book achieve its purpose? Absolutely! It enables one to:  Understand the business environment in which data preparation occurs  Extract and structure your data  Create derived variables from different tables  Program SAS in an efficient way

11 11 What is the best tip or technique addressed in this book?  There are many new techniques that I learnt from this book. For example:  Examine the mean scores for math by board mident

12 12 Continued… Proc means data=datalib.boards noprint nway; class board_mident; var Math_score; output out=datalib.aggr_static(drop=_type_ _freq_) Mean= Sum= N= STD= MIN= MAX= /Autoname; run;

13 13 Continued… To run analysis by board_mident, we use a CLASS statement. A BY statement could also be used but data would have to be sorted by board_mident NWAY suppresses grand total mean and all other totals so that output data contains only rows for 5 boards which are the analysis subjects The NOPRINT in order to suppress the printed output from the log, which can be thousands of descriptive measures even for a small sample of 5 observations In the OUTPUT statement we specify the statistics that will be calculated. The AUTONAME option creates the new variable names in the form of VARIABLENAME_ STATISTIC If we want to calculate different statistics for different input variables we can specify it on the OUTPUT statement: e.g SUM(VARIABLE)=sum_variable In the OUTPUT statement we drop the _TYPE_ and _FREQ_vaiables, although we could keep the _FREQ_ and omit N from the statistics list. Chapter 18, Multiple Interval-Scaled Observations per subject, page 183.

14 14 CONTINUED…

15 15 Are there other books (or sources of information) available with similar content?  Yes, but tend to present bits and pieces of information  E.g. Resources on the internet  The Little SAS Book by Delwiche and Slaughter If so, how does this book compare?  Comprehensive, well illustrated presentation of material

16 16 What will your SAS log look like?

17 17 or

18 18 or

19 19 or


Download ppt "Data Preparation for Analytics Using SAS Gerhard Svolba, Ph.D. Reviewed by Madera Ebby, Ph.D."

Similar presentations


Ads by Google