Presentation is loading. Please wait.

Presentation is loading. Please wait.

STA 511 Statistical Computing

Similar presentations


Presentation on theme: "STA 511 Statistical Computing"— Presentation transcript:

1 STA 511 Statistical Computing
Changxing Ma Homepage:

2 Statistical Computing
What skills we need to do “Biostatistical consulting and projects”? Biostatistics SAS, (or STATA, SPSS etc.) Microsoft Office (or LaTex for mathematician/statistician) Google

3 Statistical Computing
What skills we need in research - dissertation work? Biostatistics One language: Fortran, C, C++ (Optional) Matrix language: Matlab, R Latex (Calculus, Algebra) Maple/Matlab

4 Contents LaTex (Microsoft word) SAS Basic Advanced SAS Matlab/R
SAS Macro SAS SQL SAS IML Matlab/R Matlab (Maple) – Symbolic calculation

5 Example 1 - SAS Stepwise regression y=f(x1, x2, … xs)
Stepwise regression is a technique for choosing the variables i.e., terms, to include in a multiple regression model. Forward stepwise regression starts with no model terms. At each step it adds the most statistically significant term (the one with the highest F statistic or lowest p-value) until there are none left. Backward stepwise regression starts with all the terms in the model and removes the least significant terms until all the remaining terms are statistically significant. It is also possible to start with a subset of all the terms and then add significant terms or remove insignificant terms.

6 Example 1 - SAS Stepwise regression
The stepwise method is a modification of the forward-selection technique and differs in that variables already in the model do not necessarily stay there. As in the forward-selection method, variables are added one by one to the model, and the F statistic for a variable to be added must be significant at the SLENTRY= level. After a variable is added, however, the stepwise method looks at all the variables already included in the model and deletes any variable that does not produce an F statistic significant at the SLSTAY= level. Only after this check is made and the necessary deletions accomplished can another variable be added to the model. The stepwise process ends when none of the variables outside the model has an F statistic significant at the SLENTRY= level and every variable in the model is significant at the SLSTAY= level, or when the variable to be added to the model is the one just deleted from it.

7 Example 1 - SAS Stepwise regression proc reg data=yourdata; model y=x1-x10 / selection=stepwise SLENTRY= SLSTAY=0.15; run;

8 SELECTION=BACKWARD | B
Example 1 - SAS Stepwise logistic regression proc logistic data=yourdata; model y=x1-x10 / selection=stepwise SLENTRY= SLSTAY=0.15; run; SELECTION=BACKWARD | B                           | FORWARD | F                           | NONE | N                           | STEPWISE | S

9 Example 1 - SAS Stepwise General Linear Model proc GENMOD data=yourdata; model y=x1-x10 / LINK = LOG selection=stepwise SLENTRY= SLSTAY=0.15; run; HOW? Wait SAS next release to hope that it will have the stepwise selection Do the selection manually using SAS genmod (most people did this) Write your own SAS code for stepwise selection using SAS MACRO

10 Rule to use MACRO Basically, any repeat or partially repeat job should use MACRO to do it automatically. The most of jobs are partially repeated. A job could split into partially repeated parts. Using macro will save you big TIME & $$$

11 STA Advanced SAS SAS Macro SAS IML SAS SQL

12 Example 2 - Math A = determinant of A: Det (A) ? Inverse (A) ?
Any solution?

13 Example 2 - Math Any solution? Very simple
If you are good enough in algebra, you know the answer If you just learned algebra, you should know it If you have good memory, you should remember it from the algebra you learned years ago Check an algebra or matrix textbook For me – too lazy to check a textbook. Then

14 Maple/Matlab Define the matrix (Matlab mupad) x:=matrix([[1, r, r^2, r^3, r^4],[ r, 1, r, r^2, r^3], [ r^2, r, 1, r, r^2], [ r^3, r^2, r, 1, r], [ r^4, r^3, r^2, r, 1]]) Display it

15 Matlab Inverse of x 1/x

16 Det(x) factor(det(x)) for k=5 Any k?

17 Symbolic calculation – Maple/Matlab
Calculus Algebra Help you to “produce” formula More … Another example

18 Formulated as Partially BY Maple HELP
Ma CX, Fang KT ,and Lin DKJ A note on uniformity and orthogonality, Journal of Statistical Planning and Inference 113 (1) Formulated as Partially BY Maple HELP

19 STA 511: Maple/Matlab Introduce the language Learned it by
real examples practice & practice

20 STA511 – MATLAB or R “MATLAB is a high-level language and interactive environment that enables you to perform computationally intensive tasks faster than with traditional programming languages such as C, C++, and Fortran. ”

21 MATLAB or R Introduction and Key Features
Developing Algorithms and Applications Analyzing and Accessing Data Visualizing Data Performing Numeric Computation Publishing Results and Deploying Applications

22 Example 3: Matlab 90% of my publications are calculated by MATLAB, 10% by Fortran or C See All graphs in my papers are produced by Matlab The same task written by MATLAB will cost you one 10th of that by Fortran or C base on my experience Matlab is a matrix-based language

23 Example 4: R R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.

24 R R provides a wide variety of statistical and graphical techniques, and is highly extensible. One of R's strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. R is available as Free Software

25 Why we learn and use so many software if one software (like SAS) may provide all the functions?
You should always use “proper part” of “proper software” for “proper tasks”. Each software has its “best part” to use, although every software is trying to provide “other part” for you. Sounds strange? For example, The best part of SAS is statistical analysis, although it provide all graphical functions. You should use its graph only for draft purpose. You should use Matlab, R, or Microsoft Office for publication-level plots.

26 miscellaneous Latex: a high-quality typesetting system; it includes features designed for the production of technical and scientific documentation Statistician should use it for the dissertation and papers preparation Other useful software?

27 COURSE DESCRIPTION Statistical packages and computing is an essential part of modern statistical training, as it touches on almost every aspect of statistical theory and practice. This course covers advanced SAS, symbolic calculation (Matlab), and scientific calculation software (R or MATLAB). My goals in teaching this class are: To help the students build the advanced SAS skills needed for statistical consulting and projects To help the students build the programming skills needed for their thesis or dissertation work. To present some examples of computational problems in statistics. To build a ability to learn any new language The MACRO developed through STA511 could be used in the future in your real projects

28 Software SAS/Matlab is installed in our lab, or UB virtual machine
UB students can get a free copy of Matlab R can be freely downloaded from

29 TEXT BOOK: No specific text is required
TEXT BOOK: No specific text is required. The course materials will be drawn from following recommended resources SAS Michele M. Burlew, SAS Macro Programming Made Easy, ISBN: SAS Macro User Guide, download from here. SAS IML, download from here. SAS SQL, download from here. R, MATLAB

30 Grading 6 Homework assignments (100%).

31 SAS basic base_lrconcept_9196.pdf
Above title comprehensively documents essential concepts for SAS features, the DATA step, and SAS files. This reference is a companion volume to the SAS Language Reference: Dictionary, which provides complete reference information about fundamental SAS language element features and the DATA step debugger.


Download ppt "STA 511 Statistical Computing"

Similar presentations


Ads by Google