Download presentation
Presentation is loading. Please wait.
1
STA 511 Statistical Computing
Changxing Ma Homepage:
2
Statistical Computing
What skills we need to do “Biostatistical consulting and projects”? Biostatistics SAS, (or STATA, SPSS etc.) Microsoft Office (or LaTex for mathematician/statistician) Google …
3
Statistical Computing
What skills we need in research - dissertation work? Biostatistics One language: Fortran, C, C++ (Optional) Matrix language: Matlab, R Latex (Calculus, Algebra) Maple/Matlab …
4
Contents LaTex (Microsoft word) SAS Basic Advanced SAS Matlab/R
SAS Macro SAS SQL SAS IML Matlab/R Matlab (Maple) – Symbolic calculation
5
Example 1 - SAS Stepwise regression y=f(x1, x2, … xs)
Stepwise regression is a technique for choosing the variables i.e., terms, to include in a multiple regression model. Forward stepwise regression starts with no model terms. At each step it adds the most statistically significant term (the one with the highest F statistic or lowest p-value) until there are none left. Backward stepwise regression starts with all the terms in the model and removes the least significant terms until all the remaining terms are statistically significant. It is also possible to start with a subset of all the terms and then add significant terms or remove insignificant terms.
6
Example 1 - SAS Stepwise regression
The stepwise method is a modification of the forward-selection technique and differs in that variables already in the model do not necessarily stay there. As in the forward-selection method, variables are added one by one to the model, and the F statistic for a variable to be added must be significant at the SLENTRY= level. After a variable is added, however, the stepwise method looks at all the variables already included in the model and deletes any variable that does not produce an F statistic significant at the SLSTAY= level. Only after this check is made and the necessary deletions accomplished can another variable be added to the model. The stepwise process ends when none of the variables outside the model has an F statistic significant at the SLENTRY= level and every variable in the model is significant at the SLSTAY= level, or when the variable to be added to the model is the one just deleted from it.
7
Example 1 - SAS Stepwise regression proc reg data=yourdata; model y=x1-x10 / selection=stepwise SLENTRY= SLSTAY=0.15; run;
8
SELECTION=BACKWARD | B
Example 1 - SAS Stepwise logistic regression proc logistic data=yourdata; model y=x1-x10 / selection=stepwise SLENTRY= SLSTAY=0.15; run; SELECTION=BACKWARD | B | FORWARD | F | NONE | N | STEPWISE | S
9
Example 1 - SAS Stepwise General Linear Model proc GENMOD data=yourdata; model y=x1-x10 / LINK = LOG selection=stepwise SLENTRY= SLSTAY=0.15; run; HOW? Wait SAS next release to hope that it will have the stepwise selection Do the selection manually using SAS genmod (most people did this) Write your own SAS code for stepwise selection using SAS MACRO
10
Rule to use MACRO Basically, any repeat or partially repeat job should use MACRO to do it automatically. The most of jobs are partially repeated. A job could split into partially repeated parts. Using macro will save you big TIME & $$$
11
STA Advanced SAS SAS Macro SAS IML SAS SQL
12
Example 2 - Math A = determinant of A: Det (A) ? Inverse (A) ?
Any solution?
13
Example 2 - Math Any solution? Very simple
If you are good enough in algebra, you know the answer If you just learned algebra, you should know it If you have good memory, you should remember it from the algebra you learned years ago Check an algebra or matrix textbook For me – too lazy to check a textbook. Then
14
Maple/Matlab Define the matrix (Matlab mupad) x:=matrix([[1, r, r^2, r^3, r^4],[ r, 1, r, r^2, r^3], [ r^2, r, 1, r, r^2], [ r^3, r^2, r, 1, r], [ r^4, r^3, r^2, r, 1]]) Display it
15
Matlab Inverse of x 1/x
16
Det(x) factor(det(x)) for k=5 Any k?
17
Symbolic calculation – Maple/Matlab
Calculus Algebra Help you to “produce” formula More … Another example
18
Formulated as Partially BY Maple HELP
Ma CX, Fang KT ,and Lin DKJ A note on uniformity and orthogonality, Journal of Statistical Planning and Inference 113 (1) Formulated as Partially BY Maple HELP
19
STA 511: Maple/Matlab Introduce the language Learned it by
real examples practice & practice
20
STA511 – MATLAB or R “MATLAB is a high-level language and interactive environment that enables you to perform computationally intensive tasks faster than with traditional programming languages such as C, C++, and Fortran. ”
21
MATLAB or R Introduction and Key Features
Developing Algorithms and Applications Analyzing and Accessing Data Visualizing Data Performing Numeric Computation Publishing Results and Deploying Applications
22
Example 3: Matlab 90% of my publications are calculated by MATLAB, 10% by Fortran or C See All graphs in my papers are produced by Matlab The same task written by MATLAB will cost you one 10th of that by Fortran or C base on my experience Matlab is a matrix-based language
23
Example 4: R R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues. R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R.
24
R R provides a wide variety of statistical and graphical techniques, and is highly extensible. One of R's strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. R is available as Free Software
25
Why we learn and use so many software if one software (like SAS) may provide all the functions?
You should always use “proper part” of “proper software” for “proper tasks”. Each software has its “best part” to use, although every software is trying to provide “other part” for you. Sounds strange? For example, The best part of SAS is statistical analysis, although it provide all graphical functions. You should use its graph only for draft purpose. You should use Matlab, R, or Microsoft Office for publication-level plots.
26
miscellaneous Latex: a high-quality typesetting system; it includes features designed for the production of technical and scientific documentation Statistician should use it for the dissertation and papers preparation Other useful software?
27
COURSE DESCRIPTION Statistical packages and computing is an essential part of modern statistical training, as it touches on almost every aspect of statistical theory and practice. This course covers advanced SAS, symbolic calculation (Matlab), and scientific calculation software (R or MATLAB). My goals in teaching this class are: To help the students build the advanced SAS skills needed for statistical consulting and projects To help the students build the programming skills needed for their thesis or dissertation work. To present some examples of computational problems in statistics. To build a ability to learn any new language The MACRO developed through STA511 could be used in the future in your real projects
28
Software SAS/Matlab is installed in our lab, or UB virtual machine
UB students can get a free copy of Matlab R can be freely downloaded from
29
TEXT BOOK: No specific text is required
TEXT BOOK: No specific text is required. The course materials will be drawn from following recommended resources SAS Michele M. Burlew, SAS Macro Programming Made Easy, ISBN: SAS Macro User Guide, download from here. SAS IML, download from here. SAS SQL, download from here. R, MATLAB
30
Grading 6 Homework assignments (100%).
31
SAS basic base_lrconcept_9196.pdf
Above title comprehensively documents essential concepts for SAS features, the DATA step, and SAS files. This reference is a companion volume to the SAS Language Reference: Dictionary, which provides complete reference information about fundamental SAS language element features and the DATA step debugger.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.