Zoom 30 x 200 Matrix for Red: R(red) G(green) B(blue) Frequency for red: 0 0 1.

Slides:



Advertisements
Similar presentations
R Squared. r = r = -.79 y = x y = x if x = 15, y = ? y = (15) y = if x = 6, y = ? y = (6)
Advertisements

Presentation and Data  Short Courses  Intro to SAS  Download Data to Desktop 1.
Logistic Regression and Odds Ratios
Multiple regression analysis
Linear statistical models 2008 Binary and binomial responses The response probabilities are modelled as functions of the predictors Link functions: the.
Data mining and statistical learning, lecture 5 Outline  Summary of regressions on correlated inputs  Ridge regression  PCR (principal components regression)
Data Analysis II.
15b. Accessing Data: Frequencies in SAS ®. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
PROC FREQ 1SHRUG November 28, What good is Proc FREQ It Counts! Answers question how many Display data (error checks), descriptive Analyze categorical.
1 732G21/732A35/732G28. Formal statement  Y i is i th response value  β 0 β 1 model parameters, regression parameters (intercept, slope)  X i is i.
CHAPTER 2 ECONOMETRICS x x x x x THE MEANING OF REGRESSION Dependent variable explained by Independent variables Price of iTune Consumer income Price of.
Logistic Regression II Simple 2x2 Table (courtesy Hosmer and Lemeshow) Exposure=1Exposure=0 Disease = 1 Disease = 0.
Lecture 8 Chi-Square STAT 3120 Statistical Methods I.
Multiple Discriminant Analysis and Logistic Regression.
Multiple Linear Regression - Matrix Formulation Let x = (x 1, x 2, …, x n )′ be a n  1 column vector and let g(x) be a scalar function of x. Then, by.
SAS PROC REPORT PROC TABULATE
In the above correlation matrix, what is the strongest correlation? What does the sign of the correlation tell us? Does this correlation allow us to say.
Topic 2: An Example. Leaning Tower of Pisa Construction began in 1173 and by 1178 (2 nd floor), it began to sink Construction resumed in To compensate.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation.
Proc freq: Five secrets* *Okay, well, lesser known facts.
Lecture 6 Correlation and Regression STAT 3120 Statistical Methods I.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Correlation and Regression Chapter 9. § 9.3 Measures of Regression and Prediction Intervals.
1 SAS 1-liners SAS Coding Efficiencies. 2 Overview Less is more Always aim for robust, reusable and efficient code Coding efficiency versus processing.
Grant Brown.  AIDS patients – compliance with treatment  Binary response – complied or no  Attempt to find factors associated with better compliance.
A SAS Macro to Calculate the C-statistic Bill O’Brien BCBSMA SAS Users Group March 10, 2015.
Using Weighted Data Donald Miller Population Research Institute 812 Oswald Tower, December 2008.
Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.
Kano Model & Multivariate Statistics Dr. Surej P John.
Regression Lines. Today’s Aim: To learn the method for calculating the most accurate Line of Best Fit for a set of data.
Topic 6: Estimation and Prediction of Y h. Outline Estimation and inference of E(Y h ) Prediction of a new observation Construction of a confidence band.
Chapter 22: Using Best Practices 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
SW318 Social Work Statistics Slide 1 Logistic Regression and Odds Ratios Example of Odds Ratio Using Relationship between Death Penalty and Race.
1 STA 617 – Chp11 Models for repeated data Analyzing Repeated Categorical Response Data  Repeated categorical responses may come from  repeated measurements.
Lesson 4 - Topics Creating new variables in the data step SAS Functions.
Economics 173 Business Statistics Lecture 10 Fall, 2001 Professor J. Petry
SAS Basics. Windows Program Editor Write/edit all your statement here.
Time Series Data Processes by Tai Yu April 15, 2013.
Exponential Functions. When do we use them? Exponential functions are best used for population, interest, growth/decay and other changes that involve.
Logistic Regression Saed Sayad 1www.ismartsoft.com.
Veronica Burt. Reading in the Data options nofmterr; data BCSC.data1; set BCSC.Dr238bs_sum_data_deid_v3_1012; run;
Four way analysis Nursing home residence Gender Age Death.
Principles of Biostatistics Chapter 17 Correlation 宇传华 网上免费统计资源(八)
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 16 & 17 By Tasha Chapman, Oregon Health Authority.
BINARY LOGISTIC REGRESSION
Applied Business Forecasting and Regression Analysis
LEAST – SQUARES REGRESSION
distance prediction observed y value predicted value zero
Statistical Data Analysis - Lecture 06 14/03/03
Multiple Discriminant Analysis and Logistic Regression
Advanced Analytics Using Enterprise Miner
6-1 Introduction To Empirical Models
Comparing k Populations
Quick Data Summaries in SAS
R Squared.
Complete Case Macro.
Introduction to Logistic Regression
Introduction to log-linear models
Never Cut and Paste Again
Data Analysis Module: Chi Square
Producing Descriptive Statistics
2 Activity 1: 5 columns Write the multiples of 2 into the table below. What patterns do you notice? Colour the odd numbers yellow and the even numbers.
R = R Squared
Let’s continue to review some of the statistics you’ve learned in your first class: Bivariate analyses (two variables measured at a time on each observation)
Activity 1: 5 columns Write the multiples of 3 into the table below. What patterns do you notice? Colour the odd numbers yellow and the even numbers blue.
Activity 1: 5 columns Write the multiples of 9 into the table below. What patterns do you notice? Colour the odd numbers yellow and the even numbers blue.
Using Clustering to Make Prediction Intervals For Neural Networks
5.2 Inference for logistic regression
5.4 Multiple logistic regression
Finding Correlation Coefficient & Line of Best Fit
Presentation transcript:

Zoom 30 x 200 Matrix for Red: R(red) G(green) B(blue) Frequency for red: 0 0 1

30 x 200 Matrix

CUT

for i=1:n for j=1:m if x(i,j)==1 for i1=(i+1):n k=abs(i-i1); for j1=(j+1):m if x(i1,j1)==1 dist(k+abs(j-j1)) =dist(k+abs(j-j1))+1; end n x m matrix: Calculate distance distribution : into array dist(100)

Dataset

Consider only two letters for now data bb; set aa; if ccc in ("1","O"); run; proc princomp out=pred; var x25-x66; run; proc gplot; plot prin1*prin2=ccc; run;

1 and O

1 and A

1 and I

Logistic regression distance distribution proc logistic data=bb; model ccc=x25-x66; output out=pred1 p=p; run; data pred2; set pred1; if p>0.5 then a="1"; else a="A"; run; proc freq; table ccc*a; run;

Using distance distribution Predicted by logistic model True:

Using distance distribution Predicted by logistic model True:

More statistics n x m matrix: Block Sums (4 sums) Column Sums (use first 20 sums) x1-x4 x5-x24

Using sums and distance %macro testtwo(char1, char2); data bb; set aa; if ccc in ("&char1","&char2"); run; proc princomp out=pred; var x&start-x66; run; proc gplot; plot prin1*prin2=ccc; run; proc logistic data=bb; model ccc=x&start-x66; output out=pred1 p=p; run; data pred2; set pred1; if p>0.5 then a="&char1"; else a="&char2"; run; proc freq; table ccc*a; run; %mend; %let start=1; %testtwo(1,I);

Using distance distribution True: Using both

More than two letters 26 letters + 10 digits = 36 categories Two stages: Stage 1 - Nominal Responses: Baseline- Category Logit Model Stage 2 – if predicted value belong to, for example, 1 or I, then using two-level logistic regression to further classify it.

Stage 2: based on the error transition probability 1 and I 4, A, and V D, H, M, W S -> 8 Q -> O 6, 9, and 0

proc DISCRIM data=aa out=discout method=normal outstat=distat; class ccc; var x1-x66; run; proc freq data=discout; table ccc*_into_/nofreq nocol norow; run;