DATA VISUALIZATION UNIVARIATE (no review- self study) STEM & LEAF BOXPLOT BIVARIATE SCATTERPLOT (review correlation) Overlays; jittering Regression line overlay (see ASA website:
DATA VISUALIZATION TOPICS GRAPHICAL DISPLAYS UNIVARIATE BIVARIATE ASSUMPTIONS OF MULTIPLE REGRESSION LINEARITY HOMOSCEDASTICITY ERROR INDEPENDENCE NORMALITY FIXING VIOLATIONS
GRAPHICAL DISPLAYS Frequency Histogram: –SPSS ANALYZE: Descriptive Statistics: Explore: Plot: Stem and Leaf –SPSS GRAPH: Boxplot (normal curve overlay available or INTERACTIVE: Boxplot or Analyze: Frequencies – SPSS GRAPH: Histogram or Interactive: Histogram # “bins” = 1 + log 2 (N) Example: N= 500; #bins = 1+ 9 = 10 Log 2 (512) = 9 (eg., 2x2x2x2x2x2x2x2x2=512)
ANXIETY Stem-and-Leaf Plot Frequency Stem & Leaf Stem width: 10 Each leaf: 1 case(s)
GRAPHICAL DISPLAYS Kernel Smoothing –SPSS Graph: INTERACTIVE: Line: Dots and Lines: Spline or Lagrange 3 rd and 5 th order fits – does not give you the smoother options (available for bivariate scatterplots- see later slides)
Bivariate Displays Scatterplots –Interval data –Category by interval- jittering –Regression fits- lowess lines Scatterplot Matrices
Interval Scatterplot: SPSS Graphics: Interactive: Scatterplot: Fit: Method:Smoother No Smootherwith Normal Smoother
Interval Scatterplot: SPSS Graphics: Interactive: Scatterplot: Fit: Method:Smoother with Uniform Smoother
Category X-axis: without and with jittering (adding normal random deviate with SD=.15 for sex)
Jittering Basic idea- when looking at displays for two or more groups, it is hard to tell where data lie due to overlaying of points in most plot programs, so Add a small random score to each “group” score –For example, for males (score 1) and females (score 2), add a random number with std dev. of say.1 to each male and female score
Jittering The result is a spreading out of all scores around the Male or Female column in a scatterplot: Male=1Female=2 Y.
DATA VISUALIZATION BIVARIATE Loess lines: in SPSS an option under GRAPH/ Interactive / Scatterplot labeled “FIT” with METHOD = SMOOTHER The Bandwidth multiplier has a 1.0 default; a smaller value will create more bumps or curves in the overall curve
GRAPH/INTERACTIVE/SCATTERPLOT/FIT/BANDWIDTH=1.0