1 WHY WE USE EXPLORATORY DATA ANALYSIS DATA YES NO ESTIMATES BASED ON NORMAL DISTRIB. KURTOSIS, SKEWNESS TRANSFORMATIONS QUANTILE (ROBUST) ESTIMATES OUTLIERS EXTREMS YES NO QUANTILE (ROBUST) ESTIMATES WHY ? CAN WE REMOVED THEM ? DO DATA COME FROM NORMAL DISTRIBUTION? TRANSFORMATIONS
2 METHODS OF EDA Graphical: dot plot box plot notched box plot QQ plot histogram density plots Tests: tests of normality minimal sample size
3 DOT PLOT
4 BOX PLOT lower quartil upper kvartil fence outer inner fence inner outer interquartile range (H) číselná osa median
5 NOTCHED BOX PLOT interval estimate of median RFRF
6 Q-Q PLOT X: theoretical quantiles of analysed distribution Y: sample quantiles ideal coincidence of sample values and theoretical distribution measured values
7 Q-Q GRAF
8
9 Q-Q plot right sided – skewed to left left sided – skewed to right platycurtic („flat“) leptocurtic(„steep“)
10
11
12 HISTOGRAM
13 HISTOGRAM correct width of interval:
14 HISTOGRAM – kernel density function
15 TRANSFORMATION Aim of transformation: reduction of variance better level of symmetry(normality) of data Transformation function: non-linear function monotonic function
16 TRANSFORMATION – basic concept
17 TRANSFORMATION – logaritmic transformation
18 TRANSFORMATION – power transformation
19 TRANSFORMATION – Box-Cox
20 TRANSFORMATION – Box-Cox
21 TRANSFORMATION– estimate of optimal logarithm of likelihood function for various values of optimal interval estimate of parameter = 1 is not included in interval estimate of. It means that transformation will be probably successful 1.00 maxLF– 0,5* quantile 2