1 WHY WE USE EXPLORATORY DATA ANALYSIS DATA YES NO ESTIMATES BASED ON NORMAL DISTRIB. KURTOSIS, SKEWNESS TRANSFORMATIONS QUANTILE (ROBUST) ESTIMATES OUTLIERS.

Slides:



Advertisements
Similar presentations
Understanding and Comparing Distributions 30 min.
Advertisements

Descriptive and exploratory statistics Garib Murshudov.
ELEC 303 – Random Signals Lecture 18 – Statistics, Confidence Intervals Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 10, 2009.
Descriptive Statistics
Descriptive Statistics – Central Tendency & Variability Chapter 3 (Part 2) MSIS 111 Prof. Nick Dedeke.
Jan Shapes of distributions… “Statistics” for one quantitative variable… Mean and median Percentiles Standard deviations Transforming data… Rescale:
1 Empirical and probability distributions 0.4 exploratory data analysis.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
BOX PLOTS/QUARTILES. QUARTILES: 3 points in a set of data that separate the set into 4 equal parts. Lower Quartile: Q1 (The median for the lower half.
Box and Whisker Plot 5 Number Summary for Odd Numbered Data Sets.
Quartiles & Extremes (displayed in a Box-and-Whisker Plot) Lower Extreme Lower Quartile Median Upper Quartile Upper Extreme Back.
(c) 2007 IUPUI SPEA K300 (4392) Outline: Numerical Methods Measures of Central Tendency Representative value Mean Median, mode, midrange Measures of Dispersion.
Drawing and comparing Box and Whisker diagrams (Box plots)
REPRESENTATION OF DATA.
2011 Summer ERIE/REU Program Descriptive Statistics Igor Jankovic Department of Civil, Structural, and Environmental Engineering University at Buffalo,
Normal Distributions Z Transformations Central Limit Theorem Standard Normal Distribution Z Distribution Table Confidence Intervals Levels of Significance.
Materials Reminders. Get out your agenda if you see your name below. You need to come to my room tomorrow. Period 2Period 7.
Exploratory Data Analysis
Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots.
Analyze Data USE MEAN & MEDIAN TO COMPARE THE CENTER OF DATA SETS. IDENTIFY OUTLIERS AND THEIR EFFECT ON DATA SETS.
Regression Analysis Week 8 DIAGNOSTIC AND REMEDIAL MEASURES Residuals The main purpose examining residuals Diagnostic for Residuals Test involving residuals.
EXAMPLE 3 Standardized Test Practice SOLUTION From Example 2, you know the interquartile range of the data is 0.9 inch. Find 1.5 times the interquartile.
Chapter 16 Exploratory data analysis: numerical summaries CIS 2033 Based on Textbook: A Modern Introduction to Probability and Statistics Instructor:
Compare the following heights in inches: BoysGirls
Section 4-8 Box and Whisker Plots. A Box and Whisker plot can be used to graphically represent a set of data points Box whiskers.
Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.
1 Chapter 2 Bivariate Data A set of data that contains information on two variables. Multivariate A set of data that contains information on more than.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Introductory Statistics Lesson 2.5 A Objective: SSBAT find the first, second and third quartiles of a data set. SSBAT find the interquartile range of a.
COMPUTATIONAL FORMULAS AND IQR’S. Compare the following heights in inches: BoysGirls
Concept: Comparing Data. Essential Question: How do we make comparisons between data sets? Vocabulary: Spread, variation Skewed left Skewed right Symmetric.
Box and Whisker Plots Example: Comparing two samples.
1 Pertemuan 4 Statistik Deskriptif-2 Matakuliah: A0064 / Statistik Ekonomi Tahun: 2005 Versi: 1/1.
© 2012 W.H. Freeman and Company Lecture 2 – Aug 29.
STATISTICS Chapter 2 and and 2.2: Review of Basic Statistics Topics covered today:  Mean, Median, Mode  5 number summary and box plot  Interquartile.
Probability & Statistics Box Plots. Describing Distributions Numerically Five Number Summary and Box Plots (Box & Whisker Plots )
Making a Box & Whiskers Plot Give Me Five!. 5 Numbers are Needed 1) Lowest: Least number of the data set 2) Lower Quartile : The median of the lower half.
Introduction Data sets can be compared by examining the differences and similarities between measures of center and spread. The mean and median of a data.
5,8,12,15,15,18,20,20,20,30,35,40, Drawing a Dot plot.
The symmetry statistic
Advanced Quantitative Techniques
Chapter 16: Exploratory data analysis: numerical summaries
Chapter 5 : Describing Distributions Numerically I
Boxplots.
Two Concepts of Probability
Averages and Variation
Numerical Measures: Skewness and Location
Box and Whisker Plots 50% Step 1 – Order the series.
Box and Whisker Diagrams
BOX-and-WHISKER PLOT (Box Plot)
Range between the quartiles. Q3 – Q1
Shape of Distributions
Tutorial 9 Suppose that a random sample of size 10 is drawn from a normal distribution with mean 10 and variance 4. Find the following probabilities:
Exploratory data analysis: numerical summaries
Box-And-Whisker Plots
AP Statistics Day 4 Objective: The students will be able to describe distributions with numbers and create and interpret boxplots.
Boxplots.
Boxplots.
Box Plots.
Box-and-Whisker Plots
Challenges in analysis and interpretation of cost data in vascular surgery  Kevin Mani, MD, Jonas Lundkvist, RPh, PhD, Lars Holmberg, MD, PhD, Anders Wanhainen,
MCC6.SP.5c, MCC9-12.S.ID.1, MCC9-12.S.1D.2 and MCC9-12.S.ID.3
. . Box and Whisker Measures of Variation Measures of Variation 8 12
Box-And-Whisker Plots
Box-And-Whisker Plots
Box-and-Whisker Plots
BOX-and-WHISKER PLOT (Box Plot)
Lesson Plan Day 1 Lesson Plan Day 2 Lesson Plan Day 3
Boxplots.
Presentation transcript:

1 WHY WE USE EXPLORATORY DATA ANALYSIS DATA YES NO ESTIMATES BASED ON NORMAL DISTRIB. KURTOSIS, SKEWNESS TRANSFORMATIONS QUANTILE (ROBUST) ESTIMATES OUTLIERS EXTREMS YES NO QUANTILE (ROBUST) ESTIMATES WHY ? CAN WE REMOVED THEM ? DO DATA COME FROM NORMAL DISTRIBUTION? TRANSFORMATIONS

2 METHODS OF EDA Graphical: dot plot box plot notched box plot QQ plot histogram density plots Tests: tests of normality minimal sample size

3 DOT PLOT

4 BOX PLOT lower quartil upper kvartil fence outer inner fence inner outer interquartile range (H) číselná osa median

5 NOTCHED BOX PLOT interval estimate of median RFRF

6 Q-Q PLOT X: theoretical quantiles of analysed distribution Y: sample quantiles ideal coincidence of sample values and theoretical distribution measured values

7 Q-Q GRAF

8

9 Q-Q plot right sided – skewed to left left sided – skewed to right platycurtic („flat“) leptocurtic(„steep“)

10

11

12 HISTOGRAM

13 HISTOGRAM correct width of interval:

14 HISTOGRAM – kernel density function

15 TRANSFORMATION Aim of transformation: reduction of variance better level of symmetry(normality) of data Transformation function: non-linear function monotonic function

16 TRANSFORMATION – basic concept

17 TRANSFORMATION – logaritmic transformation

18 TRANSFORMATION – power transformation

19 TRANSFORMATION – Box-Cox

20 TRANSFORMATION – Box-Cox

21 TRANSFORMATION– estimate of optimal logarithm of likelihood function for various values of optimal interval estimate of parameter = 1 is not included in interval estimate of. It means that transformation will be probably successful 1.00 maxLF– 0,5* quantile  2