Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exploratory Data Analysis

Similar presentations


Presentation on theme: "Exploratory Data Analysis"— Presentation transcript:

1 Exploratory Data Analysis
Analysis of Biological Data/Biometrics Dr. Ryan McEwan Department of Biology University of Dayton

2 A principle task for the padawan data analyst is committing to exploratory data analysis (EDA)
In some cases the scientist will be conducting a discrete experimental test of some variable…if so, skipping right to the final analysis is legitimate. In most studies this will not end up being the case. Often, many variables have been measured. Sometimes the study will take an unsuspecting turn, the focus will shift, or the scientist will simply start out looking at a broad suite of variables. Even if you have a pretty good idea of what you are getting at, and what particular comparisons you want to make, exploration is highly advised.

3

4

5 When you have lots of data…
Scattergrams are your friend. You can quickly graph relationships to get a look into the structure of the data set… In this case there is a curvilinear regression line through the points…. What is the strength of this line? Do you buy it?

6 Scattergrams are your friend
Blocking panels together in this kind of way can give an interesting idea of the big picture, and can make scanning easy

7 Scattergrams are your friend
Driving regression lines through clouds of points, then panning back yields a broader view of an extremely large data set Of course, there may be multivariate relationships or other patters that are difficult to visually discern, but it is still a good way to understand the data

8 Scattergrams are your friend
But you don’t have to drive lines, sometimes the pattern is good enough

9 Scattergrams are your friend.
You can use them for illustration of complex ideas, if you are crafty!

10 Scattergrams are your friend.
Here is a line and scatter plot, showing a temporal relationship

11 Histogram/Bar Charts: Classic way to compare values
In this particular case there was no way to calculate error…more on that later.

12 Histogram/Bar Charts. This is called a stacked bar chart. A pretty nice way to display information…the top of the bars matter, but their composition does too.

13 Histogram/Bar Charts. Here is a standard bar chart, but this time there are little whiskers on the top of the bars…. What are those? Those are “error bars” and are a great way to get a feel for the variation around your result. If the bars are very large, then your confidence should be very small. Rule of thumb- if the error bars overlap the bars are not different.

14 Histogram/Bar Charts. This is a bar chart that shows change over a time, so you have plus or minus in this one. Note also I have stars…these indicate statistical significance.

15 Histogram/Bar Charts. This is a bar chart with standard error bars. Looks impressive!

16 Histogram/Bar Charts. Here is the same data set, but scaled differently. Less impressive….

17 Box plot! The box plot is the best way to actually get a look at your data set. In this case, outliers are the dots, the line in the middle is the median, the top and bottom of the box are the 25 and 75 percentile of the data set, bars are 10th and 90th percentile. hmmmmmmmmmmmm

18 Mixing and matching formats for effect
This is a figure format that allows you to explore the relationship between two variables. In this case, a bar chart is on the y1 axis, and a drought measure on the y2.

19 Mixing and matching formats for effect
This is a figure format that allows you to explore the relationship between two variables. In this case, a bar chart is on the y1 axis, and a drought measure on the y2.

20 Ordination is nice when you have too many variables to handle easily

21 Cluster Analysis can be real useful for sussing out patterns and relationships

22 Refuse to be boxed in! Invent your own figure type Herbaceous species cover over space

23 Refuse to be boxed in! Invent your own figure type

24 Geospatial!!!!

25 Exploratory Data Analysis
Analysis of Biological Data/Biometrics Dr. Ryan McEwan Department of Biology University of Dayton


Download ppt "Exploratory Data Analysis"

Similar presentations


Ads by Google