Fundamentals Analysis of Biological Data/Biometrics Dr. Ryan McEwan

Fundamentals Analysis of Biological Data/Biometrics Dr. Ryan McEwan
Department of Biology University of Dayton

Ultimately, your statistical analyses will depend on how the data were collected, which is linked back to Experimental Design… Experimental design is like a game a chess, you must think first, before you move…

Planning One good idea is to draft a Prospectus before you start the experiment. The goal is get your head around what you are trying to accomplish. Don’t worry about format or grammar, etc, here….instead try to identify the critical aspects of the study. Rationale, what you are measuring, response variables, etc.

You can sketch this out and then work drafts through with your advisor or collaborators to the and get on the same page….BEFORE you do any work! In this process you will sometimes ID major issues, or come up with new ideas.

Consider including a section on participant expectations.
Experiments can go in the toilet if the folks involved do not understand their role. If you are going to fight about this, you might as well fight about it before you start doing work! Sometimes writing things out like “X will be an author on all papers” will generate consternation, but better to burst that boil and deal with it than let it fester until you have expended a great deal of time and effort. Maybe there is an impasse that cannot be bridged… etc… suss that out before you start This is critical if you get a spider sense that you are involved in a poker game… As a student in a lab, some of this is already set by standards and protocols, but once you become a professional things can become murky quickly

Here is a bit more statistics- centric format for a prospectus.
I have known scientists who start counting up Degrees of Freedom in an ANOVA the first day the conversation starts about running a project… More specificity is good, but with the understanding that things are bound to go astray.

Basic Concepts Sampling Randomization Replication & Error
Response Variable Experimental Unit Statistical Hypotheses (& testing) Treatment and Control

Sampling Everything varies! Through space, and over time. Sampling is a way of estimating that variation. And statistical analysis is a way of asking about the variation among samples. Often times the question is whether the variation is “natural” or due to some factor such as an experimental treatment.

Sampling Obviously in many cases it is impossible to measure all of the items of interest so a sampling is undertaken.

Sampling

Sampling Realism Reductionism

Sampling Randomization
Humans who are observers and designers of experiments are biased. Many of the biases are unconscious, thus unavoidable. The only reliable way to eliminate bias in a sampling scheme is to have at least one, formal, randomization step. If not formally randomized then the sampling is in effect an artifact of the human consciousness…not representative of the item being observed.

(2) Randomization When you claim a sample is random, you are making a specific mathematical claim. Unless you have a protocol to create such randomness, followed fastidiously, what you have is indeed NOT random and should not be stated as such. In ecology the term “haphazard” is increasingly being used for processes in the field which are not mathematically random, but also are not systematic or “overtly intentional.” Examples of things that are NOT random: “I threw the quadrat over my head backward into the grassland thus established a random location” “I selected flies from the container randomly” “Shrubs were selected randomly along the trail” “Random insects were collected from the petal of a flower” “I selected random locations from the image “ Students were randomly selected from scrolling the list”

(2) Randomization Ways to introduce randomness: Random.org
Random number table:

VS Sampling Randomization Replication & Error Replicate Not Replicant
How much replication is needed? VS

VS How much replication is needed?
More is better…and what is needed generally depends on the variation in the data set…ie, how much experimental error is in the system. Error in this case is all of the unaccounted for variation within an experiment/study. Some of this is likely natural variation based on the organism or study system…but, experimental error can also be used as a catch phrase to include actual mistakes made by the observer (miss counting seeds in a dish). Increasing replication generally improves accuracy. Increasing replication increases statistical power in nearly all cases. Increasing replication also increases the expense of an experiment and is costly in terms of time. A statistical rule of thumb I learned was n = 30 replicates is a good minimum; however, that number is impossible in many studies.

Sampling Randomization Replication & Error Response Variable
This is what you are measuring. It is crucial that at the very start of the experiment the scientists figure out what this is, precisely, and that the study is designed based on this observation. Without understanding the response variable, you cannot properly design the experiment…you cannot figure out how to set up randomization and/or replication! What are you measuring, SPECIFICALLY, and what is the expected variation? If this is an experiment, you will be applying treatments, the treatments must be constructed based on the response variable, and replication and randomization must be set based on the Experimental Unit

Sampling Randomization Replication Response Variable Experimental Unit

Here we have 4 treatments (colored rectangles), and 10 plots within each. What is the experimental unit? What is the replication?

Pseudoreplication!!

Here we have 4 treatments (colored rectangles), repeated 6 times each and have 10 plots within each.
What is the experimental unit? What is the replication?

Statistical Hypotheses (& testing)
Sampling Randomization Replication Response Variable Experimental Unit Statistical Hypotheses (& testing) Good Hypotheses. Note that following the principle of Karl Popper- a hypothesis is only a good (valid) hypothesis if it can be falsified.

Statistical Hypotheses (& testing)
Sampling Randomization Replication Response Variable Experimental Unit Statistical Hypotheses (& testing) Null Hypothesis. A null hypothesis basically says “nothing is happening.” A null hypothesis about two means would be that the means are the same.

Statistical Hypotheses (& testing) Treatment and Control
Sampling Randomization Replication & Error Response Variable Experimental Unit Statistical Hypotheses (& testing) Treatment and Control An Experimental treatment is a change/activity/impact that is intentionally enacted on a subset of the experimental units. Common examples in ecology would be application of an exudate on seeds to test germination response, or giving a particular experimental drug to a set of patients. In almost all cases an experiment needs to have a control as well- that would be a set of experimental units where the treatment is withheld. The control needs to be carefully considered when establishing an experiment and is a crucial part of most designs.

Fundamentals Analysis of Biological Data/Biometrics Dr. Ryan McEwan
Department of Biology University of Dayton

Fundamentals Analysis of Biological Data/Biometrics Dr. Ryan McEwan

Similar presentations

Presentation on theme: "Fundamentals Analysis of Biological Data/Biometrics Dr. Ryan McEwan"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Fundamentals Analysis of Biological Data/Biometrics Dr. Ryan McEwan

Similar presentations

Presentation on theme: "Fundamentals Analysis of Biological Data/Biometrics Dr. Ryan McEwan"— Presentation transcript:

Similar presentations

About project

Feedback