Download presentation
Presentation is loading. Please wait.
1
Introduction Osborn
2
“Legal” Science Daubert is a benchmark!!!:
Daubert (1993)- Judges are the “gatekeepers” of scientific evidence. Must determine if the science is reliable Has empirical testing been done? Falsifiability Has the science been subject to peer review? Are there known error rates? Is there general acceptance? Frye Standard (1928) essentially Federal Government and 26(-ish) States are “Daubert States”
3
Measurement and Randomness
Any time an observation is made, one is making a “measurement” Experimental error is inherent in every measurement Refers to variation in observations between repetitions of the same experiment. It is unavoidable and many sources contribute Error in a statistical context is a technical termBHH
4
Measurement and Randomness
Experimental error is a form of randomness Randomness: inherent unpredictability in a process The the outcomes of the process follow a probability distribution Statistical tools are used to both: Describe the randomness Make inferences taking into account the randomness Careful!: Bad data, assumptions and models lead to garbage (GIGO)
5
Probability Frequency: ratio of the number of observations of interest (ni) to the total number of observations (N) Probability (frequentist): frequency of observation i in the limit of a very large number of observations We will almost always use this definition It is EMPIRICAL!
6
Frequency Roll a “fair” die 20 times (N = 20). What is the frequency of obtaining 2 (n2 = ?)? Let’s do this with simulation (Monte Carlo): In R: Result: n2 = 2 freq2 = 2/20 = 0.1
7
What is Statistics?? Study of relationships in data
Descriptive Statistics – techniques to summarize data E.g. mean, median, mode, range, standard deviation, stem and leaf plots, histograms, box and whiskers plots, etc. Inferential Statistics – techniques to draw conclusions from a given data set taking into account inherent randomness E.g. confidence intervals, hypothesis testing, Bayes’ theorem, forecasting, etc.
8
Data Random variables - All measurements have an associated “randomness” component Randomness –patternless, unstructured, typical, total ignoranceChaitin, Claude Any experiment/observation recorded is a random variate
9
T.I.C. of Gasoline
10
Observations from the T.I.C.
GC-MS instrument output for a gasoline :
11
Population and Sample Almost all of statistics is based on a sample drawn from a population. Population: The totality of observations that might occur as a result of repeatedly performing an experiment Why not measure the whole population? Usually impossible Likely wasteful Population should be relevant. Part logic Part guess Part philosophy….
12
Data and Sampling Sample Representations: Representative Sample
Population Biased Samples Population Population Sample Sample
13
Parameters and Statistics
Parameter: any function of the population Statistic: any function of a sample from the population Statistics are used to estimate population parameters Statistics can be biased or unbiased Sample average is an unbiased estimator for population mean We may construct distributions for statistics Populations have distributions for observations Samples have distributions for observations and statistics
14
What is ? R : A powerful Platform for Statistical Analysis
Why bother learning R ? Basic Graphing Basic Data Summary and Analysis Tools Basic Statistical Inference Tools We will learn R and Rstudio Getting Help Basic input/output and calculating Visualizing with Graphing
15
Finding our way around R/RStudio
Script Window Command Line
16
Handy Commands: x <- 4 x <- “text goes in quotes”
Basic Input and Output Numeric input x <- 4 variables: store information :Assignment operator x <- “text goes in quotes” Text (character) input
17
Get help on an R command:
Handy Commands: Get help on an R command: If you know the name: ?command name ?plot brings up html on plot command If you don’t know the name: Use Google (my favorite) ??key word
18
R is driven by functions:
Handy Commands: R is driven by functions: func(arguement1, argument2) input to function goes in parenthesis function name function returns something; gets dumped into x x <- func(arg1, arg2)
19
Handy Commands: Matrices: X User defined functions syntax:
X[,1] returns column 1 of matrix X X[3,] returns row 3 of matrix X Handy functions for data frames and matrices: dim, nrow, ncol, rbind, cbind User defined functions syntax: func.name <- function(arguements) { do something return(output) } To use it: func.name(values)
20
R commands not to forget for today
<- (assignment or “gets”) ? (to get help with a command) : (range operator) c (“collect”) sample seq (generate a sequence) plot library install.packages (to install libraries you don’t have) For matrices and vectors: x[,3] vs. x[3,] vs. x[,] vs. x[3,3] vs. x[] vs. x[1:3] etc…
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.