Typical logical expressions used in programs with if statements: If(EXPRESSION) { COMMANDS EXECUTED WHEN TRUE } else { COMMANDS EXECUTED WHEN FALSE } x<-3 y<-4.0 x==4.0 returns TRUE x==y returns TRUE y==y returns TRUE y==4 may return TRUE Better abs(y-4)< When the numbers are sufficently close, then TRUE
Typical logical expressions used in programs with if statements: If(EXPRESSION) { COMMANDS EXECUTED WHEN TRUE } else { COMMANDS EXECUTED WHEN FALSE } x<0 x>0 x<(-1) # negative numbers must be put in () Logical combinations: Logical AND: (x 3) Logical OR: (x 3) Logical NOT: !(x<y) Strings can also be compared for equivalence month<-”Dec” !(month==“Jan”) returns TRUE Note: Practice the logical operations: Be careful with vectors! Inform yourself about more operators and how they behave with vector objects! The proper use requires experience!
albany_climatology_snow.R Changes: filename and the variable name (better called ‘object name’) Changes: all data and variables are derived from object ‘snow’: vector ‘time’
albany_climatology_snow.R Changes: plot() function call: update the y-axis label Changes: ‘res’ must be assigned the monthly mean snow data! it is used below for plotting in the function lines()
albany_climatology_snow.R Changes: use the year information from object snow Changes:mhelp is a new object thatstores the months data, but only for the selected years within our climatological period
albany_climatology_snow.R Changes: buffer stores the monthly mean snow data of the climatological period Changes: snowclim A new object to calculate the seasonal cycle (climatological mean).
albany_climatology_snow.R Changes:plot() function call with adjusted y-axis label string Changes: snowclim object to calculate the seasonal cycle (climatological mean). Changes: lines() plots values of snowclim
albany_climatology_snow.R Outlier, Erroneous data or record snow?
albany_climatology_snow.R 30-yr climatology No snow between May-September (but in previous years May had snow!)
Statistical estimates are attempts to quantify the underlying true properties of the sample population Random samples are incomplete description of the full population The mean of a sample is an estimate of the true mean The variance is also only an estimate of the true variance of the population (Any other estimates, of course too)
We have seen that the sample mean is the arithmetic mean of the observations Albany Airport Monthly mean temperature anomalies from the 30-yr climatology: Sample size: 360 Mean: 0 C (degree Celsius) Standard Deviation = C
We have seen that the sample mean is the arithmetic mean of the observations New incoming data: anomaly with respect to (w.r.t.) previous estimated mean With larger sample size each individual sample becomes less influential for updating the mean and it converges to the true mean (for Independent Identically Distributed (IID) samples) ‘online algorithm’
Created with R-program scripts/online_average.R (needs data/USW _tavg_mon_mean_ano.csv
Histograms give an overview of the sample distribution: mean and standard deviation describe only in parts of the character of sample distributions (we will learn about the skewness and tails of distributions in this course) Albany Airport Monthly mean temperature anomalies from the 30-yr climatology: Sample size: 360 Mean: 0 C (degree Celsius) Standard Deviation = C
We have seen that the sample mean is the arithmetic mean of the observations Albany Airport Monthly mean temperature anomalies from the 30-yr climatology: Sample size: 360 Mean: 0 C (degree Celsius) Standard Deviation = C
Estimate an unknown mean of the random process (a “population mean”): we only have a sample with limited number of observations Sample is drawn randomly from the population The larger the sample size the better the estimate That is if we repeated an experiment several times each time with new samples of size n, then the variance among the estimated means will decrease the larger the sample size n.