Introduction to R, Statistics, and the grammar of graphics Thomas INGICCO E. Delacroix, Dante et Virgile aux Enfers E. Delacroix, Dante and Virgile in.

Introduction to R, Statistics, and the grammar of graphics Thomas INGICCO E. Delacroix, Dante et Virgile aux Enfers E. Delacroix, Dante and Virgile in Hell

Classes – how you present your data - Vector – series of values of 1 dimension -Matrix – series of values of 2 dimensions -Arrays – series of values of n dimensions -Data Frame – series of values in columns -List – series of objects -Table – Contingency table … but before all it is a language with its own grammar made of: # Array ee <- array(1:4, dim=c(2, 3, 2)) ee <- array(1:4, c(2, 3, 2))

Classes – how you present your data - Vector – series of values of 1 dimension -Matrix – series of values of 2 dimensions -Arrays – series of values of n dimensions -Data Frame – series of values in columns -List – series of objects -Table – Contingency table … but before all it is a language with its own grammar made of: # Array ee <- array(1:4, dim=c(2, 3, 2)) ee <- array(1:4, c(2, 3, 2)) Data3d1 <- matrix(c(0.72,100.32,0.75,100.36,0.77,100.32,0.81,100.32,0.77,100.29,0.77,100.2 4,0.73,100.28,0.7,100.26,0.7,100.3,0.67,100.33), 10, 2, byrow=T)

Classes – how you present your data - Vector – series of values of 1 dimension -Matrix – series of values of 2 dimensions -Arrays – series of values of n dimensions -Data Frame – series of values in columns -List – series of objects -Table – Contingency table … but before all it is a language with its own grammar made of: # Array ee <- array(1:4, dim=c(2, 3, 2)) ee <- array(1:4, c(2, 3, 2)) Data3d1 <- matrix(c(0.72,100.32,0.75,100.36,0.77,100.32,0.81,100.32,0.77,100.29,0.77,100.2 4,0.73,100.28,0.7,100.26,0.7,100.3,0.67,100.33), 10, 2, byrow=T) colnames(Data3d)<-c("x", "y") rownames(Data3d)<-paste("Lan", 1:10, sep="")

Classes – how you present your data - Vector – series of values of 1 dimension -Matrix – series of values of 2 dimensions -Arrays – series of values of n dimensions -Data Frame – series of values in columns -List – series of objects -Table – Contingency table … but before all it is a language with its own grammar made of: # Array ee <- array(1:4, dim=c(2, 3, 2)) ee <- array(1:4, c(2, 3, 2)) Data3d1 <- matrix(c(0.72,100.32,0.75,100.36,0.77,100.32,0.81,100.32,0.77,100.29,0.77,100.2 4,0.73,100.28,0.7,100.26,0.7,100.3,0.67,100.33), 10, 2, byrow=T) colnames(Data3d)<-c("x", "y") rownames(Data3d)<-paste("Lan", 1:10, sep="") t(Data3d) Data3d2 <- Data3d1

Classes – how you present your data - Vector – series of values of 1 dimension -Matrix – series of values of 2 dimensions -Arrays – series of values of n dimensions -Data Frame – series of values in columns -List – series of objects -Table – Contingency table … but before all it is a language with its own grammar made of: # Array ee <- array(1:4, dim=c(2, 3, 2)) ee <- array(1:4, c(2, 3, 2)) Data3d1 <- matrix(c(0.72,100.32,0.75,100.36,0.77,100.32,0.81,100.32,0.77,100.29,0.77,100.2 4,0.73,100.28,0.7,100.26,0.7,100.3,0.67,100.33), 10, 2, byrow=T) colnames(Data3d)<-c("x", "y") rownames(Data3d)<-paste("Lan", 1:10, sep="") t(Data3d) Data3d2 <- Data3d1 array(cbind(Data3d1, Data3d2), dim=c(10, 2, 2))

Classes – how you present your data - Vector – series of values of 1 dimension -Matrix – series of values of 2 dimensions -Arrays – series of values of n dimensions -Data Frame – series of values in columns -List – series of objects -Table – Contingency table … but before all it is a language with its own grammar made of: # List ff <- list(aa, bb, cc, dd)

Classes – how you present your data - Vector – series of values of 1 dimension -Matrix – series of values of 2 dimensions -Arrays – series of values of n dimensions -Data Frame – series of values in columns -List – series of objects -Table – Contingency table … but before all it is a language with its own grammar made of: ## Table hh <- table(gg) hh <- table(gg, dd[1:6,11])

Classes – how you present your data - Vector – series of values of 1 dimension -Matrix – series of values of 2 dimensions -Arrays – series of values of n dimensions -Data Frame – series of values in columns -List – series of objects -Table – Contingency table … but before all it is a language with its own grammar made of: hhh <- data.frame(gg, dd[1:6,11]) colnames(hhh) <- c("gg","Lip") # Rename the columns hhhh <- table(hhh) data.frame(gg, na.omit(dd[1:6,11])) # Function na.omit data.frame(gg, na.omit(dd[1:7,11])) dim(hhhh) # Number of lines and columns dimnames(hhhh)

Classes – how you present your data - Vector – series of values of 1 dimension -Matrix – series of values of 2 dimensions -Arrays – series of values of n dimensions -Data Frame – series of values in columns -List – series of objects -Table – Contingency table … but before all it is a language with its own grammar made of: margin.table(hhhh) # Calculate the margins margin.table(hhhh, 1) margin.table(hhhh, 2) hhhh[3,] <- c(1000,2000) # Replace line 3 cbind(hhhh,hhh) # Concatenate the columns of two tables t(hhhh) # Transposition

Classes – how you present your data - Vector – series of values of 1 dimension -Matrix – series of values of 2 dimensions -Arrays – series of values of n dimensions -Data Frame – series of values in columns -List – series of objects -Table – Contingency table … but before all it is a language with its own grammar made of: # Factor gg <- rep(c("Everted", "Round", "Flat"), c(1,2,3)) is.vector(gg) is.character(gg) gg1 <- factor(gg)

Individuals Variables 1…j…p 1x 11 …x 1j …x 1p ………… ix i1 …x ij …x ip ………… nx n1 …x nj …x np FAMGENSPIDUNWLNWMTWATWANWMDWADWTLNHAGE GibbonsHylobatesH.sp1880_1167_D7.117.7410.999.268.169.429.59188,3110.37A GibbonsHylobatesH.sp1880_1167_G6.128.5311.39.298.549.59.42187,510.13A GibbonsHylobatesH.sp1880_1170_D6.189.7210.818.917.698.058.78177,248.94A GibbonsHylobatesH.sp1880_1170_G6.4410.0910.688.969.078.058.69177,599.29A GibbonsHylobatesH.sp1901_102_D6.3111.6915.1911.799.2611.8311.6206,6911.49A GibbonsHylobatesH.sp1901_102_G7.1411.1314.9311.689.0611.7611.3205,3211.49A

Continuous quantitative variable Length of dog calcaneum {67.0 54.7 7.0 48.5 14.0 17.2 20.7 13.0 43.4 40.2 38.9 54.5 59.8 48.3 22.9 11.5 34.4 35.1 38.7 30.8 30.6 43.1 56.8 40.8 41.8 42.5 31.0 31.7 30.2 25.9 49.2 37.0 35.915.0 30.2 7.2 36.2 45.5 7.8 33.4 36.1 40.2 42.7 42.5 16.2 39.0 35.0 37.0 31.4 37.6 39.9 36.2 42.8 46.424.7 49.1 46.0 35.9 7.8 48.2 15.2 32.5 44.7 42.6 38.8 17.4 40.8 29.1 14.6 59.2} Discrete quantitative variable Number of flakes per context {1 0 3 3 0 0 1 1 0 0 1 1 0 2 2 1 0 1 0 0 1 3 0 0 0 2 0 2 5 0 0 0 0 1 1 0 0 0 1 0 0 1 4 0 2 2 1 2 2 2 1 1 0 2 0 0 1 0 4 2 0 0 2 3 1 1 1 0 0 1 0 0 2 0 0 0 2 2 0 0 1 0 2 2 0 1 0 3 3 0 2 0 2 2 3 0 3 1 0 0} Qualitative variable Colour of the pot {black, red, black, red, brown, brown, black, grey, red, black} Different kind of data

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation We add all the measures And we divide by the number of measurements

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation We add all the measures And we divide by the number of measurements sum(DATA[1:49,6]) / length(DATA[1:49,6])

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation We add all the measures And we divide by the number of measurements sum(DATA[1:49,6]) / length(DATA[1:49,6]) mean(DATA[1:49,6])

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation We add all the measures And we divide by the number of measurements sum(DATA[1:49,6]) / length(DATA[1:49,6]) mean(DATA[1:49,6]) colMeans(DATA[1:49,6:11])

Descriptive and inferential statistics Example: You are told that you have a serious illness for which the mean survival period is six months… Statistics interest you ! Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation The Mode is the most frequent value Sample > Median = Median < sample

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation The Mode is the most frequent value Sample > Median = Median < sample median(DATA[1:49,6])

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation The Mode is the most frequent value Sample > Median = Median < sample median(DATA[1:49,6]) quantile(DATA[1:49,6])

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation The Mode is the most frequent value Sample > Median = Median < sample median(DATA[1:49,6]) quantile(DATA[1:49,6]) min(DATA[1:49,6]) max(DATA[1:49,6])

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation The Mode is the most frequent value Sample > Median = Median < sample median(DATA[1:49,6]) quantile(DATA[1:49,6]) min(DATA[1:49,6]) max(DATA[1:49,6]) range(DATA[1:49,6])

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation The Mode is the most frequent value Sample > Median = Median < sample

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation We calculate the difference between every value and the mean We square this difference We sum the squared differences And we divide by the number of value

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation We calculate the difference between every value and the mean We square this difference We sum the squared differences And we divide by the number of value The variance is the mean of the squared differences to the mean

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation The standard deviation is the square root of the variance

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation Transform the standard deviation into the metrics of the variable It permits to compare two variables Problem: when X is close to zero, it becomes useless

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation To measure the difference to the mean in the standard deviation metrics, we use: This is the centered- reduced variable of mean=0 and variance=1

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation Covariance measures the degree of dependance of two variables: Are the values of each measurement drift independantly away from the centre of gravity, or are they drifting away together? If x and y are independant, then the covariance is equal to 0

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation Covariance measures the degree of dependance of two variables: Are the values of each measurement drift independantly away from the centre of gravity, or are they drifting away together? If x and y are independant, then the covariance is equal to 0 We multiply the x-deviation to the mean to its associated y-deviation We sum these products We divide by the number of values

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation Covariance measures the degree of dependance of two variables: Are the values of each measurement drift independantly away from the centre of gravity, or are they drifting away together? If x and y are independant, then the covariance is equal to 0 We multiply the x-deviation to the mean to its associated y-deviation We sum these products We divide by the number of values So covariance is the sum of the crossed products

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation So covariance is the sum of the crossed products ( sum(DATA[1:49,6] * DATA[1:49,7]) - prod(sum(DATA[1:49,6]),sum(DATA[1:49,7])) / length(DATA[1:49,6]) ) / ( length(DATA[1:49,6])-1 )

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation So covariance is the sum of the crossed products ( sum(DATA[1:49,6] * DATA[1:49,7]) - prod(sum(DATA[1:49,6]),sum(DATA[1:49,7])) / length(DATA[1:49,6]) ) / ( length(DATA[1:49,6])-1 ) Cov(DATA[1:49,6],DATA[1:49,7])

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation Pearson’s correlation coefficient differs from the covariance by its absence of unit and its boundaries between -1 and 1

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation Pearson’s correlation coefficient differs from the covariance by its absence of unit and its boundaries between -1 and 1 cov(DATA[1:49,6],DATA[1:49,7]) / (sd(DATA[1:49,6]) * sd(DATA[1:49,7]))

Descriptive and inferential statistics Position parameters: Mean Mode Mediane Dispersion parameters: Standard deviation Variance Maximum Minimum Coefficient of variation Covariance Coefficient of correlation Pearson’s correlation coefficient differs from the covariance by its absence of unit and its boundaries between -1 and 1 cov(DATA[1:49,6],DATA[1:49,7]) / (sd(DATA[1:49,6]) * sd(DATA[1:49,7])) Cor(DATA[1:49,6],DATA[1:49,7])

Introduction to R, Statistics, and the grammar of graphics Thomas INGICCO E. Delacroix, Dante et Virgile aux Enfers E. Delacroix, Dante and Virgile in.

Similar presentations

Presentation on theme: "Introduction to R, Statistics, and the grammar of graphics Thomas INGICCO E. Delacroix, Dante et Virgile aux Enfers E. Delacroix, Dante and Virgile in."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Introduction to R, Statistics, and the grammar of graphics Thomas INGICCO E. Delacroix, Dante et Virgile aux Enfers E. Delacroix, Dante and Virgile in.

Similar presentations

Presentation on theme: "Introduction to R, Statistics, and the grammar of graphics Thomas INGICCO E. Delacroix, Dante et Virgile aux Enfers E. Delacroix, Dante and Virgile in."— Presentation transcript:

Similar presentations

About project

Feedback