What factors are most responsible for height?
Model Specification ERROR??? measurement error model error analysis unexplained unknown unaccounted for missing variables Outcome = (Model) + Error
Analytics & History: 1st Regression Line The first “Regression Line”
Men's average height 'up 11cm since 1870s'
Galton’s Notebook on Families & Height
X1X2X3 X4 X5Y
we find that a 54-loci genomic profile explained 4–6% of the sex- and age-adjusted height variance the Galtonian mid-parental prediction method explained 40% of the sex- and age-adjusted height variance
> getwd() [1] "C:/Users/johnp_000/Documents" > setwd()
Dataset Input Function Filename Object
Data Types: Numbers and Factors/Categorical str() summary()
head() summary() ece
ContinuousCategorical Continuous Categorical Histogram Scatter Bar Cross Table Boxplot Predictor Variable (X-Axis) Pie Child’s Height Smartphone? Yes or No Outcome, Dependent Variable (Y-Axis) Mosaic Cross Table Linear Regression Logistic Regression Regression Model Parents Height Gender Frequency 0 1 Outcome, Dependent Variable (Y-Axis)
Frequency Distribution, Histogram hist(heights$childHeight)
Standard Deviation Mean
Deviation between mean and an actual data point. Calculating Standard Deviation - sd()
Normal Distribution and SD Mean = 66.5 S.D. = = 73.6 SDPct.Z-scoreHeights 190% % % = 59.4
Area = 1 Density Plot plot(density(h$childHeight))
hist(h$childHeight,freq=F, breaks =25, ylim = c(0,0.14)) curve(dnorm(x, mean=mean(h$childHeight), sd=sd(h$childHeight)), col="red", add=T) Bimodal: two modes Mode, Bimodal
ggplot2