Download presentation
Presentation is loading. Please wait.
Published bySylvia Parsons Modified over 9 years ago
1
Research Question What determines a person’s height?
2
Genetics Nutrition Immigration / Origins Disease Hypothesis Brainstorming Sons will be similar to their Dad’s height Daughters will be similar to their Mom’s height Hypotheses:
3
Literature Review: Article #1 Invented Regression When Mid-Parents are taller then mediocrity, their Children tend to be shorter than they When Mid-Parents are shorter than mediocrity, their Children tend to be taller then they Francis Galton
4
Literature Review: Article #2 Variables: Genes First two years of life Illnesses Infant mortality rates Smaller Families Higher income Better education
5
Literature Review: Article #3 “we find that a 54-loci genomic profile explained 4–6% of the sex- and age-adjusted height variance” “the Galtonian mid-parental prediction method explained 40% of the sex- and age-adjusted height variance”
6
Literature Review: Summary VariableGaltonHattonAulchenko HeightIndividualsCountry AverageIndividuals GenderMen and WomenMen OnlyMen and Women AgeIndividuals Countries Infant MortalityCountry Average GDPCountry Average Family SizeCountry Average TimeX GenomeIndividuals Observations~1,000550 5,478
7
Variables Y X’s Height Independent Variables Dependent Variable Y X4 X3 X2X1
8
Height Dataset Variables heights <- read.csv("GaltonFamilies.csv")
9
Data Types: Numbers and Factors/Categorical Dataset Variables: Type
10
Summary Statistics
11
Frequency Distribution, Histogram hist(heights$childHeight)
12
hist(h$childHeight,freq=F, breaks =25, ylim = c(0,0.14)) curve(dnorm(x, mean=mean(h$childHeight), sd=sd(h$childHeight)), col="red", add=T) Bimodal: two modes Mode, Bimodal
13
Q-Q Plot
14
Correlation Matrix for Continuous Variables chart.Correlation(num2) PerformanceAnalytics package
15
Correlations Matrix: Both Types library(car) scatterplotMatrix(heights) Zoom in on Gender
16
Categorical: Revisit Box Plot Note there is an equation here: Y = mx b Correlation will depend on spread of distributions
17
Children Height by Gender
18
Linear Regression: Model 1 Child’s Height = f(Father’s Height)
19
Linear Regression: Model 2 model.5 <- lm(childHeight~gender, data = h) Child’s Height = f(Father’s Height)
20
Mom MidParent Height Linear Regression: Additional Models
21
Compare Models Model121234 Intercept40.146.622.622.6322.64 Father 0.385 0.360.01 Mom0.3140.29NA midparentHeight0.6370.538 Gender R-squares0.0700.03950.1050.1020.1033 r0.270.20.32 R^20.0730.040.102
22
Key Findings: Gender was the biggest factor Parents height played a lesser role Downsides DataSet used did not include more variables of interest DataSet for X Country for 1877 Discussion Summary
23
Include More Predictor Variables Literature review of a few articles suggests several important factors: Nutrition Analyze a Contemporary DataSet DataSet used was from 18?? Location Specific as Well Future Research
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.