Outline Research Question: What determines height? Data Input Look at One Variable Compare Two Variables Children’s Height and Parents Height Children’s.

Slides:



Advertisements
Similar presentations
Family History A Journey Through Time What is family history? A study of the history of your family. Learning about customs of the time. What new technologies.
Advertisements

A gentle introduction to R – how to load in data and produce summary statistics BRC MH Bioinformatics group.
A very short introduction to R Pia Wohland. R is… -A statistical software -Programming language -Free! -Very good in handling and manipulating data sets.
1 A workshop on using R to select a sample for EHES Susie Cooper & Johan Heldal Statistics Norway.
Children Aged 5 to
Bivariate Normal Distribution and Regression Application to Galton’s Heights of Adult Children and Parents Sources: Galton, Francis (1889). Natural Inheritance,
Chapter 2 Programming by Example. A Holistic Perspective Three sections separated by blank lines. Program comment File: FileName.java
Functions in MatLab Create a new folder on your Z:drive called MatLab_Class24 Start MatLab and change your current directory to MatLab_Class24 Topics:
Programming in R Describing multivariate data. In this session I will explain: How to describe two or more categorical variables with tables and stacked.
Flow Master  Flow Master is used to design and analyze single pipe.  It is very flexible as no unit conversion is needed.  Data can be entered with.
Portfolio in the nursery Used tools examples Experimentation of a self-assessment system I Nathalie José, ES Luxembourg.
BY CHRIS ANDERSON Creating a MIDI Generator Program.
1 Simulation Modeling and Analysis Verification and Validation.
Frequency Tables and Data This represents all of the earthquakes above 6.0 on the Richter Scale between Feb 1, 2005 and Mar. 30,
Correlation and Covariance. Overview Continuous Categorical Histogram Scatter Boxplot Predictor Variable (X-Axis) Height Outcome, Dependent Variable (Y-Axis)
What factors are most responsible for height?
R Example Descriptive Statistics Frequency and Histogram Diagrams Standard Deviation.
Team members: Mohammad Al-Subaie Ahmed Al-Saleh Faisal Al-Eshiwy Mohammad Al-Dulaijan Ali Al-Nuami.
The Original and Current Basic R “Console” command line interface….
The Birth of Texas Baby Book Project Insert Name, Class Period, Teacher.
Outline Class Intros – What are your goals? – What types of problems? datasets? Overview of Course Example Research Project.
12-CRS-0106 REVISED 8 FEB 2013 CSG2A3 ALGORITMA dan STRUKTUR DATA.
ENGINEERING CONSULTANTS 11480, SUNSET HILLS ROAD RESTON, VA,
JCreator Tonga Institute of Higher Education. Programming with the command line and notepad is difficult. DOS disadvantages  User Interface (UI) is not.
A Few Handful Many Time Stamps One Time Snapshot Many Time Series Number of Variables Mobile Phone Galton Height Census Titanic Survivors Stock Market.
R-Studio and Revolution Analytics have built additional functionality on top of base R.
Outline Class Intros Overview of Course Example Research Project.
Outline Comparison of Excel and R R Coding Example – RStudio Environment – Getting Help – Enter Data – Calculate Mean – Basic Plots – Save a Coding Script.
Syllabus. We covered Regression in Applied Stats. We will review Regression and cover Time Series and Principle Components Analysis. Reference Book.
1 Applets are small applications that are accessed on an Internet server, transported over the internet, automatically installed and run as a part of web.
Research Question What determines a person’s height?
Accessible Web Publishing Wizard For Microsoft® Office E-Series Webcasts Best Practices for Microsoft® PowerPoint.
Where to Get Data? Run an Experiment Use Existing Data.
What factors are most responsible for height?. Model Specification ERROR??? measurement error model error analysis unexplained unknown unaccounted for.
Outline Research Question: What determines height? Data Input Look at One Variable Compare Two Variables Children’s Height and Parents Height Children’s.
Cell Division Growth and Development: A Property of Life.
Main Themes Few vs. Many Variables Linear vs. Non-Linear Statistics vs. Machine Learning.
Question 3 RExcel Analysis. 1.Double click on the RExcel2007 icon on your desktop to launch R and Excel.

Standards-Based Report Cards. What is a Standards-Based Report Card? It reports student progress toward meeting state and district standards in areas.
Actor Heights 1)Create Vectors of Actor Names, Heights, Date of Birth, Gender 2) Combine the 4 Vectors into a DataFrame.
OBJ: Solve Linear systems graphically & algebraically Do Now: Solve GRAPHICALLY 1) y = 2x – 4 y = x - 1 Do Now: Solve ALGEBRAICALLY *Substitution OR Linear.
Vectors and DataFrames. Character Vector: b
Statistical Exploratory Analysis with “EnQuireR” 1.Introduction 2.Installation 3.How to 4.Report.
An Introduction to GNU-R Image of Manchester Mark 1 used with the kind permission of the School of Computer Science, The.
 An illustrated story to help children understand and cope with the problem of alcoholism or other drug addiction in the family.
Windows 7 Ultimate
Bivariate Normal Distribution and Regression
R Assignment #4: Making Plots with R (Due – by ) BIOL
PB 26 GRAPHICS.
Nature’s Notebook Monitoring Program
Independent vs Dependant Variables
Code is on the Website Outline Comparison of Excel and R
Claire Osgood November 2017
Bivariate Normal Distribution and Regression
Gay and Lesbian Parenting
Lesson 4.7 Graph Linear Functions
1.1- Relations and Functions
USA Learns Citizenship
Signal Conditioning.
Setting up R Project Link to download R: Link to download R console:
21 3 Variables Selection Functions Repetition Challenge 21
Algebraic and Graphical Evaluation
Bivariate Normal Distribution and Regression
7.1 – Functions of Several Variables
Lesson Quizzes Standard Lesson Quiz
MACIAS CASTILLO, JOSSELYN T4 PROJECTS 2
Alternate Development Chart:
Today’s Agenda Grammar: Series Comma with an awkward video of course. Practice on Page 8 of the SkillsBook. Persuasive Essay Review & Analysis Writer’s.
Warwickshire Parent Carer Forum acts as a voice for parent carers and families who have children or young people with Special Educational Needs or disabilities.
Presentation transcript:

Outline Research Question: What determines height? Data Input Look at One Variable Compare Two Variables Children’s Height and Parents Height Children’s Height and Gender Graphic Packages: ggplot2

What factors are most responsible for height?

X1X2X3Y Galton’s Family Height Dataset

Galton’s Notebook on Families & Height

> getwd() [1] "C:/Users/johnp_000/Documents" > setwd()

Dataset Input Function Filename Object h <- read.csv("GaltonFamilies.csv")

str() summary() Data Types: Numbers and Factors/Categorical

Steps Continuous Categorical Histogram Scatter Boxplot Child’s Height Dad’s Height Gender Continuous Type Variable Mom’s Height

Frequency Distribution, Histogram hist(h$child)

Area = 1 Density Plot plot(density(h$childHeight))

hist(h$childHeight,freq=F, breaks =25, ylim = c(0,0.14)) curve(dnorm(x, mean=mean(h$childHeight), sd=sd(h$childHeight)), col="red", add=T) Mode, Bimodal

Grammar of Graphics formations Legend Axes Seven Components ggplot2 built using the grammar of graphics approach

Asst. Professor of Statistics at Rice University ggplot2 plyr reshape rggobi profr Hadley Wickman and ggplot2

In ggplot2 a plot is made up of layers. ggplot2 Plot

ggplot2 library(ggplot2) h.gg <- ggplot(h, aes(child)) h.gg + geom_histogram(binwidth = 1 ) + labs(x = "Height", y = "Frequency") h.gg + geom_density()

ggplot2 h.gg <- ggplot(h, aes(child)) + theme(legend.position = "right") h.gg + geom_density() + labs(x = "Height", y = "Frequency") h.gg + geom_density(aes(fill=factor(gender)), size=2)

Box Plot

Children’s Height vs. Gender boxplot(h$child~gender,data=h, col=(c("pink","lightblue")), main="Children's Height by Gender", xlab="Gender", ylab="")

Descriptive Stats: Box Plot

Subset Males men<- subset(h, gender=='male')

Subset Females women <- subset(h, gender==‘female')

Children’s Height: Males hist(men$childHeight)

Children’s Height: Females hist(women$child)

ggplot2 library(ggplot2) h.bb <- ggplot(h, aes(factor(gender), child)) h.bb + geom_boxplot() h.bb + geom_boxplot(aes(fill = factor(gender)))

Steps Continuous Categorical Histogram Scatter Boxplot Child’s Height Dad’s Height Gender Continuous Y X1, X2 X3 Type Variable Mom’s Height

Correlation

?cor cor(h$father, h$child)

Scatterplot Matrix: pairs()

Correlations Matrix library(car) scatterplotMatrix(heights)

ggplot2

Analytics & History: 1st Regression Line The first “Regression Line”

Steps Continuous Categorical Histogram Scatter Boxplot Child’s Height Dad’s Height Gender Continuous Type Variable Mom’s Height

Appendix

.net BIRT cytoscape flot gephi gnuplot graphite iDashboards Incanter Java JMP Javascript: Raphael Highcharts Arbor jfreecharts BI Tools Spotfire Cognos MicroStrategy LogiXML MDX Mondrian octave openlayers OpenViz PhP Powerpoint precog Prezi processing Ptotobi Silverlight splunk SSRS talend webGL Wijmo WPF Xcelcuis XLMiner May, 2013 N=172 What software do you use for creating charts or data visualizations?

Easy to Use Interactive Standard Visualizations Steep Learning Curve Visualization and Reporting

BI Software: Tableau

The next data visual was produced with about 150 lines of R code

Data Viz Tutorials