Statistical Applications in Biology and Genetics Tian Zheng Wednesday, March 12, 2003
Outline Biological Background Overview of quantitative research area related to genetics Sample project I: Bayesian Regression Analysis with application to Microarray studies Sample project II: BHTA algorithm for complex traits
Chromosomes and genes Video from the Human Genome Project You can also find links to background readings at : http://www.stat.columbia.edu/~tzheng/research/statgen.html Celebrating the 50th Anniversary of the discovery of DNA double-helix structure.
Biology: Science of 21st century Everybody talks about it!
Computational Biology (1) Sequence to function Sequence alignment using wet-lab results Model aligned sequences Predict function to sequence with unknown function using model fitted Sequence to structure of proteins Significance: sequence structure function
Computational Biology (2) Motif detection Homology detection
Bioinformatics/Genomics Gene expression analysis (using DNA chips or Microarray) Protein regulatory network inference Pedigree inference Phylogeny inference
Genetic Epidemiology Linkage mapping Association mapping Mapping for complex traits: quantitative traits, epistasis etc.
Linkage and Association Gene, alleles; Haplotype Transmission Cross-over and recombination Linkage
Sample Project: Bayesian Regression Analysis Mike West et al (2000) Bayesian Regression Analysis in the “large p, small n” Paradigm with application in DNA Microarray studies.
What is a Microarray/DNA chip How Chips Work?
Oligonucleotide Arrays Current “Golden Standard”!
Affymetrix GeneChip System
An Affymetrix GeneChip
Gene Expression Data n experiments (patients, types of cell lines, types of cancer tissues, etc) p genes on one array Subtracted and normalized gene expression data is a n by p matrix