1 Paper Outline Specific Aim Background & Significance Research Description Potential Pitfalls and Alternate Approaches Class Paper: 5-7 pages (with figures)

Slides:



Advertisements
Similar presentations
Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Advertisements

Planning breeding programs for impact
Introduction Materials and methods SUBJECTS : Balb/cJ and C57BL/6J inbred mouse strains, and inbred fruit fly strains number 11 and 70 from the recombinant.
Frary et al. Advanced Backcross QTL analysis of a Lycopersicon esculentum x L. pennellii cross and identification of possible orthologs in the Solanaceae.
Population Genetics and Natural Selection
From the homework: Distribution of DNA fragments generated by Micrococcal nuclease digestion mean(nucs) = bp median(nucs) = 110 bp sd(nucs+ = 17.3.
Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
QTL Mapping R. M. Sundaram.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Signatures of Selection
Variation in submergence tolerance
Differentially expressed genes
Quantitative Genetics
Gene Set Analysis 09/24/07. From individual gene to gene sets Finding a list of differentially expressed genes is only the starting point. Suppose we.
BACKGROUND E. coli is a free living, gram negative bacterium which colonizes the lower gut of animals. Since it is a model organism, a lot of experimental.
Genomics tools to identify the molecular basis of complex traits Justin Borevitz Salk Institute naturalvariation.org.
Significance Tests P-values and Q-values. Outline Statistical significance in multiple testing Statistical significance in multiple testing Empirical.
CS 374: Relating the Genetic Code to Gene Expression Sandeep Chinchali.
Office hours Wednesday 3-4pm 304A Stanley Hall Review session 5pm Thursday, Dec. 11 GPB100.
Identification of obesity-associated intergenic long noncoding RNAs
Different Expression Multiple Hypothesis Testing STAT115 Spring 2012.
Special Topics in Genomics Lecture 1: Introduction Instructor: Hongkai Ji Department of Biostatistics
Chapter 2 Genes Encode RNAs and Polypeptides
Wfleabase.org/docs/tileMEseq0905.pdf Notes and statistics on base level expression May 2009Don Gilbert Biology Dept., Indiana University
Multiple testing correction
Modes of selection on quantitative traits. Directional selection The population responds to selection when the mean value changes in one direction Here,
Geuvadis RNAseq analysis at UNIGE Analysis plans
Characterizing the role of miRNAs within gene regulatory networks using integrative genomics techniques Min Wenwen
20.1 – 1 Look at the illustration of “Cloning a Human Gene in a Bacterial Plasmid” (Figure 20.4 in the orange book). If the medium used for plating cells.
Methods of Genome Mapping linkage maps, physical maps, QTL analysis The focus of the course should be on analytical (bioinformatic) tools for genome mapping,
Genetic Variation and Mutation. Definitions and Terminology Microevolution –Changes within populations or species in gene frequencies and distributions.
The Evolution of Populations.  Emphasizes the extensive genetic variation within populations and recognizes the importance of quantitative characteristics.
* only 17% of SNPs implicated in freshwater adaptation map to coding sequences Many, many mapping studies find prevalent noncoding QTLs.
IAP workshop, Ghent, Sept. 18 th, 2008 Mixed model analysis to discover cis- regulatory haplotypes in A. Thaliana Fanghong Zhang*, Stijn Vansteelandt*,
Regulation of gene expression in the mammalian eye and its relevance to eye disease Todd Scheetz et al. Presented by John MC Ma.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Quantitative Genetics. Continuous phenotypic variation within populations- not discrete characters Phenotypic variation due to both genetic and environmental.
Complex Traits Most neurobehavioral traits are complex Multifactorial
Quantitative Genetics
Multiple Testing Matthew Kowgier. Multiple Testing In statistics, the multiple comparisons/testing problem occurs when one considers a set of statistical.
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.
Comp. Genomics Recitation 10 4/7/09 Differential expression detection.
1 Before considering selection, it’s important to characterize how gene expression varies within and between species. What evolutionary forces act on gene.
Chapter 22 - Quantitative genetics: Traits with a continuous distribution of phenotypes are called continuous traits (e.g., height, weight, growth rate,
Testing the Neutral Mutation Hypothesis The neutral theory predicts that polymorphism within species is correlated positively with fixed differences between.
What is a QTL? Quantitative trait locus (loci) Region of chromosome that contributes to variation in a quantitative trait Generally used to study “complex.
1 What forces constrain/drive protein evolution? Looking at all coding sequences across multiple genomes can shed considerable light on which forces contribute.
Genetics of Gene Expression BIOS Statistics for Systems Biology Spring 2008.
Canadian Bioinformatics Workshops
Who is smarter and does more tricks you or a bacteria? YouBacteria How does my DNA compare to a prokaryote? Show-off.
Understanding GWAS SNPs Xiaole Shirley Liu Stat 115/215.
Estimating the False Discovery Rate in Genome-wide Studies BMI/CS 576 Colin Dewey Fall 2008.
Canadian Bioinformatics Workshops
EQTLs.
upstream vs. ORF binding and gene expression?
Quantitative traits Lecture 13 By Ms. Shumaila Azam
Quantitative genetics
Genome Wide Association Studies using SNP
Complex Traits Qualitative traits. Discrete phenotypes with direct Mendelian relationship to genotype. e.g. black or white, tall or short, sick or healthy.
Genetic Drift, followed by selection can cause linkage disequilibrium
Sequential Steps in Genome Mapping
Chapter 7 Beyond alleles: Quantitative Genetics
Modes of selection.
Zvi Tamari, Naama Barkai  Cell Reports 
Presentation transcript:

1 Paper Outline Specific Aim Background & Significance Research Description Potential Pitfalls and Alternate Approaches Class Paper: 5-7 pages (with figures) Scope is 1 aim (one main topic, can have multiple sub aims) * Topic must utilize at least one genomic sequence in some way Short description of topic: due THURSDAY, March 25 Final paper due: THURSDAY, April 29

2 What contributes to the evolution of gene expression? How many loci underlie expression variation? Few major effectors or many minor contributors? What are the mechanisms of expression evolution? Relative prominence of cis vs. trans effects? How much of expression variation has been selected for?

3 Expression QTL (eQTL) mapping Treat expression levels as a quantitative trait. * With DNA microarrays (and now next-gen sequencing) can simultaneously phenotype (measure) ALL genes at once. Can quickly measure all the reproducible (heritable) expression differences between two different parental lines.

4 eQTL mapping in yeast spore clones From Rockman & Kruglyak 2006 First done by Rachel Brem et al.

5 eQTL mapping in yeast spore clones From Rockman & Kruglyak 2006 First done by Rachel Brem et al.

6 eQTL mapping in yeast spore clones From Rockman & Kruglyak 2006 First done by Rachel Brem et al.

7 LOD threshold in standard QTL mapping (* 1 trait) 1000 permutations 10% LOD score threshold: % LOD score threshold: 3.52 Challenge with eQTL mapping is that there are thousands of traits.

8 Challenge of multiple testing Imagine doing a single t-test with p = 0.01 the significance threshold. * at this p-value: 1 in 100 change data could be randomly generated But if you do 10,000 t-tests and EACH has a p = 0.01 … expect 100 positive tests to have occurred by chance In genomics it is common to do a Multiple-Test Correction on the p-value cutoff * Simplest is the Bonferroni correction but it is way too stringent Divide p-value cutoff by number of tests. eg / 10,000 tests = is new cutoff * Better methods adjust for False Discovery Rate (FDR) (eg. Benjamini & Hochberg or Storey’s Qvalue) Out of total set of what was called significant, how many of those are likely to be false positives.

9 Lessons for eQTL studies Only ~25% of heritable expression traits can even be mapped - on average they explain only 30% of heritable variation Most traits explained by many loci - only 3% explained by 1 locus - Alan Orr exponential QTL model: few big effectors with lots of modifiers Majority of traits explained by transgressive segregation - distribution of F2 phenotypes extends beyond parental phenotypes - indicates many small effectors - suggests stabilizing selection - also consistent with epistasis

10 Lessons for eQTL studies Fewer traits show directional segregation - Phenotypic distribution of F2’s between the parents - Also implies many minor effectors - Suggests directional selection by ‘tweaking’

11 ~16% were directional ~23% showed epistasis ~82% were transgressive Brem et al PNAS epistatic 583 directional 406 highly heritable transgressive 2093 Lessons for eQTL studies

12 Local vs. Distant and cis vs. trans ORF Local QTL that work in cis: TF binding site affects transcription 3’ UTR affects RNA stability Local eQTL: “near” the affected gene Distant eQTL: “far” from the affected gene cis effect: often taken to mean on the DNA molecule affected trans effect: often taken to mean takes effect through the protein/RNA

13 Local vs. Distant and cis vs. trans Local eQTL: “near” the affected gene Distant eQTL: “far” from the affected gene cis effect: often taken to mean on the DNA molecule affected trans effect: often taken to mean takes effect through the protein/RNA ORF Local QTL that work in trans: Coding polymorphism that affects TF activity

14 Local vs. Distant and cis vs. trans ORF Distant QTL that work in trans: Local eQTL: “near” the affected gene Distant eQTL: “far” from the affected gene cis effect: often taken to mean on the DNA molecule affected trans effect: often taken to mean takes effect through the protein/RNA ORF

15 Local vs. Distant and cis vs. trans ORF Distant QTL that work in trans: Local eQTL: “near” the affected gene Distant eQTL: “far” from the affected gene cis effect: often taken to mean on the DNA molecule affected trans effect: often taken to mean takes effect through the protein/RNA ORF PHYSIOLOGY Most trans acting effects are likely secondary responses (distantly-acting loci are NOT enriched for TFs)

16 Local vs. Distant and cis vs. trans Which is more prevalent? Estimates vary: - Brem et al papers: ~25% traits explained by local polymorphs - other studies say close to 100% - Many MORE individual genes explained by distant polymorphs * but because many link to same loci, there are fewer distantly acting loci But … statistical challenges likely enrich for local polymorphisms: - FDR hurdle is higher for trans acting loci - cis (local) polymorphisms may have larger effect size - also depends on how “local” is defined

17 Local vs. Distant and cis vs. trans Which is more prevalent? Using hybrid diploids and allele-specific expression ORF-1 ORF-2 A cis acting polymorphism will affect only the allele it’s physically linked to

18 Local vs. Distant and cis vs. trans Which is more prevalent? Using hybrid diploids and allele-specific expression ORF-1 ORF-2 A trans acting polymorphism will affect BOTH alleles

19 Local vs. Distant and cis vs. trans Which is more prevalent? Tricia Wittkopp et al Nature differentially expressed genes between D. melanogaster & D. simulans: - Measured allele-specific expression in D. mel/D. sim hybrid with pyrosequencing 28 out of 29 show cis variation in expression 16 out of 29 affected by trans and cis variation Conclusion: cis-acting variation is more common to explain interspecific variation

20 Local vs. Distant and cis vs. trans Which is more prevalent? Tricia Wittkopp et al Nature Genetics genes examined (48 within, 49 between species … 16 genes overlapping) 4 D. melanogaster strains and 4 D. simulans strains Conclusion: trans-acting variation is more common within species (over shorter time frames) but is more likely to have more pleiotropic and deleterious effects … trans-acting variation more likely to be removed over time -cis regulatory effects explained more variation between (64%) species rather than within (35%) … argues against neutrality, since effects should occur at same ‘rate’ over time - compensatory cis + trans effects also more common between species