The role of variation in finding functional genetic elements Andy Clark – Cornell Dave Begun – UC Davis
The Human Encode project Sequencing the regions in multiple mammals. Identifying genes by conservation (twinscan), by promoter signatures, etc.. Identifying motifs by ChIP/Chip, and interspecific conservation Identifying motifs by phylogenetic shadowing Seeks to identify all functional genetic elements in 1% of the human genome by:
What ENCODE won’t do Identify the genes that are important in determining variation in risk of common complex diseases.
Low resolution of linkage methods Typical confidence in gene location is Mbp.
Equilibrium relation between LD and recombination rate E(r 2 )
Linkage disequilibrium is rare beyond 100 kb or so 533 regions, The SNP Consortium data
Beyond 500 kb, there is almost zero Linkage disequilibrium
…so observing LD means the sites are likely to be close together
Empirical observation of linkage disequilibrium Reich et al. (2001 Nature 411: ) |D’| kb
LD in the human genome has a “block” structure
The HapMap project Genotype 2.5 million SNPs across the genome in 90 individuals from each Europe, Asia and Africa. Identify regions of high and low linkage disequilibrium. Identify SNPs that are most informative for the neighboring regions in the genome. All of this is in the hope that HapMap SNPs will help to map genes for complex traits by association tests.
HapMap woes HapMap is a big gamble – no model system. Assumes variation in risk of complex diseases is driven by a few simple, common SNPs of large effect.
The intersection between human ENCODE and HapMap Resequence the ENCODE regions in 48 humans from the HapMap project. Genotype all SNPs discovered in all 270 HapMap individuals. Use the ENCODE regions to ask whether higher density HapMap would be more informative, and whether low density HapMap distorts the picture.
ENCODE needs to have measurements of functional variation Not feasible in human ENCODE study. But it would be feasible in flies. The Drosophila community has a long history of innovation in studies of the genetics of complex traits.
The role of variation in a Drosophila ENCODE project Nearly all functional sites will exhibit variation in sequence. Do they exhibit variation in function? Natural populations provide an important source of mutations for understanding function.
Advantages of a Drosophila ENCODE project Validation – we can do efficient experiments to test function. Flies have more complex phenotypes. Experimental studies are needed for the challenge of assembling gene regulatory networks. The Drosophila community is best prepared to deal with VARIATION at multiple levels.