Presentation is loading. Please wait.

Presentation is loading. Please wait.

BI820 – Seminar in Quantitative and Computational Problems in Genomics

Similar presentations


Presentation on theme: "BI820 – Seminar in Quantitative and Computational Problems in Genomics"— Presentation transcript:

1 BI820 – Seminar in Quantitative and Computational Problems in Genomics
Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

2 Sequence variations Human Genome Project produced a reference genome sequence that is 99.9% common to each human being sequence variations make our genetic makeup unique SNP Single-nucleotide polymorphisms (SNPs) are most abundant, but other types of variations exist and are important

3 Why do we care about variations?
phenotypic differences inherited diseases demographic history

4 How do we find polymorphisms?
look at multiple sequences from the same genome region diverse sequence resources can be used EST WGS BAC diversion: sequencing informatics

5 SNP discovery -- Methods
Sequence clustering Cluster refinement Multiple alignment SNP detection

6 SNP discovery – Computer tools

7 SNP discovery – Mining Projects
~ 30,000 clones >CloneX ACGTTGCAACGT GTCAATGCTGCA >CloneY ACGTTGCAACGT GTCAATGCTGCA 25,901 clones (7,122 finished, 18,779 draft with basequality values) 21,020 clone overlaps (124,356 fragment overlaps) ACCTAGGAGACTGAACTTACTG 507,152 high-quality candidate SNPs (validation rate 83-96%) Marth et al., Nature Genetics 2001 ACCTAGGAGACCGAACTTACTG

8 SNP databases and characteristics
access to variation data SNP properties reliability of information characterizing known polymorphic sites in sample collections – genotyping

9 Where do variations come from?
sequence variations are the result of mutation events TAAAAAT TAACAAT TAAAAAT TAACAAT MRCA mutations are propagated down through generations TAAAAAT TAACAAT

10 Mutation rate higher mutation rate (µ) gives rise to more SNPS MRCA
accgttatgtaga accgctatgtaga MRCA actgttatgtaga accgctatataga MRCA

11 Recombination accgttatgtaga accgttatgtaga accgttatgtaga accgttatgtaga

12 Demographic history large (effective) population size N
small (effective) population size N different world populations have varying long-term effective population sizes (e.g. African N is larger than European)

13 Modeling history stationary collapse expansion bottleneck past present
MD (simulation) AFS (direct form)

14 Ancestral inference modest but uninterrupted expansion bottleneck

15 The effects and signatures of selection
selective mutations influence the genealogy itself; in the case of neutral mutations the processes of mutation and genealogy are decoupled

16 Allelic association and haplotype structure
“linkage disequilibrium” “haplotype blocks”

17 Computer simulations: the Coalescent

18 ? Medical utility? clinical phenotype molecular markers
functional understanding

19 Mapping disease-causing loci
genetic linkage association between allele and phenotype

20 Forensic applications


Download ppt "BI820 – Seminar in Quantitative and Computational Problems in Genomics"

Similar presentations


Ads by Google