Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genomics Chapter 22. What is a Genome? It is the total DNA content in a typical cell of an organism In multicellular organisms, most cells will have exact.

Similar presentations


Presentation on theme: "Genomics Chapter 22. What is a Genome? It is the total DNA content in a typical cell of an organism In multicellular organisms, most cells will have exact."— Presentation transcript:

1 Genomics Chapter 22

2 What is a Genome? It is the total DNA content in a typical cell of an organism In multicellular organisms, most cells will have exact same genome inside Exceptions: –Humans red blood cells which lack nuclei –Sperm and ova which have haploid genomes –Cancer-transformed cells which may have amplified or deleted genes

3 Genomics Studying the genome: Karyotyping: (largest) –Examining chromosomes and banding Linkage mapping: –Genetic distances (cM), position of markers Physical mapping: –Physical distances (Mb), position of genes Sequencing: (smallest) –Actual order of bases

4 Levels of Mapping

5 Sequencing Sequencing can only be done with about 800 bases at a time After this it starts to fall apart, no longer read with any accuracy Sequencing an entire genome is done by first shredding DNA apart Then trying to piece it back together 800 bases at a time Encyclopedia analogy

6 Sanger Method 1977 Fredrick Sanger Dideoxy sequencing Dideoxys are nucleotides that contain no free oxygen at all –These nucleotides cannot form chains –Polymerase stops copying DNA’s sequence when it adds a dideoxy base –Make each nucleotide with dideoxy sugar

7 Sanger Method Dideoxy nucleotides: deoxyribose OH HO-CH 2 dideoxyribose OH HO-CH 2 ribose OH HO-CH 2

8 Sanger Method In four tubes add: –DNA of unknown sequence –Everything necessary for DNA replication –One of each dideoxy nucleotide Each tube has a different dideoxy nucleotide in it (A, C, G or T) DNA polymerase will stop working once it adds a dideoxy base Therefore, get different lengths of copied DNA

9 Sanger Method Run each separate tube in it’s own lane on a gel –“A” lane, the “T” lane, etc Important – some regular nucleotide is also added so that sequence can continue past some of the time Fragments of different lengths Read the four lanes to determine sequence of complete DNA fragment

10 Sanger Method

11 Modern Sequencing Added florescent dyes: –A is red –C is blue –T is green –G is yellow Now can run all four tubes in same lane Automate entire process: –Invented an automated sequencer and a computer program to read the gels

12 Modern Sequencing

13 Practice Interpreting Sanger Determine the sequence from this gel: dd-A dd-C dd-T dd-G TGCACTGAATCAGTGCT ACGTGACTTAGTCACGA Direct Read Actual Sequence

14 Building sequences Based on overlapping fragments

15 Building sequences Based on overlapping fragments

16 Why 8X Coverage? When sequencing the complete genome of any organism (humans included) they always use DNA from 6 to 8 different individuals – WHY? Ensure fragments will overlap often Ensure each base is covered with at least two good clean “reads” Identify common polymorphisms

17 Important Pieces: Need to know: Dideoxy nucleotides STSs BACs –NIH and International Consortium’s (Public) Genome Project ESTs –Celera’s Genome Project

18 STS = Sequence Tagged Sites Short sequences that are completely unique in the genome: Already mapped to exact physical position in the genome 20 to 30 bases long Genomewide sequencing uses STS to identify where a given sequence lies within genome Sort of like Road Signs

19 BAC = Bacterial Artificial Chromosome Cloning vector to hold fragments of DNA Entire genome divided into BACs –Each BAC can hold ~100,000 bases Know which chromosomal regions are in which BACs BACs then sequenced: –800 bases at a time Sequences overlapped and built up

20 EST = Expressed Sequence Tag A small piece of DNA that is known to be expressed in certain cell type Therefore ESTs represent only DNA that encodes proteins EST libraries have been developed and can be checked against: –For different organisms or cell types Now can focus on only sequencing protein coding regions of genome

21 Human Genome Project Public Consortium: –Used BACs –Posted their sequencing results every night on GenBank Celera Genomics: –Skipped the BACs and shotgunned the entire genome into fragments –Only sequenced EST positive fragments –Used STSs to align sequences at the end –Updated their analysis from GenBank every morning

22 Human Genome Currently All protein coding regions are completely sequenced and aligned Many non-coding regions are sequenced but unaligned Many repeat regions still unsequenced Annotation is ongoing: –Determining where genes are –Determining gene function –Determining gene involvement in diseases

23 Positional Cloning Gene mapping: Begin with pedigrees of affecteds Step One – Linkage –Identify regions of genome where gene lies Step Two – Fine Mapping –Use more markers to determine exactly where gene is on chromosome Step Three – Gene Identification –Pinpoint exact gene causing disease

24 Positional Cloning Time consuming Only works for Mendelian disorders However it does work: –Cystic Fibrosis –Huntington’s Disease –Early onset Alzheimer's Sort of the opposite of Human Genome project –“Sequence now – Interpret later” – HGP

25 Positional Cloning Still happening today Book says “…a graduate student can find a gene in weeks” Not true – even with entire genome sequenced, even with Mendelian disorders Still have to analyze the sequences and identify which genes are involved in which disorders Not to mention complex disorders…

26 What is more important? Sequencing the entire human genome or positional cloning all the genes? Should the non-coding regions be sequenced? Why or why not? What about all the annotation? –Genome comparisons (to model organisms) –Identifying gene positions –Identifying gene function Focus on Mendelian or Complex disorders?

27 Genome Comparisons The human genome is 3 x 10 9 base pairs (3000 Mb) Prokaryotic genomes between 1 - 6 x 10 6 Model organisms –S. cerevisiae (yeast)12 Mb –C. elegans (roundworm) 100 Mb –D. melanogaster (fruit fly)170 Mb –M. musculus (mouse) 3000 Mb Some plant genomes are much larger (e.g., onion is 15000 Mb)

28 Genome Comparisons Larger genome doesn’t mean more genes Prokaryotes have about 1 to 6 1000 genes Yeast has about 6,000 genes But most multicellular organisms (animals and plants) have around 30,000 Humans included Rest of space in genome? –Repeats (especially in plants) –More/larger introns in genes

29 Synteny Comparing genomes shows regions of synteny Allows faster identification of homologous genes Human chromosomes with mouse pieces labeled

30 Bioinformatics Compare sequences to model organisms: –Determine gene function (homology) Search for gene positions: –Known genes are labeled on Human Genome Browser (www.genome.ucsc.edu)www.genome.ucsc.edu –Gene-like sequences are searched for to try to identify position of unknown genes Gene expression profiles: –Determine which genes are expressed when Gene pathways/networks

31 Example – post HGP: Searching for the genes involved in Autism Disorder: 1.Linkage Analysis –Use 345 pedigrees to find linkage 2.Fine Mapping –Narrow down region of linkage 3.Bioinformatics –Determine what genes exist in linked region 4.Follow up on a candidate gene

32 Outcomes of HGP

33 Summary Know: How Sanger (dideoxy) sequencing works Modern changes to sequencing BACs, ESTs, STSs Positional Cloning basics Genome comparison generalizations What bioinformatics is/can determine

34 Ethics Discussion Get into groups of 3 or 4 Discuss the following ethical decisions regarding genome sequencing Make notes of your discussions

35 Genome Ethics 1.Would you want to know all the diseases that you are predisposed for? Why or why not? What if the disease has a cure? What if there is only a painful treatment that works in 30% of people, but no cure? Should parents be allowed to determine their minor children’s predispositions?

36 Genome Ethics 2. Lets say that personalized medicine is a real possibility, but it still costs 10,000 dollars to sequence your genome. Who should pay the cost? You, your insurance, the government? How is that information kept private? What sort of diseases would be worth spending the money for? Which ones wouldn’t make sense?

37 Genome Ethics 3.Lets say the genome can be sequenced for $1,000. What if people could get their analysis through the mail, without every talking to a doctor. What are the pros and cons of this situation? What are some potential fall outs of receiving your genome analysis this way? How should the law control this procedure?

38 Genome Ethics 4. Think about behavioral disorders such as depression, anxiety or schizophrenia? Where is the line between treatment and medical enhancement? Who should enforce this line? Will things change once we learn all the genes involved in specific disorders? What about homosexuality?

39 Next Class: Homework – Chapter 22 Problems; –Review: 1, 3, 4 –Applied: 1, 5, 6, 9, 11 –Also – write out at least 5 questions about material to review on Wednesday Review All Chapters and Notes and Exams

40 Next Class: Review Chapters 1, 3-18, 20-22 Go through your review questions Final Exam: Twice as long as regular tests (200 pts) Cumulative Monday – December 12 th – 8 pm

41 Test dd-A dd-C dd-T dd-G


Download ppt "Genomics Chapter 22. What is a Genome? It is the total DNA content in a typical cell of an organism In multicellular organisms, most cells will have exact."

Similar presentations


Ads by Google