BB30055: Genes and genomes Major insights from the HGP.

Slides:



Advertisements
Similar presentations
Repetitive elements Evolutionary ‘signposts’
Advertisements

LINEs and SINEs ….& towards cancer! Presenter: Manindra Singh Course: MCB 720 (Winter Qt.)
Genomics – The Language of DNA Honors Genetics 2006.
Introduction to genomes & genome browsers
Major insights from the HGP on Nature (2001) 15 th Feb Vol 409 special issue; pgs 814 & )Gene content 2)Proteome content 3)SNP identification.
Chap. 6 Problem 2 Protein coding genes are grouped into the classes known as solitary (single) genes, and duplicated or diverged genes in gene families.
Describe the structure of a nucleosome, the basic unit of DNA packaging in eukaryotic cells.
IDENTIFICATION OF THE MOLECULAR MECHANISMS IN RETT SYNDROME AND RELATED DISORDERS (RTT-GENET) X.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. CHAPTER 18 LECTURE SLIDES.
Genes. Outline  Genes: definitions  Molecular genetics - methodology  Genome Content  Molecular structure of mRNA-coding genes  Genetics  Gene regulation.
Genomes summary 1.>930 bacterial genomes sequenced. 2.Circular. Genes densely packed Mbases, ,000 genes 4.Genomes of >200 eukaryotes (45.
Human Genome Project Seminal achievement. Scientific milestone. Scientific implications. Social implications.
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
Large-Scale Copy Number Polymorphism in the Human Genome J. Sebat et al. Science, 305:525 Luana Ávila MedG 505 Feb. 24 th /24.
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
- any detectable change in DNA sequence eg. errors in DNA replication/repair - inherited ones of interest in evolutionary studies Deleterious - will be.
Ultraconserved Elements in the Human Genome Bejerano, G., et.al. Katie Allen & Megan Mosher.
Chapter 5 Genome Sequences and Gene Numbers. 5.1Introduction  Genome size vary from approximately 470 genes for Mycoplasma genitalium to 25,000 for human.
Selfish DNA Honors Genetics.
Genome Organization and Evolution. Assignment For 2/24/04 Read: Lesk, Chapter 2 Exercises 2.1, 2.5, 2.7, p 110 Problem 2.2, p 112 Weblems 2.4, 2.7, pp.
발표자 석사 2 년 김태형 Vol. 11, Issue 3, , March 2001 Comparative DNA Sequence Analysis of Mouse and Human Protocadherin Gene Clusters 인간과 마우스의 PCDH 유전자.
Transposon and Mechanisms of Transposition
Gene & Genome Evolution1 Chapter 9 You will not be responsible for: Read the How We Know section on Counting Genes, and be able to discuss methodologies.
Translesion DNA Synthesis Cells bypass lesions encountered at the replication fork during DNA synthesis and correct them after replication is finished.
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
Genome Organization & Evolution. Chromosomes Genes are always in genomic structures (chromosomes) – never ‘free floating’ Bacterial genomes are circular.
Genetic Variation in Individuals and Populations: Mutation and Polymorphism Chapter 9 Thompson and Thompson (only mutation) Dr. M. Fardaei 1.
Ch. 21 Genomes and their Evolution. New approaches have accelerated the pace of genome sequencing The human genome project began in 1990, using a three-stage.
Click to edit Master title style Click to edit Master subtitle style CLICKER QUESTIONS For CAMPBELL BIOLOGY, NINTH EDITION Jane B. Reece, Lisa A. Urry,
BB30055: Genes and genomes Genomes - Dr. MV Hejmadi Lecture 2 – Repeat elements.
Eukaryotic Genomes 15 November, 2002 Text Chapter 19.
Introduction to Molecular Genetics Studiju materiāli / MolekularasBiologijas / Ievads MolGen / EN.
Chapter 5 The Content of the Genome 5.1 Introduction genome – The complete set of sequences in the genetic material of an organism. –It includes the.
BB30055: Genes and genomes Genomes - Dr. MV Hejmadi (bssmvh)
Lecture 10 Genes, genomes and chromosomes
Lecture 2 – Repeat elements
Chapter 2 From Genes to Genomes. 2.1 Introduction We can think about mapping genes and genomes at several levels of resolution: A genetic (or linkage)
Evolution at the Molecular Level. Outline Evolution of genomes Evolution of genomes Review of various types and effects of mutations Review of various.
Genomics Chapter 18.
Gene & Genome Evolution
The Secret of Life! DNA. 2/4/20162 SOMETHING HAPPENS GENE PROTEIN.
‘mobile’ DNA or ‘jumping’ DNA Transposable elements as drivers of evolution.
BB30055: Genes and genomes Major insights from the HGP.
Chapter 18.1 Contributors of Genetic Diversity in Bacteria.
Gene Regulation, Part 2 Lecture 15 (cont.) Fall 2008.
Looking Within Human Genome King abdulaziz university Dr. Nisreen R Tashkandy GENOMICS ; THE PIG PICTURE.
BB30055: Genes and genomes Major insights from the HGP.
SNP Detection Congtam Pham 2/24/04 Dr. Marth’s Class.
Thursday, March 2, 2017 GOALS: Finish Ghost in your Genes
Organization of the human genome
Genomes and Their Evolution
Transposable Elements
Genomes and Their Evolution
SGN23 The Organization of the Human Genome
Chapter 15 Controls over Genes.
Genomes and Their Evolution
What makes a mutant?.
Evolution of eukaryote genomes
Fig Figure 21.1 What genomic information makes a human or chimpanzee?
Genomes and Their Evolution
Organization of the human genome
Chapter 9 Organization of the Human Genome
Lecture 11 LTRs Properties of Chromatin Telomeres.
Genomes and Their Evolution
Extra chromosomal Agents Transposable elements
Transposable Elements
The Content of the Genome
Human Genome Project Seminal achievement. Scientific milestone.
Eukaryotic Gene Regulation
SNPs and CNPs By: David Wendel.
Presentation transcript:

BB30055: Genes and genomes Major insights from the HGP

Major insights from the HGP Gene size, content and distribution Proteome content SNP identification Distribution of GC content CpG islands Recombination rates Repeat content Nature (2001) 15th Feb Vol 409 special issue; pgs 814 & 875-914.

1) Gene size

Gene content…. More genes: Twice as many as drosophila / C.elegans Uneven gene distribution: Gene-rich and gene-poor regions More paralogs: some gene families have extended the number of paralogs e.g. olfactory gene family has 1000 genes More alternative transcripts: Increased RNA splice variants produced thereby expanding the primary proteins by 5 fold (e.g. neurexin genes)

Gene distribution Genes generally dispersed (~1 gene per 100kb) Class III complex at HLA 6p21.3 Overlapping genes (transcribed from 2 DNA strands) - Rare Genes- within genes E.g. NF1 gene HMG3 Fig 9.8

Uneven gene distribution Gene-rich E.g. MHC on chromosome 6 has 60 genes with a GC content of 54% Gene-poor regions 82 gene deserts identified ? Large or unidentified genes What is the functional significance of these variations?

2) Proteome content proteome more complex than invertebrates Protein Domains (sections with identifiable shape/function) Domain arrangements in humans largest total number of domains is 130 largest number of domain types per protein is 9 Mostly identical arrangement of domains A A B B B C C C C C Protein X

Proteome more complex than invertebrates…… no huge difference in domain number in humans BUT, frequency of domain sharing very high in human proteins (structural proteins and proteins involved in signal transduction and immune function) However, only 3 cases where a combination of 3 domain types shared by human & yeast proteins. e.g carbomyl-phosphate synthase (involved in the first 3 steps of de novo pyrimidine biosynthesis) has 7 domain types, which occurs once in human and yeast but twice in drosophila

3) SNPs (single nucleotide polymorphisms) Sites that result from point mutations in individual base pairs biallelic ~60,000 SNPs lie within exons and untranslated regions (85% of exons lie within 5kb of a SNP) May or may not affect the ORF Most SNPs may be regulatory More than 1.4million SNPs identified One every 1.9kb length on average Densities vary over regions and chromosomes e.g. HLA region has a high SNP density, reflecting maintenance of diverse haplotypes over many MYears Nature (2001) 15th Feb Vol 409 special issue; pgs 821-823 & 928

How does one distinguish sequence errors from polymorphisms? Each piece of genome sequenced at least 10 times to reduce error rate (0.01%) Polymorphisms Sequence variation between individuals is 0.1% To be defined as a polymorphism, the altered sequence must be present in a significant population Rate of polymorphisms in diploid human genome is about 1 in 500 bp Nature (2001) 15th Feb Vol 409 special issue; pgs 821-823 & 928

SNPs and disease

SNPs……and risk of disease N(291)S

SNPs……and pharmacogenomics

4) Distribution of GC content Genome wide average of 41% Huge regional variations exist E.g.distal 48Mb of chromosome 1p-47% but chromosome 13 has only 36% Confirms cytogenetic staining with G-bands (Giemsa) dark G-bands – low GC content (37%) light G-bands – high GC content (45%) Nature (2001) 15th Feb Vol 409 special issue; pg 876-877

CpG islands show no methylation TpG Methyl CpG Deamination methylated at C CpG islands show no methylation Significance of CpG islands Non-methylated CpG islands associated with the 5’ ends of genes Usually overlap the promoter region Aberrant methylation of CpG islands linked to pathologies like cancer or epigenetic diseases like Rhett’s syndrome http://www.sanger.ac.uk/HGP/cgi.shtml

Inheritance of CpG methylation

Epigenetic disease – Rett Syndrome Characterised by neurodevelopmental problems after birth mutations in a gene on the X chromosome, MECP2 (methyl CpG-binding protein 2), whose protein normally binds to methylated CpG and represses gene expression RS symptoms associated with the failure of mutated MECP2 to regulate transcription of a specific gene, DLX5, one allele of which is normally imprinted. Without the MeCP2 protein, production of the Dlx5 protein is increased, which influence production of the neurotransmitter GABA in the brain DLX5 DLX5

CpG islands Greatly under-represented in human genome ~28,890 in number (5 times less than expected) ~ 56% of human genes and 47% of the mouse genes have CpG islands Variable density e.g. Y – 2.9/Mb but 16,17 & 22 have 19-22/Mb Average is 10.5/Mb Nature (2001) 15th Feb Vol 409 special issue; pg 877-888

6) Recombination rates 2 main observations Recombination rate increases with decreasing arm length Recombination rate suppressed near the centromeres and increases towards the distal 20-35Mb

7) Repeat content Age distribution Comparison with other genomes Variation in distribution of repeats Distribution by GC content Y chromosome Nature (2001) 409: pp 881-891

Repeat content……. a) Age distribution Most interspersed repeats predate eutherian radiation (confirms the slow rate of clearance of nonfunctional sequence from vertebrate genomes) LINEs and SINEs have extremely long lives 2 major peaks of transposon activity No DNA transposition in the past 50MYr LTR retroposons teetering on the brink of extinction Most IR predate eutherian radiation (confirms the slow rate of clearance of nonfunctional sequence from vertebrate genomes) LINEs and SINE have extremely long lives 2 major peaks of transposon activity No DNA transposition in the past 50MYr LTR teetering on the brink of extinction

a) Age distribution overall decline in interspersed repeat activity in hominid lineage in the past 35-40MYr compared to mouse genome, which shows a younger and more dynamic genome Most IR predate eutherian radiation (confirms the slow rate of clearance of nonfunctional sequence from vertebrate genomes) LINEs and SINE have extremely long lives 2 major peaks of transposon activity No DNA transposition in the past 50MYr LTR teetering on the brink of extinction

b) Comparison with other genomes Higher density of transposable elements in euchromatic portion of genome Higher abundance of ancient transposons 60% of IR made up of LINE1 and Alu repeats whereas DNA transposons represent only 6% (a few human genes appear likely to have resulted from horizontal transfer from bacteria!!)

c) Variation in distribution of repeats Some regions show either High repeat density e.g. chromosome Xp11 – a 525kb region shows 89% repeat density Low repeat density e.g. HOX homeobox gene cluster (<2% repeats) (indicative of regulatory elements which have low tolerance for insertions)

d) Distribution by GC content High GC – gene rich ; High AT – gene poor LINEs abundant in AT-rich regions SINEs lower in AT-rich regions Alu repeats in particular retained in actively transcribed GC rich regions E.g. chromosme 19 has 5% Alus compared to Y chromosome

e) The Y chromosome ! Unusually young genome (high tolerance to gaining insertions) Mutation rate is 2.1X higher in male germline Possibly due to cell division rates or different repair mechanisms

Working draft published – Feb 2001 Finished sequence – April 2003 Annotation of genes going on (refer: International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature 21 October 2004 (doi: 10.1038/nature03001)

References Chapter 9 pp 265-268 Chapter 10: pp 339-348 HMG 3 by Strachan and Read Chapter 10: pp 339-348 Genetics from genes to genomes by Hartwell et al (2/e) Nature (2001) 409: pp 879-891