Genome Organization and Evolution
Assignment For 2/24/04 Read: Lesk, Chapter 2 Exercises 2.1, 2.5, 2.7, p 110 Problem 2.2, p 112 Weblems 2.4, 2.7, pp
Assignment For 3/02/04 Pick any two bioinformatics projects or resources, such as those in the previous lecture. For each, write a brief survey (~1000 words), giving such information as: the history of the project; the participants; the funding; its purpose and scope. Sources: web site, mailing lists, faqs, published papers.
Genes ● Definition: A gene is a segment of DNA which codes for a protein – Caveats: – DNA which codes for functional RNA? – Control regions?
Gene organization ● A gene may occur on either strand of DNA ● Genes are continuous stretches (almost always) in prokaryotes ● Genes are (often) discontinuous stretches (exons) in eukaryotes. The intervening regions are called introns ● Upstream is a binding site ● Location of regulatory region is less predictable
The Central Dogma ● One gene, one protein ● Like most dogmas, not entirely true ● Alternative splicing permits the manufacture of many products from a single gene ● The protein products are sometimes called the proteome ● With current technology, more gene information is available than protein information
Transmission of information ● The continuity of life is a reflection of the (nearly) faithful transmission of genetic information ● The adaptation of life (evolution) is a result of imperfect transmission of information, and natural selection
Genetic maps ● Variable number tandem repeats (VNTRs – minisatellites), bp, are a sort of genetic fingerprint ● Short tandem repeat polymorphisms (STRPs – microsatellites), 2-5 bp, are another kind of marker ● A sequence tagged site (STS), bp, is a known unique location in the genome
Identifying genes ● A long ORF is probably a gene (but what about eukaryotes? AG and GT splice signals) ● A gene promoter site has identifiable characteristics (TATA box) ● If it looks like a known gene, it's a gene
Prokaryote genomes ● Example: E. coli ● 89% coding ● 4,285 genes ● 122 structural RNA genes ● Prophage remains ● Insertion sequence elements ● Horizontal transfers
Eukaryotic genome ● Example: C. elegans ● 10 chromosomes ● 19,099 genes ● Coding region – 27% ● Average of 5 introns/gene ● Both long and short duplications
Evolution of genomes ● Adaptation of species is coterminous with adaptation of genomes ● Where do genes come from? (Answer: from other genes) ● Homologs and paralogs ● Lateral transfer ● Molecular species each have their own family tree ● Genes are widely shared
Close relatives ● Yeast, fly, worm and human share at least 1308 groups of proteins ● Unique to vertebrates: immune proteins (for example) ● Unique molecules are adapted from ancient molecules of different purpose but similar design ● Most new proteins come from domain rearrangement ● Most new species come from control region variation