The Human Genome Project (Lecture 7)

Slides:



Advertisements
Similar presentations
© Wiley Publishing All Rights Reserved. Using Nucleotide Sequence Databases.
Advertisements

The Human Genome Project at UC Santa Cruz Phoenix Eagleshadow November 9, 2004.
Human Genome Project.
Human Genome Project What did they do? Why did they do it? What will it mean for humankind? Animation OverviewAnimation Overview - Click.
Unit 1: DNA and the Genome Key area 8: Genomic sequencing.
13.3- The Human Genome. What is a genome? Genome: the total number of genes in an individual. Human Genome- approx. 20,000 genes on the 46 human chromosomes.
Kolmogorov: Complexity of an object is the shortest length of a computer program that creates the object The Human Genome, and Human Complexity Yoni Toker.
living organisms According to Presence of cell The non- cellular organism The cellular organisms According to Type the Eukaryotes the prokaryotes human.
9 Genomics and Beyond Brief Chapter Outline
The Human Genome Project Ashley Osborne Quesha McClanahan Orchi Haghighi.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. CHAPTER 18 LECTURE SLIDES.
Data visualization in the post-genomics era Carol Morita Genentech, Inc.
The Human Genome Race. Collins vs. Venter Collins Venter.
Goals of the Human Genome Project determine the entire sequence of human DNA identify all the genes in human DNA store this information in databases improve.
The Human Genome Project Public: International Human Genome Sequencing Consortium (aka HUGO) Private: Celera Genomics, Inc. (aka TIGR)
Genome Analysis Determine locus & sequence of all the organism’s genes More than 100 genomes have been analysed including humans in the Human Genome Project.
Human Genome Project Seminal achievement. Scientific milestone. Scientific implications. Social implications.
The Human Genome Project (H.G.P.) By Ben Fuhr. What is the Human Genome Project? The Human Genome Project was a great scientific endeavor designed to.
Human Molecular Genetics Section 14–3
What is genomics? Study of genomes. What is the genome? Entire genetic compliment of an organism.
Why It Might Change Your Life! By C. Rhein - Hazelwood Central Next Teacher’s Page Next.
Elements of Molecular Biology All living things are made of cells All living things are made of cells Prokaryote, Eukaryote Prokaryote, Eukaryote.
Lesson 10 Bioinformatics
Genetics and Biotechnology
LEQ: WHAT ARE THE BENEFITS OF DNA TECHNOLOGY & THE HUMAN GENOME PROJECT? to
Human Genome Project, Stem Cells and Cloning. Human Genome Project A genome is an organism’s complete set of DNA A genome is an organism’s complete set.
What is the Human Genome Project? Identify all the approximately 35,000 genes in human DNA Determine the sequences of the 3,000,000,000 bases ( = 200 phone.
Human Genome Project. In 2003 scientists in the Human Genome Project obtained the DNA sequence of the 3 billion base pairs making up the human genome.
Chapter 5 Genome Sequences and Gene Numbers. 5.1Introduction  Genome size vary from approximately 470 genes for Mycoplasma genitalium to 25,000 for human.
AP Biology A Lot More Advanced Biotechnology Tools Sequencing.
Models in Genetics Of Mice and Men Patricia Sidelsky BS/MS Biology Cherokee High School Marlton, NJ Of Mice and Men Patricia Sidelsky BS/MS Biology Cherokee.
Lesson Overview Lesson Overview Studying the Human Genome Lesson Overview 14.3 Studying the Human Genome.
Genomics BIT 220 Chapter 21.
Section 4 Lesson 1– The Human Genome Project. Applications of DNA Technology Advances in gene manipulation have made many things possible. This section.
This presentation was originally prepared by C. William Birky, Jr. Department of Ecology and Evolutionary Biology The University of Arizona It may be used.
Section 2 Genetics and Biotechnology DNA Technology
The Human Genome (part 1 of 2) Wednesday, November 5, 2003 Introduction to Bioinformatics ME: J. Pevsner
Ch. 21 Genomes and their Evolution. New approaches have accelerated the pace of genome sequencing The human genome project began in 1990, using a three-stage.
Chapter 21 Eukaryotic Genome Sequences
© 2015 W. H. Freeman and Company CHAPTER 1 The Genetics Revolution Introduction to Genetic Analysis ELEVENTH EDITION Introduction to Genetic Analysis ELEVENTH.
David Sadava H. Craig Heller Gordon H. Orians William K. Purves David M. Hillis Biologia.blu B – Le basi molecolari della vita e dell’evoluzione The Eukaryotic.
A Lot More Advanced Biotechnology Tools Sequencing.
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
DNA Technology. TO DO HUMAN GENOME PROJECT Started in map the 3 billion nucleotide sequencesThe project’s purpose was to discover all the estimated.
Johnson - The Living World: 3rd Ed. - All Rights Reserved - McGraw Hill Companies Genomics Chapter 10 Copyright © McGraw-Hill Companies Permission required.
A Lot More Advanced Biotechnology Tools (Part 2) Sequencing.
Genetic Engineering. Human Genome Project A genome is an organism’s complete set of DNA A genome is an organism’s complete set of DNA Project began in.
Eukaryotic genes are interrupted by large introns. In eukaryotes, repeated sequences characterize great amounts of noncoding DNA. Bacteria have compact.
STEM CELL RESEARCH. Overview In this activity, you will learn how cell specialization takes place in vertebrate embryos. –Explore a gallery of different.
Looking Within Human Genome King abdulaziz university Dr. Nisreen R Tashkandy GENOMICS ; THE PIG PICTURE.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Ch 12: Genomes.
Human Genome Project By: Scott Kutschke.
Genetic Engineering.
Genomes and Their Evolution
Genomics: Sequencing Is the Basis for Identifying and Mapping All Genes in a Genome Genomics, the study of genomes, encompasses structural genomics, functional.
EL: To find out what a genome is and how gene expression is regulated
Section 2 Genetics and Biotechnology DNA Technology
DNA Technology.
New genes can be added to an organism’s DNA.
Bellwork: What is the human genome project. What was its purpose
Genomes and Their Evolution
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Evolution of eukaryote genomes
Hgp april 2008.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Human Genome Project Seminal achievement. Scientific milestone.
In 2003 scientists in the Human Genome Project achieved a long-sought goal by obtaining the DNA sequence of the 3.2 billion base pairs (the order of As,
A Lot More Advanced Biotechnology Tools
Presentation transcript:

The Human Genome Project (Lecture 7) What did they do? Why did they do it? What will it mean for humankind?

Brief history of the work… Proposed in 1985 1988. Initiated and funded by NIH and US Dept. of Energy ($3 billion set aside) 1990. Work begins. 1998. Celera announces a 3-year plan to complete the project years early Published in Science and Nature in February, 2001

The Human Genome Project Began in 1990 The Mission of the HGP: The quest to understand the human genome and the role it plays in both health and disease. “The true payoff from the HGP will be the ability to better diagnose, treat, and prevent disease.” --- Francis Collins, Director of the HGP and the National Human Genome Research Institute (NHGRI)

Goals of HGP Create physical map of the 24 human chromosomes (22 autosomes, X & Y) Identify the entire set of genes & map them all to their chromosomes Determine the nucleotide sequence of the estimated 3 billion base pairs Analyze genetic variation among humans Map and sequence the genomes of model organisms

Model organisms Bacteria (E. coli, influenza, several others) Yeast (Saccharomyces cerevisiae) Plant (Arabidopsis thaliana) Roundworm (Caenorhabditis elegans) Fruit fly (Drosophila melanogaster) Mouse (Mus musculus)

Goals of HGP (cont’d) Develop new laboratory and computing technologies to make all this possible Disseminate genome information Consider ethical, legal, and social issues associated with this research

Brief history of the work… Proposed in 1985 1988. Initiated and funded by NIH and US Dept. of Energy ($3 billion set aside) 1990. Work begins. 1998. Celera announces a 3-year plan to complete the project years early Published in Science and Nature in February, 2001

The Beginning of the Project Most the first 10 years of the project were spent improving the technology to sequence and analyze DNA. Scientists all around the world worked to make detailed maps of our chromosomes and sequence model organisms, like worm, fruit fly, and mouse.

How they did it… DNA from 5 humans 2 males, 3 females 2 caucasians, one each of asian, african, hispanic Cut up DNA with restriction enzymes Ligated into BACs & YACs, then grew them up Sequenced the BACs Let a supercomputer put the pieces together

Cut segments inserted into BACs DNA Cut segments inserted into BACs Lots of overlap Known sequence

How they did it… DNA from 5 humans 2 males, 3 females 2 caucasians, one each of asian, african, hispanic Cut up DNA with restriction enzymes Ligated into BACs & YACs, then grew them up Sequenced the BACs Let a supercomputer put the pieces together

What did they find?

5000 bases per page CACACTTGCATGTGAGAGCTTCTAATATCTAAATTAATGTTGAATCATTATTCAGAAACAGAGAGCTAACTGTTATCCCATCCTGACTTTATTCTTTATG AGAAAAATACAGTGATTCC AAGTTACCAAGTTAGTGCTGCTTGCTTTATAAATGAAGTAATATTTTAAAAGTTGTGCATAAGTTAAAATTCAGAAATAAAACTTCATCCTAAAACTCTGTGTGTTGCTTTAAATAATC AGAGCATCTGC TACTTAATTTTTTGTGTGTGGGTGCACAATAGATGTTTAATGAGATCCTGTCATCTGTCTGCTTTTTTATTGTAAAACAGGAGGGGTTTTAATACTGGAGGAACAA CTGATGTACCTCTGAAAAGAGA AGAGATTAGTTATTAATTGAATTGAGGGTTGTCTTGTCTTAGTAGCTTTTATTCTCTAGGTACTATTTGATTATGATTGTGAAAATAGAATTTATCC CTCATTAAATGTAAAATCAACAGGAGAATAGCAAAAACTTATGAGATAGATGAACGTTGTGTGAGTGGCATGGTTTAATTTGTTTGGAAGAAGCACTTGCCCCAGAAGATACACAAT GAAATTCATGTTATTGAGTAGAGTAGTAATACAGTGTGTTCCCTTGTGAAGTTCATAACCAAGAATTTTAGTAGTGGATAGGTAGGCTGAATAACTGACTTCCTATC ATTTTCAGGTT CTGCGTTTGATTTTTTTTACATATTAATTTCTTTGATCCACATTAAGCTCAGTTATGTATTTCCATTTTATAAATGAAAAAAAATAGGCACTTGCAAATGTCAGATCACTTGCCTGTGGT CATTCGGGTAGAGATTTGTGGAGCTAAGTTGGTCTTAATCAAATGTCAAGCTTTTTTTTTTCTTATAAAATATAGGTTTTAATATGAGTTTTAAAATAAAATTAATTAGAAAAAGGCAA ATTACTCAATATATATAAGGTATTGCATTTGTAATAGGTAGGTATTTCATTTTCTAGTTATGGTGGGATATTATTCAGACTATAATTCCCAATGAAAAAACTTTAAAAAATGCTAGTGA TTGCACACTTAAAACACCTTTTAAAAAGCATTGAGAGCTTATAAAATTTTAATGAGTGATAAAACCAAATTTGAAGAGAAAAGAAGAACCCAGAGAGGTAAGGATATAACCTTACC AGTTGCAATTTGCCGATCTCTACAAATATTAATATTTATTTTGACAGTTTCAGGGTGAATGAGAAAGAAACCAAAACCCAAGACTAGCATATGTTGTCTTCTTAAGGAGCCCTCCCCT AAAAGATTGAGATGACCAAATCTTATACTCTCAGCATAAGGTGAACCAGACAGACCTAAAGCAGTGGTAGCTTGGATCCACTACTTGGGTTTGTGTGTGGCGTGACTCAGGTAATCT CAAGAATTGAACATTTTTTTAAGGTGGTCCTACTCATACACTGCCCAGGTATTAGGGAGAAGCAAATCTGAATGCTTTATAAAAATACCCTAAAGCTAAATCTTACAATATTCTCAAG AACACAGTGAA ACAAGGCAAAATAAGTTAAAATCAACAAAAACAACATGAAACATAATTAGACACACAAAGACTTCAAACATTGGAAAATACCAGAGAAAGATAATAAATAT TTTACTCTTTAAAAATTTAGTTAAAAGCTTAAACTAATTGTAGAGAAAA AACTATGTTAGTATTATATTGTAGATGAAATAAGCAAAACATTTAAAATACAAATGTGATTACTTAAAT TAAATATAATAGATAATTTACCACCAGATTAGATACCATTGAAGGAATAATTAATATACTGAAATACAGGTCAGTAGAATTTTTTTCAATTCAGCATGGAGATGTAAAAAATGAAAA TTAATGCAAAAAATAAGGGCACAAAAAGAAATGAGTAATTTTGATCAGAAATGTATTAAAATTAATAAACTGGAAATTTGACATTTAAAAAAAGCATTGTCATCCAAGTAGATGTG TCTATTAAATAGTTGTTCTCATATCCAGTAATGTAATTATTATTCCCTCTCATGCAGTTCAGATTCTGGGGTAATCTTTAGACATCAGTTTTGTCTTTTATATTATTTATTCTGTTTACTAC ATTTTATTTTGCTAATGATATTTTTAATTTCTGACATTCTGGAGTATTGCTTGTAAAAGGTATTTTTAAAAATACTTTATGGTTATTTTTGTGATTCCTATTCCTCTATGGACACCAAGGCT ATTGACATTTTCTTTGGTTTCTTCTGTTACTTCTATTTTCTTAGTGTTTATATCATTTCATAGATAGGATATTCTTTATTTTTTATTTTTATTTAAATATTTGGTGATTCTTGGTTTTCTCAGCC ATCTATTGTCAAGTGTTCTTATTAAGCATTATTATTAAATAAAGATTATTTCCTCTAATCACATGAGAATCTTTATTTCCCCCAAGTAATTGAAAATTGCAATGCCATGCTGCCATGTGG TACAGCATGGGTTTGGGCTTGCTTTCTTCTTTTTTTTTTAACTTTTATTTTAGGTTTGGGAGTACCTGTGAAAGTTTGTTATATAGGTAAACTCGTGTCACCAGGGTTTGTTGTACAGATCA TTTTGTCACCTAGGTACCAAGTACTCAACAATTATTTTTCCTGCTCCTCTGTCTCCTGTCACCCTCCACTCTCAAGTAGACTCCGGTGTCTGCTGTTCCATTCTTTGTGTCCATGTGTTCTC ATAATTTAGTTCCCCACTTGTAAGTGAGAACATGCAGTATTTTCTAGTATTTGGTTTTTTGTTCCTGTGTTAATTTGCCCAGTATAATAGCCTCCAGCTCCATCCATGTTACTGCAAAGAA CATGATCTCATTCTTTTTTATAGCTCCATGGTGTCTATATACCACATTTTCTTTATCTAAACTCTTATTGATGAGCATTGAGGTGGATTCTATGTCTTTGCTATTGTGCATATTGCTGCAAG AACATTTGTGTGCATGTGTCTTTATGGTAGAATGATATATTTTCTTCTGGGTATATATGCAGTAATGCGATTGCTGGTTGGAATGGTAGTTCTGCTTTTATCTCTTTGAGGAATTGCCATG CTGCTTTCCACAATAGTTGAACTAACTTACACTCCCACTAACAGTGTGTAAGTGTTTCCTTTTCTCCACAACCTGCCAGCATCTGTTATTTTTTGACATTTTAATAGTAGCCATTTTAACT GGTATGAAATTATATTTCATTGTGGTTTTAATTTGCATTTCTCTAATGATCAGTGATATTGAGTTTGTTTTTTTTCACATGCTTGTTGGCTGCATGTATGTCTTCTTTTAAAAAGTGTCTGTT CATGTACTTTGCCCACATTTTAATGGGGTTGTTTTTCTCTTGTAAATTTGTTTAAATTCCTTATAGGTGCTGGATTTTAGACATTTGTCAGACGCATAGTTTGCAAATAGTTTCTCCCATTC TGTAGGTTGTCTGTTTATTTTGTTAATAGTTTCTTTTGCTATGCAGAAGCTCTTAATAAGTTTAATGAGATCCTGATATGTTAGGCTTTGTGTCCCCACCCAAATCTCATCTTGAATTATA TCTCCATAATCACCACATGGAGAGACCAGGTGGAGGTAATTGAATCTGGGGGTGGTTTCACCCATGCTGTTCTTGTGATAGTGAATGAGTTCTCACGAGATCTAATGGTTTTATGAGG GGCTCTTCCCAGCTTTGCCTGGTACTTCTCCTTCCTGCCGCTTTGTGAAAAAGGTGCATTGCGTCCCTTTCACCTTCTTCTATAATTGTAAGTTTCCTGAGGCCTTCCCAGCCATGCTGAA CTTCAAGTCAATTAAACCTTTTTCTTTATAAATTACTCAGTCTCTGGTGGTTCTTTATAGCAGTGTGAAAATGGACTAATGAAGTTCCCATTTATGAATTTTTGCTTTTGTTGCAATTGCTT TTGACATCTTAGTCATGAAATCCTTGCCTGTTCTAAGTACAGGACGGTATTGCCTAGGTTGTCTTCCAGGGTTTTTCTAATTTTGTGTTTTGCATTTAAGTGTTTAATCCATCTTGAGTTGA TTTTTGTATATTGTGTAAGGAAGGGGTCCAGTTTCAATCTTTTGCATATGGCTAGTTAGTTATCCCAGTACCATTTATTGAAAAGACAGTCTTTTCCCCATCGCTCGTTTTTGTCAGTTTT ATTGATGATCAGATAATCATAGCTGTGTGGCTTTATTTCTGGGTTCTTTATTCTGTTCTATTGGTTTATGTCCCTGTTTTTGTGCCAGTACCATGCTGTTTTGGTTAACATAGCCCTGTAGT ATAGTTTGAGGTCAGATAGCCTGATGCTTCCAGCTTTGTTCTTTTTCTTAAGATTGCCTTGGCTATTTGGCCTCTTTTTTGGTTCCACATGAATTTTAAAACAGTTGTTTCTAGTTTTTGAA GAATGTCATTGGTAGTTTGATAGAAATAGCATTTAATCTGTAAATTGATTTGTGCAGTATGGCCTTTTAATGATATTGATTCTTCCTATCCATGAGCATGATATGTTTTCCATTTTGTTTG TATCCTCTCTGATTTCTTTGTGCAGTGTTTTGTAATTCTCAT TGTAGAGATTTTTCACCTCCCTGGTTAGTTGTATTTTACCCTAGATATTT TATTCTTTTTGTGAAAATTGTGAATGGGAT TGCCTTCCTGATTTGACTGC CAGCTTGGTTACTGTTGGTTTATAGAAATGCTAGTGATTTTTGTACATTG ATTTTCTTTCTAAAACTTTGCTGAAGTTTTTTTTATTAGCAGAAGGAGCT TTGGGGCTGAGACTATGGGGTTTTCTAGATATAGAATCATGTCAGCTTCAAATAGGGATAATTTTACTTCCTCTCTTCCTATTTGGATGCCCTTTATTTCTTTCTCTTGCCTGATTACTCTG GCTGGGATTTCCTATGTTGAATAGGAGT CATGAGAGAGGGCATCAAATCTACACATATCAAATACTAACCTTGAATGTCTAGATATTT TATTCTTTTTGTGAAAATTGTGAATGGGAT

How much data make up the human genome? 3 pallets with 40 boxes per pallet x 5000 pages per box x 5000 bases per page = 3,000,000,000 bases! To get accurate sequence requires 6-fold coverage. Now: Shred 18 pallets and reassemble.

Human genome content 1-2 % codes for protein products 24% important for translation 75% “junk” Repetitive elements Satellites (regular, mini-, micro-) Transposons Retrotransposons Parasites BOOK THAT WROTE ITSELF

Comparative Genomics

Yeast 70 human genes are known to repair mutations in yeast Nearly all we know about cell cycle and cancer comes from studies of yeast Advantages: fewer genes (6000) few introns 31% of yeast genes give same products as human homologues

Drosophila nearly all we know of how mutations affect gene function come from Drosophila studies We share 50% of their genes 61% of genes mutated in 289 human diseases are found in fruit flies 68% of genes associated with cancers are found in fruit flies Knockout mutants Homeobox genes

C. elegans 959 cells in the nervous system 131 of those programmed for apoptosis apoptosis involved in several human genetic neurological disorders Alzheimers Huntingtons Parkinsons

Mouse known as “mini” humans Very similar physiological systems Share 90% of their genes

The Human Genome Project at UC Santa Cruz Phoenix Eagleshadow November 9, 2004

The Challenges were Overwhelming First there was the Assembly The DNA sequence is so long that no technology can read it all at once, so it was broken into pieces. There were millions of clones (small sequence fragments). The assembly process included finding where the pieces overlapped in order to put the draft together. Small sequence fragments reproduced so there was enough material to sequence. 3,200,000 piece puzzle anyone?

The “Working Draft” of the human genome ACCTTGG CCTGAAT CTAGGCT TTGCATC CCTAGTC CTGATCG Freeze of sequence data generated by NCBI Clone layouts generated By Washington University sequence Clone maps Assembly generated by UCSC Working draft assembly

UCSC put the human genome sequence on the web July 7, 2000 Cyber geeks Searched for hidden Messages, and “GATTACA” UCSC put the human genome sequence on CD in October 2000, with varying results

The Completion of the Human Genome Sequence June 2000 White House announcement that the majority of the human genome (80%) had been sequenced (working draft). Working draft made available on the web July 2000 at genome.ucsc.edu. Publication of 90 percent of the sequence in the February 2001 issue of the journal Nature. Completion of 99.99% of the genome as finished sequence on July 2003.

The Project is not Done… Next there is the Annotation: The sequence is like a topographical map, the annotation would include cities, towns, schools, libraries and coffee shops! So, where are the genes? How do genes work? And, how do scientists use this information for scientific understanding and to benefit us?

What do genes do anyway? We only have ~27,000 genes, so that means that each gene has to do a lot. Genes make proteins that make up nearly all we are (muscles, hair, eyes). Almost everything that happens in our bodies happens because of proteins (walking, digestion, fighting disease). OR OR Eye Color and Hair Color are determined by genes

Of Mice and Men: It’s all in the genes Humans and Mice have about the same number of genes. But we are so different from each other, how is this possible? One human gene can make many different proteins while a mouse gene can only make a few! Did you say cheese? Mmm, Cheese!

Genes are important By selecting different pieces of a gene, your body can make many kinds of proteins. (This process is called alternative splicing.) If a gene is “expressed” that means it is turned on and it will make proteins.

What we’ve learned from our genome so far… There are a relatively small number of human genes, less than 30,000, but they have a complex architecture that we are only beginning to understand and appreciate. -We know where 85% of genes are in the sequence. -We don’t know where the other 15% are because we haven’t seen them “on” (they may only be expressed during fetal development). -We only know what about 20% of our genes do so far. So it is relatively easy to locate genes in the genome, but it is hard to figure out what they do.

How do scientists find genes? The genome is so large that useful information is hard to find. Researchers at UCSC decided to make a computational microscope to help scientists search the genome. Just as you would use “google” to find something on the internet, researchers can use the “UCSC Genome Browser” to find information in the human genome. Explore it at http://genome.ucsc.edu

The UCSC Genome Browser

The browser takes you from early maps of the genome . . . First looked thru microscope. Geisma staining produced chromosome bands. Later Fluorescence In Situ Hybridization developed. Genetic map grew to 5000 markers (places where distinctive variation occurs, like those used in DNA fingerprinting).(picture shows chr 18, about 85 million bases. Order of the markers determined by studying family inheritance of variations. Studies also led to the identification of genes associated with some diseases: famous hunts for Huntington’s disease, Duchenne muscular distrophy, retinoblastoma, Cystic Fibrosis and other genes in the 80’s.

. . . to a multi-resolution view . . . Ochre is mouse chr 5, yellow is mouse chr7

. . . at the gene cluster level . . . HD = Huntington’s disease gene. First success of RFLP (restriction fragment length polymorphism) mapping. Found linkage in 1983 before the first RFLP map was even constructed. Disease claimed Woody Guthrie. Nancy Wexler, whose mother had died of the disease, became director of the Huntington’s commission (congressional) and of NIH project. Collected family data in Venezuela. “Lucky Jim” Gusella found a link between HD and one of the first RFLP markers he tested. Took another 10 years to actually locate the gene. Takes seconds on the browser.

. . . the single gene level . . .

. . . the single exon level . . .

. . . and at the single base level caggcggactcagtggatctggccagctgtgacttgacaag caggcggactcagtggatctagccagctgtgacttgacaag

The Continuing Project Finding the complete set of genes and annotating the entire sequence. Annotation is like detailing; scientists annotate sequence by listing what has been learn experimentally and computationally about its function. Proteomics is studying the structure and function of groups of proteins. Proteins are really important, but we don’t really understand how they work. Comparative Genomics is the process of comparing different genomes in order to better understand what they do and how they work. Like comparing humans, chimpanzees, and mice that are all mammals but all very different. Annotation: includes, for example, not only the genes, but the proteins the genes makes and what diseases they are associated with. Proteins are really important for drug development and discovery.