Download presentation
Presentation is loading. Please wait.
Published byNorman Carson Modified over 9 years ago
1
New issues in storage and analysis Christophe Roos - MediCel ltd christophe.roos@medicel.fi Annotating genomes with functional information: automatic but without errors? High throughput data acquisition
2
Spring 2002Christophe Roos - 6/6 Functional genomics Genome annotation Annotations is the sum of all non-sequence information that can be connected to any sequence Gene Phylogenetic inferenceConnectors to other mapsMetabolic profiles Cofactors and metabolitesSequence homologs in other genomesMetabolic map locator Sequence Genome location Expression info Functional chemistry Structure Raw imagesNumerical values Cluster genesSS assignments Structure annotation Electron density Raw data Experimental data
3
Spring 2002Christophe Roos - 6/6 Functional genomics Genome annotation Primary sources of information about what genes do are laboratory experiments. It may take several experiments for one data point. All that data should ideallically be associated – hyperlinked among DBs. –Magpie is an environment for genome annotation Compare genomes to learn how their structure affects function –Bacteria have modules of genes functioning together organised in ‘operons’ –Higher organisms need to pack the DNA to fit it in the nucleus. Activating a gene means unpacking and is not efficient if it is done for each gene separately
4
Spring 2002Christophe Roos - 6/6 Functional genomics Functional genomics High throughput technologies give us long lists of the parts of systems (chromosomes, genomes, cells, etc). We can now analyse how they work together to produce the complexity of the organisms. The function of the genome is –Metabolism: metabolic pathways convert chemical energy derived from food into useful work in the cell. –Regulation: regulatory pathways are biochemical mechanisms that control what genomic DNA does. It switches genes on and off in a controlled way. –Signalling: signalling pathways control the movement of information (chemicals) from one component to another on many levels –Construction Functional genomics tries to map these pathways
5
Spring 2002Christophe Roos - 6/6 Functional genomics Analysing the activity of the genome Genomics: look at transcriptional activity of genes –Transcription: When a gene is transcriptionally active, it means that messenger RNA (mRNA) is synthesised. The amount of mRNA from each active gene varies over time. –Turnover: Different mRNA species have different half-lives. –Translation: When a mRNA is produced, it does not imply that the corresponding protein is translated. Transcripts can also be produced for storage and later use. –Technically feasible: it is possible to isolate all mRNAs from cells and to quantitate it within certain limits. Proteomics: look at proteins instead of transcripts –Limited: Presently acceptable efficiency comes at the expenses of incufficient quality –Closer to ’reality’ since the proteins are the players
6
Spring 2002Christophe Roos - 6/6 Functional genomics EST: Expressed sequence tags ESTs are partial sequences of cDNA clones. cDNA clones are DNA synthesised in vitro using mRNA as template. –Why?cDNA is more stable than mRNA –How?cDNA can be made ‘en masse’ starting from total cellular mRNA isolates. cDNA libraries are specific for tissue, developmental time, stimulation etc. –Therefore, looking at cDNA is looking at mRNA is looking at active genes. –To look at cDNA means sequencing (part of) it. Clones are picked at random (10’000-200’000) Sequenced from one or both ends once (no proofreading) Sequences entered into EST sequence databases
7
Spring 2002Christophe Roos - 6/6 Functional genomics EST: Expressed sequence tags constucting a clone by inserting a piece of DNA into a ’vector’. the vector and its insert will behave as an independent unit (’plasmid’) in the bacterial host and carries some additional genes to allow for selection (only those bacterial with the vector will survive on antibiotics) Amplify and sequence Iterate (in parallell)
8
Spring 2002Christophe Roos - 6/6 Functional genomics DNA hybridisation DNA is a double-helix and can be separated by denaturing treatment into two strands. Each strand becomes ’sticky’ and attempts to renature with homologous single-strand sequences to form hybrids. Single-strand DNA from all known genes of a given species can be attached to a matrix, then probed with labelled cDNA molecules from a given sample. Only complementary probes will hybridise and can be detected if they have been previously labelled (radioactivity, fluorescent stain,...) The technique can be multiplexed: –High density arrays carrying sticky probes from a full genome –Parallel hybridisation with cDNA from various sources
9
Spring 2002Christophe Roos - 6/6 Functional genomics The process of using microarrays Building the Chip: MASSIVE PCR PCR PURIFICATION and PREPARATION PREPARING SLIDESPRINTING Preparing RNA: CELL CULTURE AND HARVEST RNA ISOLATION cDNA PRODUCTION Hybridising the Chip: POST PROCESSING ARRAY HYBRIDIZATION PROBE LABELING DATA ANALYSIS
10
Spring 2002Christophe Roos - 6/6 Functional genomics The output: the image raw data laser 1laser 2 emission scanning analysis overlay images and normalise cDNA is prepared from two samples (in this example) and labelled, each sample with a distinct color. Then the array is hybridised with the doubble probe and the signal is recorded as images
11
Spring 2002Christophe Roos - 6/6 Functional genomics Problems in image analysis Noise Spot detection and intensity Alignment if overlay
12
Spring 2002Christophe Roos - 6/6 Functional genomics A set of experiments on yeast... Each row represents one gene Each column represents one experiment –The columns have been organised into related sets of experiments (ALPH, ELU,...) The colors indicate gene activity (from high to absent)
13
Spring 2002Christophe Roos - 6/6 Functional genomics Clustering the resulting data Looking at 10’000 genes is not easy Group genes into clusters of genes that behave the same way over a set of several experiments –Hierarchical clustering –K-means clustering –Self-organising maps (SOM) –Etc.
14
Spring 2002Christophe Roos - 6/6 Functional genomics The overall process with microarrays Microarray data has to be used in a larger frame of experimentation
15
Spring 2002Christophe Roos - 6/6 Functional genomics Making a model of the data Sequence Structure Function Interaction Network Function Genome Transcriptome Proteome 1.Elements 2.Binary relations 3.Networks Pathway AssemblyNeighbourCluster Hierarchical TreeGenome
16
Spring 2002Christophe Roos - 6/6 Functional genomics Comparing networks Gain new biological information by comparison of networks What is the metrics? How is it done? Is it simply a problem of graph isomorphism Pathway vs. Pathway Pathway vs. Genome Genome vs. Genome Cluster vs. Pathway
17
Spring 2002Christophe Roos - 6/6 Functional genomics Biological graph comparison Search heuristically for clusters of correspondence A - a B - b C - c D - d... Clustering algorithm A B C D E G H K F I J A B C D E G H K F I J a b c d e g h k f i j a b c d e g h k f i j Graph 1CorrespondencesGraph 2
18
Spring 2002Christophe Roos - 6/6 Functional genomics Example: genomic, metabolic, structural Genome-pathway comparison, which reveals the correlation of physical coupling of genes in the genome - operon structure (a) and functional coupling (b) of gene products in the pathway E. coli genome hisLhisGhisDhisChisBhisHhisAhisFhisI yefMyzzB
19
Spring 2002Christophe Roos - 6/6 Functional genomics Example: genomic, metabolic, structural Pentose phosphate cycle Purine metabolism HISTIDINE METABOLISM 2.4.2.17 3.6.1.313.5.4.195.3.1.16 2.4.2.- 4.2.1.19 2.6.1.9 3.1.3.15 3.5.1.- 2.6.1.- Phosphoribulosyl- Formimino-AICAR- P Phosphoribosyl- Formimino-AICAR-P Phosphoribosyl-AMP Phosphoriboxyl-ATP PRPP 5P-D-1-ribulosyl- formimine Imidazole- Glicerol-3P Imidazole- acetole P L-Histidinol-P 1.1.1.23 2.1.1.- 6.3.2.11 2.1.1.22 6.3.2.11 3.4.13.5 3.4.13.2 0 3.4.13.3 4.1.1.22 4.1.1.28 1.4.3.61.2.1.3 1.14135 3.5.2.-3.5.3.5 N-Formyl-L- aspartate Imidazolone acetate Imidazole- 4-acetate Imidazole acetaldehyde Histamine Carnosine Aneserine 1.1.1.23 6.1.1 1-Methyl- L-histidine L-Hisyidinal 5P Ribosyl-5-amino 4- Imidazole carboxamide (AICAR) L-Histidine Hercyn
20
Spring 2002Christophe Roos - 6/6 Functional genomics Example: genomic, metabolic, structural ……..NE, TYROSINE AND TRYPTOPHAN BIOSYNTHESIS Tyrosine metabolism Alkaloid biosynthesis I 2.6.1.92.6.1.57 2.6.1.12.6.1.5 6.1.1.1 1.4.3.2 2.6.1.92.6.1.57 2.6.1.12.6.1.5 4.1.1.48 4.2.1.20 Tryptophan metabolism 5.3.1.242.4.2.184.1.3.272.5.1.19 2.7.1.71 1.1.9925 1.1.1.25 4.2.1.10 4.2.1.11 1.1.9925 1.1.1.24 4.2.1.91 4.2.1.51 2.6.1.57 2.6.1.92.6.1.5 1.4.1.202.6.1.1 1.4.3.2 6.1.1.20 4.2.1.91 4.2.1.51 1.14.16.1 1.3.1.43 Tyr-tRNA 4-Hydroxy- phenylpyruvate Prephenate Tyrosine Pretyrosine RNA Phenylalanine 5.4.99.5 4.6.1.4 Anthranilate Histidine N-(5-Phospho- -v-ribosyl)- anthranilate 1-(2- Carboxy- Phenylamino)- 1-deoxy-D-ribulose 5-phosphate (3-Indolyl)- Glycerol phosphate Indole L-Tryptophan 4.1.3.- Folate biosynthesis Ubiquinone biosynthesis Chorismate 4-Aminobenzoate 4.6.1.3 3-deoxy- D-arabino- heptonate 3-Dehydro- quinate 4.2.1.10 3-Dehydro- shikimate Protocatechuate Shikimate Phenylpyruvate SCOP hierarchical tree 1. All alpha 2. All beta 3. Alpha and beta (a/b) 3.1 beta/alpha (TIM)-barrel 3.2 Cellulases....... 3.74 Thiolase 3.75 Cytidine deaminase 4. Alpha and beta (a+b) 5. Multi-domain (alpha and beta) 6. Membrane and cell surface pro 7. Small proteins 8. Peptides 9. Designed proteins 10. Non-protein
21
Spring 2002Christophe Roos - 6/6 Functional genomics More challenges? The list of genes being activated or inactivated or that are unaffected when comparing two samples becomes more informative if the genes can be mapped onto maps from which functions can be deduced.
22
Spring 2002Christophe Roos - 6/6 Functional genomics More challenges?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.