Presentation is loading. Please wait.

Presentation is loading. Please wait.

BioSci 145B lecture 7 page 1 © copyright Bruce Blumberg 2004. All rights reserved BioSci 145B Lecture #7 5/18/2004 Bruce Blumberg –2113E McGaugh Hall -

Similar presentations


Presentation on theme: "BioSci 145B lecture 7 page 1 © copyright Bruce Blumberg 2004. All rights reserved BioSci 145B Lecture #7 5/18/2004 Bruce Blumberg –2113E McGaugh Hall -"— Presentation transcript:

1 BioSci 145B lecture 7 page 1 © copyright Bruce Blumberg 2004. All rights reserved BioSci 145B Lecture #7 5/18/2004 Bruce Blumberg –2113E McGaugh Hall - office hours Wed 12-1 PM (or by appointment) –phone 824-8573 –blumberg@uci.edublumberg@uci.edu TA – Curtis Daly cdaly@uci.educdaly@uci.edu –2113 McGaugh Hall, 924-6873, 3116 –Office hours Tuesday 11-12 lectures will be posted on web pages after lecture –http://eee.uci.edu/04s/05705/ - link only herehttp://eee.uci.edu/04s/05705/ –http://blumberg-serv.bio.uci.edu/bio145b-sp2004http://blumberg-serv.bio.uci.edu/bio145b-sp2004 –http://blumberg.bio.uci.edu/bio145b-sp2004http://blumberg.bio.uci.edu/bio145b-sp2004

2 BioSci 145B lecture 7 page 2 © copyright Bruce Blumberg 2004. All rights reserved Identification of gene function You have identified a gene – what is its function? –Always look for similarity to known sequences Book suggests swiss-prot GENBANK translated database is best BLAST is tool to use Amino acid searches more sensitive than nucleotide searches –Because identical amino acid sequences might only be 67% identical at nucleotide level –What might you find? Match may predict biochemical and physiological function –e.g., a known enzyme from another organism Match may predict biochemical function only –e.g a kinase Match a gene from another organism with no known function –May match ESTs or ORFs from other organisms Match a known gene with partly characterized function –Search leads to clarification of function – NifS in book Might not match anything at all –Expect this will happen less and less

3 BioSci 145B lecture 7 page 3 © copyright Bruce Blumberg 2004. All rights reserved Identification of gene function (contd) You have identified a gene – what is its function? (contd) –Does the sequence contain an obvious functional motif? Homeobox or other consensus DNA binding domain? Kinase domain? Serine protease, etc. –InterPro database allows one to compare a protein sequence with whole family of structural databases http://www.ebi.ac.uk/interpro/ HICAICGDRSSGKHYGVYSCEGCKGFFKRTVRKDLTYTCRDSKDCMIDKRQRNR CQYCRYQKCLAMGM –Other sorts of similarity searches Identify protein secondary structure motifs –Alpha helix, beta sheets, hydrophobicity –Amphipathic helices –Overall polarity of sequences Not used much

4 BioSci 145B lecture 7 page 4 © copyright Bruce Blumberg 2004. All rights reserved Identification of gene function (contd) You have identified a gene – what is its function? (contd) –Gene ontology – highly structured vocabulary for gene classification Genes are classified using this vocabulary Relates protein function with cellular or organismal functions –Nucleic acid replication –Cell division

5 BioSci 145B lecture 7 page 5 © copyright Bruce Blumberg 2004. All rights reserved Genome annotation Extremely important as number of sequences increases –Goals are to identify all of the sequences all of the features of each sequence All of the functions of the identified genes –Often annotation does not agree with known function Human error New and updated information not propagated to database Inaccurate sequencing Sometimes annotation is correct but protein lacks function under certain conditions –Gold standard for functional analysis is loss-of-function analysis Most accurate annotation –Common to have “annotation jamborees” where biologists and bioinformaticians come together to annotate new sequences Xenopus tropicalis jamboree will be in Spring 2005

6 BioSci 145B lecture 7 page 6 © copyright Bruce Blumberg 2004. All rights reserved Comparative genomics Study of similarities and differences in genome structure and organization –How many genes? –How many chromosomes? –Genome duplications –Gene loss Driving forces behind comparative genomics –Understanding evolution in molecular terms –Sequence annotation and function identification Sequences with important functions are conserved across evolution –How can functions be determined computationally? Protein functions? Regulatory element functions?

7 BioSci 145B lecture 7 page 7 © copyright Bruce Blumberg 2004. All rights reserved Comparative genomics (contd) Orthology vs paralogy –Homology – descended from a common ancestor (e.g. Hox genes) Often grossly misused in sequence comparisons to mean similarity –Orthologs are homologous genes in different organisms that encode proteins with the same function and which have evolved by direct vertical descent (e.g., mouse and human Hoxa-1) Evolve by gradual accumulation of mutations –Paralogs are homologous genes that encode proteins with related but non-identical functions (e.g. mouse Hoxa-1 and human Hoxb-1) Evolve by gene duplication followed by gradual accumulation of mutations

8 BioSci 145B lecture 7 page 8 © copyright Bruce Blumberg 2004. All rights reserved Comparative genomics (contd) Functional equivalency does not require homology, sequence similarity or even 3D structure –Same chemical reaction can be catalyzed by totally unrelated enzymes –Non-orthologous gene displacement – when non- orthologous genes encode the same essential cellular function Better term would be analogous gene Convergent evolution also sometimes used

9 BioSci 145B lecture 7 page 9 © copyright Bruce Blumberg 2004. All rights reserved Comparative genomics (contd) Genes with very different functions can be related –3-D structure may indicate that proteins are related (evolved from the same ancestral protein) but sequence identity too low to detect Expected when genes diverge from a distant common ancestor < 20% amino acid sequence identity too little to establish homology (although proteins may be homologous) –For example 3-D structures of –D-alanine ligase –Glutathione synthetase –ATP-binding domains of »Carbamoyl phosphate sythetase »Succinyl-CoA synthetase Are all so similar in 3D structure that homology is not in doubt but sequence comparisons do not detect homology

10 BioSci 145B lecture 7 page 10 © copyright Bruce Blumberg 2004. All rights reserved Comparative genomics (contd) Protein evolution –Observation – many proteins composed of discrete domains –Observation – many proteins have multiple domains shared with other proteins –Conclusion – domain shuffling must have occurred during evolution –Some correlation between exons and protein domains Protein domains tend to be encoded in 1 or two exons New combinations of protein domains can be created by recombination –LINEs –Between repetitive elements in introns Exon shuffling – process of transferring exons (and hence functional domains) between proteins

11 BioSci 145B lecture 7 page 11 © copyright Bruce Blumberg 2004. All rights reserved Comparative genomics (contd) Protein evolution (contd) –Haemostatic proteins as an exon shuffling paradigm Family of proteases that are activated by proteolysis Protein domains show strong correlation with exons

12 BioSci 145B lecture 7 page 12 © copyright Bruce Blumberg 2004. All rights reserved Comparative genomics (contd) Protein evolution (contd) –Horizontal gene transfer – transfer of genes or protein domains across unrelated species Frequently identifiable by different patterns of codon usage from other genes, particularly ribosomal proteins Fairly rare with eukaryotes Happens in prokaryotes all the time –e.g., transfer of antiobiotic resistance among bacteria –Plasmid exchange, phage infections and transfer –Often associated with pathogenicity »Pathogenic variants of bacteria frequently have lots of inserted DNA »e.g., E. coli H0157 has 800 kb more than lab strains of E. coli, much of which is virulence factors, prophages and prophage like elements

13 BioSci 145B lecture 7 page 13 © copyright Bruce Blumberg 2004. All rights reserved Comparative genomics (contd) Comparisons of two Salmonella strains with E. coli shows a variety of sequences differences –S. typhi – only infects humans, S. typhimurium infects wide range of mammals –S. typhimurium has many more inserted sequences than S. typhi, consistent with virulence argument

14 BioSci 145B lecture 7 page 14 © copyright Bruce Blumberg 2004. All rights reserved Comparative genomics (contd) Is there a minimal genome? –Encoding the essential set of proteins required for life? –Compare genomes of archebacteria, eubacteria and yeast Issues with how genes are classified but a reasonably good approximation can be made Can identify 322 clusters of orthologous groups required for all key biosynthetic pathways that might be required in free-living organisms Some lessons from bacterial genomics (we will hear more on Thursday) –Nearly half of ORFs are of unknown function –About 25% of all ORFs are unique to a particular species! Suggests that many new protein families remain to be discovered Many new functions may be uncovered –Periodic re-evaluation of sequenced genomes is useful Compare with newly acquired data –Often find additional ORFs and genes –Much conservation of gene position Same genes found in many genomes at same positions (good for evolutionary studies

15 BioSci 145B lecture 7 page 15 © copyright Bruce Blumberg 2004. All rights reserved Comparative genomics (contd) Phylogenetic footprinting –Powerful method to identify regulatory elements in DNA sequences –Central assumption is that protein coding sequences evolve much more slowly than DNA sequences (or DNA sequences evolve faster) Due to selective pressure on protein function –Sequences conserved in related organisms likely to be functional –Species selection- Must be sufficiently diverged that functional domains stand out Sufficiently conserved to that they can be identified –A variety of algorithms exist – typical approach is to use multiple programs and look for what is found in common.

16 BioSci 145B lecture 7 page 16 © copyright Bruce Blumberg 2004. All rights reserved Comparative genomics (contd) Phylogenetic footprinting comparison of zebrafish, mouse and xenopus caudal orthologs (cdx4, cdx4, Xcad3) –A number of putative conserved elements identified including –TTCATTTGAATGCAAATGTA –Absolutely conserved in all 3 promoters –Compare with database http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?CMD=Web&LAYOUT=TwoWindo ws&AUTO_FORMAT=Semiauto&PAGE=Formating&NCBI_GI=yes&SHOW_OVERVI EW=yes&AUTO_FORMAT=yes&SHOW_LINKOUT=yeshttp://www.ncbi.nlm.nih.gov/blast/Blast.cgi?CMD=Web&LAYOUT=TwoWindo ws&AUTO_FORMAT=Semiauto&PAGE=Formating&NCBI_GI=yes&SHOW_OVERVI EW=yes&AUTO_FORMAT=yes&SHOW_LINKOUT=yes 1084857936-20585-185161197847.BLASTQ3

17 BioSci 145B lecture 7 page 17 © copyright Bruce Blumberg 2004. All rights reserved Comparative genomics (contd) What do we get from comparative genomics? –Powerful new tools to identify conserved sequences important regulatory elements Unidentified genes Features (promoters, splice sites, etc) –Important information about genome evolution Where did related genes originate? When did genome duplications arise? What is the history of life on earth? What is the genetic diversity in wild populations –Environmental shotgun sequencing –Information required to identify gene function Protein sequence and structure comparisons

18 BioSci 145B lecture 7 page 18 © copyright Bruce Blumberg 2004. All rights reserved Functional Genomics - Analysis of gene function on a whole genome basis Genome projects –DNA sequencing –Human genome, mouse, rat, Drosophila, C. elegans “finished” –model organisms progressing rapidly –Many new genes but many lack known function Functional genomics –Identification of gene functions associate functions with new genes coming from genome projects function of genes identified from characterizing diseases or mutants –Identification of genes by their function discovery of new genes

19 BioSci 145B lecture 7 page 19 © copyright Bruce Blumberg 2004. All rights reserved Functional Genomics - The challenge: Many new genes of unknown function Where/when are they expressed? –Known genes (e.g. from genome projects) Gene chips (Affymetrix) Microarrays (Oligo, cDNA, protein) –Novel genes Differential display Expression profiling –SAGE and related approaches What do they interact with (next week) –Biochemical methods –Yeast two, three hybrid screening –Phage display –Expression cloning –Proteomics 2 dimensional gel electrophoresis Mass spectrometry Protein microarrays

20 BioSci 145B lecture 7 page 20 © copyright Bruce Blumberg 2004. All rights reserved Methods of profiling gene expression How to evaluate gene expression? –Prepare RNA sample and use one of the following methods Northern blot – immobilize RNA on filter, probe –quantitative Reverse northern blot – immobilize cDNA on filter, probe with labeled RNA –Not quantitative Nuclease protection –quantitative RT-PCR –Can be quantitative In situ hybridization –Not quantitative –Or prepare protein samples and Western blot - detect protein of interest with specific antibody. ELISA – enzyme linked immunosorbent assay quantitative RIA – radioimmunoassay - quantitative

21 BioSci 145B lecture 7 page 21 © copyright Bruce Blumberg 2004. All rights reserved Analysis of mRNA - size and splicing Quantitation of mRNA levels –possible methods Northern analysis nuclease protection RT-PCR –measure steady state mRNA levels (production/degradation) mRNA size determination – –Northern blot only way –good RNA size markers = accurate sizing –which to use, poly A + or total RNA? A + much more sensitive (50-100x) –what about mRNAs with no or short tails? total RNA much simpler –gel limitations – 20 μg/lane is practical limit –what is a key factor in sizing mRNAs?

22 BioSci 145B lecture 7 page 22 © copyright Bruce Blumberg 2004. All rights reserved Analysis of mRNA - quantitation (contd) Nuclease protection assays –approach hybridize a SS probe (DNA or RNA) to RNA sample –probe must be larger than protected region digest remaining single stranded regions electrophorese on denaturing PAGE –advantages less sensitive to slightly degraded mRNA absolutely quantitative can tolerate large amounts of RNA (100+ μg) –allows detection of rare transcripts –but gives high background multiple simultaneous detection –disadvantages more tedious than Northern no blot to reuse multiple simultaneous detection is very difficult to optimize

23 BioSci 145B lecture 7 page 23 © copyright Bruce Blumberg 2004. All rights reserved Analysis of mRNA - quantitation (contd) Nuclease protection assays (contd) –RNase protection is now the dominant flavor of the assay widespread availability of bacteriophage RNA polymerases (T3, T7, SP6) and cloning vectors –easy to make high specific activity probes –or huge quantities of RNA in vitro –relative quantitation up to 12 probes can be detected simultaneously –hard to optimize once optimized this is the best way to quantitate multiple mRNAs requires careful probe design and construction one probe must be “invariant” reference RNA

24 BioSci 145B lecture 7 page 24 © copyright Bruce Blumberg 2004. All rights reserved Analysis of mRNA - quantitation (contd) RT-PCR - reverse transcriptase mediated PCR –approach reverse transcribe mRNA -> cDNA amplify with specific primers quantitate –flavors relative quantitation – compare to invariant gene absolute quantitation –by comparison to synthetic reference –competitive PCR –various fluorescent dye mediated methods –advantages very fast and simple works with tiny amounts of material –limitations efficiency of RT reaction is not identical for all mRNAs easy to fall outside of linear amplification range Errors increase exponentially with amplification

25 BioSci 145B lecture 7 page 25 © copyright Bruce Blumberg 2004. All rights reserved Analysis of mRNA - quantitation (contd) RT-PCR reverse transcriptase mediated PCR –relative concentration determination perform multiplex reaction using two primer sets –1 for reference, 1 experimental advantages –no fancy equipment required disadvantages –careful attention to linear region for both primer sets –often must add one set during reaction »companies claim to have products that eliminate this need »more than 2 primer sets are not reliable

26 BioSci 145B lecture 7 page 26 © copyright Bruce Blumberg 2004. All rights reserved Analysis of mRNA - quantitation (contd) RT-PCR (contd) –absolute concentration determination real time PCR Taqman, molecular beacons –Fluorescent methods that allow direct quantitation of PCR product approach –special oligonucleotide that has a fluor and a quenching group on it. »When whole, no fluorescence –perform PCR reaction, if primer anneals, Taq polymerase removes the reporter group which can now fluoresce

27 BioSci 145B lecture 7 page 27 © copyright Bruce Blumberg 2004. All rights reserved Analysis of mRNA - quantitation (contd) RT-PCR (contd) –absolute concentration determination - Taqman, etc Fluorescence detected continuously in real time advantages –can be detected in real time with proper instrument –no difficulties with linearity –multiplexing of probes possible (limited by available dyes) –very good for clinical diagnostics disadvantages –requires instrument »varies from expensive to extremely expensive »Not of equal quality –need to make custom oligos - can be expensive –must know something about relative abundance of mRNAs before setting up reactions –careful optimization required for best results »primer concentrations »target concentrations

28 BioSci 145B lecture 7 page 28 © copyright Bruce Blumberg 2004. All rights reserved RT-PCR (contd) –absolute concentration determination – Sybr Green Alternative real time RT-PCR utilizes a single dye approach –Extend a single template –Detect ds DNA with a specific dye Analysis of mRNA - quantitation (contd)

29 BioSci 145B lecture 7 page 29 © copyright Bruce Blumberg 2004. All rights reserved Analysis of mRNA - quantitation (contd) RT-PCR (contd) –absolute concentration determination – Sybr green Plot lift off time Generate standard curve

30 BioSci 145B lecture 7 page 30 © copyright Bruce Blumberg 2004. All rights reserved Analysis of mRNA - quantitation (contd) –RT-PCR Sybr Green (contd) Advantages –No special primers needed –Single dye, simple –Fast, robust and quantitative –Good for routine use Disadvantages –Need instrument –Single dye, can’t multiplex –Problems with multiple fragments »Melting curves required –Absolute quantitation requires std curve

31 BioSci 145B lecture 7 page 31 © copyright Bruce Blumberg 2004. All rights reserved Methods of profiling gene expression – large scale What are the possibilities –Array – micro or macro –Sequence sampling –SAGE – serial analysis of gene expression –Massively parallel signature sequencing DNA microarray analysis is now totally dominant method –Two basic flavors Spotted (spot DNA onto support) –cDNA microarrays –Oligonucleotide arrays –Moderately expensive Synthesized (use photolithography to synthesize oligos onto silicon or other suitable support –Affymetrix Gene Chips dominate –VERY expensive –Both are in wide use and suitable for whole genome analysis

32 BioSci 145B lecture 7 page 32 © copyright Bruce Blumberg 2004. All rights reserved Affymetrix GeneChips High density arrays are synthesized directly on support –4 masks required per cycle -> 100 masks per chip (25-mers) –Pentium IV requires about 30 masks –G.P. Li in Engineering directs a UCI facility that can make just about anything using photolithography

33 BioSci 145B lecture 7 page 33 © copyright Bruce Blumberg 2004. All rights reserved Affymetrix GeneChips Streptavidin/phycoerythrin

34 BioSci 145B lecture 7 page 34 © copyright Bruce Blumberg 2004. All rights reserved Affymetrix GeneChips –Each gene is represented by a series of oligonucleotide pairs One perfect match One with a single mismatch –Only hybridization to perfect match but not mismatch is considered to be real –Gene is considered “detected” if > ½ of oligo pairs are positive –Number of pairs depends on organism and how well characterized array behavior is Human uses 8 pairs Xenopus uses 16 pairs

35 BioSci 145B lecture 7 page 35 © copyright Bruce Blumberg 2004. All rights reserved Affymetrix GeneChips Result is in single color –Need two chips – control and experimental for each condition Advantages –Commercially available –Standardized Disadvantages –About $1000 to buy, probe and process each chip! –May not be available for your organism of interest –No ability to compare probes directly on the same chip Must rely on technology

36 BioSci 145B lecture 7 page 36 © copyright Bruce Blumberg 2004. All rights reserved Spotted arrays Source material is prepared –cDNAs are PCR amplified OR –Oligonucleotides synthesized Spotted onto treated glass slides RNA prepared from 2 sources –Test and control Labeled probes prepared from RNAs –Incorporate label directly –Or incorporate modified NTP and label later –Or chemically label mRNA directly Hybridize, wash, scan slide Express as ratio of one channel to other after processing

37 BioSci 145B lecture 7 page 37 © copyright Bruce Blumberg 2004. All rights reserved Strategy to identify RAR target genes Agonist - TTNPBAntgonist - AGN193109 Harvest st 18 Poly A+ RNA Amino-allyl labeled 1 st strand cDNA Alexa Fluor 555 (cy3) Alexa Fluor 647 (cy5) Alexa Fluor 555 (cy3) Alexa Fluor 647 (cy5) Probe microarrays upregulateddownregulated

38 BioSci 145B lecture 7 page 38 © copyright Bruce Blumberg 2004. All rights reserved DNA microarray Statistical analysis of output – VERY IMPORTANT! Replicates are very important Preprocessing of data is needed –To remove spurious signals

39 BioSci 145B lecture 7 page 39 © copyright Bruce Blumberg 2004. All rights reserved DNA microarray Advantages –Custom arrays possible and affordable –Ratio of fluorescence is robust and reproducible Disadvantages –Availability of chips –Expense of production on your own –Technical details in preparation If you want to see a microarrayer, drop by my lab and I will show you several types


Download ppt "BioSci 145B lecture 7 page 1 © copyright Bruce Blumberg 2004. All rights reserved BioSci 145B Lecture #7 5/18/2004 Bruce Blumberg –2113E McGaugh Hall -"

Similar presentations


Ads by Google