Biological background: Gene Expression and Molecular Laboratory Techniques Class web site: Statistics for Microarrays
Basic principles in physics, chemistry and biology Principles Known? Physics Matter Chemistry Compound Biology Organism Elementary Particles Yes Elements Yes Genes No
Central Paradigm
(RT)
Protein Synthesis
Transcription Transcription is a complex process involving several steps and many proteins (enzymes) RNA polymerase synthesizes a single strand of RNA against the DNA template strand (anti-sense strand), adding nucleotides to the 3’ end of the RNA chain Initiation is regulated by transcription factors, including promoters, usually an initiator element and TATA box, usually lying just upstream (at the 5’ end) of the coding region 3’ end cleaved at AAUAAA, poly-A tail added
Exons and Introns Most of the genome consists of non-coding regions Some non-coding regions (centromeres and telomeres) may have specific chomosomal functions Other non-coding regions have regulatory purposes Non-coding, non-functional DNA often called junk DNA, but may have some effect on biological functions The terms exon and intron refer to coding and non-coding DNA, respectively
Intron Splicing
Transcription Overview
Transcription Illustration
Translation The AUG start codon is recognized by methionyl-tRNA i Met Once the start codon has been identified, the ribosome incorporates amino acids into a polypeptide chain RNA is decoded by tRNA (transfer RNA) molecules, which each transport specific amino acids to the growing chain Translation ends when a stop codon (UAA, UAG, UGA) is reached
Translation Illustrated
From Primary Transcript to Protein
Alternative Splicing (of Exons) How is it possible that there are over 1,000,000 human antibodies when there are only about 30,000 genes? Alternative splicing refers to the different ways the exons of a gene may be combined, producing different forms of proteins within the same gene-coding region Alternative pre-mRNA splicing is an important mechanism for regulating gene expression in higher eukaryotes
Molecular Laboratory Techniques Hybridizing DNA Copying DNA Cutting DNA Probing DNA
Hybridization Hybridization exploits a potent feature of the DNA duplex – the sequence complementarity of the two strands Remarkably, DNA can reassemble with perfect fidelity from separated strands Strands can be separated (denatured) by heating
Polymerase Chain Reaction (PCR) PCR is used to amplify (copy) specific DNA sequences in a complex mixture when the ends of the sequence are known Source DNA is denatured into single strands Two synthetic oligonucleotides complementary to the 3’ ends of the segment of interest are added in great excess to the denatured DNA, then the temperature is lowered The genomic DNA remains denatured, because the complementary strands are at too low a concentration to encounter each other during the period of incubation, but the specific oligonucleotides hybridize with their complementary sequences in the genomic DNA
PCR, ctd The hybridized oligos then serve as primers for DNA synthesis, which begins upon addition of a supply of nucleotides and a temperature resistant polymerase such as Taq polymerase, from Thermus aquaticus (a bacterium that lives in hot springs) Taq polymerase extends the primers at temperatures up to 72˚C When synthesis is complete, the whole mixture is heated further (to 95˚C) to melt the newly formed duplexes Repeated cycles (25—30) of synthesis (cooling) and melting (heating) quickly provide many DNA copies
(BREAK)
Types of Viruses Reverse transcriptase makes a complementary DNA copy from RNA. A virus is a nucleic acid in a protein coat.
Reverse transcription Clone cDNA strands, complementary to the mRNA G U A A U C C U C Reverse transcriptase mRNA cDNA C A T T A G G A G T T A G G A G C A T T A G G A G
RT-PCR
Restriction Enzymes Cut DNA
Restriction Enzymes When a bacterium is invaded by a DNA- containing organism (e.g. virus), it can defend itself with restriction enzymes (REs; also called restriction endonucleases) REs recognize a specific short sequence of DNA and cut both strands The recognition sequence is typically a palindrome – i.e. the sequence in one strand is the same as in the other, read in the other direction (e.g. GAATTC) REs named after the bacteria in which they occur, plus sequence number (e.g. Eco RI)
RE Example (Eco RI) (cut) 5’ – GAATTC – 3’ 3’ – CTTAAG – 5’ (cut)
Probing DNA One way to study a specific DNA fragment within a genome is to probe for the sequence of the fragment A probe is a labeled (usually radioactive or fluorescent) single-stranded oligonucleotide, synthesized to be complementary to the sequence of interest – probe sequence is known Attach single-stranded DNA to a membrane (or other solid support) and incubate with the probe so that it hybridizes Visualize the probe (e.g. by X-ray for radioactive probes)
The Southern blotting technique
Sample Autoradiogragh (Gel)
Types of Blots Southern Blot – use DNA to probe DNA Northern Blot – use DNA to probe RNA Western Blot – use antibodies to probe Protein
Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein would be more direct, but is currently harder. Measuring Gene Expression
Microarrays provide a means to measure gene expression
Areas Being Studied with Microarrays Differential gene expression between two (or more) sample types Similar gene expression across treatments Tumor sub-class identification using gene expression profiles Classification of malignancies into known classes Identification of “marker” genes that characterize different tumor classes Identification of genes associated with clinical outcomes (e.g. survival)
cDNA microarray experiments mRNA levels compared in many different contexts Different tissues, same organism (brain v. liver) Same tissue, same organism (ttt v. ctl, tumor v. non-tumor) Same tissue, different organisms (wt v. ko, tg, or mutant) Time course experiments (effect of ttt, development) Other special designs (e.g. to detect spatial patterns).
Web animation of a cDNA microarray experiment chip.html
Yeast genome on a chip
Brief outline of steps for producing a microarray cDNA probes attached or synthesized to solid support Hybridize targets Scan array
cDNA microarrays cDNA clones
cDNA microarrays Compare the genetic expression in two samples of cells PRINT cDNA from one gene on each spot SAMPLES cDNA labelled red/green e.g. treatment / control normal / tumor tissue
HYBRIDIZE Add equal amounts of labelled cDNA samples to microarray. SCAN Laser Detector
Quantification of expression For each spot on the slide we calculate Red intensity = Rfg - Rbg (fg = foreground, bg = background) and Green intensity = Gfg - Gbg and combine them in the log (base 2) ratio Log 2 ( Red intensity / Green intensity)
Gene Expression Data On p genes for n slides: p is O(10,000), n is O(10-100), but growing, Genes Slides Gene expression level of gene 5 in slide 4 = Log 2 ( Red intensity / Green intensity) slide 1slide 2slide 3slide 4slide 5 … These values are conventionally displayed on a red (>0) yellow (0) green (<0) scale.
Biological question Differentially expressed genes Sample class prediction etc. Testing Biological verification and interpretation Microarray experiment Estimation Experimental design Image analysis Normalization Clustering Discrimination R, G 16-bit TIFF files (Rfg, Rbg), (Gfg, Gbg)