Hutchinson-Guilford Progeria premature aging lifespan = 13.4 years retarded growth midface hypoplasia micrognathia alopecia low adiposity osteodysplasia premature, severe atherosclerosis -death due to MI De Sandre-Ciovannoli, Science express, 17 April 2003
Lamin A mutations in HGS Exons 11 and 12 code the Lamin A tail (not lamin c) Red is coiled-coil and blue is globular domains 1824C>T is aa conservative (G608G) but - in 300 con. 1824C>T creates a cryptic donor site at 1819, -50 aa del
Best guess Most diseases are probably interactions between polygenic heritable events, and environmental pressures leading to somatic epigenetic changes. Translation: diseases are complicated.
Gene by Environment Interaction Predisposition Event Disease DNA FAP MSH BRCA LDLr hydrocarbons radiation estrogens low fiber colon CA breast CA atherosclerosis
Microarrays-the big net. Ideal disease-hunter: genomic scale protein quantitation and sequencing. Imperfect solution A: genomic scale detection of mRNA level. Problem: little information on protein level Imperfect solution B: genome-wide SNP/haplotype. Problem: statistical limits on patient populations Common compromise: microarray profiling mRNA transcripts (transcript profiling) to identity target areas. Target genes are then followed by proteomics and SNPs.
Array flavors DNA detection (SNP, genotyping, etc.) • short oligonucleotides to detect mismatches RNA detection (transcript profiling) • Plasmid • Inserts • Long oligonucleotides (60 mers) • Short oligonucleotides (20 mers)
Hybridization-basic elements Hybridization = Annealing - Melting CRUCIAL: non-covalent, hydrogen bonds -->equilibrium rules, binding is statistical Best hybridization occurs with: long sequences (no hyb when nt<4) high salt concentration (hybrids melt in water) low temperatures (hybrids melt with heat) G and C (3 H) bind better than A and T (2 H) self-complementarity is low (high GC is bad)
Base-pairing (the stuff of life) C T A C G Lewin. Genes VII page 8.
Tm-a good thing. Tm is a measure of the stability of DS-DNA under a given set of conditions. Stability, and therefore Tm, is affected by: Strand length - the longer the strand, the higher the Tm Base Composition - higher the GC content, the higher the Tm. Ionic Strength - as the ionic strength increases, so does Tm. Double helical DNA is stabilised by cations. Divalent cations (eg Mg2+) are more effective than monovalent cations (+ or K+). Organic Solvents - formamide for instance lowers the Tm by weakening the hydrophobic interactions.
Melting Curves-Tm measured
PCR Primer design www.oligo.net
Array Choice Factors Expression profiling: Sequence known? Not known? Oligo arrays cDNA arrays High confidence Clone drift/cross hyb Immediate ID sequence clones
Sample selection isolate the purest phenotypic examples of test and control laser capture microdissection (LCM) always control for treatment and manipulation people are the most meaningful, but least controllable animals are highly controllable, but less meaningful cell systems (in vitro) are controlled, but meaningful? small amounts of RNA can be amplified while purifying cells is good, the processing is bad. The quality of the results are directly proportional to the samples that are chosen.
Laser Capture Microdissection
The importance of purity Human colon cancer Blue are normal cells Red are tumor cells
Assessing sample quality Amount > 5 ug total RNA or 500 ng of poly A+ Basic: O.D. 260/280 ratio >2.1, nucleic acids absorb at 260, protein at 280 nm thus, increasing impurity reduces ratio Better: agarose gel electrophoresis, EtBR stained if total RNA, 28s = 2 x 18s ribosomal (Lab-on-chip) or Q-PCR of a low and high gene, against standard Best: test chip
GeneChip® Probe Arrays Millions of copies of a specific oligonucleotide probe Single stranded, labeled RNA target Oligonucleotide probe * Hybridized Probe Cell GeneChip Probe Array 11 µm 1.28cm GENECHIP PROBE ARRAYS The core of the platform is our unique arrays Oligonucleotides synthesized de novo (photolithography & combinatorial chemistry) Currently 65,000 (50 micron features) to 250,000 (24 micron) different oligos on commercially available products Each oligo represented in 107 to 108 full-length copies >1 million probes Image of Hybridized Probe Array
Synthesis of Ordered Oligonucleotide Arrays O O O O O Light (deprotection) HO HO O O O T T O O O T T C C O C A T A T A G C T G T T C C G Mask Substrate T – C – REPEAT Light removes protecting groups at defined positions. Single nucleotide washed over the chip, binds where the protecting group removed. Through successive steps, any sequence can be built up in any position on the chip. The number of steps corresponds with length of oligo, so can increase # of genes without # of steps
GeneChip® Expression Array Design Multiple oligo probes 5´ 3´ Gene Sequence Probes designed to be Perfect Match Probes designed to be Mismatch
Procedures for Target Preparation Cells AAAA Labeled transcript IVT (Biotin-UTP Biotin-CTP) Poly (A)+ RNA L L L L cDNA Fragment (heat, Mg2+) L Wash & Stain L Hybridize (16 hours) L Target Preparation RNA isolation is the first step. 1-2 hours The messenger RNA is then reverse transcribed into cDNA (we then go on to make the second strand of cDNA). 4 hours An in vitro transcription reaction using biotinylated nucleotides is then done to both amplify and label the transcripts. 4-6 hours These are then fragmented in order to get a more efficient hybridization (30-100 bases pairs is the goal). 1.0 hours The fragmented target is then hybridized overnight to a GeneChip expression array. 16 hours When washing and staining of the array is complete it can then be scanned. 1-2 hours L Scan Labeled fragments Streptavidin-Phycoerythrin (SAPE) Fluorescent stain-laser stimulated
Analysis of expression level from probe sets A single, contiguous gene set for the rat B-actin gene. Each pixel is quantitated and integrated for each oligo feature (range 0-25,000) Perfect Match (PM) Mis Match (MM) Control PM - MM = difference score All significant difference scores are averaged to create “average difference” = expression level of the gene.
Affymetrix® Instrument System Platform for GeneChip® Probe Arrays Integrated Exportable Easy to use Versatile
GeneChip analysis of human atherosclerosis Dissect normal media from atherosclerotic lesion Prepare highly purified RNA O.D. 260/280 = 2.0 Reverse transcribe w/poly dT + T7 = cDNA Transcribe with T7 + biotin dUTP = cRNA Purify probe/hybridize to chip Wash and detect with avidin/PE + ab amplification Read fluorescent label And deconvolve genes
Basic Bioinformatics-Scatterplot
Transcript profiling of aged rat aorta. Affymetrix GeneChip analysis of 10 aortas @ 20 mo. vs. 3 mo.
FAQs: How many replicates?
Simple fold changes Crude, insensitive--but effective Criteria: Present 1.5-fold up/down
Hierachical clustering
Statistical testing and ontology
Pathways of genetic information
Expression of Egr-1 mRNA in human lesions.
Egr-1 mRNA and protein in lesions vs normal cells. B) Western blot Media E197 E221 E240 E243 20 Lesion M L M L M L M L 15 Egr-1 Egr-1 mRNA x x 10 5 Actin E197 E196
Expression screening by GeneChip • each oligo sequence (20 mer) is synthesized as a 11 µ square (feature) • each feature contains > 1 million copies of the oligo • scanner resolution is about 2 µ (pixel) • each gene is quantitated by 11 oligos and compared to equal # of mismatched controls • 44,000 genes are evaluated with 11 matching oligos and 11 mismatched oligos = 4 x 106 features/chip • features are photolithographically synthesized onto a 2 x 2 cm glass substrate
GeneChip® Array Advantages – Specificity Oligo arrays cDNA arrays Gene “on” ~ 150 µm 24 µm Gene “off” Detection Pattern Single Spot
Limitations to all microarrays. dynamic range of gene expression: very difficult to simultaneously detect low and high abundance genes accurately - each gene has multiple splice variants 2 splice variants may have opposite effects (i.e. trk) arrays can be designed for splicing, but complexity ^ 5X - translational efficiency is a regulated process: mRNA level does not correlate with protein level - proteins are modified post-translationally glycosylation, phosphorylation, etc. - pathogens might have little ‘genomic’ effect
CardioChip in silico workup Lipoprotein genes/variants Heart failure predictors Atherosclerosis markers Restenosis markers Coagulation factors Stress markers Infectious agents Inflammatory markers