Download presentation
Presentation is loading. Please wait.
1
GE 07 © Ron Shamir 1 DNA Chips Base on slides by Ron Shamir
2
GE 07 © Ron Shamir 2 Monitoring Gene Expression Goal: Simultaneous measurement of expression levels of all genes in one experiment. 2 fundamental biological assumptions: –Transcription level indicates genes’ regulation. –Only genes which contribute to organism fitness are expressed in a particular condition. => Detecting changes in gene expression level provides clues on its product function.
3
GE 07 © Ron Shamir 3 DNA chips / Microarrays Perform thousands of hybridizations in a single experiment Variants: –Oligonucleotide arrays –cDNA microarrays Allow global view of cellular processes: Monitor transcription levels of numerous/all genes simultaneously.
4
GE 07 © Ron Shamir 4 GeneChip ® Manufacturing Process Photolithographic Synthesis Lamp Mask Wafer
5
GE 07 © Ron Shamir 5 O O O O OO O O O O Light (deprotection) HO HO O O O T T O O OT T O O O T T C C OT T C C O Light (deprotection) T T O O OT T O O O C A T A TC A T A T A G C T GA G C T G T T C C GT T C C G Mask Wafer Mask Wafer T – C – Repeat Synthesis of Ordered Oligonucleotide Arrays
6
GE 07 © Ron Shamir 6 Wafers, Chips, and Features 1.28cm >1,300,000 features / chip >1,300,000 features / chip 1.28cm Chips / wafer 5” 5” 11µm 11µm Millions of identical probes / feature * * * * *
7
GE 07 © Ron Shamir 7 GeneChip Array Design 5´3´ Reference sequence ···TGTGATGGTGGGAATGGGTCAGAAGGACTCCTATGTGGGTGACGAG··· TTACCCAGTCTTCCTGAGGATACAC TTACCCAGTCTTGCTGAGGATACAC mRNA reference sequence Spaced DNA probe cells Perfect Match Oligo Mismatch Oligo Fluorescence intensity image Perfect match probe cells Mismatch probe cells
8
GE 07 © Ron Shamir 8 cDNA Fragment (heat, Mg 2+ ) BBBB Wash & Stain Scan Hybridize (16 hours) Labeled transcript TotalRNA (T7-dT Primer) IVT(Biotin-UTPBiotin-CTP) Labeled fragments B B B B Cells (SAPE) Expression Assay Using Affymetrix GeneChips 5ug (10ng SSP)
9
GE 07 © Ron Shamir 9 cDNA Microarrays
10
GE 07 © Ron Shamir 10
11
GE 07 © Ron Shamir 11 cDNA Microarrays (2)
12
GE 07 © Ron Shamir 12 Using mirrors to manufacture DNA chips http://www.nimblegen.com/technology/manufacture.html NimbleGen builds its arrays using photo deposition chemistry with its MAS system. At the heart of the system is a Digital Micromirror Device (DMD), similar to Texas Instruments' Digital Light Processor (DLP), employing a solid-state array of miniature aluminum mirrors to pattern up to 786,000 individual pixels of light. The DMD creates "virtual masks" that replace the physical chromium masks used in traditional arrays. Digital Light Processor (DLP) The Digital Micromirror Device's (DMD) micromirrors are displayed in comparison to the tip of a pin. Each of the 786,000 micromirrors is individually addressable, giving unparalleled precision and control over DNA array fabrication chemistry and structure.
13
GE 07 © Ron Shamir 13 These "virtual masks" reflect the desired pattern of UV light with individually addressable aluminum mirrors controlled by the computer. The DMD controls the pattern of UV light on the microscope slide in the reaction chamber, which is coupled to the DNA synthesizer. The UV light deprotects the oligo strand, allowing the synthesis of the appropriate DNA molecule. Unlike conventional oligo synthesis, arrays are synthesized on glass slides rather than controlled pore glass supports. Another key difference is that the deprotection steps are performed by photodeprotection rather than by acid deprotection. The illustration here depicts digital micromirrors reflecting a pattern of UV light, which deprotects the nascent oligonucleotide and allows addition of the next base.
14
GE 07 © Ron Shamir 14 Agilent SurePrint Technology SurePrint: Inkjet based industrial scale microarray printing and production technology.
15
GE 07 © Ron Shamir 15 AG TC … SurePrint in-situ Process
16
GE 07 © Ron Shamir 16 Human 1A Array: Description 22,575 feature microarrays on 1” x 3“ glass slides 17,986 60-mer probes representing 17,086 genes 10 replicates present for each of 100 probes for intra-array reproducibility evaluation > 3,000 blank features for custom use
17
GE 07 © Ron Shamir 17 Candidate Probe Design Apply design algorithms to select 10 best candidate probes for each transcript. Merged Consensus Region Candidate Probes Select sequence that represents a consensus across all transcripts for each LifeSeq Foundation gene and merge when possible
18
GE 07 © Ron Shamir 18 The Raw Data Expression levels, “Raw Data” experiments genes Entries of the Raw Data matrix: Ratio values Absolute values Distributions… Row = gene’s expression pattern / fingerprint vector Column = experiment/condition’s profile
19
GE 07 © Ron Shamir 19 Computational Challenges Normalization: How does one best normalize thousands of signals from same/different conditions/experiments? Identify differentially expressed genes between experiments Clustering: Partition genes into subsets that manifest similar exp. pattern Biclustering: find subsets of genes and conditions that manifest a common exp. sub-pattern
20
GE 07 © Ron Shamir 20 Computational Challenges (2) Classification: Given partition of the conditions into types, classify the types of new conditions Feature selection: Given partition of the conditions into types, find a subset of the genes for each type that distinguishes it
21
GE 07 © Ron Shamir 21
22
GE 07 © Ron Shamir 22 Computational Challenges (3) Experiment design: Choose which (pairs of) conditions will be most informative Detect regulatory signals in promoter regions of co-expressed genes. Assign statistical significance to your answers for each of the above
23
GE 07 © Ron Shamir 23 PRIMA PRomoter Integration in Microarray Analysis Chaim Linhart Rani Elkon Roded Sharan
24
GE 07 © Ron Shamir 24 PRIMA – Motivation Microarrays reveal systems-level alterations in cellular transcriptional programs induced by treatment/manipulation But they DO NOT directly disclose the transcriptional regulatory networks that underlie the observed alterations Integration of computational promoter analysis can shed light on those networks
25
GE 07 © Ron Shamir 25 PRIMA – GOAL: ‘Reverse engineering’ of transcriptional networks Reverse engineering of transcriptional networks infers regulatory mechanisms from gene expression patterns –Assumption: co-expression → transcriptional co-regulation → common cis-regulatory promoter elements Step 1:Identification of co-expressed genes using microarray technology (clustering algs) Step 2: Computational identification of cis-regulatory elements that are over-represented in promoters of the co-expressed genes Such methodologies were successfully demonstrated in prokaryotes and low eukaryotes
26
GE 07 © Ron Shamir 26 PRIMA – General description Input: –Target set (e.g., co-expressed genes) –Background set (e.g., all genes on the chip) Analysis: –Identify transcription factors whose binding site signatures are enriched in the ‘Target set’ with respect to the ‘Background set’. Required ‘databases’: –Promoter sequences on a genomic scale –Models for binding sites recognized by TFs
27
GE 07 © Ron Shamir 27 PRIMA: PWM models for TFs’ binding sites Consensus pattern –Example: E2F: TTTSGCGCS, S=G or C Position weight matrix (PWM) (E2F’s PWM)E2F’s PWM –This model is used in PRIMA –Scan promoters for hits of the PWM – Each hit is a hypothetical binding site for the TF that corresponds to the PWM TRANSFAC DB - PWMs for over 100 distinct mammalian TFs
28
GE 07 © Ron Shamir 28 PRIMA - steps Given a Target set and a Background set: For each PWM: –Scan the promoters of the Background set and of the Target set for hits of the PWM –Apply a statistical test to identify TFs whose PWM hits in the target set are significantly over-represented given their prevalence in the background set.
29
GE 07 © Ron Shamir 29 Genome-wide in silico identification of TFs controlling cell cycle progression in human cells (Elkon et al. Genome Research 2003) Whitfield et al. (MBC, June 2002) recorded expression profiles during the progression of human cell cycle. 874 genes showed periodic expression patterns (Fourier analysis). These cell cycle-regulated genes were partitioned into five clusters (G1/S, S, G2, G2/M and M/G1).
30
GE 07 © Ron Shamir 30 Human cell cycle – results (I) PRIMA’s human promoters set contained sequences for 568 of these cell cycle-regulated genes. PRIMA revealed 8 TFs whose binding sites are significantly (p<0.0005) over-represented in promoters of cell cycle-regulated genes Enrichment of some of these factors was biased to certain phases of the cell cycle.
31
GE 07 © Ron Shamir 31 p = 1.2x10 -8 (true positive) 78 promoters (92 hits) p = 1.2x10 -11 (152, 203) p = 8x10 -4 (20, 25)
32
GE 07 © Ron Shamir 32 I. Enriched TFs in cell cycle-regulated promoters
33
GE 07 © Ron Shamir 33 II. Location distribution of the computationally identified putative binding sites
34
GE 07 © Ron Shamir 34 In-silico Identification of Regulation Modules Transcriptional regulation is combinatorial Decipher transcriptional modules Computational identification of pairs of PWMs that tend to appear together Promoter #1 Promoter #n
35
GE 07 © Ron Shamir 35 PRIMA: results on HCC (III) Co-occurring pairs of TFs:
36
GE 07 © Ron Shamir 36 E2F - Position Weight Matrix (PWM) BACK This PWM model is based on 45 empirically validated E2F binding sites P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 A 4 4 0 1 0 0 0 0 1 24 26 24 C 4 2 2 23 4 45 0 32 13 11 1 5 G 4 3 4 21 41 0 45 13 26 5 8 12 T 33 36 39 0 0 0 0 0 5 5 10 4 T T T S G C G C S M D R
37
GE 07 © Ron Shamir 37 Conservation of Regulatory Elements Gene “DNA replication licensing factor MCM6”: (G1/S)
38
GE 07 © Ron Shamir 38 Human Cell Cycle Revisited We detected global enrichments that pointed to major TFs in human cell cycle regulation. However, we did not report on specific target genes due to high rate of false positive hits. Comparative Genomics greatly boosts the specificity of in-silico detection of regulatory elements. It now allows us to pinpoint TF targets with high confidence.
39
GE 07 © Ron Shamir 39
40
GE 07 © Ron Shamir 40 E2F Human-Mouse Conserved Hits Rest of GenomeCell Cycle Promoters 15,602697Total 52575E2F hits 3.3%11% 3.3Enrichment Factor E-17 P-value 16,299 human-mouse ortholog promoters (Ensembl)
41
GE 07 © Ron Shamir 41 E2F Conserved Hits: Phase Distribution P-valEnrich. FactorNum of targets E-17x3.375 Cell Cycle – Total E-18x7.333G1/S E-5x3.915S --- 18G2+M --- 9M/G1
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.