Gene Expression Profiling Brad Windle, Ph.D
Profile A set of data or characteristics pertaining to an item Profiles are sometimes referred to as Signatures or Fingerprints
Cellular Profiles Gene Expression Protein Expression Misc Data SNPs Methylation Cell State Drug Response Metabolitics Structural Genomic Protein States Disease Gene/Protein Sequence Protein Structure Drug Structure
Cellular Profiles Gene Expression Protein Expression Misc Data SNPs Methylation Drug Response Metabolitics Structural Genomic Protein States Gene/Protein Sequence Protein Structure Drug Structure
20, ,000 human genes
Factors in Gene Expression 1. Presence or absence of the genes, and the number of genes Differences within the human population and big differences that occur during oncogenesis 2. Epigenetics, chromatin state Cell to cell and host to host variability unknown 3. Homeostasis and environmental factors Cell to cell and host to host variability unknown but environmental factors is variable of interest
What do we want to know? The bigger picture Are cells or tissues related based on the genes they express? For an experimental cell model, are there conditions that are similar based on changes in gene expression? For certain experimental conditions, are there genes that show similar patterns of change (co-regulated)? The smaller picture What genes went up or down under an experimental condition? What are the differences in gene expression between two cell types?
Profiles Have Two Sides A gene profile across samples and a sample profile across genes
Microarrays and Gene Expression Profiling
How did we use to do this? Probe for 1 gene Analyze ~10 samples
Analyzing 1 gene for 10 samples Analyzing thousands of genes for 1 sample Now with Microarrays
Spotting Arrays
Affymetrix
cDNA synthesis labeling
cell or condition of interest control or reference cell hybridize to microarray Gene Expression Profiling
Issues of Multiplicity 10,000 genes analyzed p=0.001 Therefore, there should be ~10 genes found even when there is no significant difference
Applications of Gene Expression Profiling Tissue or Tumor Classification Gene Classification Drug Classification Drug Target Identification Drug Response Prediction
Gene Expression Array Genomic Content Array Methylation Array (Chromatin Array) SNP Array
Comparative Genomic Hybridization (CGH) Structural Genomic Profiling cell with losses or gainsnormal cell hybridize to metaphase chromosomes
BAC or Oligo Array Normal Cancer Label DNA Genome Representation Profiling Using Arrays
Detection mainly for cancer and inherited deletions Tumor suppressor genes are deleted Oncogenes are amplified CGH
Methylation Profiling CCGG GGCC me CCGG GGCC me PCR linkers Hpa II / PCR Amplify/ Label PCR Amplify / Label CpG Island Array hybridize to array
Profile cells based on methylation state cell-type profiles Differences in the methylated state of cancers Compare methylation profiles to gene expression profiles
Combinatorial Elements Regulating Transcription
Profiling Transcription Factor-Interactive DNA Immuno-precipitate w/ Ab to protein Chromatin IP or ChIP total genomic DNA
Analyzing Gene Expression Data and Profiles Cluster Analysis
We start with no hypothesis or a very general hypothesis We want the data to reveal what is relatively significant This approach is a hypothesis generator Cluster Analysis
We can’t observe the patterns unaided? The patterns are too complex or abstract. There’s too much data. There’s too much noise.
Pearson
Two-way Clustering
Gene AGene X
Divisive Agglomerative (Aggregative) Clustering Methods
Cluster Linkage Methods Nearest Neighbor or Single Linkage Furthest Neighbor or Complete Linkage Average Neighbors or Average Linkage
The Color Map Cells Genes a b c d e 1234adcd1234adcd
The Profile Data Sources SNPsDNA microarray, oligos, millions of SNP sites Protein expressionAb microarray, 2D gels, Mass Spectrometry Protein states2D gels, <1000 proteins resolved Drug responsebrute force, 70,000 compounds screened MetaboliticsGC-MS DNA/protein sequenceSequencing, <20 people sequenced, brute force Drug structurein silico Protein structure3D crystallography, NMR, brute force Gene expressionDNA microarrays, oligo or PCR, 20-30,000 genes Structural genomicsDNA microarrays, BACs, ~one per 1Mb MethylationDNA microarrays, upstream sequences, CpG islands
Questions?