Public data - available for projects 6 data sets: –Human Tissues –Leukemia –Spike-in –Arabidopsis mutants –Yeast Cell Cycle –Yeast Rosetta.

Slides:



Advertisements
Similar presentations
Wilson WH et al. Proc ASH 2012;Abstract 686.
Advertisements

Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG) Rafael A. Irizarry Department of Biostatistics, JHU (joint.
Integrating Cross-Platform Microarray Data by Second-order Analysis: Functional Annotation and Network Reconstruction Ming-Chih Kao, PhD University of.
Data Integration for Cancer Genomics. Personalized Medicine Tumor Board Question: given all we know about a patient, what is the “optimal” treatment?
. Inferring Subnetworks from Perturbed Expression Profiles D. Pe’er A. Regev G. Elidan N. Friedman.
Microarray Normalization
Introduction to yeast genetics Michelle Attner July 24, 2012.
Microarray Simultaneously determining the abundance of multiple(100s-10,000s) transcripts.
A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae Article by Peter Uetz, et.al. Presented by Kerstin Obando.
Gene expression analysis summary Where are we now?
Public data - available for projects 6 data sets: –Human Tissues –Leukemia –Spike-in –FARO compendium – Yeast Cell Cycle –Yeast Rosetta Find one yourself.
Copyright, ©, 2002, John Wiley & Sons, Inc.,Karp/CELL & MOLECULAR BIOLOGY 3E Transcriptional Control in Eukaryotes Background Information Microarrays.
Notch1 and its role in pre T-cell Acute Lymphoblastic Leukemia (ALL) By Rebecca Goodman.
Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al
Affymetrix Latin Square Experiment One lot of 14 arrays were used for the study using spikes along with the background-- enriched RNA from DR153 (Ecoli)
Microarray Data Analysis Using R Studies in Tissue Databases Mark Reimers, NCI.
Inferring the nature of the gene network connectivity Dynamic modeling of gene expression data Neal S. Holter, Amos Maritan, Marek Cieplak, Nina V. Fedoroff,
Artificial Intelligence Term Project #3 Kyu-Baek Hwang Biointelligence Lab School of Computer Science and Engineering Seoul National University
Generate Affy.dat file Hyb. cRNA Hybridize to Affy arrays Output as Affy.chp file Text Self Organized Maps (SOMs) Functional annotation Pathway assignment.
Office hours Wednesday 3-4pm 304A Stanley Hall Review session 5pm Thursday, Dec. 11 GPB100.
Comprehensive Gene Expression Analysis of Prostate Cancer Reveals Distinct Transcriptional Programs Associated With Metastatic Disease Kevin Paiz-Ramirez.
Gene expression profiling identifies molecular subtypes of gliomas
Seeds are mutagenized in the lab, then screened for mutants in the ethylene signaling pathway, based on the “triple response” phenotype. The mutants that.
Chapter 14 Genomes and Genomics. Sequencing DNA dideoxy (Sanger) method ddGTP ddATP ddTTP ddCTP 5’TAATGTACG TAATGTAC TAATGTA TAATGT TAATG TAAT TAA TA.
Eucalyptus Pine Pathogen Interactions. Introduction Forest trees – Long – lived – Exposed to array of pathogens – Do not posses adaptive immunity Innate.
Ethylene responses Developmental processes
More on Microarrays Chitta Baral Arizona State University.
The Center for Medical Genomics facilitates cutting-edge research with state-of-the-art genomic technologies for studying gene expression and genetics,
Finish up array applications Move on to proteomics Protein microarrays.
Reconstructing gene networks Analysing the properties of gene networks Gene Networks Using gene expression data to reconstruct gene networks.
Verna Vu & Timothy Abreo
Microarrays and Their Uses Brad Windle, Ph.D
HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P.
P. falciparum Life Cycle & Pathogenesis of Malaria Miller et al., Nature  Molecular and genetic.
Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006.
Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine
The Role and Mechanism of PPAR  in the Transcriptional Regulation of its Target Genes Jinlu Cai 1, Henry L. Keen 2,Thomas L. Casavant 3,4,5, and Curt.
MCB 317 Genetics and Genomics Topic 11 Genomics. Readings Genomics: Hartwell Chapter 10 of full textbook; chapter 6 of the abbreviated textbook.
Evolutionary Algorithms for Finding Optimal Gene Sets in Micro array Prediction. J. M. Deutsch Presented by: Shruti Sharma.
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
Figure SOM1. Functional roles of the genes affected in zmet2-m1 mutants. Although the genes localized on the intracellular membranes were slightly over-represented.
Whole Genome Approaches to Cancer 1. What other tumor is a given rare tumor most like? 2. Is tumor X likely to respond to drug Y?
Integration of chemical-genetic & genetic interaction data links bioactive compounds to cellular target pathways Parsons et al Nature Biotechnology.
A B Supporting Information Figure S1: Distribution of the density of expression intensities for the complete microarray dataset (A) and after removal of.
Introduction to Microarrays Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics
Artificial Intelligence Project #3 : Diagnosis Using Bayesian Networks May 19, 2005.
Introduction to Microarrays. The Central Dogma.
Brad Windle, Ph.D Unsupervised Learning and Microarrays Web Site: Link to Courses and.
Biases in RNA-Seq data. Transcript length bias Two transcripts of length 50 and 100 have the same abundance in a control sample. The expression of both.
Shortest Path Analysis and 2nd-Order Analysis Ming-Chih Kao U of M Medical School
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Statistical Analyses of High Density Oligonucleotide Arrays Rafael A. Irizarry Department of Biostatistics, JHU (joint work with Bridget Hobbs and Terry.
Exploration, Normalization, and Summaries of High Density Oligonucleotide Array Probe Level Data Rafael A. Irizarry Department of Biostatistics, JHU (joint.
AN INTRODUCTION TO GENE EXPRESSION ANALYSIS BY MICROARRAY TECHNIQUE (PART II) DR. AYAT B. AL-GHAFARI MONDAY 10 TH OF MUHARAM 1436.
生物資料庫搜尋 ( 第八組 ) 連威森 王鼎 黃智楹 張鈞淵
Analyzing circadian expression data by harmonic regression based on autoregressive spectral estimation Rendong Yang and Zhen Su Division of Bioinformatics,
Divergence and reciprocity in signaling through closely-related oxidative stress-activated MAPKs Gregory Lampard*, Godfrey Miles*, Juergen Ehlting, Nathalie.
Simone Ferrari Lab meeting 12/19/00
Microarray Technology and Applications
Loyola Marymount University
(T-cell Acute Lymphoblastic Leukemia)
Kristoffer Palma, Yuelin Zhang, Xin Li  Current Biology 
A Major Role for Capsule-Independent Phagocytosis-Inhibitory Mechanisms in Mammalian Infection by Cryptococcus neoformans  Cheryl D. Chun, Jessica C.S.
HER-2/neu mRNA detection by gene expression profiling
Biomedical Discovery with DNA Arrays
Loyola Marymount University
Loyola Marymount University
Loyola Marymount University
Loyola Marymount University
Volume 1, Issue 3, Pages (May 2008)
Presentation transcript:

Public data - available for projects 6 data sets: –Human Tissues –Leukemia –Spike-in –Arabidopsis mutants –Yeast Cell Cycle –Yeast Rosetta

Human Tissue Atlas 79 human tissues (in duplicates) Among the tissues are: –brain samples –heart and liver samples –fetal samples probe sets (HGU133A subset) Preprocessed data: –normalized –expression index calculated Used for investigation of global trends and chromosomal organization of transcription, evaluation of gene prediction Su et al.,

Leukemia Data 84 bone marrow samples from children with acute lymphoblastic leukemia (ALL) 70 B-cell ALL, with 4 different subtypes: –15 BCR-ABL –18 E2A-PBX1 –17 Hyperdiploid –20 TEL-AML 14 T-cell Platform: Affymetrix HGU133A Dataset has previously been used for classification problems For more information: Ross et al., Blood,

Spike In Dataset Subset of the SPIKE-IN HGU95 Latin square data “Normal” sample + spike-in of transcripts that hybridize to 14 probe sets (The concentrations of the spike-in is known) 2 series of concentrations: Each probe set is spiked in, in two different concentrations (pM). 12 replicates for each series - four replicates on three GeneChip batches (24 GeneChip CEL files are available in total) Previous usage: a benchmark data set for preprocessing methods ABCDEFGHIJKLMN Probe set: Series 1: Series 2:

Arabidopsis Mutants Data Set Samples (3x) WT- Wild type mpk4- MAP Kinase 4 ctr1- Constitutive Triple Response mpk4/ctr1- Double mutant Platform: –All data is from Affymetrix ATH1 GeneChip ® –22810 probe sets, ~ all genes Background: –MPK4 is central to the response to the plant hormone salicylic acid (SA). –CTR1 plays a key role in ET perception. –SA and ET are partially antagonistic. –MPK4 may play a key role in this mechanism.

LOCAL RESPONSE SA PR1 PR2 PR5 BIOTROPH RESISTANCE MPK4 PDF1.2 b-CHI GST NECROTROPH RESISTANCE ET ETR1-s CTR1 EINs-ERFs MPK6 ARABIDOPSIS SYSTEMIC IMMUNITY PATHWAYS ? NPR1

Yeast Cell Cycle Data The experiment: –Three time-series, where samples were taken from a synchronized yeast cell culture as it progresses through the cell cycle. Three different synchronization methods to arrest the cell cycle: –Two temperature sensitive mutant strains (Cdc15 and Cdc28) that cannot pass the cell cycle at high temperature –Rapid removal of mating factor alpha from the culture, which releases it from arrest. Aim of the original studies: – to determine the genes that fluctuate in expression during the cell cycle – to characterize when in the cell cycle these genes are expressed and repressed. The data set: –three separate files, normalized and preprocessed data.

Yeast Rosetta Compendium Dataset consisting of a compendium of expression profiles: –276 deletion mutants (69 of which where unknown at the time) –11 tetracycline-regulatable essential genes –13 compound treatments Data: –P-values and logratios –generated by comparison with 63 control experiments. Data originally used for identifying gene clusters and profiling of unknown ORFs and drug targets. For more information: Hughes et al., Cell, 2000

Data Set Overview # genes / probe sets platformorganism# samplesdata Human Tissue 22215Affymetrix HGU133A (custom) Homo sapiens 158 (2 x 79) expression values Leukemia22215Affymetrix HGU133A Homo sapiens 84 ( ) expression values Spike-in12559Affymetrix HGU95A Homo sapiens 24 (3 x 4 x 2) CEL files Arabidopsis mutants 22746Affymetrix ATH1 Arabidopsis thaliana 12 (3 x 4) CEL files Cell cycle~6000cDNA arrays Affymetrix S. cerevisiae59 ( ) scaled expression values Rosetta6251cDNA arraysS. cerevisiae300 ( ) logratios + p-values

Practical stuff Where: Data.sets directory, see link in your home directories When: Week 1: Wednesday: Problem formulation Thursday: Public data - available for project, discussion (Human tissue, Spike-in, Cell cycle) Week 2: Monday:Public data - available for project, discussion (Leukemia, Plant mutants, Rosetta compendium) Tuesday:Project outline Wednesday: 13:00, deadline for problem formulation - hand in written P.F.