FINAL PROJECT- Key dates

Slides:



Advertisements
Similar presentations
Analysis of Microarray Genomic Data of Breast Cancer Patients Hui Liu, MS candidate Department of statistics Prof. Eric Suess, faculty mentor Department.
Advertisements

Basic Gene Expression Data Analysis--Clustering
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
From Pairwise to Multiple Alignment. WHATS TODAY? Multiple Sequence Alignment- CLUSTAL MOTIF search.
Introduction to DNA Microarrays Todd Lowe BME 88a March 11, 2003.
Microarrays Dr Peter Smooker,
From Pairwise to Multiple Alignment. WHATS TODAY? Multiple Sequence Alignment- CLUSTAL MOTIF search.
Gene Co-expression Network Analysis BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University.
Part II: Discriminative Margin Clustering Joint work with: Rob Tibshirani, Dept of Statistics Patrick O. Brown, School of Medicine Stanford University.
Figure 1: (A) A microarray may contain thousands of ‘spots’. Each spot contains many copies of the same DNA sequence that uniquely represents a gene from.
‘Gene Shaving’ as a method for identifying distinct sets of genes with similar expression patterns Tim Randolph & Garth Tan Presentation for Stat 593E.
The Human Genome Project and ~ 100 other genome projects:
Gene Expression and Networks. 2 Microarray Analysis Unsupervised -Partion Methods K-means SOM (Self Organizing Maps -Hierarchical Clustering Supervised.
Sequence Motifs. Motifs Motifs represent a short common sequence –Regulatory motifs (TF binding sites) –Functional site in proteins (DNA binding motif)
DNA Arrays …DNA systematically arrayed at high density, –virtual genomes for expression studies, RNA hybridization to DNA for expression studies, –comparative.
Microarrays and Gene Expression Analysis. 2 Gene Expression Data Microarray experiments Applications Data analysis Gene Expression Databases.
Gene Expression 1. Methods –Unsupervised Clustering Hierarchical clustering K-means clustering Expression data –GEO –UCSC EPCLUST 2.
Arrays: Narrower terms include bead arrays, bead based arrays, bioarrays, bioelectronic arrays, cDNA arrays, cell arrays, DNA arrays, gene arrays, gene.
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
Gene Expression Analysis using Microarrays Anne R. Haake, Ph.D.
Analysis of microarray data
Comprehensive Gene Expression Analysis of Prostate Cancer Reveals Distinct Transcriptional Programs Associated With Metastatic Disease Kevin Paiz-Ramirez.
Paola CASTAGNOLI Maria FOTI Microarrays. Applicazioni nella genomica funzionale e nel genotyping DIPARTIMENTO DI BIOTECNOLOGIE E BIOSCIENZE.
with an emphasis on DNA microarrays
Analysis and Management of Microarray Data Dr G. P. S. Raghava.
From motif search to gene expression analysis
Clustering of DNA Microarray Data Michael Slifker CIS 526.
Finish up array applications Move on to proteomics Protein microarrays.
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks From Nature Medicine 7(6) 2001 By Javed.
Microarrays and Gene Expression Analysis. 2 Gene Expression Data Microarray experiments Applications Data analysis Gene Expression Databases.
1 FINAL PROJECT- Key dates –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max.
Gene Expression Analysis. 2 DNA Microarray First introduced in 1987 A microarray is a tool for analyzing gene expression in genomic scale. The microarray.
Gene Expression and Networks. 2 Microarray Analysis Supervised Methods -Analysis of variance -Discriminate analysis -Support Vector Machine (SVM) Unsupervised.
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
Introduction to Microarrays Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics
Support Vector Machines and Gene Function Prediction Brown et al PNAS. CS 466 Saurabh Sinha.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Gene expression. Gene Expression 2 protein RNA DNA.
Case Study: Characterizing Diseased States from Expression/Regulation Data Tuck et al., BMC Bioinformatics, 2006.
Microarray: An Introduction
Introduction to Bioinformatics
Genomics A Systematic Study of the Locations, Functions and Interactions of Many Genes at Once.
Gene Expression Analysis
Microarray - Leukemia vs. normal GeneChip System.
Genomic analysis: Toward a new approach in breast cancer management
Gene expression.
What are the Advantages?
Microarray Technology and Applications
 The human genome contains approximately genes.  At any given moment, each of our cells has some combination of these genes turned on & others.
Lecture 11 By Shumaila Azam
Gene Chips.
Gene Expression Analysis and Proteins
DNA Chip Data Interpretation Tools: Genmapp & Dragon View
Loyola Marymount University
Class Prediction Based on Gene Expression Data Issues in the Design and Analysis of Microarray Experiments Michael D. Radmacher, Ph.D. Biometric Research.
Mapping Global Histone Acetylation Patterns to Gene Expression
Gene Expression Analysis
Microarray Data Analysis
Volume 132, Issue 6, Pages (March 2008)
Alternative Splicing: New Insights from Global Analyses
Loyola Marymount University
BIOBASE Training TRANSFAC® ExPlain™
Loyola Marymount University
Loyola Marymount University
Loyola Marymount University
A, unsupervised hierarchical clustering of the expression of probe sets differentially expressed in the oral mucosa of smokers versus never smokers. A,
Data Type 1: Microarrays
Presentation transcript:

FINAL PROJECT- Key dates 9.1 –last day to decided on a project * 18,23,24/1- Presenting a proposed project in small groups A very short presentation (Max 5 minutes) Title- Background Main question Major tools you are planning to use to answer the questions 6.3 Final submission

Gene Expression Analysis

Gene Expression DNA RNA protein

Gene Expression mRNA gene1 mRNA gene2 mRNA gene3 AAAAAAA AAAAAAA

Studying Gene Expression 1987-2010 Spotted microarray One channel microarray RNA-seq (Next Generation Sequencing)

Applications Identify gene function Similar expression can infer similar function Find tissue/developmental specific genes Different expression in different cells/tissues Diagnostics and Therapy Different genes expression can indicate a disease state Genes which change expression in a disease can be good candidates for drug targets

Different types of microarray technologies Classical Methods Different types of microarray technologies Spotted Microarray Two channel cDNA microarrays. DNA Chips One Channel microarrays (Affymetrix, Agilent),

Microarray Experiment http://www.bio.davidson.edu/Courses/genomics/chip/chip.html

One channel DNA chips Each sequence is represented by a probe set colored with one fluorescent dye Target hybridizes to complimentary probes only The fluorescence intensity is indicative of the expression of the target sequence

Expression Data Format Experiments cold normal hot uch1 -2.0 0.0 0.924 gut2 0.398 0.402 -1.329 fip1 0.225 0.225 -2.151 msh1 0.676 0.685 -0.564 vma2 0.41 0.414 -1.285 meu26 0.353 0.286 -1.503 git8 0.47 0.47 -1.088 sec7b 0.39 0.395 -1.358 apn1 0.681 0.636 -0.555 wos2 0.902 0.904 -0.149 Genes / mRNAs

RNA-seq

Gene Expression Analysis Unsupervised -Hierarchical Clustering -Partition Methods K-means Supervised Methods -Analysis of variance -Discriminant analysis -Support Vector Machine (SVM)

Clustering genes according to their expression profiles . Experiments Genes

Clustering Clustering organizes things that are close into groups. - What does it mean for two genes to be close? - Once we know this, how do we define groups? Notice we do this ourselves all the time: divide people by race, divide animals into families, etc…

What does it mean for two genes to be close? We need a mathematical definition of distance between the expression of two genes Gene 1 Gene 2 Gene1= (E11, E12, …, E1N)’ Gene2= (E21, E22, …, E2N)’ For example distance between gene 1 and 2 Euclidean distance= Sqrt of Sum of (E1i -E2i)2, i=1,…,N

Once we know this, how do we define groups? Michael Eisen, 1998 : Generate a tree based on similarity (similar to a phylogenetic tree) Each gene is a leaf on the tree Distances reflect similarity of expression Hierarchical Clustering Gene Cluster Genes Experiments

Internal nodes represent different functional Groups (A, B, C, D, E) genes One genes may belong to more than one cluster

Clusters can be presented by graphs

What can we learn from clusters with similar gene expression ?? Similar expression between genes The genes have similar function One gene controls the other in a pathway All genes are controlled by a common regulatory genes Clusters can help identify regulatory motifs Search for motifs in upstream promoter regions of all the genes in a cluster

EXAMPLE- hnRNP A1 and SRp40 Gene with similar expression pattern tend to have common functions HnRNPA1 and SRp40 have a similar gene expression pattern in different tissues

EXAMPLE- hnRNP A1 and SRp40 Gene with similar expression pattern tend to have common functions hnRNP A1 SRp40

Are they regulated by the same transcription factor ? 1. Extract their promoter regions 2. Find a common motif in both sequences (MEME) hnrnpA1 SRp40 gene Promoter Common motif 3. Identify the transcription factor related to the motif http://jaspar.cgb.ki.se/

Extract the promoters of the genes in the cluster and find a common motif (using MEME) >GGATAACAATTTCACAAGTGTGTGAGCGGATAACAA >AAGGTGTGAGTTAGCTCACTCCCCTGTGATCTCTGTACATAG >ACGTGCGAGGATGAGAACACAATGTGTGTGCTCGGTTTAGTCACC >TGTGACACAGTGCAAACGCGCCTGACGGAGTTCACA >AATTGTGAGTGTCTATAATCACGATCGATTTGGAATATCCATCACA >TGCAAAGGACGTCACGATTTGGGAGCTGGCGACCTGGGTCATG >TGTGATGTGTATCGAACCGTGTATTTATTTGAACCACATCGCAGGTGAGAGCCATCACAG >GAGTGTGTAAGCTGTGCCACGTTTATTCCATGTCACGAGTGT >TGTTATACACATCACTAGTGAAACGTGCTCCCACTCGCATGTGATTCGATTCACA

Create a Multiple Sequence Alignment GGATAACAATTTCACA TGTGAGCGGATAACAA TGTGAGTTAGCTCACT TGTGATCTCTGTTACA CGAGGATGAGAACACA CTCGGTTTAGTTCACC TGTGACACAGTGCAAA CCTGACGGAGTTCACA AGTGTCTATAATCACG TGGAATATCCATCACA TGCAAAGGACGTCACG GGCGACCTGGGTCATG TGTGATGTGTATCGAA TTTGAACCACATCGCA GGTGAGAGCCATCACA TGTAAGCTGTGCCACG TTTATTCCATGTCACG TGTTATACACATCACT CGTGCTCCCACTCGCA TGTGATTCGATTCACA

Generate a PSSM Find the transcription factor which bind the motif

How can we use microarray for diagnostics?

Gene-Expression Profiles in Hereditary Breast Cancer cDNA Microarrays Parallel Gene Expression Analysis Breast tumors studied: BRCA1 BRCA2 sporadic tumors Log-ratios measurements of 3226 genes for each tumor after initial data filtering RESEARCH QUESTION Can we distinguish BRCA1 from BRCA2– cancers based solely on their gene expression profiles?

+ - How can microarrays be used as a basis for diagnostic ? 5 Breast Cancer Patient Patient 1 patient 2 patient 3 patient4 patient 5 Gen1 + - Gen2 Gen3 Gen4 Gen5

+ - How can microarrays be used as a basis for diagnostic ? BRCA1 patinet1 patient 2 patient4 patient 3 patient 5 Gen1 + - Gen3 Gen4 Gen2 Gen5 Informative Genes

Specific Examples Cancer Research Hundreds of genes that differentiate between cancer tissues in different stages of the tumor were found. The arrow shows an example of a tumor cells which were not detected correctly by histological or other clinical parameters. Ramaswamy et al, 2003 Nat Genet 33:49-54

Supervised approaches for predicting gene function based on microarray data SVM would begin with a set of genes that have a common function (red dots), In addition, a separate set of genes that are known not to be members of the functional class (blue dots) are specified.

Using this training set, an SVM would learn to differentiate between the members and non-members of a given functional class based on expression data. ? Having learned the expression features of the class, the SVM could recognize new genes as members or as non-members of the class based on their expression data.

Using SVMs to diagnose tumors based on expression data Each dot represents a vector of the expression pattern taken from a microarray experiment . For example the expression pattern of all genes from a cancer patients.

How do SVM’s work with expression data? In this example red dots can be primary tumors and blue are from metastasis stage. The SVM is trained on data which was classified based on histology. ? After training the SVM we can use it to diagnose the unknown tumor.

Gene Expression Databases and Resources on the Web GEO Gene Expression Omnibus - http://www.ncbi.nlm.nih.gov/geo/ List of gene expression web resources http://industry.ebi.ac.uk/~alan/MicroArray/ Another list with literature references http://www.gene-chips.com/ Cancer Gene Anatomy Project http://cgap.nci.nih.gov/ Stanford Microarray Database http://genome-www.stanford.edu/microarray/