Presentation is loading. Please wait.

Presentation is loading. Please wait.

Project of CZ5225 Zhang Jingxian:

Similar presentations


Presentation on theme: "Project of CZ5225 Zhang Jingxian:"— Presentation transcript:

1 Project of CZ5225 Zhang Jingxian: g0800791@nus.edu.sg

2 Identifying biomarkers of drug response for cancer patients Aims: Aims: To develop of predictors of response to drugs To develop of predictors of response to drugs To learn how to get public microarray data To learn how to get public microarray data To learn how to preprocess microarray raw data To learn how to preprocess microarray raw data To annotate the genes of interest To annotate the genes of interest

3 Requirements Each group investigates: Each group investigates: ONE kind of cancer patient drug response ONE kind of cancer patient drug response Need Two datasets from different studies Need Two datasets from different studies Download the raw data Download the raw data Use Bioconductor in R to prepossess raw data Use Bioconductor in R to prepossess raw data Identify certain number of genes Identify certain number of genes Annotate those identified genes in your report Annotate those identified genes in your report Each group needs only ONE report Each group needs only ONE report

4 Requirements All kinds of affymatrix expression datasets related to drug response of cancer patients are available All kinds of affymatrix expression datasets related to drug response of cancer patients are available Dataset needs to contain at least 20 samples Dataset needs to contain at least 20 samples Dataset needs two comparable outcome groups: response vs. non-response; resistance vs. non-resistance, et al. Dataset needs two comparable outcome groups: response vs. non-response; resistance vs. non-resistance, et al.

5 Bioconductor & R http://www.bioconductor.org http://www.bioconductor.org

6 Advantages Advantages Cross platform Cross platform Linux, windows and MacOS Linux, windows and MacOS Comprehensive and centralized Comprehensive and centralized Analyzes both Affymetrix and two color spotted microarrays, and covers various stages of data analysis in a single environment Analyzes both Affymetrix and two color spotted microarrays, and covers various stages of data analysis in a single environment Cutting edge analysis methods Cutting edge analysis methods New methods/functions can easily be incorporated and implemented New methods/functions can easily be incorporated and implemented Quality check of data analysis methods Quality check of data analysis methods Algorithms and methods have undergone evaluation by statisticians and computer scientists before launch. And in many cases there are also literature references Algorithms and methods have undergone evaluation by statisticians and computer scientists before launch. And in many cases there are also literature references Good documentations Good documentations Comprehensive manuals, documentations, course materials, course notes and discussion group are available Comprehensive manuals, documentations, course materials, course notes and discussion group are available A good chance to learn statistics and programming A good chance to learn statistics and programming

7 Installation R & Bionconductor Install R from: http://cran.stat.nus.edu.sg/ Install R from: http://cran.stat.nus.edu.sg/http://cran.stat.nus.edu.sg/ Open R platform then execute: Open R platform then execute:>source("http://bioconductor.org/biocLite.R")>biocLite() Check library by execute: >library() Check library by execute: >library()

8 Case study Dataset source (GSE19697): http://www.ncbi.nlm.nih.gov/geo Dataset source (GSE19697): http://www.ncbi.nlm.nih.gov/geo http://www.ncbi.nlm.nih.gov/geo

9 Extraction raw data into: D://gse19697 Extraction raw data into: D://gse19697 Create title.txt : Create title.txt :

10 Open R Open R Set workdir by execute: Set workdir by execute: >setwd( ‘ d://gse19697 ’ ) >setwd( ‘ d://gse19697 ’ ) Load simpleaffy module by execute: Load simpleaffy module by execute: >library(simpleaffy) >library(simpleaffy) Load data by: Load data by: >eset eset <- read.affy('title.txt')

11 Calculate expression by: Calculate expression by: >eset.rma eset.rma <- call.exprs(eset,'rma') Compare two groups by: Compare two groups by: >pc.result pc.result <- pairwise.comparison(eset.rma, "title", c("pCR", "RD"), eset)

12 Filter significant changed markers between two groups by: Filter significant changed markers between two groups by: >significant significant <- pairwise.filter(pc.result,fc=log2(1.5), tt=0.001)

13 Plot significant changed markers: Plot significant changed markers: >plot(significant) >plot(significant) Annotate selected markers: Annotate selected markers: >significant >significant

14

15 Annotate selected markers: Annotate selected markers:

16 Heatmap:

17 > significant significant <- pairwise.filter(pc.result,fc=log2(1), tt=0.001) > pid pid<-rownames(significant@means) >eset.hm eset.hm<-eset.rma[pid,] > install.packages("RColorBrewer") > install.packages("RColorBrewer") > library(RColorBrewer) > library(RColorBrewer) > hmcol hmcol <- colorRampPalette(brewer.pal(10, "RdBu"))(256) > spcol spcol <- ifelse(eset.hm$title == "pCR", "goldenrod", "skyblue") > heatmap(exprs(eset.hm), col = hmcol, ColSideColors = spcol) > heatmap(exprs(eset.hm), col = hmcol, ColSideColors = spcol)

18 Assignment 2 Genetics of gene expression (eQTL) Genetics of gene expression (eQTL) Aim: to identify potential genetics various that causes differential expression Aim: to identify potential genetics various that causes differential expression Deadline of report: two weeks before final examination Deadline of report: two weeks before final examination

19 Genetics of gene expression

20 SNP

21

22

23

24 expression Quantitative Trait Locus (eQTL) tries to find genomic variation to explain expression traits. tries to find genomic variation to explain expression traits. One difference between eQTL mapping and traditional QTL mapping is that, traditional mapping study focuses on one or a few traits, while in most of eQTL studies, thousands of expression traits will be analyzed and thousands of QTLs will be declared. One difference between eQTL mapping and traditional QTL mapping is that, traditional mapping study focuses on one or a few traits, while in most of eQTL studies, thousands of expression traits will be analyzed and thousands of QTLs will be declared.

25 GGdata: all 90 hapmap CEU samples, 47K expression, 4mm SNP GGdata: all 90 hapmap CEU samples, 47K expression, 4mm SNP

26 Chromosome 17 Chromosome 17

27 > biocLite(“GGtools”) > biocLite(“GGtools”) >biocLite(“GGdata”) >biocLite(“GGdata”) >library(GGtools) >library(GGtools) >library(GGdata) >library(GGdata) > c17 = getSS("GGdata", "17") > c17 = getSS("GGdata", "17") >/////get(“CSDA", revmap(illuminaHumanv1SYMBOL)) >/////get(“CSDA", revmap(illuminaHumanv1SYMBOL)) > t1 = gwSnpTests(genesym("CSDA") ~ male, c17, chrnum("17")) > t1 = gwSnpTests(genesym("CSDA") ~ male, c17, chrnum("17")) > /////t1 = gwSnpTests(probeId(" GI_21359983-S ") ~ male, c17, chrnum("17")) > /////t1 = gwSnpTests(probeId(" GI_21359983-S ") ~ male, c17, chrnum("17")) > topSnps(t1) > topSnps(t1) >plot_EvG(genesym("CSDA"), rsid("rs7212116"), c17) >plot_EvG(genesym("CSDA"), rsid("rs7212116"), c17) >//c_full = getSS(“GGdata", as.character(1:22)) >//c_full = getSS(“GGdata", as.character(1:22))

28 Requirements for assignment 2 Identify the genetics cause (eQTL) of the genes selected in assignment 1 Identify the genetics cause (eQTL) of the genes selected in assignment 1 Get SNPs with significant association (<10e-4) from each chromosome Get SNPs with significant association (<10e-4) from each chromosome Paste the plot image for each association Paste the plot image for each association Annotate SNPs in dbSNP Annotate SNPs in dbSNP Submit a report for each group Submit a report for each group


Download ppt "Project of CZ5225 Zhang Jingxian:"

Similar presentations


Ads by Google