Presentation is loading. Please wait.

Presentation is loading. Please wait.

GxDb a universal tool to collect, analyse, manage and visualize transcriptomic data Wolfgang Raffelsberger, Raymond Ripp and Laetitia Poidevin BingGi Days.

Similar presentations


Presentation on theme: "GxDb a universal tool to collect, analyse, manage and visualize transcriptomic data Wolfgang Raffelsberger, Raymond Ripp and Laetitia Poidevin BingGi Days."— Presentation transcript:

1 GxDb a universal tool to collect, analyse, manage and visualize transcriptomic data Wolfgang Raffelsberger, Raymond Ripp and Laetitia Poidevin BingGi Days January 2010

2 What is transcriptomic ? -> a high throughput analysis of gene expression by measuring the amount of mRNA What are the techniques ? -> DNA microarrays -> SAGE -> Differential Display -> …. => large quantities of data GxDb: integrative tool to Introduction collect treat analyze manage visualize

3 GxDb is a website and a database

4 Organization of data in GxDb Sample Individual name age description Individual name age description Organism Genotype Tissue Treatment SampleCondition ex: mouse wt aged 9 day Arraytype ex: Mouse430_2

5 Arraytype RealExp Arraytype Sample CEL file r3 CEL file r2 CEL file r1 RealExp 2 Arraytype Sample 2 CEL file r5 CEL file r4 CEL file r3 RealExp 3 Arraytype Sample 3 CEL file r8 CEL file r7 CEL file r6 RealExp 4 Arraytype Sample 4 CEL file r11 CEL file r10 CEL file r9 Organization of data in GxDb ex: Mouse430_2 ex: wt_d9 ex: wt_d11 ex: wt_d13 ex: wt_d15

6 Organization of data in GxDb Arraytype RealExp Arraytype Sample CEL file r3 CEL file r2 CEL file r1 RealExp 2 Arraytype Sample 2 CEL file r5 CEL file r4 CEL file r3 RealExp 3 Arraytype Sample 3 CEL file r8 CEL file r7 CEL file r6 RealExp 4 Arraytype Sample 4 CEL file r11 CEL file r10 CEL file r9 Experiment Arraytype RealExp Arraytype Sample CEL file r3 CEL file r2 CEL file r1 RealExp 2 Arraytype Sample 2 CEL file r5 CEL file r4 CEL file r3 RealExp 3 Arraytype Sample 3 CEL file r8 CEL file r7 CEL file r6 RealExp 4 Arraytype Sample 4 CEL file r11 CEL file r10 CEL file r9 Experiment Signal Intensity Ratio Cluster ≠ expressed genes Quality Treatment and Analysis protocol

7 1) Normalization 6 methods: RMA, gcRMA, dChip, MAS5.0, plier, vsn => signal intensity 2) Calculate average (between replicats) and ratio 3) Filtering - Eliminate probesets that are never expressed in all arrays of one experiment based on distribution or call (according to normalization method) - Eliminate probesets with very low changes between condition et reference based on fold change based on standard deviation 4) Statistical analysis - method: t-test combined with empirical bayes for shrinkage - estimation of FDR (false discovery rate) - tag probesets with differential expression (automatic threshold findings) Treatment and Analysis protocol

8 1) Normalization 2) Calculate average (replicats) and ratio 3) Filtering 4) Statistical analysis 5) Clustering tool: Cluspack methods: k-means (DPC) Mixtures models (aic and bic) => clusters 6) Quality Control Report tool: RReportGenerator for Automatic Statistical Analysis Automatic Statistical Analysis to estimate the quality of arrays

9 Upload form

10 Step 1: Selection of Arraytype and Experiment

11 Upload form Step 1 Create your new experiment

12 Organism Genotype SampleCondition Individual TreatmentType Treatment Tissue Sample Upload form Step 1 Create your news samples

13 Upload form Step 1: Selection of Arraytype and Experiment

14 Upload form Step 2: Upload of.cel files

15 Upload form Step 3: Select the corresponding sample to each cel file

16 Upload form Step 4: Select the interesting comparisons to calculate ratio Ratio: Condition / reference Example: C3H_rd1_d10 / C3H_wt_d10

17 Upload form Step 5: Launch Treatment and Analysis protocol

18 Upload form Step 5: Clustering, Quality analysis and loading in database

19 Signal Intensity Ratio ≠ expressed gene Clustering RealExp Organization of data in GxDb Quality Sample Experiment Cel file Arraytype-Probeset

20 Query GxDb

21 Experiment Probeset Sample RealExp Signal Intensity Ratio Cluster

22 time-course of retinal development Visualization in GxDb

23 GxDb Website Upload Querying Display alnitak Star3 Star4 Star5 Star6 Star7 Star8 /GxData GxDb SQL database http://gx.igbmc.fr Web Services Café des sciences QSub Ordonnanceur GxDb ressources Languages used: PHP (HTML) - Upload - PipeWork - RadarGenerator - Fed R - Treatment and analysis protocol - RReportGenerator SQL Tcl - Gx (~ Gscope) - Probeset loading C - Cluspack

24 Conclusion and Prospects Automated raw-data upload, storage, treatment and analysis multiple treatment protocols multiple clustering methods multiple human and automatic expert analysis => Comparisons => Analyse the strengths and weaknesses of the different protocols Improvement of website More user friendly Visualization of clusters, ratio Tools for meta-analysis Possibility of upload data directly from GEO Diagnostic report to analyze easier the data Links to others databases and tools: STRING, GSEA..

25

26 Ratio Pipework Organism Normalization Ratio minimum Ratio maximum

27 Integration and storage in a unifying format Automated raw-data upload, storage, treatment and analysis multiple treatment protocols multiple clustering methods multiple human and automatic expert analysis => Comparisons => Analyse the strengths and weaknesses of the different protocols Facilitated querying and data visualization Advantages of GxDb

28

29 Arraytype RealExp Arraytype Sample CEL file r3 CEL file r2 CEL file r1 Arraytype RealExp 2 Arraytype Sample 2 CEL file r5 CEL file r4 CEL file r3 Arraytype RealExp 3 Arraytype Sample 3 CEL file r8 CEL file r7 CEL file r6 Arraytype RealExp 4 Arraytype Sample 4 CEL file r11 CEL file r10 CEL file r9 GxDb transcriptomics

30 PROBESET 3 probeset_id genename genedescription species speciessymbol representpublicid refseqtranscriptid gscope_id swissprot unigene_id entrezgene ensembl mgi cytoband chromoloc omim tissuespecificity linkeddiseases go_biologicalprocess go_cellularcomponent go_molecularfunction pathway interpro transmembrane PROBESET 2 genename probeset_id genedescription species speciessymbol representpublicid refseqtranscriptid gscope_id swissprot unigene_id entrezgene ensembl mgi cytoband chromoloc omim tissuespecificity linkeddiseases go_biologicalprocess go_cellularcomponent go_molecularfunction pathway interpro transmembrane Experiment Arraytype RealExp 1 Arraytype Sample CEL file r3 CEL file r2 CEL file r1 Arraytype RealExp 2 Arraytype Sample CEL file r3 CEL file r2 CEL file r1 Arraytype RealExp 3 Arraytype Sample CEL file r3 CEL file r2 CEL file r1 Arraytype RealExp 4 Arraytype Sample 4 CEL file r11 CEL file r10 CEL file r9 Arraytype PROBESET probeset_id genename genedescription species speciessymbol representpublicid refseqtranscriptid gscope_id swissprot unigene_id entrezgene ensembl mgi cytoband chromoloc omim tissuespecificity linkeddiseases go_biologicalprocess go_cellularcomponent go_molecularfunction pathway interpro transmembrane 45000 Sample Individual name age description Individual name age description Organism Genotype Tissue Treatment SampleCondition Signal Intensity Ratio Cluster

31 already exists ? Arraytypes Create new Arraytype already exists ? Sample Create new Sample with existing or new Individual existing or new Organism existing or new Tissues existing or new Genotype existing or new Treatment Upload your.CEL files Enter their association to Arraytypes and Samples Define Couples of RealExps for the Ratio Calculation Fill in the other information for the Experiment Run Automatic Analysis Query and Display Results GxDb protocol from upload to display Quality Report Signal Intensity Ratio Cluster Differentially Expressed Genes


Download ppt "GxDb a universal tool to collect, analyse, manage and visualize transcriptomic data Wolfgang Raffelsberger, Raymond Ripp and Laetitia Poidevin BingGi Days."

Similar presentations


Ads by Google