Download presentation
Presentation is loading. Please wait.
Published byGervase Floyd Modified over 9 years ago
1
GxDb a universal tool to collect, analyse, manage and visualize transcriptomic data Wolfgang Raffelsberger, Raymond Ripp and Laetitia Poidevin BingGi Days January 2010
2
What is transcriptomic ? -> a high throughput analysis of gene expression by measuring the amount of mRNA What are the techniques ? -> DNA microarrays -> SAGE -> Differential Display -> …. => large quantities of data GxDb: integrative tool to Introduction collect treat analyze manage visualize
3
GxDb is a website and a database
4
Organization of data in GxDb Sample Individual name age description Individual name age description Organism Genotype Tissue Treatment SampleCondition ex: mouse wt aged 9 day Arraytype ex: Mouse430_2
5
Arraytype RealExp Arraytype Sample CEL file r3 CEL file r2 CEL file r1 RealExp 2 Arraytype Sample 2 CEL file r5 CEL file r4 CEL file r3 RealExp 3 Arraytype Sample 3 CEL file r8 CEL file r7 CEL file r6 RealExp 4 Arraytype Sample 4 CEL file r11 CEL file r10 CEL file r9 Organization of data in GxDb ex: Mouse430_2 ex: wt_d9 ex: wt_d11 ex: wt_d13 ex: wt_d15
6
Organization of data in GxDb Arraytype RealExp Arraytype Sample CEL file r3 CEL file r2 CEL file r1 RealExp 2 Arraytype Sample 2 CEL file r5 CEL file r4 CEL file r3 RealExp 3 Arraytype Sample 3 CEL file r8 CEL file r7 CEL file r6 RealExp 4 Arraytype Sample 4 CEL file r11 CEL file r10 CEL file r9 Experiment Arraytype RealExp Arraytype Sample CEL file r3 CEL file r2 CEL file r1 RealExp 2 Arraytype Sample 2 CEL file r5 CEL file r4 CEL file r3 RealExp 3 Arraytype Sample 3 CEL file r8 CEL file r7 CEL file r6 RealExp 4 Arraytype Sample 4 CEL file r11 CEL file r10 CEL file r9 Experiment Signal Intensity Ratio Cluster ≠ expressed genes Quality Treatment and Analysis protocol
7
1) Normalization 6 methods: RMA, gcRMA, dChip, MAS5.0, plier, vsn => signal intensity 2) Calculate average (between replicats) and ratio 3) Filtering - Eliminate probesets that are never expressed in all arrays of one experiment based on distribution or call (according to normalization method) - Eliminate probesets with very low changes between condition et reference based on fold change based on standard deviation 4) Statistical analysis - method: t-test combined with empirical bayes for shrinkage - estimation of FDR (false discovery rate) - tag probesets with differential expression (automatic threshold findings) Treatment and Analysis protocol
8
1) Normalization 2) Calculate average (replicats) and ratio 3) Filtering 4) Statistical analysis 5) Clustering tool: Cluspack methods: k-means (DPC) Mixtures models (aic and bic) => clusters 6) Quality Control Report tool: RReportGenerator for Automatic Statistical Analysis Automatic Statistical Analysis to estimate the quality of arrays
9
Upload form
10
Step 1: Selection of Arraytype and Experiment
11
Upload form Step 1 Create your new experiment
12
Organism Genotype SampleCondition Individual TreatmentType Treatment Tissue Sample Upload form Step 1 Create your news samples
13
Upload form Step 1: Selection of Arraytype and Experiment
14
Upload form Step 2: Upload of.cel files
15
Upload form Step 3: Select the corresponding sample to each cel file
16
Upload form Step 4: Select the interesting comparisons to calculate ratio Ratio: Condition / reference Example: C3H_rd1_d10 / C3H_wt_d10
17
Upload form Step 5: Launch Treatment and Analysis protocol
18
Upload form Step 5: Clustering, Quality analysis and loading in database
19
Signal Intensity Ratio ≠ expressed gene Clustering RealExp Organization of data in GxDb Quality Sample Experiment Cel file Arraytype-Probeset
20
Query GxDb
21
Experiment Probeset Sample RealExp Signal Intensity Ratio Cluster
22
time-course of retinal development Visualization in GxDb
23
GxDb Website Upload Querying Display alnitak Star3 Star4 Star5 Star6 Star7 Star8 /GxData GxDb SQL database http://gx.igbmc.fr Web Services Café des sciences QSub Ordonnanceur GxDb ressources Languages used: PHP (HTML) - Upload - PipeWork - RadarGenerator - Fed R - Treatment and analysis protocol - RReportGenerator SQL Tcl - Gx (~ Gscope) - Probeset loading C - Cluspack
24
Conclusion and Prospects Automated raw-data upload, storage, treatment and analysis multiple treatment protocols multiple clustering methods multiple human and automatic expert analysis => Comparisons => Analyse the strengths and weaknesses of the different protocols Improvement of website More user friendly Visualization of clusters, ratio Tools for meta-analysis Possibility of upload data directly from GEO Diagnostic report to analyze easier the data Links to others databases and tools: STRING, GSEA..
26
Ratio Pipework Organism Normalization Ratio minimum Ratio maximum
27
Integration and storage in a unifying format Automated raw-data upload, storage, treatment and analysis multiple treatment protocols multiple clustering methods multiple human and automatic expert analysis => Comparisons => Analyse the strengths and weaknesses of the different protocols Facilitated querying and data visualization Advantages of GxDb
29
Arraytype RealExp Arraytype Sample CEL file r3 CEL file r2 CEL file r1 Arraytype RealExp 2 Arraytype Sample 2 CEL file r5 CEL file r4 CEL file r3 Arraytype RealExp 3 Arraytype Sample 3 CEL file r8 CEL file r7 CEL file r6 Arraytype RealExp 4 Arraytype Sample 4 CEL file r11 CEL file r10 CEL file r9 GxDb transcriptomics
30
PROBESET 3 probeset_id genename genedescription species speciessymbol representpublicid refseqtranscriptid gscope_id swissprot unigene_id entrezgene ensembl mgi cytoband chromoloc omim tissuespecificity linkeddiseases go_biologicalprocess go_cellularcomponent go_molecularfunction pathway interpro transmembrane PROBESET 2 genename probeset_id genedescription species speciessymbol representpublicid refseqtranscriptid gscope_id swissprot unigene_id entrezgene ensembl mgi cytoband chromoloc omim tissuespecificity linkeddiseases go_biologicalprocess go_cellularcomponent go_molecularfunction pathway interpro transmembrane Experiment Arraytype RealExp 1 Arraytype Sample CEL file r3 CEL file r2 CEL file r1 Arraytype RealExp 2 Arraytype Sample CEL file r3 CEL file r2 CEL file r1 Arraytype RealExp 3 Arraytype Sample CEL file r3 CEL file r2 CEL file r1 Arraytype RealExp 4 Arraytype Sample 4 CEL file r11 CEL file r10 CEL file r9 Arraytype PROBESET probeset_id genename genedescription species speciessymbol representpublicid refseqtranscriptid gscope_id swissprot unigene_id entrezgene ensembl mgi cytoband chromoloc omim tissuespecificity linkeddiseases go_biologicalprocess go_cellularcomponent go_molecularfunction pathway interpro transmembrane 45000 Sample Individual name age description Individual name age description Organism Genotype Tissue Treatment SampleCondition Signal Intensity Ratio Cluster
31
already exists ? Arraytypes Create new Arraytype already exists ? Sample Create new Sample with existing or new Individual existing or new Organism existing or new Tissues existing or new Genotype existing or new Treatment Upload your.CEL files Enter their association to Arraytypes and Samples Define Couples of RealExps for the Ratio Calculation Fill in the other information for the Experiment Run Automatic Analysis Query and Display Results GxDb protocol from upload to display Quality Report Signal Intensity Ratio Cluster Differentially Expressed Genes
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.