Download presentation
Presentation is loading. Please wait.
1
Genomic Arrays: Tools for cancer gene discovery Ian Roberts MRC Cancer Cell Unit Hutchison MRC Research Centre ir210@cam.ac.uk
2
2/17 What’s a genomic array? A platform of regularly spaced genomic sequences All known genes or a subset of genes of interest A tool for querying the genome about damage Genomic gains (oncogenes) Genomic losses (tumour suppressor genes) Applications Research disease gene discovery Clinical diagnostic tests
3
Comparative genomic hybridisation Tumour DNA (Test) Normal DNA (Reference) + Available probe GAIN: More test probe than reference probe (oncogene) LOSS: Reference probe in excess of test (tumour suppressor) Vast majority is normal Array platform
4
4/17 New generation arrays produce large amounts of data Agilent 244K array 243,504 defined spots Raw data is foreground and background signal intensities in two channels Median ratio of foreground is important.
5
aCGH data analysis...... using camgrid
6
6/17 Genomic array analysis strategy using R 1. array data is processed by snapCGH R package Correct array data for background noise and mean distribution Order data by genomic location Apply an aCGH segmentation algorithm Draw some plots 2. Determine significant findings (in house R functions) Common and minimum genomic regions of gain and loss Summarise output R www.cran.r-project.org snapCGH www.bioconductor.org parrot R on camgrid http://www.bio.cam.ac.uk/local/condor-parrot.html
7
7/17 Old vs. New genomic array plots Chromosome 7
8
Significant region detection is computationally intensive
9
Distributed aCGH analysis Consolidate output Preprocess data Input data to snapCGH (e.g. 3 chrs, 2 analysis methods) Condor Job 1 Condor Job 2 Generate genome ordered data and condor dagman analysis batch files Chr 1Chr 2Chr 3 DNA copy GLAD DNA copy GLAD DNA copy GLAD Perform aCGH analysis + region detection (1 run per Chr per analysis method) DNAcopy dagman description file Score combining 1. Clone call scoring n. Clone call scoring Segmentation Step CRI MRI Detection Dagman job 1 … n
10
10/17 Condor job scripting in BASH & R BASH function Responsible for producing required condor files for discrete jobs Default_submit has 2 positional parameters R script name $1 Data files $2 Initiates aCGH analysis on grid. Condor dagman R function set R-scripter Writes the appropriate R script for the current job R-condor-submitter Writes the condor job submission file R-condor-executer Writes the condor job executable file R-job-descriptor Writes the condor dagman description file
11
11/17 End user abstraction – start_aCGH.sh aCGH analysis undertaken by a single shell command Manages array data input Collects user specified parameters Chromosome range Segmentation algorithms Significance thresholds Links condor R job scripting
12
12/17 start_aCGH.sh session on mole
13
…. continued … 1 hr – 6 hr later! aCGH region information and plots
14
14/17 Summary findings (38 arrays) Rapid identification of regions of interest Easy comparison of aCGH analysis via different algorithms Bio HMM DNAcopy Sample percentage Region size
15
15/17 Real life application Retrospective analysis confirms initial findings! (summary of 38 samples) OSMR Sample percentage Region size
16
16/17 Future development Tailor output for specific user requirements Produce overall summary plot Apply approach to expression arrays
17
www.bio.cam.ac.uk/~ir210 Grace Ng Steph Carter Konstantina Karagavriliidou Jenny Barna Mark Calleja Nick Coleman
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.