SIGMA: A Platform to Visualize and Analyze DNA Copy Number Microarray Data Raj Chari, PhD Student BC Cancer Research Centre Department of Cancer Genetics and Developmental Biology APIII Conference, August 17 th, 2006
Overview DNA microarrays and array comparative genomic hybridization (array CGH) Architecture of SIGMA Examples Current/Future directions
Studying DNA changes Methods to study DNA aberrations are getting better => movement to array-based Different from expression microarrays Measure genomic content vs. RNA transcript levels Dynamic range of values are much smaller Discrete vs. continuous data (segmentation algorithms)
Array CGH Technology Chari et al, Cancer Informatics, 2006, 2, 48-58
Rationale for SIGMA Many different platforms for array CGH Software developed tends to be platform-specific Inefficient data processing pipeline Need to encapsulate data processing and support different types of data => System for Integrative Genomic Microarray Analysis (SIGMA)
Architecture of SIGMA MySQL Database SERVER MySQL Database LOCAL R: Analysis Java Application JDBC JGR JDBC
Main interface
Functionalities of SIGMA Importing data from multiple array CGH platforms Built-in segmentation algorithms DNACopy Edge detection based Segmentation (Poster #105) Integration with other types of DNA microarray-based assays Chromosome Immunoprecipitation on microarray chips (ChIP on chip) (Poster #116) => Histone acetylation Methylation Dependent Immunoprecipitation array CGH (MeDIP array CGH) (Poster #120) => DNA methylation Gene expression => RNA levels
Example: cancer cell line database “stripped” down version of SIGMA database of pre-processed data Poster #104 Case #1: Examining a single sample for copy number aberrations Case #2: Identifying recurrent alterations in lung adenocarcinoma
H2087 Lung cancer cell line A.Whole genome karyogram B.Chromosome 8 C.Region on arm 8q D.Highlight and find genes
Segment & Curate changes 100% 50% +1 Individual Profile Detection of Alterations Frequency of alterations (aligning many profiles)
Summary of 24 Lung Adenocarcinomas
Current / Future Directions Database of cancer cell lines will soon be publicly available Full application to be completed by October Integration with proteomics DNA-RNA-Protein Multi-dimensional views of the cell will enhance understanding of pathogenesis => “Systems” approach
Acknowledgements Wan Lam lab Calum MacAulay Funding organizations: