TCGA The Cancer Genome Atlas Project January 24, 2008.

Slides:



Advertisements
Similar presentations
Data Integration for Cancer Genomics. Personalized Medicine Tumor Board Question: given all we know about a patient, what is the “optimal” treatment?
Advertisements

The Rice Functional Genomics Program of China cDNA microarray database (RIFGP-CDMD) consists of complete datasets, including the probe sequences, microarray.
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
TCGA(The cancer genome atlas) catalogue genetic mutations responsible for cancer, using genome sequencing and bioinformatics The TCGA is sequencing the.
Peter Tsai, Bioinformatics Institute.  University of California, Santa Cruz (UCSC)  A rapid and reliable display of any requested portion of genomes.
Using HapMap.Org A Tutorial Lincoln Stein, Cold Spring Harbor Laboratory.
By: Katie Adolphsen, Robin Aldrich, Brandon Hu, Nate Havko.
What’s new in GDAC Firehose? Raw MAFs For many cancer types, mutation samples continued to be sequenced after paper publication. Previously, we only packaged.
Tutorial 7 Genome browser. Free, open source, on-line broswer for genomes Contains ~100 genomes, from nematodes to human. Many tools that can be used.
Evaluating cell lines as tumor models by comparison of genomic profiles Domcke, S. et al. Nat. Commun 4:2126.
NCBI resources III: GEO and expression data analysis Yanbin Yin Fall
NaviCell Web Service Data visualization tutorial.
ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Gene expression services: ArrayExpress and the Gene Expression Atlas Contact: Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
ArrayExpress and Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Call in: Participant Passcode: Centra: Meeting ID: ICR_meetinghttp://ncicb.centra.com April 1, 2009 caArray.
Support for MAGE-TAB in caArray 2.0 Overview and feedback MAGE-TAB Workshop January 24, 2008.
Gene Expression Omnibus (GEO)
ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Copyright OpenHelix. No use or reproduction without express written consent1.
ArrayExpress and Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Data Analysis Summary. Elephant in the room General Comments General understanding that informatics is integral in medical sequencing and other –omics.
Copyright OpenHelix. No use or reproduction without express written consent1.
UCSC Genome Browser 1. The Progress 2 Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools.
Taverna Workflow. A suite of tools for bioinformatics Fully featured, extensible and scalable scientific workflow management system – Workbench, server,
Introduction to caArray caBIG ® Molecular Analysis Tools Knowledge Center April 3, 2011.
CceHUB An Environment for Collaborative Cancer Research Ann Christine Catlin CCE Annual Retreat May 26, 2010 clinical dataobservational & scientific data.
Browsing the Genome Using Genome Browsers to Visualize and Mine Data.
Genboree Discovery Process Integration Aleksandar Milosavljevic, PhD Baylor College of Medicine January 10 th, 2008; modified April 1 st 2008.
Analysis of GEO datasets using GEO2R Parthav Jailwala CCR Collaborative Bioinformatics Resource CCR/NCI/NIH.
Gene Expression Omnibus (GEO)
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
Computational Laboratory: aCGH Data Analysis Feb. 4, 2011 Per Chia-Chin Wu.
CaArray User Community Meeting Release Demonstration Call in: Participant Passcode: Centra: Meeting.
GVS: Genome Variation Server Materials prepared by: Warren C. Lathe, PhD Updated: Q Version 2.
ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Introduction to caIntegrator caBIG ® Molecular Analysis Tools Knowledge Center April 3, 2011.
Applied Bioinformatics Week 9 Jens Allmer. Theory I Gene Expression Microarray.
What is NCIA? National Cancer Imaging Archive Searchable repository of in vivo cancer images in DICOM format Publicly available at no cost over the Internet.
Copyright OpenHelix. No use or reproduction without express written consent1.
CaArray User Community Meeting Feature Overview and Review of MAGE-TAB Update and Export Specification Call in: Participant Passcode:
Call in: Participant Passcode: Centra: Meeting ID: ICR_WShttp://ncicb.centra.com August 11, 2010 ICR-WS Meeting.
CBioPortal Web resource for exploring, visualizing, and analyzing multidimentional cancer genomics data.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Accessing and visualizing genomics data
Google Sites Credit to: Rich Hoeg, Create rich web pages easily Collect all your info in one place Control who can view and.
CaTissue Suite 1.2 TPBT Face to Face Michelle Lee, MBA, Ph.D. Ian Fore, D. Phil. December, 2009.
Bioinformatics Shared Resource Introduction to Gene Expression Omnibus (GEO) bsrweb.sanfordburnham.org
Overview and Demo of CaIntegrator2 A Tool for Publishing and Analyzing Integrated Study Data.
CCRC Cancer Conference November 8, 2015.
PRESENTATION TO CABIG TBPT – 10/4/10 TARA LICHTENBERG – NATIONWIDE CHILDREN’S DIANNE REEVES – NCI CBIIT MARTIN FERGUSON – TCGA PROJECT OFFICE TCGA development.
Date of download: 6/18/2016 Copyright © 2016 American Medical Association. All rights reserved. From: Association of BRCA1 and BRCA2 Mutations With Survival,
An Overview of The Cancer Genome Atlas (TCGA)
DEPARTMENT OF HEALTH AND HUMAN SERVICES National Institutes of Health National Cancer Institute Frederick National Laboratory is a federally funded research.
GEO (Gene Expression Omnibus) Deepak Sambhara Georgia Institute of Technology 21 June, 2006.
Department of Pathology UC Davis School of Medicine Jeff Gregg, M.D. The Development of an Informatics Platform for the Characterization of Clinical Samples.
GIAB: Genome reference material development resources for clinical sequencing Chunlin Xiao 1, Justin Zook 2, Shane Trask 1, Melissa Landrum 1, Marc Salit.
NCRI Cancer Conference November 1, 2015.
The regulation of Caspase 8 chIP-seq motifs mRNA expression DNA methylation.
A graph-based integration of multiple layers of cancer genomics data (Progress Report) Do Kyoon Kim 1.
Cancer Genomics and Class Discovery
Using ArrayExpress.
The PedcBioPortal & DiseaseXpress
 The human genome contains approximately genes.  At any given moment, each of our cells has some combination of these genes turned on & others.
High Incidence of Somatic BAP1 Alterations in Sporadic Malignant Mesothelioma  Masaki Nasu, Mitsuru Emi, Sandra Pastorino, Mika Tanji, Amy Powers, Hugh.
Gene Expression Omnibus (GEO)
Accessing TCGA Data.
Searching the NCBI Databases
A Tutorial Lincoln Stein, Cold Spring Harbor Laboratory
To Infinium, and Beyond! Cancer Cell
Presentation transcript:

TCGA The Cancer Genome Atlas Project January 24, 2008

TCGA Program Goal: find genomic alterations that cause cancer (mutations, CNA, methylation, …) Pilot project –$100M (NCI/NHGRI) –3 years –3 diseases brain (glioblastoma multiforme) lung (squamous) ovarian (serous cystadenocarcinoma )

TCGA Organization Biospecimen Core Resource (BCR) Genome Sequencing Centers (GSCs) (3) Cancer Genome Characterization Centers (CGCCs) (7) Data Coordinating Center (DCC) Project Team (NCI/NHGRI) Steering Committee (NCI/NHGRI & PIs) External Scientific Committee Working Groups

TCGA PI’s BCRIGC/TGENRobert Penny GSCBaylorRichard Gibbs BroadEric Lander WashURick Wilson CGCCBroad/DFCIMatthew Meyerson Harvard/B&WRaju Kucherlapati JHUSteve Baylin LBLJoe Gray MSKCCMarc Ladanyi StanfordRick Myers UNCChuck Perou DCCSRAAri Kahn

TCGA URLs project site: gforge: (search for TCGA) data: portal: [coming]

TCGA Data Types InstitutionAnalysisPlatform Broad/DFCITranscription and Copy Number Affymetrix U133 Plus 2.0 & SNP Array 6.0 Harvard/B&WTranscription and Copy Number Agilent 244K Array LBLTranscriptionAffymetrix Exon 1.0 ST Array MSKCCCopy NumberAgilent 244K Array JHUMethylationIllumina GoldenGate UNCTranscriptionAgilent 44K Array StanfordCopy NumberIllumina Infinium 550K BeadChip Array BroadSomatic MutationsDNA sequencing BaylorSomatic MutationsDNA sequencing WashUSomatic MutationsDNA sequencing

TCGA Data Levels raw –low-level data for a single sample, not normalized (e.g., trace file,.cel file) processed –single-sample, normalized & interpreted (e.g. mutation call, amplification call for a locus,.snp,.chp) segmented (n/a for mutation & expression) –single-sample, aggregation of loci into regions (e.g. amplification call for a region of a sample) summary finding (aka “region of interest”) –cross-sample findings (e.g. minimal common region of amplification across a sample set)

TCGA Flow Tissue Source (MD Anderson, Henry Ford, …) BCR 1.check pathology, quality/quantity 2.extract analytes 3.prepare data file GSC WGACGCC DNA, mRNA DNA NCBI Trace Archive DCC sample data Bulk Download caTissue Core caArraycaIntegrator “tracking database”

TCGA Data Formats BCR –XML (tags are CDEs) –images GSC –Called mutations (Genboree LFF format) –Linking table sample-trace-target CGCC –MAGE-TAB IDF: Investigation Definition Format SDRF: Sample and Data Relationship Format

TCGA Where Does/Will the Data Go? ftp site (now with a simple web wrapper: “portal #1”) “tracking database” repositories with caBIG API’s –caArray –caTissue CORE –caIntegrator –NCIA NCBI trace archive a richer, “portal #2” –more convenient download capability –filtering datasets by clinical information –summary level data –genome browser view –gene info page –visualization on pathways –etc.