Www.bioinformatics.ca CCRC Cancer Conference November 8, 2015.

Slides:



Advertisements
Similar presentations
Oncomine Database Lauren Smalls-Mantey Georgia Institute of Technology June 19, 2006 Note: This presentation contains animation.
Advertisements

Data integration across omics landscapes Bing Zhang, Ph.D. Department of Biomedical Informatics Vanderbilt University School of Medicine
Peter Tsai, Bioinformatics Institute.  University of California, Santa Cruz (UCSC)  A rapid and reliable display of any requested portion of genomes.
Copyright OpenHelix. No use or reproduction without express written consent1 Organization of genomic data… Genome backbone: base position number sequence.
Presented by Karen Xu. Introduction Cancer is commonly referred to as the “disease of the genes” Cancer may be favored by genetic predisposition, but.
Before we start: Align sequence reads to the reference genome
NGS Analysis Using Galaxy
ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Gene expression services: ArrayExpress and the Gene Expression Atlas Contact: Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
ArrayExpress and Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
TAIR resources for plant biology research kate dreher curator TAIR/PMN.
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Detecting enriched regions (Chip- seq, RIP-seq) Statistical evaluation of enriched regions Data displayed in Genome Browser Detection of enriched motifs.
ArrayExpress and Gene Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
ArrayExpress and Expression Atlas: Mining Functional Genomics data Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
EGAN: Exploratory Gene Association Networks by Jesse Paquette Biostatistics and Computational Biology Core Helen Diller Family Comprehensive Cancer Center.
MES Genome Informatics I - Lecture VIII. Interpreting variants Sangwoo Kim, Ph.D. Assistant Professor, Severance Biomedical Research Institute,
TCGA The Cancer Genome Atlas Project January 24, 2008.
RNAseq analyses -- methods
GENOME-CENTRIC DATABASES Daniel Svozil. NCBI Gene Search for DUT gene in human.
UCSC Genome Browser 1. The Progress 2 Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools.
Experimental validation. Integration of transcriptome and genome sequencing uncovers functional variation in human populations Tuuli Lappalainen et al.
Introductory RNA-seq Transcriptome Profiling. Before we start: Align sequence reads to the reference genome The most time-consuming part of the analysis.
EADGENE and SABRE Post-Analyses Workshop 12-14th November 2008, Lelystad, Netherlands 1 François Moreews SIGENAE, INRA, Rennes Cytoscape.
NCI Cloud Pilot Collaboration Meeting
Variation Cytoscape 3 app Michael L Heuer dishevelled.org 28 Oct 2013.
Mutation Calling IGV Exercises. Run IGV – Web search IGV (Integrative Genomics Viewer) – Go to Download page – may need to provide – Launch with.
1 of 38 Data Mining in Ensembl with BioMart. 2 of 38 Simple Text-based Search Engine.
Data Mining in Ensembl with BioMart Nov,
Fea- ture Num- ber Feature NameFeature description 1 Average number of exons Average number of exons in the transcripts of a gene where indel is located.
Predicting protein degradation rates Karen Page. The central dogma DNA RNA protein Transcription Translation The expression of genetic information stored.
SCRIPPS GENOME ADVISER Galina Erikson Senior Bioinformatics Programmer The Scripps Translational Science Institute Scripps Translational Science Institute.
COMPUTATIONAL ANALYSIS OF MULTILEVEL OMICS DATA FOR THE ELUCIDATION OF MOLECULAR MECHANISMS OF CANCER Presented by Azeez Ayomide Fatai Supervisor: Junaid.
Introductory RNA-seq Transcriptome Profiling. Before we start: Align sequence reads to the reference genome The most time-consuming part of the analysis.
Building WormBase database(s). SAB 2008 Wellcome Trust Sanger Insitute Cold Spring Harbor Laboratory California Institute of Technology ● RNAi ● Microarray.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics | Saurabh Sinha | PowerPoint by Casey Hanson.
Clustering Algorithms to make sense of Microarray data: Systems Analyses in Biology Doug Welsh and Brian Davis BioQuest Workshop Beloit Wisconsin, June.
GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic.
EBI is an Outstation of the European Molecular Biology Laboratory. Gautier Koscielny VectorBase Meeting 08 Feburary 2012, EBI VectorBase Text Search Engine.
Contains details of your submission Manifest file FILE EXTENSION -.manifest.json FORMAT - JSON format REQUIRED - Genboree login name, group name, database.
No reference available
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Accessing and visualizing genomics data
Slide 1 of 24 Copyright Pearson Prentice Hall 12-4 Mutations 12–4 Mutations.
Genomes at NCBI. Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools lists 57 databases.
Lars Ailo Bongo NBS meeting Tromsø, Jan 23, 2016 NeLS Norwegian e-Infrastructure for Life Sciences Overview and recent developments
Canadian Bioinformatics Workshops
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
(1) Genotype-Tissue Expression (GTEx) Largest systematic study of genetic regulation in multiple tissues to date 53 tissues, 500+ donors, 9K samples, 180M.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Introductory RNA-seq Transcriptome Profiling of the hy5 mutation in Arabidopsis thaliana.
DEPARTMENT OF HEALTH AND HUMAN SERVICES National Institutes of Health National Cancer Institute Frederick National Laboratory is a federally funded research.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
NCRI Cancer Conference November 1, 2015.
The regulation of Caspase 8 chIP-seq motifs mRNA expression DNA methylation.
Genomon a high-integrity pipeline for cancer genome and transcriptome sequence analysis Kenichi Chiba(1), Yuichi Shiraishi(1), Ai Okada(1), Hiroko.
Alignment table: group 4
How to store and visualize RNA-seq data
Annotation of Sequence Variants in Cancer Samples
Ensembl Genome Repository.
Annotation of Sequence Variants in Cancer Samples
TCGAbiolinks, Elmer & FunciVAR:
The Genetic Basis for Cancer Treatment Decisions
SIFGD: Setaria italica Functional Genomics Database
TOPMed Analysis Workshop Genetic Analysis Center Biostatistics Department University of Washington TOPMed Data Coordinating Center August 7-9, 2017 Introduction.
The NCI Genomic Data Commons as an engine for precision medicine
Presentation transcript:

CCRC Cancer Conference November 8, 2015

2Module #: Title of Module

CCRC Workshop 2015 – Module 2 bioinformatics.ca The ICGC Data Portal Part 1: Data submission, processing and release

CCRC Workshop 2015 – Module 2 bioinformatics.ca ICGC Data Release Cycle Release 1 Data files Submission and Validation Time Data Annotation & ETL Sign off Portal Release Release 2 Data files Submission and Validation Sign offOpen Portal Release Data Annotation & ETL

CCRC Workshop 2015 – Module 2 bioinformatics.ca Data Type Submitted To the Data Coordination Center (DCC) – Simple somatic mutations and germline variants – Copy number somatic mutations and germline variants – Structural somatic mutations and germline variants – DNA methylation – Gene expression (RNA-Seq, microarrays) – Protein expression – miRNA – Exon junctions To the European Genome Archive (EGA) and CGHub – Raw sequencing data (FASTQ, BAM)

CCRC Workshop 2015 – Module 2 bioinformatics.ca Data Validation at Submission

CCRC Workshop 2015 – Module 2 bioinformatics.ca Data Annotations & ETL Pipeline Annotations – Mutation frequencies – Mutation consequences protein changes and their consequences for genes & transcripts (e.g. amino acid substitution, frameshift, nonsense-mediated decay etc) – Mutation functional impact High impact mutation prediction by FatHMM – Gene Sets: Gene Ontology terms, Reactome Pathways, Cancer Gene Census ETL data processing pipeline – Annotations and data are transformed and indexed using an ElasticSearch to support highly integrated search

CCRC Workshop 2015 – Module 2 bioinformatics.ca THE ICGC Data Portal Part 2: Portal feature highlights

CCRC Workshop 2015 – Module 2 bioinformatics.ca ICGC Data Portal Quick keyword search Major functional sections

CCRC Workshop 2015 – Module 2 bioinformatics.ca Top 20 mutated genes with high functional impact SSMs in selected cancer projects Simple somatic mutation rate per donor across selected cancer projects Facets

CCRC Workshop 2015 – Module 2 bioinformatics.ca Project Entity Page ALSO Most frequent mutations Most affected donors Publications Filter on high impact mutations ALSO Most frequent mutations Most affected donors Publications Filter on high impact mutations

CCRC Workshop 2015 – Module 2 bioinformatics.ca Gene Entity Page Pfam domains for all transcripts Frequencies by cancer projects mutations

CCRC Workshop 2015 – Module 2 bioinformatics.ca Reactome Pathway Entity Page

CCRC Workshop 2015 – Module 2 bioinformatics.ca Permanent ID across releases Consequences for all transcripts Mutation Entity Page View the mutation in Genome Viewer

CCRC Workshop 2015 – Module 2 bioinformatics.ca Genome Viewer

CCRC Workshop 2015 – Module 2 bioinformatics.ca Donors, mutated genes and mutations found simultaneously Download data files for filtered donors only Search data of interest by applying filters at Donor, Gene, and/or Mutation Search for donor files in external repositories (e.g. raw data) Current filters Export table Facets: filter + count Save the current donors

CCRC Workshop 2015 – Module 2 bioinformatics.ca Customized saved donor, gene and mutation sets Analyses: Enrichment Analysis Phenotype Comparison Set Operation Analyses: Enrichment Analysis Phenotype Comparison Set Operation

CCRC Workshop 2015 – Module 2 bioinformatics.ca File filters: Repository, Data Type, Experimental Strategy, File format, Access

CCRC Workshop 2015 – Module 2 bioinformatics.ca Acknowledgment Principal Investigator – Vincent Ferretti Project Manager – Francois Gerthoffert Lead bioinformatician – Junjun Zhang Software Architect and Tech Lead – Bob Tiernay Business Analyst – Phuong-My Do Software Developer – Dusan Andric – Terry Lin – Michael Moncada – Vitalii Slobodianyk

CCRC Workshop 2015 – Module 2 bioinformatics.ca The ICGC Data Portal Part 3: Live demo