Data and Hartwig Medical Foundation December 2016
Eg: Primary Tumor location Molecular and clinical data both required Eg: Primary Tumor location Genetic variants Molecular data (WGS) Clinical data (ECRF) Patient Report Data for Research Treatment A/B/C Genetic profiles
Biopsies per center Note: stats from patients reported until 26 oct 2016 and ECRF data from 2nd of nov 2016 (total of 425 biopsies)
Clinical data: Primary tumor location Note: "Other" includes 15 types (vaginal, skin, cervical, bladder, ovarian, lung (non NSCLC), gallbladder, stomach, pancreatic, small intestine, eye, urinary, salivary gland, anal, oesophageal) Note: stats from patients reported until 26 oct 2016 and ECRF data from 2nd of nov 2016 (total of 425 biopsies)
Clinical data: first documented treatment Note: stats from patients reported until 26 oct 2016 and ECRF data from 2nd of nov 2016 (total of 425 biopsies)
The ‘pipeline’ Genomic Variants sequencing (lab) gVCF GATK SNV + INDEL calling bcl2fastq conversion gVCF FASTQ file collection for one sample genotyping read-mapping variant filtering SNP Check Main msg: ready to share data (but start with only own center) mark-duplicates annotation indel-realignment VCF BAM Genomic Variants
somatic variant calling The ‘pipeline’ TXT somatic variant calling Tumor / Normal pair? CNV calling annotate VCF merge strelka mutect freeBayes varscan YES filter Kinship Check Somatic Mutations
The ‘pipeline’ Goal: move from a molecular report to "treatment report"
Patient Report: time to report Goal Hartwig Medical Foundation: report within 6 weeks (= 42 days) Note: stats from patients reported until 26 oct 2016
Data sharing Data will be stored centrally (partner Schuberg-Philis) Relevant clinical data retrieved from medical centers via ECRF Only data from patients who sign informed consent form Data is owned by patient and treating hospital Supplying centre has access to its own data Anonymised data (the database) accessible for research Data request handled by Data Access Board
Data sharing: Portal (currently in test phase) Goal: allow healthcare professionals to access/download the relevant data from their own centre
Data quality Central theme: Genome in a Bottle (GIAB) sample NA12878 Germline: periodically sequence single sample NA12878 Somatic: use artificially created "tumor" by mixing two GIAB samples PrecisionFDA: https://precision.fda.gov/ DREAM challenge: https://www.synapse.org/#!Synapse:syn2813589/wiki/401435 Germline Somatic Internal GIAB NA12878 GIAB based 70/30 Mixin External precisionFDA DREAM / Synapse Main msg: ready to share data (but start with only own center)
Tumor coverage depth Aim Hartwig Medical Foundation: minimal 90x when % tumor-cells = 30 Other metrics to add to our catalogue: mutational patterns, variant distributions Note: stats from patients reported until 26 oct 2016 (total of 425 biopsies)