Candidate Gene Resource Steering Committee Meeting July 25, 2006
Goals for Today Strengthen relationships among CARE investigators Define pilot project (phenotypes & SNPs) Establish principles of data release Discuss genotyping study design Select phenotypes to be analyzed
CARE Governance Steering committee –Representative of each CARE organization –Subcommittees : Data Release, Phenotypes, Study Design, Informatics, SNP Selection, DNA/Genotyping NHLBI staff NHLBI appointed oversight committee
CARE : timeline RFP released March 2005 Response submitted July 15, 2005 Awarded April 1, 2006 Four year award –Y1: Create DNA and phenotype database –Y2: Genotyping –Y3 / 4: Joint analysis and data distribution
Resources Provided by NHLBI $18.3M over 4 years to create a resource to relate genotype-phenotype across cohorts: –Create a consortium among CARE cohorts –Database DNA and phenotypes –Genotype a common set of SNPs across cohorts –Create software tools to enable joint analysis –Data distribution as per CARE data release policy –Project management and coordination -PM hired : Deb Farlow
Areas for Discussion Today Data Release Study Design Phenotypes NHLBI Current state of genotyping technology Presentation of informatics tools
Data release Data release policy to be established by CARE steering committee with NHLBI and local IRB’s Broad proposed secure, HIPAA compliant web architecture to implement this policy and to enable access-controlled environment for data sharing and analysis
Areas for Discussion Today Data Release Study Design Phenotypes NHLBI Current state of genotyping technology Presentation of informatics tools
Original CARE Study Design Candidate Gene Study –50,000 samples –average 10 SNPs/gene x 1700 genes = 17,000 SNPs –Requirement: $0.01 /genotype (fully loaded) Whole Genome Association Study –500 cases / 1,000 controls –At least 300,000 SNPs genome wide
Candidate gene study Targeted genotyping technology has remained stable : same price and throughput as in approved proposal Key issue: criteria for selecting 17,000 candidate gene-based SNPs – biological hypotheses
Developments since RFP Whole genome scans promise new hypotheses for candidate genes Evaluation of coverage / performance of whole genome arrays Price for whole genome genotyping technology has improved
Whole genome scanning SHARE will genotype 15,000 people from NHLBI cohorts (FHS and TBA) RFA for 4-5 whole genome scans GAIN, WTCCC, etc, etc Implication: hypotheses that could be confirmed and extended by CARE Challenge: timing doesn’t synch up well with original CARE timeline
Developments since RFP Whole genome scans promise new hypotheses for candidate genes Evaluation of coverage / performance of whole genome arrays Price for whole genome genotyping technology has improved
Coverage
Do they work? * from
Do They Work at High Scale? Recent Call Rate Data (at Broad) ProductChipsCall Rate Affy 500K12, % ILMN 317K % In-Process QC test HapMap sample vs Hap Map Avg=99.62% 7,947,748 comparisons
QC statistics: MS andT2D Scans
DM vs. BRLMM 2500 chips <5% of chips fail
MIP (20K)
WGAS: Then and Now Original Plan Product: Affymetrix 500K Total cost per sample: $1600 (chip+reagents+equipment+labor+IDC) Study Design: 500 cases / 1,000 controls Budget=$2,400,000
WGAS: Then and Now Now possible Product: Affymetrix 500K Total cost per sample: $530 (chip+reagents+equipment+labor+IDC) Study Design: 4,500 samples Budget=$2,400,000
WGAS: Then and Now January 2007 Product: Affymetrix 500K Total cost per sample: $410 (chip+reagents+equipment+labor+IDC) Study Design: 5,800 samples Budget=$2,400,000
In Summary SNPsSamplesCost 7/15/05500,000 1,500$2.4M 17,00050,000$8.5M 7/25/06500,000 4,500$2.4M 17,00050,000$8.5M 1/07500,000 5,800$2.4M `17,00050,000$8.5M
Conclusions: genotyping Targeted genotyping (custom set of candidate genes) $0.01 / gt Timing of candidate gene selection Improved cost and performance of whole genome $0.001 / gt
Areas for Discussion Today Data Release Study Design Phenotypes NHLBI Current state of genotyping technology Presentation of informatics tools
High Level Workflow – for CaRE Upload Samples, Peds, Individuals, Phenotypes Create Experiments (Samples x Features) Summarize/Filter PLINK Data Vault QC/Curate Results Design and Execute Experiments Project DB LIMS DBs BSP DB Association & Statistics Viewers Cohort’s Custom Algorithms, Viewers Web Services Data Compile Feature DB Analysis: Gene Pattern + CaRE analysis tools Production: BSP/GAP + CaRE enhancements
Designing a Pilot A trial run for DNA quality, genotyping, phenotype and joint analysis, and publication Scale and content of pilot to be refined, topic for today’s discussion sessions
A R E Our shared aspiration: the greatest genetic epidemiology experiment to date C C CSSCD
Technological Advance Current 500K assayNew 500K assay DNA
How? Smaller format BRLMM Sequence Variability (DNA Analysis) A/A B/B A/B Mismatch probes not needed Fewer probes needed Single format
No drop in Het Calls
Mendel Errors Per Plate Accuracy 99.4% Sty/Nsp : one family 25,000 errors
Coverage of Common Variants by Whole-genome Products Tag SNPs Affymetrix Mapping 500K GeneChip Illumina HumanHap300 BeadChip
Coverage Mostly Provided by Pairwise Correlations A A A T T T G G G T T T G G T G G G A A C A A C T T C T T C T T G T T G G G C C C C G G T T G G G G T T G G C C C C T T C C C C G G A A A A C C A A A A T T G G C C C C G G C C C C G G T T G G
Specified Multimarker Tests Improve Effective Coverage A A A T T T G G T G G G A A C A A C G G C C C C G G T T G G G G T T G G C C C C T T G G T T G G C C
Coverage of the genome
Other recent developments Whole genome scan planned in 9,000 FHS participants (SHARE) Other whole genome scans will be funded (recent NHLBI RFA)