Vanderbilt’s DNA Databank: BioVU
Personalized Medicine Integration of genomic information into clinical decision making Personalized disease treatment and also preventative therapies
Personalized Medicine A SNP is a single base-pair mutation that occurs at a specific site in the DNA sequence - occurs in at least 1% of the population SNPs are responsible for over 80% of the variation between two individuals; they are ideal for establishing correlations between genotype and phenotype As some SNPs predispose individuals to have a certain disease or trait or react to a drug in a different way, they will be highly useful in diagnostics and drug development
What is BioVU? The move towards personalized medicine requires very large sample sets for discovery and validation BioVU: biobank intended to support a broad view of biology and enable personalized medicine Contains de-identified DNA extracted from leftover blood after clinically-indicated testing of Vanderbilt patients who have not opted out Linked to Synthetic Derivative: de-identified EMR Current sample number: 116,551 105,910 adult samples 10,641 pediatric samples
The “synthetic derivative” Extract DNA John Doe A7CCF99DE65732…. eligible A7CCF99DE5732…. One way hash A7CCF99DE65732…. scrubbed John Doe The “synthetic derivative” (SD): can be updated
Synthetic Derivative vs. BioVU A7CDE6532 …. scrubbed A7CDE6532 …. scrubbed A7CDE6532…. + Synthetic Derivative BioVU ~1.9 million ~116,000
The Synthetic Derivative A Derivative of the EMR - information content reduced by ‘scrubbing’ identifiers Systematically shifted event dates Contains ~1.9 million records ~1 million with detailed longitudinal data averaging 100,000 bytes in size an average of 27 codes per record Records updated over time and are current through 9/31/09 Can be searched restricting to records for which DNA is available
Synthetic Derivative Data Types Narratives, such as: Clinical Notes Discharge Summaries History and Physicals Problem Lists Surgical Reports Progress Notes Letters Diagnostic Codes, Procedural Codes Forms (intake, assessment) Reports (pathology, ECGs, echocardiograms) Clinical Communications Lab Values and Vital Signs Medication Orders TraceMaster (ECGs)
BioVU Program Review
Sample accrual
BioVU Sample Management Samples are stored in a robotic sample storage system that allows for fully automated high speed sample picking. Currently genotyping on BioVU samples is being done in the DNA Resources Core here at Vanderbilt. RTS SmaRTStore
Validation in BioVU Sample handling algorithms Ancestry Gender match 1/384 gender mismatches Ancestry Characterize sample ancestry, assess usefulness of ‘race’ as defined in EMR Provide a panel of ancestry informative markers that define ancestry No significant difference between the concordance of self-report or observer-report with genetic ancestry Demonstration project – American Journal of Human Genetics Can known associations between genetic variants and common diseases be identified in the EMR?
The “demonstration project” Genotype “high-value” SNPs in the first 8,000 samples accrued. including SNPs associated by replicated genome-wide experiments with common diseases & traits Atrial fibrillation Crohn’s disease Multiple Sclerosis Rheumatoid arthritis Type II Diabetes Develop Natural Language Processing methods to identify cases and controls Are genotype-phenotype relations replicated?
First results 0.5 1.0 2.0 5.0 Odds Ratio gene / disease marker region Chr. 4q25 Atrial fibrillation rs10033464 Chr. 4q25 rs11805303 IL23R rs17234657 Chr. 5 Crohn's disease rs1000113 Chr. 5 rs17221417 NOD2 rs2542151 PTPN22 rs3135388 DRB1*1501 Multiple sclerosis rs2104286 IL2RA rs6897932 IL7RA rs6457617 Chr. 6 Rheumatoid arthritis rs6679677 RSBN1 rs2476601 PTPN22 rs4506565 TCF7L2 rs12255372 TCF7L2 rs12243326 TCF7L2 rs10811661 CDKN2B Type 2 diabetes rs8050136 FTO rs5219 KCNJ11 rs5215 KCNJ11 rs4402960 IGF2BP2 0.5 1.0 2.0 5.0 Odds Ratio
First results 0.5 1.0 2.0 5.0 Odds Ratio gene / disease marker region Chr. 4q25 Atrial fibrillation rs10033464 Chr. 4q25 rs11805303 IL23R rs17234657 Chr. 5 Crohn's disease rs1000113 Chr. 5 rs17221417 NOD2 rs2542151 PTPN22 rs3135388 DRB1*1501 Multiple sclerosis rs2104286 IL2RA rs6897932 IL7RA rs6457617 Chr. 6 Rheumatoid arthritis rs6679677 RSBN1 rs2476601 PTPN22 rs4506565 TCF7L2 rs12255372 TCF7L2 rs12243326 TCF7L2 rs10811661 CDKN2B Type 2 diabetes rs8050136 FTO rs5219 KCNJ11 rs5215 KCNJ11 rs4402960 IGF2BP2 0.5 1.0 2.0 5.0 Odds Ratio
Types of projects Discovery or validation of genotype-phenotype relations for disease susceptibility or drug responses Discovery of new disease/susceptibility genes resequence in patients (obesity, Cushing's, susceptibility to infection, insomnia, pre-term birth) Access samples without disease X, or “normals” of specified ancestry, or old normals Phenome-wide association study (PheWAS): in development
Research Use Cases Retrospective chart reviews Rapid preliminary data for grant submissions Feasibility assessment Hypothesis generation
Examples of ICD-9 codes for rare diseases Example Rare Disease Number in SD Number in BioVU Microcephalus 1,070 85 Pica 115 22 Septicemic Plague 21 Pick’s Disease 45 8 Acromegaly and Gigantism 571 123 Ehlers-Danlos Syndrome 285 34 Narcolepsy without Cataplexy 438 76 Spina Bifida 1968 238 Stiff-Man Syndrome 82 17 Tourette Syndrome 667 Bell’s Palsy 2534 402 Bulimia Nervosa 919 88 Cushing’s 1443 298 Peyronies Disease 694 157 Wilson’s Disease 140 49 Meningioma 1444 355 Wegener’s 363 141
Data use agreement + IRB Approval Investigator query Data use agreement + IRB Approval cases controls + 20
Data use agreement + IRB Approval Investigator query Data use agreement + IRB Approval Manual Review cases controls + 21
Data use agreement + IRB Approval cases controls + Investigator query Data use agreement + IRB Approval Sample retrieval cases controls + 22
+ + cases Investigator query controls Sample retrieval cases controls Genotyping, genotype-phenotype relations Investigator query Data use agreement + IRB Approval Sample retrieval cases controls + 23
Data Use Agreement
Genotyping Data Accrual
Coronary Artery Disease Nationally Prevalent Diseases in the African American Population Disease BioVU Count Hypertension 1095 Type 2 Diabetes 714 Coronary Artery Disease 273 Kidney Disease 252 Asthma 210 Pneumonia 193 Stroke 133 Lupus 48 Lung cancer 21 Genotype data on 1786 AA subjects
BioVU Home page. Description of resource BioVU Home page. Description of resource. On the left, you can click for application instructions, status of your application, information about support for BioVU projects, and Record Counter to determine approximate number of records that meet specific criteria (useful for sample size estimations and power calculations when developing BioVU projects/proposals). Tools for assistance with grant language and also statistical analysis plan for BioVU proposals. Click “Genotyping Data” button for more information about current sample counts and genotyping data (page that comes up is on next slide).
BioVU Home page. Description of resource BioVU Home page. Description of resource. On the left, you can click for application instructions, status of your application, information about support for BioVU projects, and Record Counter to determine approximate number of records that meet specific criteria (useful for sample size estimations and power calculations when developing BioVU projects/proposals). Tools for assistance with grant language and also statistical analysis plan for BioVU proposals. Click “Genotyping Data” button for more information about current sample counts and genotyping data (page that comes up is on next slide).
Record Counter is on the BioVU homepage.
This is the screen when you log into Record Counter.
You can query by ICD-9 code, for example MS.
You can also query for medications as shown here.
It will instruct you to enter the medication as shown here with avonex as an example.
You can search for multiple medications as well.
For example, if you want to query the records for MS cases with interferon treatment, you can add the medications to the same Criteria Group as shown here for 3 different interferon medications. This results in a query for MS cases with either avonex, betaseron, OR rebif. This will pull MS records with an occurrence of any of the 3 medications. If you wanted to query for MS cases on all 3 medications, you would need to add each medication to a SEPARATE Criteria Group. This is just using “and” / “or” logic. If you want to query only cases with DNA samples in BioVU, you have to click “Limit by DNA availability”. When you have all your criteria entered, click “Execute Query”.
This will give you a readout of total number of cases in BioVU that meet your criteria with demographic breakdown.
This will give you a readout of total number of cases in BioVU that meet your criteria with demographic breakdown.
BioVU Home page. Description of resource BioVU Home page. Description of resource. On the left, you can click for application instructions, status of your application, information about support for BioVU projects, and Record Counter to determine approximate number of records that meet specific criteria (useful for sample size estimations and power calculations when developing BioVU projects/proposals). Tools for assistance with grant language and also statistical analysis plan for BioVU proposals. Click “Genotyping Data” button for more information about current sample counts and genotyping data (page that comes up is on next slide).
This will give you a readout of total number of cases in BioVU that meet your criteria with demographic breakdown.
When you click on “Application Instructions”, an explanation of the 5 different application options will appear. This depends on whether investigator will be requesting genotyping or not and also whether the investigator intends to apply for VICTR funding to support their BioVU project. The first option is just for SD users (the process will now be electronic).
An example of the web-based application process (this particular example is for an investigator who wants to request genotyping as well as apply for VICTR funding). Investigators will need IRB approval for their project, a 5 page research proposal, and a biosketch. They will also need to sign a Data Use Agreement. Once all of these are completed, they will complete the electronic application (in REDCap) where they will download all of their documents and complete application information. The link to VICTR resource request is also there (this process will not change – they will apply for VICTR funding as usual, however their application will not be processed until Shraddha receives the go ahead that the BioVU proposal has been approved).
For assistance with BioVU project development or application submission, there are office hours, information clinics, and studios. Investigators may also apply for VICTR funding.
BioVU Application Process Investigator(s) completes VICTR Resource Request Funding request reviewed by VICTR SRC BioVU Application Process Proposal reviewed by BioVU Review Committee BioVU program office contacts investigators with any necessary revisions if applicable Proposal Approved/ Access granted Investigator resubmits proposal if necessary Total Time: Data Requests: 4-6 weeks DNA Access: 8-12 weeks Funding Decision Proposal Review Process:
BioVU Genotyping Process Genotyped data analyzed by investigator Investigator selects cases and controls from Synthetic Derivative Investigator signals BioVU program to initiate sample selection BioVU notifies DNA resources core that samples are ready for selection and picking Samples are provided to appropriate lab and are genotyped Investigator and BioVU program receive genotype data BioVU Genotyping Process:
BioVU Requests 37 Total Requests 24 Approvals
FAQ “answers” SD access: “non-human subjects” IRB review (days) Current access costs: $4/sample Genotyping: Investigator-funded Consider VICTR as a funding source Genotyping/sequencing performed in VUMC Core Facilities Justification must be provided for outside genotyping, including quality control plans Genotype “redeposit” part of the data use agreement Anticipate 16,000 BioVU subjects will have GWAS-type genotyping data by fall 2011
Contact: Erica Bowton PhD Questions? Contact: Erica Bowton PhD BioVU Program Manager erica.bowton@vanderbilt.edu 322-1975