Welcome to UW-Madison, the WNPRC, and O’Connor Lab! MHC Genotyping Workshop November 7 th – 11 th, 2011 Madison, Wisconsin
Introductions Trainers (WNPRC Genetics Service) – Roger Wiseman – Julie Karl – Simon Lank – Gabe Starrett – Francesca Norante Participants – Wendy Garnica – Mark Garthwaite – Julie Holister-Smith – Suzanne Queen – Premeela Rajakumar – Yuko Yuki
Schedule of Events Monday – Welcome and Overview Presentation – Begin bench work: cDNA synthesis & PCR (run #1) Tuesday – PCR product purification, quantification & pooling (run #1) – Begin emulsion PCR (run #1) – Begin bench work (run #2) Wednesday – Break & enrich DNA beads (run #1) – Run Roche/454 GS Junior instrument (run #1) – emPCR (run #2) Thursday – View run #1 results – Continue work on run #2 – Informatics presentation – Data analysis Friday – Run #2 results – Continue Data Analysis & Wrap-up
Overview of Presentation Our lab & research focus Evolution of DNA sequencing technology Discussion of Roche/454 technology & sample multiplexing MHC genotyping method overview – NHP immunogenetics – Genotyping strategy – Workflow Genotyping results
Welcome to Madison! WNPRC
Welcome to Madison!
The Wisconsin National Primate Research Center (WNPRC) Only federally funded National Primate Research Center in the Midwest Center holds ~1,100 rhesus macaques, 200 marmosets, and 100 cynomolgus macaques Research strengths: – Immunogenetics & Virology – Aging & Metabolism – Reproductive & Regenerative Medicine
The O’Connor Laboratory Genetics Services Members
The O’Connor Laboratory Genetics Services Members
The O’Connor Laboratory: Research NHP immunogenetics (MHC class I, class II, KIR) – Cynomolgus Macaque (Mauritian, Indonesian, SE Asian) – Rhesus Macaque (Indian & Chinese) – Japanese Macaque, Vervet, Sooty Mangaby SIV pathogenesis (immunology) and viral evolution Human immunogenetics (HLA) and HIV variation
The O’Connor Laboratory: Research NHP immunogenetics (MHC class I, class II, KIR) – Cynomolgus Macaque (Mauritian, Indonesian, SE Asian) – Rhesus Macaque (Indian & Chinese) – Japanese Macaque, Vervet, Sooty Mangaby SIV pathogenesis (immunology) and viral evolution Human immunogenetics (HLA) and HIV variation
Sequencing Technology is Changing Micro sequencing reactions – Pyrosequencing – Single molecule sequencing Higher throughput – Millions of sequences per day Lower cost – $10,000 human genome (original HGP = $3 billion)
Sequencing Technology: Overview 1 st Generation (previous): Sanger sequencing Applied Biosystems 3730xl: 1 x 10 3 reads / day to 1,000 bp read length
Sequencing Technology: Overview 2 nd Generation (current): 454, Illumina, SoLID, Ion torrent Roche / 454: 1 x 10 6 reads / day to 800 bp read length Illumina: 2 x 10 9 reads / week or 200 bp read length
Sequencing Technology: Overview 3 rd Generation (future): Pacific Biosciences, Nanopore sequencing, Complete Genomics Pacific Biosciences: 1 x 10 5 sequences / hour - 1,000 to 10,000 bp reads (?) - Single molecule sequencing - Goal = $1,000 genome !
Sequencing Technology: Overview 1 st Generation (previous): Sanger – Slow, Expensive, Not clonal, easy to analyze 2 nd Generation (current): 454, Illumina, SoLID, Ion torrent – Faster, Cheaper, Clonal, hard to analyze 3 rd Generation (future): Pacific Biosciences, Nanopore sequencing, Complete Genomics, Helicos – Very fast, Very cheap, Impossible to analyze
Roche / 454 Sequencing How does it work?
Flowgram (instead of chromat)
O’Connor Laboratory Sequencing Sanger sequencing NHP MHC class I genotyping with E. coli based cloning and Sanger sequencing: Throughput of ~ 8 animals per week.
O’Connor Laboratory Sequencing Sanger sequencing Pilot with Roche sequencing center MHC class I genotyping pilot project: ~24 samples per week
O’Connor Laboratory Sequencing Sanger sequencing GS FLX at UIUC Pilot with Roche sequencing center MHC class I genotyping at UIUC, ~ 48 samples per week
O’Connor Laboratory Sequencing Sanger sequencing GS FLX at UIUC Pilot with Roche sequencing center Titanium pilot with Roche sequencing center MHC class I full-length sequencing project with Roche using Titanium chemistry
O’Connor Laboratory Sequencing Sanger sequencing GS Junior in lab GS FLX at UIUC Pilot with Roche sequencing center Titanium pilot with Roche sequencing center MHC class I and viral sequencing projects run in- house ( > 48 samples per week )
Roche/454 Sequencing Advantages Inherently clonal (no bacterial cloning needed) Far cheaper per base than Sanger (3 – 4 orders of magnitude) Reliable read number and data regularity Easy protocol: many people trained
GS Junior 5 Month Run Summary MHC Class I 568bp Amplicon – 9 runs Average70,848 HQ reads523 bp median length Highest101, Lowest33, SIV Whole Genome – 16 runs Average101,846 HQ reads360 bp median length Highest177, Lowest42, SIV Epitope Amplicons (Various Sizes) – 5 runs Average80,244 HQ reads369 bp median length Highest107, Lowest37,066356
Ease of Use Access to instrument since Jan 2010 34 different fully-trained operators to date 7 additional people have begun training, but have not yet completed a solo run
Ease of Use Access to instrument since Jan 2010 34 different fully-trained operators to date 7 additional people have begun training, but have not yet completed a solo run
Ultra-Deep vs. Ultra-Wide Sequencing 2 nd & 3 rd Generation = thousands / millions of sequences per run Cost per run is high ($1000s) Can examine polymorphic target at high depth (ultra-deep) – expensive Can sequence many samples sequenced at the same time (ultra-wide) – cheap
Ultra-Deep vs. Ultra-Wide Sequencing Significantly improves sensitivity over traditional Sanger-based sequencing (500x vs 2x coverage)
Ultra-Deep vs. Ultra-Wide Sequencing Ultra-deep Ultra-wide Low frequency ARV resistance TCR sequencing Antibody sequencing HLA Typing Allele frequencies SNP detection
Multiplexed (Ultra-wide) Amplicon Sequencing Multiplex Identifier MID Tag
Methods to increase multiplexing 1.Physically subdividing plate (gasket) 2.Sample specific MID sequence tags 3.Uniquely mixing 5’ & 3’ MID tags PatientMID 1ATCGTAGTCA 2TCCGATCGA 3GTGTAACGT 4CCATGGATC 5TGGATGCAG 6TAGTAGCCA 7GTAGTCTAA 8AACGATGCA 9GCGCTAGCA Patient5' MID3' MID
O’Connor lab sequencing projects NHP comprehensive MHC genotyping & allele discovery (amplicons)
Importance of MHC Class I MHC class I molecules dictate immunity to disease High degree of polymorphism within the MHC class I peptide-binding domain Specific MHC alleles associated with superior control of HIV infection Source: modified from Yewdell et al., Nature Reviews Immunology 2003 Host Immune Genetics
NHP MHC Class I Allele Libraries Total # Alleles in GenBank
NHP MHC Class I Allele Libraries Total # Alleles in GenBank Human HLA class I = 5,400 alleles
Human HLA vs NHP MHC Class I AC AC B B Human HLA class I
Human HLA vs NHP MHC Class I AC AC B B Human HLA class I A1 A1 A2 A2 A4 A4 A3 A3 B1 B1 B2 B2 B3 B3 B4 B4 BN BN A1 A1 A2 A2 A 3 A 4 B1 B1 B2 B2 B3 B3 B4 B4 BN BN Nonhuman primate MHC class I
MHC Genotyping Design 568bp amplicon captures highly variable peptide binding region flanked by conserved sequences Amplifies in multiple primate species Longer reads provide better resolution of alleles % MHC Class I Variability Leader Peptide α 1 Domain α 2 Domain α 3 Domain Cyto- plasmi c Trans- membran e Amino Acid Position FR 568bp Amplicon
MHC Genotyping Design 568bp Amplicon Primer = Adapter (A or B) + MID + sequence-specific
MHC Genotyping Design 568bp Amplicon Primer = Adapter (A or B) + MID + sequence-specific Within a single nonhuman primate sample:
MHC Genotyping Design 568bp Amplicon Primer = Adapter (A or B) + MID + sequence-specific Within an MHC class I amplicon genotyping pool:
Roche/454 MHC Workflow Total RNA isolation and cDNA synthesis – RNA isolation ~4 hrs; cDNA synthesis ~2 hrs Primary PCR amplification – plus SPRI purification, quantification, pooling ~3 hrs emPCR – set-up ~1 hr, run ~5.5 hrs Breaking and enrichment – ~3 hrs GS Junior run – set-up ~1.5 hrs; run time ~10 hrs Data processing and analysis – run processing ~2 hrs; – analysis time varies m
GS Junior Run Metrics – MHC
Reads per Sample Sample MIDRead CountSample MIDRead Count Monkey Monkey Monkey Monkey Monkey003 31,023 Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey ,203 Monkey Monkey Monkey Monkey Monkey ,342 Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey ,672 Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey ,094 Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey Monkey
Allele Calls & Transcript Profiles % Total Reads MHC Class I Alleles
Lymphocyte Specific Expression % Total Reads MHC Class I Alleles
ROGER: INSERT ADDITIONAL DATA SLIDES?
Same methods applicable to HLA typing We have developed a similar assay to genotype human samples: HLA Class I and DRB loci Cheaper, higher-resolution, and higher- throughput than existing methods Can genotype up to 96 individuals per GS-Jr run
High Resolution HLA Genotyping LP α 1 Domain α 2 Domain α 3 Domain CT TM 581-F / 1kb-R bp SBT (Amplicon 2)
High-resolution Typing for 40 Reference Cell Lines UW ID#A*B*C* HLA-Ref01A*31:01:02 B*51:01:01 C*15:02:01 HLA-Ref02A*32:01:01 B*38:01:01 C*12:03:01:01/02 HLA-Ref03A*02:16A*03:01:01:01/03B*51:01:01 C*07:04:01C*15:02:01 HLA-Ref04A*24:02:01:01/02LA*26:02B*40:06:01:01/02B*51:01:01C*08:01:01C*14:02:01 HLA-Ref05A*30:01:01 B*13:02:01 C*06:02:01:01/02 HLA-Ref06 A*02:01:01:01/02L/0 3A*02:07B*46:01:01 C*01:02:01 HLA-Ref07A*33:03:01 B*44:03:01 C*14:03 HLA-Ref08A*30:01:01A*68:02:01:01/02/03B*42:01:01 C*1701 HLA-Ref09A*02:06:01A*11:01:01B*15:01:01:01B*35:01:01:01/02C*03:03:01C*04:01:01:01/02/03 HLA-Ref10A*26:01:01 B*08:01:01 C*07:01:01 HLA-Ref11A*02:04 B*51:01:01 C*15:02:01 HLA-Ref12A*03:01:01:01/03 B*47:01:01:01/02 C*06:02:01:01/02 HLA-Ref13A*01:01:01:01 B*57:01:01 C*06:02 HLA-Ref14 A*02:01:01:01/02L/0 3 B*35:03:01 C*12:03:01:01/02 HLA-Ref15 A*02:01:01:01/02L/0 3 B*35:01:01:01/02 C*04:01:01:01/02/03 HLA-Ref16A*34:01:01 B*15:21B*15:35C*04:03C*07:02:01:01/02/03 HLA-Ref17 A*02:01:01:01/02L/0 3 B*15:01:01:01 C*03:04:01:01/02 HLA-Ref18A*01:01:01:01 B*49:01:01 C*07:01:01 HLA-Ref19A*25:01 B*51:01:01 C*01:02 HLA-Ref20A*30:02:01 B*18:01:01:01 C*05:01:01:01/02 HLA-Ref21A*01:01:01:01A*02:05:01B*08:01:01B*50:01:01C*06:02:01:01/02C*07:01:01 HLA-Ref22A*01:01:01:01A*03:01:01:01/03B*07:02:01B*58:01:01C*07:01:01C*07:02:01:01/02/03 HLA-Ref23A*01:01:01A*02:01B*05:801B*07:02C*07:01C*07:02 HLA-Ref24A*01:01:01:01A*24:02:01:01/02LB*39:06:02B*58:01:01C*07:01:01C*07:02:01:01/02/03 HLA-Ref25A*01:01:01:01A*01:37B*35:01:01:01/02B*58:01:01 HLA-Ref26A*03:01:01:01/03 B*07:02:01B*35:01:01:01/02C*04:01:01:01/02/03C*07:02:01:01/02/03 HLA-Ref27A*03:01:01:01/03 B*07:02:01B*35:01:01:01/02C*04:01:01:01/02/03C*07:02:01:01/02/03 HLA-Ref28A*01:01:01:01A*03:01:01:01/03B*35:01:01:01/02B*58:01:01C*04:01:01:01/02/03C*07:18 (701?) HLA-Ref29A*03:01:01:01/03A*24:02:01:01/02LB*35:01:01:01/02B*51:01:04C*04:01:01:01/02/03C*07:04:01 HLA-Ref30 A*02:01:01:01/02L/0 3A*03:01:01:01/03B*07:02:01B*37:01:01C*06:02:01:01/02C*07:02:01:01/02/03 HLA-Ref31A*01:01:01:01A*24:02:01:01/02LB*39:06:02B*58:01:01C*07:01:01C*07:02:01:01/02/03 HLA-Ref32A*24:02:01:01/02L B*07:02:01B*51:01:01C*07:117 HLA-Ref33A*03:01:01:01/03 B*07:02:01B*35:01:01:01/02C*04:01:01:01/02/03C*07:02:01:01/02/03 HLA-Ref34A*03:01:01:01/03A*24:02:01:01/02LB*35:01:01:01/02B*39:06:02C*04:01:01:01/02/03C*07:02:01:01/02/03 HLA-Ref35 A*02:01:01:01/02L/0 3A*24:02:01:01/02LB*07:02:01B*13:02:01C*06:02:01:01/02C*07:02:01:01/02/03 HLA-Ref36A*24:02:01:01/02LA*31:01:02B*07:02:01B*40:01:02C*03:04:01:01/02C*07:02:01:01/02/03 HLA-Ref37 A*02:01:01:01/02L/0 3A*24:02:01:01/02LB*15:01:01:01B*39:06:02C*03:03:01C*07:02:01:01/02/03 HLA-Ref38A*3402A*7401B*801B*1503C*02:10C*701 HLA-Ref39A*2308NA*301B*440301B*5129C*02:02:02C*04 HLA-Ref40 A*02:01:01:01/02L/0 3A*29:02:01B*35:01:01:01/02B*44:03:01C*04:01:01:01/02/03C*16:01:01
Example High-Resolution HLA Genotypes with DRB SampleAllele Read s1kbF581F581R1kbRDRB-FDRB-R HIV_114A*36: HIV_114A*68:01: HIV_114B*41:02: HIV_114B*53:01: HIV_114C*04:01: HIV_114C*17:01:01 (primer) HIV_114DRB1*01:02: HIV_114DRB1*16:02: HIV_114DRB5*02-novel?60. HIV_115A*03:01: HIV_115A*11:01: HIV_115B*07:02: HIV_115B*51:01: HIV_115C*07:02: HIV_115C*15:02: HIV_115DRB1*04:04: HIV_115DRB1*07:01: HIV_115DRB4*01:01:01: HIV_115DRB4*01:03:01: HIV_116A*01:01: HIV_116A*02:01: HIV_116B*08:01: HIV_116B*15:01: HIV_116C*03:04: HIV_116C*07:01: HIV_116DRB1*03:01: HIV_116DRB1*04:01: HIV_116DRB3*01:01: HIV_116DRB4*01:03:01: SampleAlleleReads1kbF581F581R1kbRDRB-FDRB-R HIV_117A*26:01: HIV_117A*29:02: HIV_117B*44:03:01 (putative) HIV_117B*44:10 (putative) HIV_117C*04:01: HIV_117 DRB1*03:01: HIV_117DRB1*07:01: HIV_117DRB3*02:02: HIV_117DRB4*01:03:01: HIV_118A*02:01: HIV_118A*23:01: HIV_118B*40:01: HIV_118B*44:03: HIV_118C*03:04: HIV_118C*14: HIV_118DRB1*04:01: HIV_118DRB1*10:01: HIV_118DRB4*01:03:01: HIV_119A*29:01:01: HIV_119A*68:01: HIV_119B*07:05: HIV_119B*44:02:01: HIV_119C*05:01: HIV_119C*15:05:01/ HIV_119DRB1*04:04: HIV_119DRB1*07:01: HIV_119DRB4*01:03:01: