High-Throughput Sequencing of T-cell Receptors PI- Harlan Robins of the Fred Hutchinson Cancer Research Center Collaborator- Adaptive TCR Technologies High-Throughput Sequencing of T-cell Receptors PI- Harlan Robins of the Fred Hutchinson Cancer Research Center Collaborator- Adaptive TCR Technologies Our research showing tens of thousands of shared TCRs between individuals (vs. a statistical prediction of virtually zero overlap) has sparked interest by autoimmune specialists studying type 1 diabetes, Multiple Sclerosis and Rheumatoid Arthritis, as TCRs are thought to play a causative role in autoimmune disease. Adaptive TCR’s success will further encourage the model of technology transfer from academic institutions to private industry, stimulating economic activity in the state of Washington. Fig. 2. Increasing Plate Real Estate [One sample per lane vs. 3 samples per lane, vs. 8 samples per lane] Relevance of LSDF Goals Discovery of a new platform technology to analyze the adaptive immune system at unprecedented depth, coupled with a commercial partner in Adaptive TCR that has already committed resources towards its commercialization, inherently promotes Washington life sciences competitiveness. From a timeline perspective, we have already initiated collaborations with the Benaroya Research Institute to develop diagnostics for autoimmune diseases (specifically T1D and MS) and have already stated a project within the FHCRC to develop a protocol for CBT clinicians to correlate drug regimentation to TCR repertoire reconstitution. While the outcomes are not certain, we already have promising preliminary research on shared TCRs in T1D patients that have potential to lead to a biomarker. The timeline to develop a laboratory-developed test (LDT) for disease is 3-4 years. Similarly, we currently have access to a multitude of samples from cord blood transplant recipients frozen at different time point to assess the correlation between TCR repertoire reconstitution and post-transplant infections, thus our study can rapidly proceed without waiting on new trial patients. Fig. 3. Primer Design For additional information about immunoSEQ assays and the immunoSEQ Analyzer suite of bioinformatics applications at Adaptive TCR, contact us on the web at and Our research showing tens of thousands of shared TCRs between individuals (vs. a statistical prediction of virtually zero overlap) has sparked interest by autoimmune specialists studying type 1 diabetes, Multiple Sclerosis and Rheumatoid Arthritis, as TCRs are thought to play a causative role in autoimmune disease. Adaptive TCR’s success will further encourage the model of technology transfer from academic institutions to private industry, stimulating economic activity in the state of Washington. Fig. 2. Increasing Plate Real Estate [One sample per lane vs. 3 samples per lane, vs. 8 samples per lane] Relevance of LSDF Goals Discovery of a new platform technology to analyze the adaptive immune system at unprecedented depth, coupled with a commercial partner in Adaptive TCR that has already committed resources towards its commercialization, inherently promotes Washington life sciences competitiveness. From a timeline perspective, we have already initiated collaborations with the Benaroya Research Institute to develop diagnostics for autoimmune diseases (specifically T1D and MS) and have already stated a project within the FHCRC to develop a protocol for CBT clinicians to correlate drug regimentation to TCR repertoire reconstitution. While the outcomes are not certain, we already have promising preliminary research on shared TCRs in T1D patients that have potential to lead to a biomarker. The timeline to develop a laboratory-developed test (LDT) for disease is 3-4 years. Similarly, we currently have access to a multitude of samples from cord blood transplant recipients frozen at different time point to assess the correlation between TCR repertoire reconstitution and post-transplant infections, thus our study can rapidly proceed without waiting on new trial patients. Fig. 3. Primer Design For additional information about immunoSEQ assays and the immunoSEQ Analyzer suite of bioinformatics applications at Adaptive TCR, contact us on the web at and Work Design & Methods Thirteen unique J primers were synthesized for each barcode. We designed ten sets of J primers, each set containing a specific barcode in all thirteen primers. The barcodes consist of a six base-pair sequence, consistent with Illumina’s previous protocol. Our standard primers include 33 bp of J segment sequence and 25 bp of adapter sequence, so this six base pair sequence is unlikely to substantially alter the hybridization kinetics of the PCR primers. To minimize this risk, we used barcodes with consistent GC content. Illumina’s barcodes are robust to as many as two sequence errors in the six base index, allowing for confident assignment of each sequence back to an original indexed library. We developed 10 unique nucleotide tags that were incorporated into sample libraries so that we could pool multiple samples into a single lane of the sequencer (see Fig. 4). We generated tagged libraries of previously sequenced samples to ensure that the introduction of the nucleotide tags did not alter the results obtained. This allowed us to divide the 20 million reads per lane among 10 different samples, allowing our technology to target applications that require fewer reads per sample, increasing our scale of analysis and reducing the sequencing costs. We also extended our suite of software to automatically decode the indexes and to rapidly query the resulting data. Fig. 4. Raw Copy Counts of Total TCR Sequences Obtained Using Several Barcodes Our basic method for tagging the libraries with the appropriate barcode is to adapt the current methodology to our assay by inserting the barcode into the J segment PCR primers, between the existing adapter sequence for the solid phase amplification and the J segment-specific sequence of each of the thirteen primers (Fig. 5). Fig. 5. J Segment Usage is Not Barcode -Dependent Work Design & Methods Thirteen unique J primers were synthesized for each barcode. We designed ten sets of J primers, each set containing a specific barcode in all thirteen primers. The barcodes consist of a six base-pair sequence, consistent with Illumina’s previous protocol. Our standard primers include 33 bp of J segment sequence and 25 bp of adapter sequence, so this six base pair sequence is unlikely to substantially alter the hybridization kinetics of the PCR primers. To minimize this risk, we used barcodes with consistent GC content. Illumina’s barcodes are robust to as many as two sequence errors in the six base index, allowing for confident assignment of each sequence back to an original indexed library. We developed 10 unique nucleotide tags that were incorporated into sample libraries so that we could pool multiple samples into a single lane of the sequencer (see Fig. 4). We generated tagged libraries of previously sequenced samples to ensure that the introduction of the nucleotide tags did not alter the results obtained. This allowed us to divide the 20 million reads per lane among 10 different samples, allowing our technology to target applications that require fewer reads per sample, increasing our scale of analysis and reducing the sequencing costs. We also extended our suite of software to automatically decode the indexes and to rapidly query the resulting data. Fig. 4. Raw Copy Counts of Total TCR Sequences Obtained Using Several Barcodes Our basic method for tagging the libraries with the appropriate barcode is to adapt the current methodology to our assay by inserting the barcode into the J segment PCR primers, between the existing adapter sequence for the solid phase amplification and the J segment-specific sequence of each of the thirteen primers (Fig. 5). Fig. 5. J Segment Usage is Not Barcode -Dependent Milestones Design and order index primer sets (completed 8/15/10) Generate index libraries (completed 10/31/10) Quantify library production (completed 11/15/10) Sequence initial libraries (completed 2/1/11) Assess bias (completed 3/1/11) Sorting algorithm (completed 3/1/11) Sequence multiplexed samples (completed 5/1/11) Scale error correction (completed 5/1/11) Evaluate multiplexing results (completed 6/30/11) Annual meeting with technology transfer (completed 7/31/11) Fig. 6. V Segment Usage is Not Barcode-Dependent Outcomes & Future Plans Next step in the commercialization pathway is to optimize the workflow process to facilitate taking multiple orders of varying resolution on a single flow cell. Once the workflow process incorporating the multiplexing assay is running smoothly, our intent is to continue business development efforts to take in additional revenue. Revenue from the service business will further fund research and development efforts on diagnostics for autoimmune disease, post-cord blood transplant measurement protocols, and vaccine development efficacy trials. Our end goal is increased data as a lower cost with faster results. Milestones Design and order index primer sets (completed 8/15/10) Generate index libraries (completed 10/31/10) Quantify library production (completed 11/15/10) Sequence initial libraries (completed 2/1/11) Assess bias (completed 3/1/11) Sorting algorithm (completed 3/1/11) Sequence multiplexed samples (completed 5/1/11) Scale error correction (completed 5/1/11) Evaluate multiplexing results (completed 6/30/11) Annual meeting with technology transfer (completed 7/31/11) Fig. 6. V Segment Usage is Not Barcode-Dependent Outcomes & Future Plans Next step in the commercialization pathway is to optimize the workflow process to facilitate taking multiple orders of varying resolution on a single flow cell. Once the workflow process incorporating the multiplexing assay is running smoothly, our intent is to continue business development efforts to take in additional revenue. Revenue from the service business will further fund research and development efforts on diagnostics for autoimmune disease, post-cord blood transplant measurement protocols, and vaccine development efficacy trials. Our end goal is increased data as a lower cost with faster results. Specific Objectives Over the past two years, our team at the Fred Hutchinson Cancer Research Center (FHCRC) has invented a method to sequence T-cell receptor (TCR) genes at very high throughput. Two of the three inventors of the technology founded a private company, Adaptive TCR Corporation (Adaptive TCR or the Company), which is currently offering TCR sequencing as a service to the academic and pharmaceutical research communities. The applications of this technology are broad and far-reaching. Generally, researchers and clinicians interested in how the TCR repertoire relates to their field of interest (i.e. the role TCRs are playing, altering course of clinical treatment) now have a tool to measure and analyze their areas of study. More specifically, a clinical application of this technology has the potential to improve treatment decisions for patients receiving bone marrow or cord blood transplants, as well as patients who suffer from autoimmune disorders. Additionally, the technology has an application for pharmaceutical companies to measure efficacy of vaccines in vaccine development trials Our objectives are to: 1.Scale up our existing methodology by implementation of a multiplexing assay 2.Extend our software pipeline and analysis tools to include the multiplexing readout and subdivided samples Multiplexing will reduce the price point for researchers desiring TCR sequencing services, providing the capability to a broader scientific community. Our current technology provides approximately 20 million sequence reads for each sample, with each sample compromising one lane out of eight on the Illumina Genome Analyze IIx sequencer. Increasing the robustness of our suite of software applications to accommodate multiplexing renders the vast amounts of data accessible and useful to the client. Significance of Work Dr. Robins and the team have already developed the technology to sequence millions of unique TCRs in parallel. The multiplexing technology under development will drastically reduce the cost of the methodology to $500 per sample, greatly enhancing the market penetration of the research community. Fig. 1 shows an amplified and sequenced TCRB sequence using the immunoSEQ assay. Fig. 1. Assay Our methodology provides a 1000-fold increase in TCR receptor resolution vs. existing technologies, providing researchers the capability to measure millions of TCR sequences from each blood sample and learn the relative abundance of the different clonotypes, answering questions that have not been previously tractable. We are working to analyze T1D and MS samples using TCR profiling and have entered discussions with large pharmaceutical companies interested in utilizing TCR sequencing services in the context of vaccine development. Specific Objectives Over the past two years, our team at the Fred Hutchinson Cancer Research Center (FHCRC) has invented a method to sequence T-cell receptor (TCR) genes at very high throughput. Two of the three inventors of the technology founded a private company, Adaptive TCR Corporation (Adaptive TCR or the Company), which is currently offering TCR sequencing as a service to the academic and pharmaceutical research communities. The applications of this technology are broad and far-reaching. Generally, researchers and clinicians interested in how the TCR repertoire relates to their field of interest (i.e. the role TCRs are playing, altering course of clinical treatment) now have a tool to measure and analyze their areas of study. More specifically, a clinical application of this technology has the potential to improve treatment decisions for patients receiving bone marrow or cord blood transplants, as well as patients who suffer from autoimmune disorders. Additionally, the technology has an application for pharmaceutical companies to measure efficacy of vaccines in vaccine development trials Our objectives are to: 1.Scale up our existing methodology by implementation of a multiplexing assay 2.Extend our software pipeline and analysis tools to include the multiplexing readout and subdivided samples Multiplexing will reduce the price point for researchers desiring TCR sequencing services, providing the capability to a broader scientific community. Our current technology provides approximately 20 million sequence reads for each sample, with each sample compromising one lane out of eight on the Illumina Genome Analyze IIx sequencer. Increasing the robustness of our suite of software applications to accommodate multiplexing renders the vast amounts of data accessible and useful to the client. Significance of Work Dr. Robins and the team have already developed the technology to sequence millions of unique TCRs in parallel. The multiplexing technology under development will drastically reduce the cost of the methodology to $500 per sample, greatly enhancing the market penetration of the research community. Fig. 1 shows an amplified and sequenced TCRB sequence using the immunoSEQ assay. Fig. 1. Assay Our methodology provides a 1000-fold increase in TCR receptor resolution vs. existing technologies, providing researchers the capability to measure millions of TCR sequences from each blood sample and learn the relative abundance of the different clonotypes, answering questions that have not been previously tractable. We are working to analyze T1D and MS samples using TCR profiling and have entered discussions with large pharmaceutical companies interested in utilizing TCR sequencing services in the context of vaccine development. Adaptive TCR Technologies Suite Westlake Ave N Seattle, WA A. TCRB primers (red and yellow) are tagged with universal capture sequences (orange and green) to amplify a library of 200 bp CDR3 (gray) fragments. A barcode (blue), unique to each DNA sample being amplified, is inserted in the PCR primer. B. Following capture and solid state amplification of the TCR library on the surface of a flow cell, 60 bp of CDR3 sequence is captured using TCRB sequencing primers (red). C. A second round of sequencing captures 6 bp of barcode (blue), after priming with a universal primer (orange). The barcode is associated with the TCR sequence by its physical location on the flow cell.