shRNA libraries with DNA Sudoku Yaniv Erlich Hannon Lab Yaniv Erlich Hannon Lab shRNA libraries sequencing using DNA Sudoku.

Slides:



Advertisements
Similar presentations
University of Queensland
Advertisements

Introduction to Algorithms Quicksort
Chapter 11 Limitations of Algorithm Power Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Lecture 19. Reduction: More Undecidable problems
Introduction to Computer Science 2 Lecture 7: Extended binary trees
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Preference Elicitation Partial-revelation VCG mechanism for Combinatorial Auctions and Eliciting Non-price Preferences in Combinatorial Auctions.
1 COMP 382: Reasoning about algorithms Unit 9: Undecidability [Slides adapted from Amos Israeli’s]
Primality Testing Patrick Lee 12 July 2003 (updated on 13 July 2003)
February 19, 2015Applied Discrete Mathematics Week 4: Number Theory 1 The Growth of Functions Question: If f(x) is O(x 2 ), is it also O(x 3 )? Yes. x.
Sequencing shRNA libraries with DNA Sudoku Yaniv Erlich Hannon Lab Yaniv Erlich Hannon Lab Compressed Genotyping Cold Spring Harbor.
CSE115/ENGR160 Discrete Mathematics 02/28/12
Computability and Complexity 14-1 Computability and Complexity Andrei Bulatov Cook’s Theorem.
1 Introduction to Computability Theory Lecture15: Reductions Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.
Computational problems, algorithms, runtime, hardness
Courtesy Costas Busch - RPI1 A Universal Turing Machine.
CPSC 411, Fall 2008: Set 12 1 CPSC 411 Design and Analysis of Algorithms Set 12: Undecidability Prof. Jennifer Welch Fall 2008.
1 Undecidability Andreas Klappenecker [based on slides by Prof. Welch]
CHAPTER 4 Decidability Contents Decidable Languages
1. The Role of the Algorithms in Computer Hsu, Lih-Hsing
Chapter 11: Limitations of Algorithmic Power
Algorithmic Problems in Algebraic Structures Undecidability Paul Bell Supervisor: Dr. Igor Potapov Department of Computer Science
Fall 2004COMP 3351 A Universal Turing Machine. Fall 2004COMP 3352 Turing Machines are “hardwired” they execute only one program A limitation of Turing.
Lection 1: Introduction Computational Geometry Prof.Dr.Th.Ottmann 1 History: Proof-based, algorithmic, axiomatic geometry, computational geometry today.
Communication What does this mean?. How do we communicate? ?
Introduction to Bioinformatics Algorithms Exhaustive Search and Branch-and-Bound Algorithms for Partial Digest Mapping.
Algorithms. Introduction The methods of algorithm design form one of the core practical technologies of computer science. The main aim of this lecture.
CSCE350 Algorithms and Data Structure
Optimal Degree Distribution for LT Codes with Small Message Length Esa Hyytiä, Tuomas Tirronen, Jorma Virtamo IEEE INFOCOM mini-symposium
Great Theoretical Ideas in Computer Science.
Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department.
Chapter 1 Introduction. Goals Why the choice of algorithms is so critical when dealing with large inputs Basic mathematical background Review of Recursion.
hoe Last viewed 1 PowerPoint Slide Show (.pps) You can advance through each part of the screen by left clicking When you see the at the top right of the.
MA/CSSE 474 Theory of Computation DFSM Canonical Form Proof of NDFSM  DFSM ALGORITHM (as much as we have time for) This version includes the "answers"
DNA Computing.  Elements of complementary nature abound in nature. Complementary parts (in nature) can “self-assemble”. A universal principle?  This.
Benk Erika Kelemen Zsolt
Logic Circuits Chapter 2. Overview  Many important functions computed with straight-line programs No loops nor branches Conveniently described with circuits.
ECE 8053 Introduction to Computer Arithmetic (Website: Course & Text Content: Part 1: Number Representation.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
1 Machine Learning: Lecture 8 Computational Learning Theory (Based on Chapter 7 of Mitchell T.., Machine Learning, 1997)
CPSC 171 Introduction to Computer Science More Algorithm Discovery and Design.
Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.
NP-Complete Problems. Running Time v.s. Input Size Concern with problems whose complexity may be described by exponential functions. Tractable problems.
NP-COMPLETE PROBLEMS. Admin  Two more assignments…  No office hours on tomorrow.
Sequence Alignments with Indels Evolution produces insertions and deletions (indels) – In addition to substitutions Good example: MHHNALQRRTVWVNAY MHHALQRRTVWVNAY-
Secret Sharing for General Access Structure İlker Nadi Bozkurt, Kamer Kaya, and Ali Aydın Selçuk Information Security and Cryptology, Ankara, Turkey, May.
1 Linear Bounded Automata LBAs. 2 Linear Bounded Automata (LBAs) are the same as Turing Machines with one difference: The input string tape space is the.
1 Turing’s Thesis. 2 Turing’s thesis: Any computation carried out by mechanical means can be performed by a Turing Machine (1930)
Bahareh Sarrafzadeh 6111 Fall 2009
Comparison between old generation and new generation of sequencing machines.
CPS Computational problems, algorithms, runtime, hardness (a ridiculously brief introduction to theoretical computer science) Vincent Conitzer.
Solution of Satisfiability Problem on a Gel-Based DNA computer Ji Yoon Park Dept. of Biochem Hanyang University.
Machine Learning in Practice Lecture 10 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
(C) 2004, SNU Biointelligence Lab, DNA Extraction by Cross Pairing PCR Giuditta Franco, Cinzia Giagulli, Carlo Laudanna, Vincenzo.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
Learning Hidden Graphs Hung-Lin Fu 傅 恆 霖 Department of Applied Mathematics Hsin-Chu Chiao Tung Univerity.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
Advanced issues in Robotics and Programming Dr. Katerina G. Hadjifotinou Experimental Junior High School of the University of Macedonia.
CS 9633 Machine Learning Support Vector Machines
Mathematics of Cryptography
Computable Functions.
Solution of Satisfiability Problem on a Gel-Based DNA computer
Analysis and design of algorithm
Without counting the squares, how many are shaded
Discrete Math for CS CMPSC 360 LECTURE 13 Last time:
Chapter 11 Limitations of Algorithm Power
Machine Learning: UNIT-3 CHAPTER-2
Decidability continued….
Presentation transcript:

shRNA libraries with DNA Sudoku Yaniv Erlich Hannon Lab Yaniv Erlich Hannon Lab shRNA libraries sequencing using DNA Sudoku

shRNA libraries with DNA Sudoku Preparing DNA libraries Programmable microarray Cloning into plasmidsTransformation Array single colonies Introduction Naïve SolutionsChinese PoolingAnalysisResults

shRNA libraries with DNA Sudoku The problem Input: 40,000 bacterial colonies Output: The sequence of the shRNA inserts Insert type Introduction Naïve SolutionsChinese PoolingAnalysisResults

shRNA libraries with DNA Sudoku Motivation Filtering the correct fragments Balanced representation Subset selection. Introduction Naïve SolutionsChinese PoolingAnalysisResults

shRNA libraries with DNA Sudoku Clone-by-clone sequencing Clone-by-clone sequencing: Sequence each clone by a capillary platform Caveat: Cost: ~40,000$ Conclusion: using next generation sequencing Introduction Naïve Solutions Chinese PoolingAnalysisResults

shRNA libraries with DNA Sudoku Naïve next-gen Pooling Solexa ?? Conclusion: we need to add a source clone identifier (barcode) Introduction Naïve Solutions Chinese PoolingAnalysisResults

shRNA libraries with DNA Sudoku Naive barcoding Barcoding Pooling Solexa BarcodeSequence 214AGTGC CTCAA TTTCG.. 88TTGAA.. Caveats: Order 40,000 barcodes. Each of length of ~95nt. 40,000 PCR reactions. Conclusion: we need less barcodes Introduction Naïve Solutions Chinese PoolingAnalysisResults

shRNA libraries with DNA Sudoku Naive Pooling(1) A B C D E F GenotypeBarcode ACACA5 B Barcode: Which specimen appears in both barcode #5 and #B? Specimen #13! Case #1: Introduction Naïve Solutions Chinese PoolingAnalysisResults

shRNA libraries with DNA Sudoku Naive Pooling(2) A B C D E F Barcode: GenotypeBarcode ACGTT1 D E 2 ACGTT associated with specimens #25(D,1) and #34 (E,2)! Or maybe ACGTT associated with specimens #25(D,2) and #34(E,1)? Ambiguity Conclusion: we should deal with shRNA ‘duplicates’ Case #2: Introduction Naïve Solutions Chinese PoolingAnalysisResults

shRNA libraries with DNA Sudoku Lessons learned for the desired scheme Features of the required encoding scheme CompactnessUsing a small set of barcodes Dealing with duplicatesEvery specimen should be resolved without ambiguity. Experimental overheadWhile reducing the number of barcodes, we should also pay attention to the resource allocated to the pooling itself. SimpleThis is not a computer program. Encoding is done by a robot and chemistry - So keep It Simple Introduction Naïve Solutions Chinese PoolingAnalysisResults

shRNA libraries with DNA Sudoku Barcoding PE sequencing Decoding Overview of our solution ‘Chinese’ Pooling IntroductionNaïve Solutions Chinese Pooling AnalysisResults

shRNA libraries with DNA Sudoku The pooling design Combinatorial pooling using the Chinese Remainder Theorem (CRT). Combinatorial pooling using the Chinese Remainder Theorem (CRT). "I have never done anything 'useful'. No discovery of mine has made, or is likely to make, directly or indirectly, for good or ill, the least difference to the amenity of the world” (G. Hardy, A Mathematician's Apology,1940) IntroductionNaïve Solutions Chinese Pooling AnalysisResults

shRNA libraries with DNA Sudoku Chinese remainder riddle “An old woman goes to market and a horse steps on her basket and crashes the eggs. The rider offers to pay for the damages and asks her how many eggs she had brought. She does not remember the exact number, but when she had taken them out 3 at a time, there was one egg left. The same happened when she picked them out 4, and 5 at a time, but when she took them 7 at a time they came out even. What is the smallest number of eggs she could have had?” Answer: 91 eggs Chinese Remainder Theorem says: -There is one-to-one correspondence between n (0  n<2*3*5*7) and the residues. - There is an easy algorithm to solve the equation system. IntroductionNaïve Solutions Chinese Pooling AnalysisResults

shRNA libraries with DNA Sudoku Pooling construction with modular equations Specimen Pooling window Destination well (different plates) One-to-One correspondence… IntroductionNaïve Solutions Chinese Pooling AnalysisResults

shRNA libraries with DNA Sudoku Example of Chinese pooling Source array: IntroductionNaïve Solutions Chinese Pooling AnalysisResults

shRNA libraries with DNA Sudoku Chinese Remainder Theorem asserts: (1) Two specimens will be meet in no more than one pool. (2) The number of pools Inputs: N (number of specimens in the experiment) Weight (pooling efforts) Algorithm: 1. Find W numbers {x 1,x 2,…,x w } such that: (a)Bigger than (b)Pairwise coprime For instance: {5,8,9} but not {5,6,9} 2. Generate W modular equations: 3. Construct the pooling design upon the modular equations Output: Pooling design Chinese Remainder Pooling Design Number of bc: IntroductionNaïve Solutions Chinese Pooling AnalysisResults

shRNA libraries with DNA Sudoku How good is our method? Features of the required encoding scheme CompactnessUsing a small set of barcodes Dealing with duplicatesEvery specimen should be resolved without ambiguity. Experimental overheadWhile reducing the number of barcodes, we should also pay attention to the resource allocated to the pooling itself. SimpleThis is not a computer program. Encoding is done by a robot and chemistry - So keep It Simple IntroductionNaïve SolutionsChinese Pooling Analysis Results

shRNA libraries with DNA Sudoku Barcode reduction IEEE Transaction on Information Theory (1964) Proved upon pure combinatorial constrains: the lower theoretical bound of the number of barcodes is Our method is very close the lower theoretical bound IntroductionNaïve SolutionsChinese Pooling Analysis Results

shRNA libraries with DNA Sudoku How good is our method? Features of the required encoding scheme CompactnessUsing a small set of barcodes Dealing with duplicatesEvery specimen should be resolved without ambiguity. Experimental overheadWhile reducing the number of barcodes, we should also pay attention to the resource allocated to the pooling itself. SimpleThis is not a computer program. Encoding is done by a robot and chemistry - So keep It Simple IntroductionNaïve SolutionsChinese Pooling Analysis Results

shRNA libraries with DNA Sudoku Dealing with duplicates - simulation Duplicates size Probability of correct decoding 40,000 specimens with only 384 barcodes 0.99 IntroductionNaïve SolutionsChinese Pooling Analysis Results

shRNA libraries with DNA Sudoku How good is our method? Features of the required encoding scheme CompactnessUsing a small set of barcodes Dealing with duplicatesEvery specimen should be resolved without ambiguity. Experimental overheadWhile reducing the number of barcodes, we should also pay attention to the resource allocated to the pooling itself. SimpleThis is not a computer program. Encoding is done by a robot and chemistry - So keep It Simple W=5: 5 lanes of Solexa One week and a half of robotics IntroductionNaïve SolutionsChinese Pooling Analysis Results

shRNA libraries with DNA Sudoku How good is our method? Features of the required encoding scheme CompactnessUsing a small set of barcodes Dealing with duplicatesEvery specimen should be resolved without ambiguity. Experimental overheadWhile reducing the number of barcodes, we should also pay attention to the resource allocated to the pooling itself. SimpleThis is not a computer program. Encoding is done by a robot and chemistry - So keep It Simple IntroductionNaïve SolutionsChinese Pooling Analysis Results

shRNA libraries with DNA Sudoku Real results… Arabidopsis shRNA library with 17,000 shRNA fragments Picked 40,320 bacterial colonies Sequence 3,000 colonies with capillary sequencing for comparison. Decoded ~20,500 bacterial colonies with correct inserts 96% of the assignments were correct. ~8,000 unique fragments of the library. IntroductionNaïve SolutionsChinese PoolingAnalysis Results

shRNA libraries with DNA Sudoku Future directions Developing a more advance decoder using machine learning approach 2-stage algorithm IntroductionNaïve SolutionsChinese PoolingAnalysis Results

shRNA libraries with DNA Sudoku DNA Sudoku Greg Hannon Acknowledgements Ken Chang Michelle Rooks Assaf Gordon Oron Navon and Roy Ronen