Design and Optimization of Universal DNA Arrays Ion Mandoiu CSE Department & BME Program University of Connecticut.

Slides:



Advertisements
Similar presentations
Improved Models and Algorithms for Universal DNA Tag Systems continued … a.k.a. what did we do?
Advertisements

Quality and Error Control Coding for DNA Microarrays Olgica Milenkovic ECE Department University of Colorado, Boulder IEEE Denver ComSoc.
Modeling sequence dependence of microarray probe signals Li Zhang Department of Biostatistics and Applied Mathematics MD Anderson Cancer Center.
Optimal Testing of Digital Microfluidic Biochips: A Multiple Traveling Salesman Problem R. Garfinkel 1, I.I. Măndoiu 2, B. Paşaniuc 2 and A. Zelikovsky.
Molecular Computation : RNA solutions to chess problems Dirk Faulhammer, Anthony R. Cukras, Richard J. Lipton, and Laura F. Landweber PNAS 2000; vol. 97:
Exact and Approximation Algorithms for DNA Tag Set Design
Introduction to DNA Microarrays Todd Lowe BME 88a March 11, 2003.
Evaluation of Placement Techniques for DNA Probe Array Layout Andrew B. Kahng 1 Ion I. Mandoiu 2 Sherief Reda 1 Xu Xu 1 Alex Zelikovsky 3 (1) CSE Department,
Design and Optimization of Universal DNA Arrays
Universal DNA Arrays Ion Mandoiu Computer Science & Engineering Department.
Exact and Approximation Algorithms for DNA Tag Set Design Ion Mandoiu and Dragos Trinca Computer Science & Engineering Department University of Connecticut.
Improved Tag Set Design and Multiplexing Algorithms for Universal Arrays Ion Mandoiu Claudia Prajescu Dragos Trinca Computer Science & Engineering Department.
Combinatorial Algorithms for Maximum Likelihood Tag SNP Selection and Haplotype Inference Ion Mandoiu University of Connecticut CS&E Department.
Selection of Optimal DNA Oligos for Gene Expression Arrays Reporter : Wei-Ting Liu Date : Nov
Algorithms for Biochip Design and Optimization Ion Mandoiu Computer Science & Engineering Department University of Connecticut.
Lecture 11. Matching A set of edges which do not share a vertex is a matching. Application: Wireless Networks may consist of nodes with single radios,
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Exact and Approximation Algorithms for DNA Tag Set Design Ion Mandoiu and Dragos Trinca Computer Science & Engineering Department University of Connecticut.
1 Combinatorial Optimization Methods for Reliable Genomic-Based Detection Systems Ion Mandoiu University of Connecticut Computer Science & Engineering.
Accurate Method for Fast Design of Diagnostic Oligonucleotide Probe Sets for DNA Microarrays Nazif Cihan Tas CMSC 838 Presentation.
Design and Optimization of Universal DNA Arrays Ion Mandoiu Computer Science & Engineering Department University of Connecticut
Introduce to Microarray
Carmine Cerrone, Raffaele Cerulli, Bruce Golden GO IX Sirmione, Italy July
May 25, GSU Biotech Symposium1 Minimum PCR Primer Set Selection with Amplification Length and Uniqueness Constraints Ion Mandoiu University of.
Rational DNA Sequence Design for Molecular Nanotechnology Ion Mandoiu, CSE Department DNA is well-known as the carrier of information in living organisms.
APBC Improved Algorithms for Multiplex PCR Primer Set Selection with Amplification Length Constraints Kishori M. Konwar Ion I. Mandoiu Alexander.
Optimization Methods for Reliable Genomic- Based Pathogen Detection Systems K.M. Konwar, I.I. Mandoiu, A.C. Russell, and A.A. Shvartsman Computer Science.
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
and analysis of gene transcription
Chapter 14 Jizhong Zhou and Dorothea K. Thompson.
with an emphasis on DNA microarrays
Special Topics in Genomics Lecture 1: Introduction Instructor: Hongkai Ji Department of Biostatistics
CDNA Microarrays Neil Lawrence. Schedule Today: Introduction and Background 18 th AprilIntroduction and Background 25 th AprilcDNA Mircoarrays 2 nd MayNo.
1 EE381V: Genomic Signal Processing Lecture #13. 2 The Course So Far Gene finding DNA Genome assembly Regulatory motif discovery Comparative genomics.
1 Outline Last time: –Molecular biology primer (sections ) –PCR Today: –More basic techniques for manipulating DNA (Sec. 3.8) Cutting into shorter.
Strand Design for Biomolecular Computation
Detection and Compensation of Cross- Hybridization in DNA Microarray Data Joint work with Quaid Morris (1), Tim Hughes (2) and Brendan Frey (1) (1)Probabilistic.
Development and Evaluation of a Comprehensive Functional Gene array for Environmental Studies Zhili He 1,2, C. W. Schadt 2, T. Gentry 2, J. Liebich 3,
Gene expression and DNA microarrays Old methods. New methods based on genome sequence. –DNA Microarrays Reading assignment - handout –Chapter ,
Microarray Technology
Variables: – T(p) - set of candidate transcripts on which pe read p can be mapped within 1 std. dev. – y(t) -1 if a candidate transcript t is selected,
Graph Theory And Bioinformatics Jason Wengert. Outline Introduction to Graphs Eulerian Paths & Hamiltonian Cycles Interval Graph & Shape of Genes Sequencing.
National Taiwan University Department of Computer Science and Information Engineering Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
Adrian Caciula Department of Computer Science Georgia State University Joint work with Serghei Mangul (UCLA) Ion Mandoiu (UCONN) Alex Zelikovsky (GSU)
JM - 1 Introduction to Bioinformatics: Lecture III Genome Assembly and String Matching Jarek Meller Jarek Meller Division of Biomedical.
Computational Molecular Biology Non-unique Probe Selection via Group Testing.
Combinatorial Optimization Problems in Computational Biology Ion Mandoiu CSE Department.
CHAPTER SIX Nucleic acid hybridization: principles and applications 생물정보학협동과정 강민호.
Finnish Genome Center Monday, 16 November Genotyping & Haplotyping.
Taqman Technology and Its Application to Epidemiology Yuko You, M.S., Ph.D. EPI 243, May 15 th, 2008.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
Computational Molecular Biology Non-unique Probe Selection via Group Testing.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Microarrays and Other High-Throughput Methods BMI/CS 576 Colin Dewey Fall 2010.
EE150a – Genomic Signal and Information Processing On DNA Microarrays Technology October 12, 2004.
DNA Microarray Overview and Application. Table of Contents Section One : Introduction Section Two : Microarray Technique Section Three : Types of DNA.
Solution of Satisfiability Problem on a Gel-Based DNA computer Ji Yoon Park Dept. of Biochem Hanyang University.
Transcriptome What is it - genome wide transcript abundance How do you obtain it - Arrays + MPSS What do you do with it when you have it - ?
From: Duggan et.al. Nature Genetics 21:10-14, 1999 Microarray-Based Assays (The Basics) Each feature or “spot” represents a specific expressed gene (mRNA).
Introduction to Oligonucleotide Microarray Technology
DNA Fingerprinting Maryam Ahmed Khan February 14, 2001.
Detecting DNA with DNA probes arrays. DNA sequences can be detected by DNA probes and arrays (= collection of microscopic DNA spots attached to a solid.
Part 3 Gene Technology & Medicine
Relationship between Genotype and Phenotype
Solution of Satisfiability Problem on a Gel-Based DNA computer
Introduction to Bioinformatics II
DNA computing on surfaces
Relationship between Genotype and Phenotype
Presentation transcript:

Design and Optimization of Universal DNA Arrays Ion Mandoiu CSE Department & BME Program University of Connecticut

DNA Microarrays Exploit Watson-Crick complementarity to simultaneously perform a large number of substring tests Used in a variety of high-throughput genomic analyses –Transcription (gene expression) analysis –Single Nucleotide Polymorphism (SNP) genotyping –Genomic-based microorganism identification –Alternative splicing, ChIP-on-chip, tiling arrays,… Common microarray formats involve direct hybridization between labeled DNA/RNA sample and DNA probes attached to a glass slide

Universal DNA Arrays Limitations of direct hybridization formats: –Arrays of cDNAs: inexpensive, but can only be used for transcription analysis –Oligonucleotide arrays: flexible, but expensive unless produced in large quantities Universal DNA arrays: “programable” arrays –Array consists of application independent oligonucleotides –Detection carried by a sequence of reactions involving application specific primers –Flexible AND cost effective Universal array architectures: DNA tag arrays, APEX arrays, SBE/SBH arrays

+ Mix tag+primer probes with genomic DNA DNA Tag Arrays Solution phase hybridization Single-Base Extension Tag Primer Solid phase hybridization Antitag

Tag Hybridization Constraints (H1) Tags hybridize strongly to complementary antitags (H2) No tag hybridizes to a non-complementary antitag (H3) Tags do not cross-hybridize to each other t1 t2 t1t2t1 Tag Set Design Problem: Find a maximum cardinality set of tags satisfying (H1)-(H3)

Hamming distance model, e.g., [Marathe et al. 01] –Models rigid DNA strands LCS/edit distance model, e.g., [Torney et al. 03] –Models infinitely elastic DNA strands c-token model [Ben-Dor et al. 00]: –Duplex formation requires formation of nucleation complex between perfectly complementary substrings –Nucleation complex must have weight  c, where wt(A)=wt(T)=1, wt(C)=wt(G)=2 (2-4 rule) Hybridization Models

c-h Code Problem c-token: left-minimal DNA string of weight  c, i.e., –w(x)  c –w(x’) < c for every proper suffix x’ of x A set of tags is a c-h code if (C1) Every tag has weight  h (C2) Every c-token is used at most once c-h Code Problem [Ben-Dor et al.00] Given c and h, find maximum cardinality c-h code [Ben-Dor et al.00] give approximation algorithm based on DeBruijn sequences

Periodic Tags [MT05] Key observation: c-token uniqueness constraint in c-h code formulation is too strong –A c-token should not appear in two different tags, but can be repeated in a tag –Periodic tags use fewer c-tokens!  Tag set design can be cast as a cycle packing problem

c-token factor graph, c=4 (incomplete) CC AAG AAC AAAA AAAT

Cycle Packing Algorithm 1.Construct c-token factor graph G 2.T  {} 3.For all cycles C defining periodic tags, in increasing order of cycle length, Add to T the tag defined by C Remove C from G 4.Perform an alphabetic tree search and add to T tags consisting of unused c-tokens 5.Return T – Gives an increase of over 40% in the number of tags compared to previous methods

Experimental Results h

More Hybridization Constraints… Enforced during tag assignment by - Leaving some tags unassigned and distributing primers across multiple arrays [Ben-Dor et al. 03] - Exploiting availability of multiple primer candidates [MPT05] t1t2 t1

Herpes B Gene Expression Assay TmTm # pools Pool size 500 tags1000 tags2000 tags # arrays% Util.# arrays% Util.# arrays% Util TmTm # pools Pool size 500 tags1000 tags2000 tags # arrays% Util.# arrays% Util.# arrays% Util GenFlex Tags Periodic Tags

New SBE/SBH Assay T A T T A T TTGCA GATAA T A T AAACCCCA ATAGCGCT TTTGGGGT TATCGCGA Primer

SBE/SBH Throughput (c=13, r=5) See poster for more details

Conclusions and Ongoing Work Combinatorial algorithms yield significant increases in multiplexing rates of universal DNA arrays –New SBE/SBH architecture particularly promising based on preliminary simulation results Ongoing work: –Extend methods to more accurate hybridization models, e.g., use NN melting temperature models –More complex (e.g., temperature dependent) DNA tag set non-interaction requirements for DNA self/mediated assembly –Probabilistic decoding in presence of hybridization errors

Acknowledgments UCONN Research Foundation Claudia Prajescu Dragos Trinca