Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory.

Slides:



Advertisements
Similar presentations
DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng.
Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Slide 1 Bayesian Model Fusion: Large-Scale Performance Modeling of Analog and Mixed- Signal Circuits by Reusing Early-Stage Data Fa Wang*, Wangyang Zhang*,
Multi-Label Prediction via Compressed Sensing By Daniel Hsu, Sham M. Kakade, John Langford, Tong Zhang (NIPS 2009) Presented by: Lingbo Li ECE, Duke University.
METHODS FOR HAPLOTYPE RECONSTRUCTION
shRNA libraries with DNA Sudoku Yaniv Erlich Hannon Lab Yaniv Erlich Hannon Lab shRNA libraries sequencing using DNA Sudoku.
Efficient And Accurate Ranking of Multidimensional Drug Profiling Data by Graph-Based Algorithm Dorit S. Hochbaum Chun-nan Hsu Yan T. Yang.
PRIORITIZING REGIONS OF CANDIDATE GENES FOR EFFICIENT MUTATION SCREENING.
D ISCOVERING REGULATORY AND SIGNALLING CIRCUITS IN MOLECULAR INTERACTION NETWORK Ideker Bioinformatics 2002 Presented by: Omrit Zemach April Seminar.
Multi-Task Compressive Sensing with Dirichlet Process Priors Yuting Qi 1, Dehong Liu 1, David Dunson 2, and Lawrence Carin 1 1 Department of Electrical.
Proactive Learning: Cost- Sensitive Active Learning with Multiple Imperfect Oracles Pinar Donmez and Jaime Carbonell Pinar Donmez and Jaime Carbonell Language.
Sequencing shRNA libraries with DNA Sudoku Yaniv Erlich Hannon Lab Yaniv Erlich Hannon Lab Compressed Genotyping Cold Spring Harbor.
More MR Fingerprinting
KEY CONCEPT Hardy-Weinberg equilibrium provides a framework for understanding how populations evolve.
Institute of Intelligent Power Electronics – IPE Page1 Introduction to Basics of Genetic Algorithms Docent Xiao-Zhi Gao Department of Electrical Engineering.
ECE Department Rice University dsp.rice.edu/cs Measurements and Bits: Compressed Sensing meets Information Theory Shriram Sarvotham Dror Baron Richard.
Network Coding for Large Scale Content Distribution Christos Gkantsidis Georgia Institute of Technology Pablo Rodriguez Microsoft Research IEEE INFOCOM.
FDA Panel Comments Adele Schneider, MD, FACMG Victor Center for the Prevention of Jewish Genetic Diseases, Director, Clinical Genetics Albert Einstein.
Finding Bit Patterns Applying haplotype models to association study design Natalie Castellana Kedar Dhamdhere Russell Schwartz August 16, 2005.
6.829 Computer Networks1 Compressed Sensing for Loss-Tolerant Audio Transport Clay, Elena, Hui.
Variation.
SIMULATION. Simulation Definition of Simulation Simulation Methodology Proposing a New Experiment Considerations When Using Computer Models Types of Simulations.
A Fault-tolerant Architecture for Quantum Hamiltonian Simulation Guoming Wang Oleg Khainovski.
Hybrid Dense/Sparse Matrices in Compressed Sensing Reconstruction
Affymetrix Resequencing Arrays Matthew Smith Trainee Presentation West Midlands Regional Genetics Laboratory.
Jewish Russian Immigrants in the US and around the world. Health Needs. Part 2: Health challenges Olga Greg and the Supercourse team University of Pittsburgh.
Introduction to Monte Carlo Methods D.J.C. Mackay.
Chapter 14 – The Human Genome
Ensuring the Quality of Genetic Testing ICORD Meeting September 14, 2007 Lisa Kalman, PhD Coordinator, GeT-RM CDC
Detection of Rare-Alleles and Their Carriers Using Compressed Se(que)nsing Or Zuk Broad Institute of MIT and Harvard In collaboration.
DR. ERNEST K. ADJEI FRCPath. DEPARTMENT OF PATHOLOGY SMS-KATH
Repairable Fountain Codes Megasthenis Asteris, Alexandros G. Dimakis IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 32, NO. 5, MAY /5/221.
Mapping protein-DNA interactions by ChIP-seq Zsolt Szilagyi Institute of Biomedicine.
Cystic Fibrosis Carrier Testing in an Ethnically Diverse US Population E.M. Rohlfs, Z. Zhou, R.A. Heim, N. Nagan, L.S. Rosenblum, K. Flynn, T. Scholl,
Genetic Regulatory Network Inference Russell Schwartz Department of Biological Sciences Carnegie Mellon University.
Cristian Urs and Ben Riveira. Introduction The article we chose focuses on improving the performance of Genetic Algorithms by: Use of predictive models.
SOFT COMPUTING (Optimization Techniques using GA) Dr. N.Uma Maheswari Professor/CSE PSNA CET.
Chapter 3: Genetic Bases of Child Development 3.1 Mechanisms of Heredity 3.2 Genetic Disorders 3.3 Heredity Is Not Destiny.
Linear Reduction for Haplotype Inference Alex Zelikovsky joint work with Jingwu He WABI 2004.
Learning Goal 1 Natural Selection is a major mechanism of evolution
Allelic Frequencies and Population Genetics
Chapter 7 Population Genetics. Introduction Genes act on individuals and flow through families. The forces that determine gene frequencies act at the.
Advanced Computer Architecture 0 Lecture # 1 Introduction by Husnain Sherazi.
Heidi Elliott El Campo High School El Campo ISD POWER SET Sponsor Cable Kurwitz Department of Nuclear Engineering Nuclear Power Institute Texas A&M University.
List diseases that can be caused by mutations Cystic fibrosis Sickle cell anaemia Tay-Sachs disease Phenylketonuria Colour-blindness Cancers
Input: A set of people with/without a disease (e.g., cancer) Measure a large set of genetic markers for each person (e.g., measurement of DNA at various.
Learning to Sense Sparse Signals: Simultaneous Sensing Matrix and Sparsifying Dictionary Optimization Julio Martin Duarte-Carvajalino, and Guillermo Sapiro.
McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved. 1.
Sparse Signals Reconstruction Via Adaptive Iterative Greedy Algorithm Ahmed Aziz, Ahmed Salim, Walid Osamy Presenter : 張庭豪 International Journal of Computer.
PARALLEL FREQUENCY RADAR VIA COMPRESSIVE SENSING
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
Genetic Disorders. Caused by a harmful mutation (physical change of gene) Mutation originally occurs in gamete and is passed to future generations (inherited)
Measuring Evolution within Populations
Practical Message-passing Framework for Large-scale Combinatorial Optimization Inho Cho, Soya Park, Sejun Park, Dongsu Han, and Jinwoo Shin KAIST 2015.
Lecture 16 Tuesday, April 9, 2013 BiSc 001 Spring 2013 Guest Lecture Dr. Jihye Park.
Jianchao Yang, John Wright, Thomas Huang, Yi Ma CVPR 2008 Image Super-Resolution as Sparse Representation of Raw Image Patches.
EVOLUTIONARY SYSTEMS AND GENETIC ALGORITHMS NAME: AKSHITKUMAR PATEL STUDENT ID: GRAD POSITION PAPER.
DISPLACED PHASE CENTER ANTENNA SAR IMAGING BASED ON COMPRESSED SENSING Yueguan Lin 1,2,3, Bingchen Zhang 1,2, Wen Hong 1,2 and Yirong Wu 1,2 1 National.
Introduction to Discrete-Time Control Systems fall
Computing and Compressive Sensing in Wireless Sensor Networks
WSRec: A Collaborative Filtering Based Web Service Recommender System
Research in Computational Molecular Biology , Vol (2008)
Spinal Muscular Atrophy
Hub Node Detection in Directed Acyclic Graphs
Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
Alan Kuhnle*, Victoria G. Crawford, and My T. Thai
Preconception screening aims to identify people who might be carriers of certain genetic traits. Some screening programs are conducted with specific ethnic.
Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
What are Multiscale Methods?
Presentation transcript:

Watson School of Biological Sciences Cold Spring Harbor Laboratory Watson School of Biological Sciences Cold Spring Harbor Laboratory Yaniv Erlich Compressed Sensing Approaches for High Throughput Carrier Screen Joint work with Noam Shental, Amnon Amir and Or Zuk

Outline What is a carrier screen? Our vision - compressed sensing carrier screen Unique features of our setting Bayesian reconstruction algorithm Simulations Intro - carrier screensCS visionUnique featuresBP solverSimulations Compressed sensing carrier screen

Rare recessive genetic diseases Normal Carrier Affected Healthy Healthy! Disease NameGenotypePhenotype Intro - carrier screens CS visionUnique featuresBP solverSimulations Compressed sensing carrier screen ~29/30 ~1/ % Cystic Fibrosis

Carrier breading may lead to devastating results AffectedCarrier 1:21:4 No Carrier 1:4 Intro - carrier screens CS visionUnique featuresBP solverSimulations Carrier couple Compressed sensing carrier screen

What can we do? Several countries employ nationwide programs - screen the bulk population - very limited set of genes Intro - carrier screens CS visionUnique featuresBP solverSimulations Compressed sensing carrier screen

Carrier screen - the current mechanism Input: Thousands of specimens. Output: Finding carriers for rare genetic diseases A needle in a haystack problem Intro - carrier screens CS visionUnique featuresBP solverSimulations Serial processing: - sequence: 1 region of 1 person per reaction - expensive and does not scale Compressed sensing carrier screen

Carrier screens - our vision Ultra-high throughput carrier screen Many specimens + many regions Adding more genes to the test panel while keeping the task in a tractable scale Increase the participation by reducing the cost Intro - carrier screens CS vision Unique featuresBP solverSimulations Compressed sensing carrier screen

BUT On pooled samples - only histogram of the DNA sequence type. How to multiplex many specimens with next generation sequencers? Next generation sequencers – parallel processing Sequence 100 million DNA molecules in a single batch (~1 week) Fraction of reads Example: When pooling 4 normal specimens and 1 carrier WT allele Mutant Intro - carrier screens CS vision Unique featuresBP solverSimulations Compressed sensing carrier screen

Multiplexing - the compressed sensing approach y = Φx CS principle: when x is sparse, very few measurements are sufficient for faithful reconstruction. X N carrier = Φ T pools y Pooling design 0-1 matrix The ratio of carrier reads Intro - carrier screens CS vision Unique featuresBP solverSimulations Compressed sensing carrier screen

Distinctions from traditional CS ‘On a budget’ compressed sensing Not all pools were born equal Signal domain Intro - carrier screensCS vision Unique features BP solverSimulations Compressed sensing carrier screen

Distinctions from traditional CS ‘On a budget’ compressed sensing Not all pools were born equal Signal domain Intro - carrier screensCS vision Unique features BP solverSimulations Compressed sensing carrier screen

On a budget compressed sensing Heavy weight design requires long pooling steps and higher material consumption Higher compression level is more prone to technical difficulties We want a very sparse sensing matrix Specimens (N) Pools (t) Φ=Φ= Weight (w) Compression level Random matrix with p=0.5 Intro - carrier screensCS vision Unique features BP solverSimulations Compressed sensing carrier screen

Inputs: N (number of specimens in the experiment) Weight (pooling efforts) Algorithm: 1. Find W numbers {x 1,x 2,…,x w } such that: (a)Bigger than (b)Pairwise coprime 2. Generate W modular equations: 3. Construct the pooling design upon the modular equations Output: Sparse pooling design with Light Chinese Design Advantages: (w-1)-disjunct matrix The weight does not explicitly depend on the number of specimens The compression level is Easy to debug mod 6 mod 7 Intro - carrier screensCS vision Unique features BP solverSimulations Compressed sensing carrier screen

Distinctions from traditional CS ‘On a budget’ compressed sensing Not all pools were born equal Signal domain Intro - carrier screensCS vision Unique features BP solverSimulations Compressed sensing carrier screen

Not all pools were born equal The sequencer does not report the absolute number of carriers in the pool Instead: # carrier reads ~ # total sequence reads Fraction of carriers in the pool / 2 Pools with ↑sequence reads and ↓carriers provide more reliable information. The noise is not additive but with correlation to the content of the pool. We need a reconstruction algorithm that takes into account the reliability of the data from each pool. Intro - carrier screensCS vision Unique features BP solverSimulations Compressed sensing carrier screen

Distinctions from traditional CS ‘On a budget’ compressed sensing Not all pools were born equal Signal domain Intro - carrier screensCS vision Unique features BP solverSimulations Compressed sensing carrier screen

Signal Domain In traditional CS: In compressed carrier screen: Traditional CS decoder solves: What are the implications of using traditional decoder and employing rounding procedure? Can we find reconstruction procedure that directly finds Intro - carrier screensCS vision Unique features BP solverSimulations Compressed sensing carrier screen

Bayesian reconstruction algorithm Biological expectations Pooling model and sequencing Biologically, the genotype of one specimen is not dependent on the genotype of other one (unless relatives) Only the specimens in the pool are affecting the pool results Biological data Pooling data Approximation by loopy Belief Propagation… Φ Intro - carrier screensCS visionUnique features BP solver Simulations Compressed sensing carrier screen

Advantages of Belief Propagation Bottom up approach – weighs the reliability of each individual pool Bayesian – everything speaks the same language. Can incorporate a-priori medical information and familial connections. Encoding advantage – Chinese pooling ensures that there are no short cycles Binary results directly – no rounding procedure at the end Biological data Pooling data Intro - carrier screensCS visionUnique features BP solver Simulations Compressed sensing carrier screen

Simulations of compressed carrier screen in Ashkenazi Jews Genetic DisorderCarrier rate Tay-Sachs1:25 Cystic Fibrosis1:30 Familial Dysautonomia1:30 Usher Syndrome1:40 Canavan 1:40 Glycogen Storage 1:71 Fanconi Anemia C 1:80 Niemann-Pick 1:80 Mucolipidosis type 4 1:100 Bloom1:102 Nemaline Myopathay1:108 Finding carriers for two Ashkenazi Jews diseases: Tay-Sachs and Bloom syndrome. Chinese pooling design Comparing GPSR (traditional solver) and BP Evaluating N max – the largest number of specimens for which at least 48 out of 50 runs give 100% accuracy. Intro - carrier screensCS visionUnique featuresBP solver Simulations Compressed sensing carrier screen

Results Bloom Tay-Sachs BP GPSR Pools/Specimen = 6.5% Pools/Specimens= 13% Intro - carrier screensCS visionUnique featuresBP solver Simulations Compressed sensing carrier screen

Conclusions CS framework can be utilized for ultra-high throughput carrier screens. Our setting shows several unique features not in traditional framework - We suggest tailored encoding (light Chinese) and decoding (BP) procedures At least in our settings: a tailor decoder, BP, has an advantage over reconstructing with off-the shelf CS solver CS carrier screen has the potential to reduce dramatically the cost of sequencing. Intro - carrier screensCS visionUnique featuresBP solver Simulations Compressed sensing carrier screen

An ongoing study… Introduct ion Naïve Solutions Chinese Pooling AnalysisResults Intro - carrier screensCS visionUnique featuresBP solver Simulations The real thing Compressed sensing carrier screen

Greg Hannon Acknowledgements For more information: hannonlab.cshl.edu/labmembers/erlich hannonlab.cshl.edu/labmembers/erlich Noam Shental Or Zuk & Amnon Amir Igor Carron (Nuit Blanche) Funding: Lindsay Goldberg PhD Fellowship ACM/IEEE-CS HPC PhD Fellowship Compressed sensing carrier screen

Loopy belief propagation is tricky Damping is the key DNA Sudoku

Pooling imperfections Background contamination Pooling failures (erasures) mod 377 Data from a real experiment  Pools not in use  Pools # Reads Intro - carrier screensCS vision Unique features BP solverSimulations

Distinctions from traditional CS ‘On a budget’ compressed sensing Not all pools were born equal Pooling imperfections Signal domain Intro - carrier screensCS vision Unique features BP solverSimulations

Distinctions from traditional CS ‘On a budget’ compressed sensing Not all pools were born equal Pooling imperfections Signal domain Intro - carrier screensCS vision Unique features BP solverSimulations

Distinctions from traditional CS ‘On a budget’ compressed sensing Not all pools were born equal Pooling imperfections Signal domain Intro - carrier screensCS vision Unique features BP solverSimulations

Distinctions from traditional CS ‘On a budget’ compressed sensing Not all pools were born equal Pooling imperfections Signal domain Intro - carrier screensCS vision Unique features BP solverSimulations

Distinctions from traditional CS ‘On a budget’ compressed sensing Not all pools were born equal Pooling imperfections Signal domain Intro - carrier screensCS vision Unique features BP solverSimulations