Variable Penalty Dynamic Time Warping For Aligning Chromatography Data David Clifford Research Scientist June 2009.

Slides:



Advertisements
Similar presentations
Constraint Optimization We are interested in the general non-linear programming problem like the following Find x which optimizes f(x) subject to gi(x)
Advertisements

Improved Algorithms for Inferring the Minimum Mosaic of a Set of Recombinants Yufeng Wu and Dan Gusfield UC Davis CPM 2007.
Word Spotting DTW.
S-SENCE Signal processing for chemical sensors Martin Holmberg S-SENCE Applied Physics, Department of Physics and Measurement Technology (IFM) Linköping.
Hidden Markov Models Theory By Johan Walters (SR 2003)
NETE4631:Capacity Planning (3)- Private Cloud Lecture 11 Suronapee Phoomvuthisarn, Ph.D. /
Infinite Horizon Problems
74 th EAGE Conference & Exhibition incorporating SPE EUROPEC 2012 Automated seismic-to-well ties? Roberto H. Herrera and Mirko van der Baan University.
Learn how to make your drawings come alive…  NEW COURSE: SKETCH RECOGNITION Analysis, implementation, and comparison of sketch recognition algorithms,
1 2 Extreme Pathway Lengths and Reaction Participation in Genome Scale Metabolic Networks Jason A. Papin, Nathan D. Price and Bernhard Ø. Palsson.
Texture Synthesis Tiantian Liu. Definition Texture – Texture refers to the properties held and sensations caused by the external surface of objects received.
Affine Linear Transformations Prof. Graeme Bailey (notes modified from Noah Snavely, Spring 2009)
Themis Palpanas1 VLDB - Aug 2004 Fair Use Agreement This agreement covers the use of all slides on this CD-Rom, please read carefully. You may freely use.
Object (Data and Algorithm) Analysis Cmput Lecture 5 Department of Computing Science University of Alberta ©Duane Szafron 1999 Some code in this.
Sketched Derivation of error bound using VC-dimension (1) Bound our usual PAC expression by the probability that an algorithm has 0 error on the training.
Algebraic Structures and Algorithms for Matching and Matroid Problems Nick Harvey.
Sequence analysis of nucleic acids and proteins: part 1 Based on Chapter 3 of Post-genome Bioinformatics by Minoru Kanehisa, Oxford University Press, 2000.
Dynamic Time Warping Applications and Derivation
Data Structures Hashing Uri Zwick January 2014.
A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora Benjamin Arai Computer Science and Engineering Department.
Exact Indexing of Dynamic Time Warping
Multiple Sequence Alignment CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (Slides by J. Burg)
Sequence Alignment Algorithms Morten Nielsen Department of systems biology, DTU.
Support Vector Machines Piyush Kumar. Perceptrons revisited Class 1 : (+1) Class 2 : (-1) Is this unique?
FastDTW: Toward Accurate Dynamic Time Warping in Linear Time and Space
So far: Historical introduction Mathematical background (e.g., pattern classification, acoustics) Feature extraction for speech recognition (and some neural.
1.1 General description - Sample dissolved in and transported by a mobile phase - Some components in sample interact more strongly with stationary phase.
1 Growth of Functions CS 202 Epp, section ??? Aaron Bloomfield.
Shape Matching for Model Alignment 3D Scan Matching and Registration, Part I ICCV 2005 Short Course Michael Kazhdan Johns Hopkins University.
Dynamic Time Warping Algorithm for Gene Expression Time Series
Implementing a Speech Recognition System on a GPU using CUDA
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
CHROMATOGRAPHY Chromatography basically involves the separation of mixtures due to differences in the distribution coefficient.
Linked List. Background Arrays has certain disadvantages as data storage structures. ▫In an unordered array, searching is slow ▫In an ordered array, insertion.
1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.
240-Current Research Easily Extensible Systems, Octave, Input Formats, SOA.
Sequence Comparison Algorithms Ellen Walker Bioinformatics Hiram College.
Extracting quantitative information from proteomic 2-D gels Lecture in the bioinformatics course ”Gene expression and cell models” April 20, 2005 John.
1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.
Exact indexing of Dynamic Time Warping
Stream Monitoring under the Time Warping Distance Yasushi Sakurai (NTT Cyber Space Labs) Christos Faloutsos (Carnegie Mellon Univ.) Masashi Yamamuro (NTT.
COMP 5331 Project Roadmap I will give a brief introduction (e.g. notation) on time series. Giving a notion of what we are playing with.
Pairwise Sequence Alignment Part 2. Outline Summary Local and Global alignments FASTA and BLAST algorithms Evaluating significance of alignments Alignment.
Automatic Speech Recognition A summary of contributions from multiple disciplines Mark D. Skowronski Computational Neuro-Engineering Lab Electrical and.
CSCE 441: Keyframe Animation/Smooth Curves (Cont.) Jinxiang Chai.
Network RS Codes for Efficient Network Adversary Localization Sidharth Jaggi Minghua Chen Hongyi Yao.
1 Dynamic Time Warping and Minimum Distance Paths for Speech Recognition Isolated word recognition: Task : Want to build an isolated ‘word’ recogniser.
Dynamic programming with more complex models When gaps do occur, they are often longer than one residue.(biology) We can still use all the dynamic programming.
ELEC692 VLSI Signal Processing Architecture Lecture 12 Numerical Strength Reduction.
Downloaded from کروماتوگرافی CHROMATOGRAPHY Downloaded from
1.1 General description - Sample dissolved in and transported by a mobile phase - Some components in sample interact more strongly with stationary phase.
Learning to Align: a Statistical Approach
Fitting: Voting and the Hough Transform
MATCH A Music Alignment Tool Chest
Supervised Time Series Pattern Discovery through Local Importance
Guide to Understand Back Pain
Fast Approximate Query Answering over Sensor Data with Deterministic Error Guarantees Chunbin Lin Joint with Etienne Boursier, Jacque Brito, Yannis Katsis,
Random walk initialization for training very deep feedforward networks
School of Computer Science & Engineering
The Functional Space of an Activity Ashok Veeraraghavan , Rama Chellappa, Amit Roy-Chowdhury Avinash Ravichandran.
CSE Social Media & Text Analytics
Randomized Hill Climbing
Fast Sequence Alignments
Replications in Multi-Region Peer-to-peer Systems
Discrete Controller Synthesis
Sequence Alignment Algorithms Morten Nielsen BioSys, DTU
Compact routing schemes with improved stretch
Feature extraction and alignment for LC/MS data
Applying principles of computer science in a biological context
Chapter 4 . Trajectory planning and Inverse kinematics
Presentation transcript:

Variable Penalty Dynamic Time Warping For Aligning Chromatography Data David Clifford Research Scientist June 2009

CSIRO Issues in aligning multiple - MS spectra Talk Outline Gas Chromatography Mass Spectrometry Examples and Properties Dynamic time warping – origins in speech recognition Uses in the 21 st century aligning GC-MS data Central Idea of the talk – variable penalty DTW, joint work with Glenn Stone Results of alignment and How to do it

CSIRO Issues in aligning multiple - MS spectra Gas Chromatography Separates a gas into its constituent parts These elute from machine over period of 40 minutes Measures quantity several times a second Does not identify compounds Gold standard in analytical chemistry Slow process, expensive technology

CSIRO Issues in aligning multiple - MS spectra Uses of Gas Chromatography Wine Chemistry Meat quality Metabolomic studies Data format is similar to Liquid Chromatography-MS etc

CSIRO Issues in aligning multiple - MS spectra Goal of this talk How can we align the two signals How can we align many signals Dynamic time warping – yes but it overdoes the warping Variable penalty DTW – balances warping with alignment needs VPdtw package now available on CRAN

CSIRO Issues in aligning multiple - MS spectra Before and After Alignment

CSIRO Issues in aligning multiple - MS spectra Calling for a taxi…. Matches what you say with database of placenames Dynamic time warping was invented in the late 60s early 70s to do this kind of matching. DTW can expand or contract your words to match placenames DTW is natural choice for matching speech Speed of speech differs between individuals Um’s and ah’s need to be cut out etc. DTW is a very fast algorithm, achieves global optimum

CSIRO Issues in aligning multiple - MS spectra Dynamic Time Warping REFERENCE Q U E R Y

CSIRO Issues in aligning multiple - MS spectra Dynamic Time Warping REFERENCE Q U E R Y

CSIRO Issues in aligning multiple - MS spectra No alignment REFERENCE Q U E R Y

CSIRO Issues in aligning multiple - MS spectra Alignment by Shift REFERENCE Q U E R Y

CSIRO Issues in aligning multiple - MS spectra Linear Transformation (Shift and Stretch) REFERENCE Q U E R Y

CSIRO Issues in aligning multiple - MS spectra Parametric Time Warping REFERENCE Q U E R Y

CSIRO Issues in aligning multiple - MS spectra Symmetric Dynamic Time Warping REFERENCE Q U E R Y

CSIRO Issues in aligning multiple - MS spectra Asymmetric Dynamic Time Warping REFERENCE Q U E R Y

CSIRO Issues in aligning multiple - MS spectra Sakoe-Chiba DTW (bound on shift) Memory efficient variation of DTW – faster method REFERENCE Q U E R Y

CSIRO Issues in aligning multiple - MS spectra Dynamic Time Warping Guaranteed global optimum, but lots of non-diagonal moves REFERENCE Q U E R Y

CSIRO Issues in aligning multiple - MS spectra Dynamic Time Warping REFERENCE Q U E R Y

CSIRO Issues in aligning multiple - MS spectra DTW and GC-MS DTW overdoes the warping…. Let’s examine the path REFERENCE Q U E R Y

CSIRO Issues in aligning multiple - MS spectra Rotate our view – it’s a complicated warp

CSIRO Issues in aligning multiple - MS spectra Paths found with two different penalties

CSIRO Issues in aligning multiple - MS spectra Why do we need to care about this Analysis is based on peak area – and overwarping will affect peak shape and area. Overwarping introduces artificial features into data. Overwarping occurs due to too many non-diagonal moves Solution #1: penalise non-diagonal moves Solution #2: variable penalty dependent on size of peaks

CSIRO Issues in aligning multiple - MS spectra Variable penalty DTW Minimise over paths w Choose penalty vector using a dilation of the signals Large penalty with large peaks Minimise this function using dynamic programming Easy to implement How does it compare to DTW, constant penalty DTW, and parametric time warping?

CSIRO Issues in aligning multiple - MS spectra Key Ingredient for VPdtw Penalty vector – proportional to a dilation of the signal. There is some subjectivity here to balance the need for alignment with the affect on raw signals.

CSIRO Issues in aligning multiple - MS spectra Before Alignment – can’t see detail but

CSIRO Issues in aligning multiple - MS spectra Check Alignment #1

CSIRO Issues in aligning multiple - MS spectra Check Alignment #2

CSIRO Issues in aligning multiple - MS spectra Check Alignment #3

CSIRO Issues in aligning multiple - MS spectra How far are points moved by alignment?

CSIRO Issues in aligning multiple - MS spectra VPdtw package – now on CRAN, GPL 2 VPdtw, dilation, plot.VPdtw, print.VPdtw result <- VPdtw(reference, query, penalty, maxshift = 350) print(result) plot(result,”Before”) plot(result,”After”) plot(result,”Shifts”) plot(result) Many queries, one penalty One query, many penalties Reference can be NULL

CSIRO Issues in aligning multiple - MS spectra Comparisons – Time

CSIRO Issues in aligning multiple - MS spectra Summary Introduced GC-MS data This talk is really about improving data quality Improvement via alignment without data reduction without unnatural features via fast computation VPdtw available on CRAN Faster Better than available alternatives

CSIRO Issues in aligning multiple - MS spectra References DTW: Vintsyuk, T. K. Kibernetika Sakoe, H., and Chiba, S. Proceedings of the International Congress on Acoustics, Budapest, Hungary, 1971; paper 20 c 13. Parametric Time Warping: Eilers, P.H.C. Anal. Chem Alignment Using Variable Penalty Dynamic Time Warping by Clifford, Stone, Montoliu, Rezzi, Martin, Guy, Bruce and Kochhar. Anal. Chem., 2009, 81 (3), pp 1000–1007

Thank you Statistical Bioinformatics - Agribusiness David Clifford Research Scientist CSIRO Division of Mathematics, Informatics and Statistics Phone: Web: Contact Us Phone: or Web:

CSIRO Issues in aligning multiple - MS spectra VPdtw package – plot(result,”Before”)

CSIRO Issues in aligning multiple - MS spectra VPdtw package – plot(result,”After”)

CSIRO Issues in aligning multiple - MS spectra VPdtw package – print(result) Reference is NULL. Query column # 13 is chosen at random. Query matrix is made up of 16 samples of length Single Penalty vector supplied by user. Max allowed shift is 150. Cost Overlap Max Obs Shift # Diag Moves # Expanded # Dropped Query #1: Query #2: Query #3: Query #4: Query #5: Query #6: Query #7: Query #8: Query #9: Query #10: Query #11: Query #12: Query #13: Query #14: Query #15: Query #16: