Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05 Big Picture Big Picture Protein Structure Protein Structure Sequencing.

Slides:



Advertisements
Similar presentations
Hidden Markov Model in Biological Sequence Analysis – Part 2
Advertisements

Lecture 2 Hidden Markov Model. Hidden Markov Model Motivation: We have a text partly written by Shakespeare and partly “written” by a monkey, we want.
Learning HMM parameters
Hidden Markov Model.
1 Hidden Markov Model Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Hidden Markov models and its application to bioinformatics.
Hidden Markov Models.
Profile Hidden Markov Models Bioinformatics Fall-2004 Dr Webb Miller and Dr Claude Depamphilis Dhiraj Joshi Department of Computer Science and Engineering.
MNW2 course Introduction to Bioinformatics
1 Profile Hidden Markov Models For Protein Structure Prediction Colin Cherry
Patterns, Profiles, and Multiple Alignment.
Hidden Markov Models Modified from:
Hidden Markov Models: Applications in Bioinformatics Gleb Haynatzki, Ph.D. Creighton University March 31, 2003.
Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008.
Profiles for Sequences
JM - 1 Introduction to Bioinformatics: Lecture XIII Profile and Other Hidden Markov Models Jarek Meller Jarek Meller Division.
C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Alignments 1 Sequence Analysis.
Profile HMMs for sequence families and Viterbi equations Linda Muselaars and Miranda Stobbe.
SNU BioIntelligence Lab. ( 1 Ch 5. Profile HMMs for sequence families Biological sequence analysis: Probabilistic models of proteins.
PatReco: Hidden Markov Models Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
Hidden Markov Models. Two learning scenarios 1.Estimation when the “right answer” is known Examples: GIVEN:a genomic region x = x 1 …x 1,000,000 where.
Hidden Markov Models I Biology 162 Computational Genetics Todd Vision 14 Sep 2004.
HIDDEN MARKOV MODELS IN MULTIPLE ALIGNMENT. 2 HMM Architecture Markov Chains What is a Hidden Markov Model(HMM)? Components of HMM Problems of HMMs.
CS262 Lecture 15, Win06, Batzoglou Rapid Global Alignments How to align genomic sequences in (more or less) linear time.
Progressive MSA Do pair-wise alignment Develop an evolutionary tree Most closely related sequences are then aligned, then more distant are added. Genetic.
Hidden Markov Models: an Introduction by Rachel Karchin.
HIDDEN MARKOV MODELS IN MULTIPLE ALIGNMENT
. Class 5: HMMs and Profile HMMs. Review of HMM u Hidden Markov Models l Probabilistic models of sequences u Consist of two parts: l Hidden states These.
Metamorphic Malware Research
Lecture 9 Hidden Markov Models BioE 480 Sept 21, 2004.
Hw1 Shown below is a matrix of log odds column scores made from an alignment of a set of sequences. (A) Calculate the alignment score for each of the four.
Profile Hidden Markov Models PHMM 1 Mark Stamp. Hidden Markov Models  Here, we assume you know about HMMs o If not, see “A revealing introduction to.
Learning HMM parameters Sushmita Roy BMI/CS 576 Oct 21 st, 2014.
Profile HMMs Biology 162 Computational Genetics Todd Vision 16 Sep 2004.
Introduction to Profile Hidden Markov Models
Hidden Markov Models As used to summarize multiple sequence alignments, and score new sequences.
CSCE555 Bioinformatics Lecture 6 Hidden Markov Models Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
MNW2 course Introduction to Bioinformatics Lecture 22: Markov models Centre for Integrative Bioinformatics FEW/FALW
Hidden Markov Models for Sequence Analysis 4
Scoring Matrices Scoring matrices, PSSMs, and HMMs BIO520 BioinformaticsJim Lund Reading: Ch 6.1.
Chapter 6 Profiles and Hidden Markov Models. The following approaches can also be used to identify distantly related members to a family of protein (or.
HMMs for alignments & Sequence pattern discovery I519 Introduction to Bioinformatics.
CSE182-L9 Modeling Protein domains using HMMs. Profiles Revisited Note that profiles are a powerful way of capturing domain information Pr(sequence x|
Protein and RNA Families
1 MARKOV MODELS MARKOV MODELS Presentation by Jeff Rosenberg, Toru Sakamoto, Freeman Chen HIDDEN.
CZ5226: Advanced Bioinformatics Lecture 6: HHM Method for generating motifs Prof. Chen Yu Zong Tel:
1 Chapter 5 Profile HMMs for Sequence Families. 2 What have we done? So far, we have concentrated on the intrinsic properties of single sequences (CpG.
Multiple alignment using hidden Markove models November 21, 2001 Kim Hye Jin Intelligent Multimedia Lab
Hidden Markov Model and Its Application in Bioinformatics Liqing Department of Computer Science.
(H)MMs in gene prediction and similarity searches.
Sequence Similarity. PROBCONS: Probabilistic Consistency-based Multiple Alignment of Proteins INSERTXINSERTY MATCH xixixixi yjyjyjyj ― yjyjyjyj xixixixi―
MGM workshop. 19 Oct 2010 Some frequently-used Bioinformatics Tools Konstantinos Mavrommatis Prokaryotic Superprogram.
Hidden Markov Model Parameter Estimation BMI/CS 576 Colin Dewey Fall 2015.
N-Gram Model Formulas Word sequences Chain rule of probability Bigram approximation N-gram approximation.
More on HMMs and Multiple Sequence Alignment BMI/CS 776 Mark Craven March 2002.
4.2 - Algorithms Sébastien Lemieux Elitra Canada Ltd.
Profile Hidden Markov Models PHMM 1 Mark Stamp. Hidden Markov Models  Here, we assume you know about HMMs o If not, see “A revealing introduction to.
BNFO 615 Fall 2016 Usman Roshan NJIT. Outline Machine learning for bioinformatics – Basic machine learning algorithms – Applications to bioinformatics.
CSE182-L10 HMM applications.
Free for Academic Use. Jianlin Cheng.
Comp. Genomics Recitation 6 14/11/06 ML and EM.
Bioinformatics: The pair-wise alignment problem
I'.II'.I. I I I - - -
£"'>£"'>.I.I ' ·.· · ·..I.
CISC 841 Bioinformatics (Fall 2007) Hidden Markov Models
Bioinformatics Biological Data Computer Calculations +
n --- IiIi.
R 1.1 t' r'. ').. "'. I{III{II \I\I ' IIII t J I I f c t -
CSE 5290: Algorithms for Bioinformatics Fall 2009
A T C.
Presentation transcript:

Similar Techniques For Molecular Sequencing and Network Security Doug Madory 27 APR 05 Big Picture Big Picture Protein Structure Protein Structure Sequencing using Profile HMM Sequencing using Profile HMM

Big Picture PQS for Network Security (Us) PQS for Network Security (Us) Design HMM for network event Design HMM for network event Find event within linear stream of observed network events Find event within linear stream of observed network events Sequencing using Profile HMM (Bioinformatics) Sequencing using Profile HMM (Bioinformatics) Train HMM using known information about subsequence Train HMM using known information about subsequence Find subsequence within linear protein / genome sequence Find subsequence within linear protein / genome sequence Q: Did an event happen? Q: If it exists, where is sequence?

Profile HMM - Simple Case Train HMM Train HMM Viterbi Scoring Viterbi Scoring Backtrace Viterbi Backtrace Viterbi Query:A-, AA, TA Query:A-, AA, TA DB:ATA DB:ATA

HMM Training Build HMM with 2 M states because there are 2 columns in query M 1 M 2 Begin ACGTACGT End D2D2 I2I2 I0I0 D1D1 I1I1 ACGTACGT

HMM Training Step 1 – add pseudocount to each transition and emission M 1 M 2 Begin A 1 C 1 G 1 T 1 End D2D2 I2I2 I0I0 D1D1 I1I1 A 1 C 1 G 1 T

HMM Training Step 2 – train with A- M 1 M 2 Begin A 2 C 1 G 1 T 1 End D2D2 I2I2 I0I0 D1D1 I1I1 A 1 C 1 G 1 T

HMM Training Step 3 – train with AA M 1 M 2 Begin A 3 C 1 G 1 T 1 End D2D2 I2I2 I0I0 D1D1 I1I1 A 2 C 1 G 1 T

HMM Training Step 4 – train with TA M 1 M 2 Begin A 3 C 1 G 1 T 2 End D2D2 I2I2 I0I0 D1D1 I1I1 A 3 C 1 G 1 T

HMM Training Fully trained HMM M 1 M 2 Begin A 3 C 1 G 1 T 2 End D2D2 I2I2 I0I0 D1D1 I1I1 A 3 C 1 G 1 T

Viterbi Scoring XATA B/I 0 V B =0 M1M1M1M1 I1I1I1I1 D1D1D1D1 M2M2M2M2 I2I2I2I2 D2D2D2D2 E Insert Match Delete Moves V I 0 (1) = log a B-I0 V M 1 (0) = 0 V I 1 (0) = 0 V D 1 (0) = log a B-D1 Illegal Moves Observations States

Viterbi Scoring XATA B/I 0 V B =0 V I 0 (1) = M1M1M1M1 V M 1 (0)= 0 I1I1I1I1 V I 1 (0)= 0 D1D1D1D1 V D 1 (0)= M2M2M2M2 I2I2I2I2 D2D2D2D2 E Insert Match Delete Moves V I 0 (2) = V I 0 (1)+log a I0-I0 V I 0 (3) = V I 0 (2)+log a I0-I0 Observations States

Viterbi Scoring XATA B/I 0 V B =0 V I 0 (1) = V I 0 (2)= -1.25V I 0 (3)=-1.72 M1M1M1M1 V M 1 (0)= 0 I1I1I1I1 V I 1 (0)= 0 D1D1D1D1 V D 1 (0)= M2M2M2M2 I2I2I2I2 D2D2D2D2 E Insert Match Delete Moves V M 1 (1) = log e(A)/q + V B + log a B-M1 V M 1 (1) = log (3/7)/(1/4) V M 1 (1) = 0.23 – 0.17 = 0.06 Observations States

Viterbi Scoring XATA B/I 0 V B =0 V I 0 (1) = V I 0 (2)= -1.25V I 0 (3)=-1.72 M1M1M1M1 V M 1 (0)= 0V M 1 (1)= 0.06 I1I1I1I1 V I 1 (0)= 0 D1D1D1D1 V D 1 (0)= M2M2M2M2 I2I2I2I2 D2D2D2D2 E Insert Match Delete Moves V D 1 (1) = V I 1 (0) + log a I0D1 V D 1 (1) = – 0.47 = V M 1 (0) + log a M1I1 V I 1 (1) = 0 + max { V I 1 (0) + log a I1I1 } V D 1 (0) + log a D1I1 V I 1 (1) = 0 + max { } V I 1 (1) = Observations States

Viterbi Scoring XATA B/I 0 V B =0 V I 0 (1) = V I 0 (2)= -1.25V I 0 (3)=-1.72 M1M1M1M1 V M 1 (0)= 0V M 1 (1)= 0.06 I1I1I1I1 V I 1 (0)= 0V I 1 (1) = D1D1D1D1 V D 1 (0)= -0.78V D 1 (1)= M2M2M2M2 I2I2I2I2 D2D2D2D2 E Insert Match Delete Moves Observations States

Viterbi Scoring XATA B/I 0 V B =0 V I 0 (1) = V I 0 (2)= -1.25V I 0 (3)=-1.72 M1M1M1M1 V M 1 (0)= 0V M 1 (1)= 0.06V M 1 (2) = -1.19V M 1 (3)= I1I1I1I1 V I 1 (0)= 0V I 1 (1) = -0.47V I 1 (2) = -0.72V I 1 (3) = D1D1D1D1 V D 1 (0)= -0.78V D 1 (1)= -1.25V D 1 (2) = -1.72V D 1 (3)= M2M2M2M2 V M 2 (0)= 0V M 2 (1)= -0.47V M 2 (2)= -0.41V M 2 (3)= I2I2I2I2 V I 2 (0)= 0V I 2 (1) = -1.85V I 2 (2) = -1.07V I 2 (3) = D2D2D2D2 V D 2 (0)= -1.25V D 2 (1)= -1.25V D 2 (2)= -0.58V D 2 (3)= E V E = Insert Match Delete Moves Observations States

Profile HMM - Simple Case Demo in Python Demo in Python

Big Picture Revisited PQS for Network Security (Us) PQS for Network Security (Us) Design HMM for network event Design HMM for network event Find event within linear stream of observed network events Find event within linear stream of observed network events Sequencing using Profile HMM (Bioinformatics) Sequencing using Profile HMM (Bioinformatics) Train HMM using known information about subsequence Train HMM using known information about subsequence Find subsequence within linear protein / genome sequence Find subsequence within linear protein / genome sequence Q: Did an event happen? Q: If it exists, where is sequence?