Overview Overview A Quantum Computation Simulation Language Anomaly Detection in the Windows Registry Detecting Splice Sites in Genes Rotationally Invariant.

Slides:

Advertisements

Similar presentations

A Quantum Programming Language

Advertisements

Lecture 9 Support Vector Machines

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.

An Introduction of Support Vector Machine

Support Vector Machines and Kernels Adapted from slides by Tim Oates Cognition, Robotics, and Learning (CORAL) Lab University of Maryland Baltimore County.

Support Vector Machines

SVM—Support Vector Machines

Machine learning continued Image source:

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization Glenn Fung and Olvi L. Mangasarian CSNA 2002 June 13-16, 2002 Madison,

Support Vector Machines and Kernel Methods

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.

Reduced Support Vector Machine

Support Vector Machines Kernel Machines

Classification Problem 2-Category Linearly Separable Case A- A+ Malignant Benign.

Support Vector Machines and Kernel Methods

Support Vector Machines

The Implicit Mapping into Feature Space. In order to learn non-linear relations with a linear machine, we need to select a set of non- linear features.

1 Computational Learning Theory and Kernel Methods Tianyi Jiang March 8, 2004.

SVMs Finalized. Where we are Last time Support vector machines in grungy detail The SVM objective function and QP Today Last details on SVMs Putting it.

Support Vector Machines

1 Recap (I) n -qubit quantum state: 2 n -dimensional unit vector Unitary op: 2 n  2 n linear operation U such that U † U = I (where U † denotes the conjugate.

Lecture 10: Support Vector Machines

Greg GrudicIntro AI1 Support Vector Machine (SVM) Classification Greg Grudic.

Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Radial Basis Function Networks

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

An Introduction to Support Vector Machines Martin Law.

Crash Course on Machine Learning

A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data Authors: Eleazar Eskin, Andrew Arnold, Michael Prerau,

Masquerade Detection Mark Stamp 1Masquerade Detection.

Efficient Model Selection for Support Vector Machines

SVCL Automatic detection of object based Region-of-Interest for image compression Sunhyoung Han.

CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.

Support Vector Machine (SVM) Based on Nello Cristianini presentation

Lecture note 8: Quantum Algorithms

The Disputed Federalist Papers: Resolution via Support Vector Machine Feature Selection Olvi Mangasarian UW Madison & UCSD La Jolla Glenn Fung Amazon Inc.,

1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.

计算机学院计算感知 Support Vector Machines. 2 University of Texas at Austin Machine Learning Group 计算感知计算机学院 Perceptron Revisited: Linear Separators Binary classification.

Quantum Computing MAS 725 Hartmut Klauck NTU

One-class Training for Masquerade Detection Ke Wang, Sal Stolfo Columbia University Computer Science IDS Lab.

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.

An Introduction to Support Vector Machines (M. Law)

ECE738 Advanced Image Processing Face Detection IEEE Trans. PAMI, July 1997.

Exploring Alternative Splicing Features using Support Vector Machines Feature for Alternative Splicing Alternative splicing is a mechanism for generating.

Machine Learning Weak 4 Lecture 2. Hand in Data It is online Only around 6000 images!!! Deadline is one week. Next Thursday lecture will be only one hour.

CISC667, F05, Lec22, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Support Vector Machines I.

University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.

Support Vector Machines Tao Department of computer science University of Illinois.

Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,

Support Vector Machines (SVM): A Tool for Machine Learning Yixin Chen Ph.D Candidate, CSE 1/10/2002.

Supervised Machine Learning: Classification Techniques Chaleece Sandberg Chris Bradley Kyle Walsh.

Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.

A Kernel Approach for Learning From Almost Orthogonal Pattern * CIS 525 Class Presentation Professor: Slobodan Vucetic Presenter: Yilian Qin * B. Scholkopf.

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

Greg GrudicIntro AI1 Support Vector Machine (SVM) Classification Greg Grudic.

Support Vector Machine: An Introduction. (C) by Yu Hen Hu 2 Linear Hyper-plane Classifier For x in the side of o : w T x + b  0; d = +1; For.

CSSE463: Image Recognition Day 14 Lab due Weds. Lab due Weds. These solutions assume that you don't threshold the shapes.ppt image: Shape1: elongation.

Machine Learning and Data Mining: A Math Programming- Based Approach Glenn Fung CS412 April 10, 2003 Madison, Wisconsin.

1 Kernel Machines A relatively new learning methodology (1992) derived from statistical learning theory. Became famous when it gave accuracy comparable.

Beginner’s Guide to Quantum Computing Graduate Seminar Presentation Oct. 5, 2007.

Support Vector Machines Reading: Textbook, Chapter 5 Ben-Hur and Weston, A User’s Guide to Support Vector Machines (linked from class web page)

Support Vector Machines Introduction to Data Mining, 2nd Edition by

A Ridiculously Brief Overview

Introduction to Quantum logic (2)

Presentation transcript:

Overview Overview A Quantum Computation Simulation Language Anomaly Detection in the Windows Registry Detecting Splice Sites in Genes Rotationally Invariant Face Detection

-HSK -HSK A Quantum Programming Language and Compiler Katherine H eller, Krysta S vore, Maryam K amvar (Al Aho)

What is -HSK? Quantum Computation Simulation Language Quantum Computation Simulation Language Quantum Compiler Quantum Compiler Q-HSK enables simplified programming of quantum algorithms with built-in graphics Q-HSK enables simplified programming of quantum algorithms with built-in graphics

Many Worlds Interpretation One formulation of quantum theory One formulation of quantum theory Each universe has a corresponding amplitude (i.e. complex number) Each universe has a corresponding amplitude (i.e. complex number) |amplitude| 2 = probability of existence x u1u1 u2u2 u4u4 u3u3

Qubits Quantum analogue of a classical bit Quantum analogue of a classical bit Takes on values 0, 1, or superposition of states: Takes on values 0, 1, or superposition of states: | ω › = α | 0 › + β | 1 › where |α| 2 + |β| 2 = 1 | ω › = cos(θ / 2) | 0 › + e iφ sin(θ / 2) | 1 ›

Quantum Gates Reversible – all unitary operators (U † U= I ) Reversible – all unitary operators (U † U= I ) Universal quantum gates – {U2,XOR}, Toffoli Universal quantum gates – {U2,XOR}, Toffoli Some common gates – Hadamard, QFT, CNOT Some common gates – Hadamard, QFT, CNOT HH | 1›| 1›| 1›| 1› | 0›| 0›| 0›| 0› 1/√2 ( | 0 › + | 1 › )

Key Features of the Q-HSK Compiler Familiar C-style syntax Familiar C-style syntax Matrix operations via CBLAS Matrix operations via CBLAS Complex and real data types Complex and real data types A quantum type qreg A quantum type qreg A graphical view of quantum algorithms A graphical view of quantum algorithms Lucid representation of quantum qubits, registers, and gates Lucid representation of quantum qubits, registers, and gates Interactive user options (start, stop, pause, change animation rate) Interactive user options (start, stop, pause, change animation rate) Detailed text output to trace algorithm Detailed text output to trace algorithm

A Simple Example int main( ) { int a, i; qreg *q; q=create(5); i = 0; while (i < 5) { q[i] = (0.0, 0.0); i = i + 1; } q = computeHadamard(q); a = Measure(q); printf(“This is the measure: %d”, a); return 0; } q H M

Shor’s Algorithm Shor’s Algorithm Factors large numbers Factors large numbers n - number to factorize n - number to factorize x – random number x – random number a – ranges from 0 to q-1 a – ranges from 0 to q-1 n 2 <=q<=2n 2 n 2 <=q<=2n 2 r – period of x a (mod n) – exp. classically r – period of x a (mod n) – exp. classically one factor of n is gcd(x r/2 -1,n) – fast classically one factor of n is gcd(x r/2 -1,n) – fast classically

Graphical Interface

Architecture of Q-HSK Compiler Program.q Lexical AnalyzerSyntax Analyzer Semantic AnalyzerTranslator Program.cppg++ Java Graphics Executable lex.yy.cy.tab.ctranslate.c javac

One Class Support Vector Machines for Detecting Anomalous Windows Registry Accesses Collaborators: Krysta Svore, Angelos Keromytis, Sal Stolfo

Host Based Intrusion Detection Systems Microsoft Windows – most often attacked Current method to combat attacks Virus Scanners and Security Patches Virus Scanners and Security Patches Problem: These do not combat unknown attacks so frequent updates are needed Problem: These do not combat unknown attacks so frequent updates are needed Host based IDS Monitor system accesses to detect intrusions Monitor system accesses to detect intrusions Application of data mining techniques Application of data mining techniques

The Windows Registry and RAD Windows Registry Windows Registry Stores configuration settings for system parameters – security information, programs, etc. Stores configuration settings for system parameters – security information, programs, etc. Programs query the registry for information Programs query the registry for information Registry Anomaly Detection Registry Anomaly Detection audit sensor audit sensor model generator model generator anomaly detector anomaly detector Process: EXPLORER.EXE Query: OpenKey Key: HKCR\CKSUD\{B41DB860-8EE4-11D EA9FADC173CA}\shellex\MayChangeDefaultMenu Response: SUCCESS ResultValue: NOTFOUND

Probabilistic Anomaly Detection Algorithm Computes 25 consistency checks: P(X i ) and P(X i |X j ) P(X i ) and P(X i |X j ) Multinomial with Hierarchical Prior For observed elements i: P(X = i) = C*(N i + α)/(k 0 α+N) P(X = i) = C*(N i + α)/(k 0 α+N) where N - total number of observations Ni - number of observations of symbol I α – “pseudo count” for each observed symbol k 0 – number of observed symbols L – number of possible symbols For unobserved elements i: P(X = i) = (1-C)*1/(L-k 0 ) P(X = i) = (1-C)*1/(L-k 0 ) C= N/(N+L-k 0 ) C= N/(N+L-k 0 )

One Class SVMs Analogous to two class SVM where all data lies in the first class and the origin is sole member of second class Analogous to two class SVM where all data lies in the first class and the origin is sole member of second class Solve optimization problem to find rule f with maximal margin Solve optimization problem to find rule f with maximal margin f(x)=‹w,x›+b Equivalent to solving the dual quadratic programming problem: Equivalent to solving the dual quadratic programming problem: min α (1/2) ∑ I,j α i α j K(x i,x j ) s.t. 0≤α i ≤1/(νl), ∑ i α i = 0 Kernel function projects input vectors into a feature space allowing for non-linear decision boundaries Kernel function projects input vectors into a feature space allowing for non-linear decision boundaries Φ: X → R N K(x i,x j ) = ‹Φ(x i ), Φ(x j )›

Experiments Kernels: Kernels: Linear: K(x,y) = (x·y) Linear: K(x,y) = (x·y) Polynomial: K(x,y) = (x·y+1) d Polynomial: K(x,y) = (x·y+1) d Gaussian: K(x,y) = e -║x-y║ 2 /(2σ 2 ) Gaussian: K(x,y) = e -║x-y║ 2 /(2σ 2 ) Feature Vectors: Feature Vectors: Binary Binary Frequency-based Frequency-based

Results

Sequence Information for the Splicing of Human Pre-mRNA Identified by Support Vector Machine Classification Collaborators: Xiang Zhang, Ilana Hefter, Christina Leslie, Larry Chasin

What Is Splicing? Exon1Exon2Intron Exon1Exon2 Exon1 DonorBranchAcceptor DNA mRNA

Pseudo Exons Consensus Sequences Donor Site: Donor Site: MAG|gtragt (M=A/C, r=a/g) Acceptor Site: Acceptor Site: (y) 10 ncag|G (y=c/t, n=a/c/g/t) Donor and acceptor sites scored based on closeness to consensus Identifying Pseudo Exons Intronic segments Intronic segments Have high scoring “donor” and “acceptor” sites Have high scoring “donor” and “acceptor” sites We look for discriminative signals in intronic regions near real and pseudo exons

String Kernels Feature map: number of times each k-length (contiguous) string occurs in sequence Feature map: number of times each k-length (contiguous) string occurs in sequence Dimension of feature space is N k Dimension of feature space is N k Example: k=2 Sequence = ACCTGGTG 1 AC 0 AA 0 AG 0 AT 0 CA 1 CC 0 CG 1 CT 0 GA 0 GC 1 GG 1 GT 0 TA 0 TC 2 TG 0 TT

Splice Kernels Hypothesis: False splice sites are intrinsically defective due to bad internal nt combinations All possible size k internal nt combinations are features Example (k=2): If the internal combination (3g,5a) occurs, that feature value is 1, otherwise it is 0

Recursive Feature Selection Normal vector to the hyperplane: Normal vector to the hyperplane: w=∑ i=1..m y i α i x i If |w j | large in absolute value, the jth feature is important for SVM discrimination If |w j | large in absolute value, the jth feature is important for SVM discrimination Approximation due to degree 2 polynomial kernel – calculate w up and w down separately, then eliminate bottom 50% of features for each Approximation due to degree 2 polynomial kernel – calculate w up and w down separately, then eliminate bottom 50% of features for each Stop when ROC score drops below 90% of original value on untouched test set Stop when ROC score drops below 90% of original value on untouched test set

Results FlanksSplice Sites Exon Body ROCSpecificity a USDS3’5’ CV b –––– – + ––– – –– –– + –– –– – + – –– ++ – – –––– –– –– Splice Sites Flanks Exon Bodies True positives detected 32/37 35/37 37/

Rotationally Invariant Face Detection Using Multi-Resolution Histograms Collaborators: Shikher Bisaria, Tony Jebara

Face Detection Given a picture with faces, how do we determine where the faces are in the image? Which pixels are face pixels? Given a picture with faces, how do we determine where the faces are in the image? Which pixels are face pixels? We would like to determine this with a system that: We would like to determine this with a system that: Runs in real time Runs in real time Recognizes rotations of faces Recognizes rotations of faces (e.g. when someone tilts their head to one side) (e.g. when someone tilts their head to one side)

Gaussian Blurring Face images are greyscale (.pgms) Face images are greyscale (.pgms) Successive levels of blur are obtained by reconvolving previous level of blur images with a 2 dimensional gaussian function Successive levels of blur are obtained by reconvolving previous level of blur images with a 2 dimensional gaussian function Mathematically equivalent to two passes of a one dimensional gaussian function g(i,j) = 1/(2πσ 2 ) ∑ m ∑ n e -(m 2 +n 2 )/(2σ 2 ) · f(i-m,j-n) = 1/(2πσ 2 ) ∑ m e -m 2 /(2σ 2 ) · ∑ n e -n 2 /(2σ 2 ) · f(i-m,j-n) = 1/(2πσ 2 ) ∑ m e -m 2 /(2σ 2 ) · ∑ n e -n 2 /(2σ 2 ) · f(i-m,j-n)

Multi-Resolution Histograms Histogram equalize the image Concatenate histograms of image together after successive levels of gaussian blurring Concatenate histograms of image together after successive levels of gaussian blurring

Average Histograms Compute average face and non-face multi-resolution histograms from training set Compute average face and non-face multi-resolution histograms from training set Average Non-Face Histogram Average Face Histogram

Optimization Problem C(α) = min α ║H FAVG – h F ║ 2 + ║H NFAVG – h NF ║ 2 Where h F = (1/∑ i α i ) ∑ i α i h i h NF = (1/∑ i (1- α i )) ∑ i (1-α i )h i such that 0≤ α i ≤ 1, ∑ i α i = 1 Let β i = (1- α i ) Q = ‹h i,h j › Q = ‹h i,h j › c α = ‹h i,H FAVG › · constant c α = ‹h i,H FAVG › · constant c β = ‹h i,H NFAVG › · constant c β = ‹h i,H NFAVG › · constant = min α,β α T Qα + 1/(N-1) 2 β T Qβ – 2c α T α – 2/(N-1)c β T β

Solve Using SMO α i NEW = [ 1/(N-1) 2 Q ii - 1/(N-1) 2 ∑ k≠i,j α k Q jj + (1- ∑ k≠i,j α k ) Q jj - (1- ∑ k≠i,j α k ) Q ij + 1/(N-1) 2 ∑ k≠i,j α k Q ij - 1/(N-1) 2 Q ij - c α i + c β i + c α j - c β j + ∑ k≠i,j (α k Q ik ) - ∑ k≠i,j (α k Q jk ) - 1/(N-1) 2 ∑ k≠i,j (α k Q ik ) + 1/(N-1) 2 ∑ k≠i,j (α k Q jk )] / [Q ii + Q jj - 2Q ij + 1/(N-1) 2 Q ii + 1/(N-1) 2 Q jj - 2/(N-1) 2 Q ij ] Bounds for α i NEW : L = 0 H = 1 - ∑ k≠i,j α k α j NEW = (1 - ∑ k≠i,j α k ) - α i NEW

Results