Download presentation
Presentation is loading. Please wait.
Published byChester Clarke Modified over 9 years ago
1
Overview Overview A Quantum Computation Simulation Language Anomaly Detection in the Windows Registry Detecting Splice Sites in Genes Rotationally Invariant Face Detection
2
-HSK -HSK A Quantum Programming Language and Compiler Katherine H eller, Krysta S vore, Maryam K amvar (Al Aho)
3
What is -HSK? Quantum Computation Simulation Language Quantum Computation Simulation Language Quantum Compiler Quantum Compiler Q-HSK enables simplified programming of quantum algorithms with built-in graphics Q-HSK enables simplified programming of quantum algorithms with built-in graphics
4
Many Worlds Interpretation One formulation of quantum theory One formulation of quantum theory Each universe has a corresponding amplitude (i.e. complex number) Each universe has a corresponding amplitude (i.e. complex number) |amplitude| 2 = probability of existence x u1u1 u2u2 u4u4 u3u3
5
Qubits Quantum analogue of a classical bit Quantum analogue of a classical bit Takes on values 0, 1, or superposition of states: Takes on values 0, 1, or superposition of states: | ω › = α | 0 › + β | 1 › where |α| 2 + |β| 2 = 1 | ω › = cos(θ / 2) | 0 › + e iφ sin(θ / 2) | 1 ›
6
Quantum Gates Reversible – all unitary operators (U † U= I ) Reversible – all unitary operators (U † U= I ) Universal quantum gates – {U2,XOR}, Toffoli Universal quantum gates – {U2,XOR}, Toffoli Some common gates – Hadamard, QFT, CNOT Some common gates – Hadamard, QFT, CNOT HH | 1›| 1›| 1›| 1› | 0›| 0›| 0›| 0› 1/√2 ( | 0 › + | 1 › )
7
Key Features of the Q-HSK Compiler Familiar C-style syntax Familiar C-style syntax Matrix operations via CBLAS Matrix operations via CBLAS Complex and real data types Complex and real data types A quantum type qreg A quantum type qreg A graphical view of quantum algorithms A graphical view of quantum algorithms Lucid representation of quantum qubits, registers, and gates Lucid representation of quantum qubits, registers, and gates Interactive user options (start, stop, pause, change animation rate) Interactive user options (start, stop, pause, change animation rate) Detailed text output to trace algorithm Detailed text output to trace algorithm
8
A Simple Example int main( ) { int a, i; qreg *q; q=create(5); i = 0; while (i < 5) { q[i] = (0.0, 0.0); i = i + 1; } q = computeHadamard(q); a = Measure(q); printf(“This is the measure: %d”, a); return 0; } 0 0 0 0 0 q H M
9
Shor’s Algorithm Shor’s Algorithm Factors large numbers Factors large numbers n - number to factorize n - number to factorize x – random number x – random number a – ranges from 0 to q-1 a – ranges from 0 to q-1 n 2 <=q<=2n 2 n 2 <=q<=2n 2 r – period of x a (mod n) – exp. classically r – period of x a (mod n) – exp. classically one factor of n is gcd(x r/2 -1,n) – fast classically one factor of n is gcd(x r/2 -1,n) – fast classically
10
Graphical Interface
11
Architecture of Q-HSK Compiler Program.q Lexical AnalyzerSyntax Analyzer Semantic AnalyzerTranslator Program.cppg++ Java Graphics Executable lex.yy.cy.tab.ctranslate.c javac
12
One Class Support Vector Machines for Detecting Anomalous Windows Registry Accesses Collaborators: Krysta Svore, Angelos Keromytis, Sal Stolfo
13
Host Based Intrusion Detection Systems Microsoft Windows – most often attacked Current method to combat attacks Virus Scanners and Security Patches Virus Scanners and Security Patches Problem: These do not combat unknown attacks so frequent updates are needed Problem: These do not combat unknown attacks so frequent updates are needed Host based IDS Monitor system accesses to detect intrusions Monitor system accesses to detect intrusions Application of data mining techniques Application of data mining techniques
14
The Windows Registry and RAD Windows Registry Windows Registry Stores configuration settings for system parameters – security information, programs, etc. Stores configuration settings for system parameters – security information, programs, etc. Programs query the registry for information Programs query the registry for information Registry Anomaly Detection Registry Anomaly Detection audit sensor audit sensor model generator model generator anomaly detector anomaly detector Process: EXPLORER.EXE Query: OpenKey Key: HKCR\CKSUD\{B41DB860-8EE4-11D2-9906-EA9FADC173CA}\shellex\MayChangeDefaultMenu Response: SUCCESS ResultValue: NOTFOUND
15
Probabilistic Anomaly Detection Algorithm Computes 25 consistency checks: P(X i ) and P(X i |X j ) P(X i ) and P(X i |X j ) Multinomial with Hierarchical Prior For observed elements i: P(X = i) = C*(N i + α)/(k 0 α+N) P(X = i) = C*(N i + α)/(k 0 α+N) where N - total number of observations Ni - number of observations of symbol I α – “pseudo count” for each observed symbol k 0 – number of observed symbols L – number of possible symbols For unobserved elements i: P(X = i) = (1-C)*1/(L-k 0 ) P(X = i) = (1-C)*1/(L-k 0 ) C= N/(N+L-k 0 ) C= N/(N+L-k 0 )
16
One Class SVMs Analogous to two class SVM where all data lies in the first class and the origin is sole member of second class Analogous to two class SVM where all data lies in the first class and the origin is sole member of second class Solve optimization problem to find rule f with maximal margin Solve optimization problem to find rule f with maximal margin f(x)=‹w,x›+b Equivalent to solving the dual quadratic programming problem: Equivalent to solving the dual quadratic programming problem: min α (1/2) ∑ I,j α i α j K(x i,x j ) s.t. 0≤α i ≤1/(νl), ∑ i α i = 0 Kernel function projects input vectors into a feature space allowing for non-linear decision boundaries Kernel function projects input vectors into a feature space allowing for non-linear decision boundaries Φ: X → R N K(x i,x j ) = ‹Φ(x i ), Φ(x j )›
17
Experiments Kernels: Kernels: Linear: K(x,y) = (x·y) Linear: K(x,y) = (x·y) Polynomial: K(x,y) = (x·y+1) d Polynomial: K(x,y) = (x·y+1) d Gaussian: K(x,y) = e -║x-y║ 2 /(2σ 2 ) Gaussian: K(x,y) = e -║x-y║ 2 /(2σ 2 ) Feature Vectors: Feature Vectors: Binary Binary Frequency-based Frequency-based
18
Results
19
Sequence Information for the Splicing of Human Pre-mRNA Identified by Support Vector Machine Classification Collaborators: Xiang Zhang, Ilana Hefter, Christina Leslie, Larry Chasin
20
What Is Splicing? Exon1Exon2Intron Exon1Exon2 Exon1 DonorBranchAcceptor DNA mRNA
21
Pseudo Exons Consensus Sequences Donor Site: Donor Site: MAG|gtragt (M=A/C, r=a/g) Acceptor Site: Acceptor Site: (y) 10 ncag|G (y=c/t, n=a/c/g/t) Donor and acceptor sites scored based on closeness to consensus Identifying Pseudo Exons Intronic segments Intronic segments Have high scoring “donor” and “acceptor” sites Have high scoring “donor” and “acceptor” sites We look for discriminative signals in intronic regions near real and pseudo exons
22
String Kernels Feature map: number of times each k-length (contiguous) string occurs in sequence Feature map: number of times each k-length (contiguous) string occurs in sequence Dimension of feature space is N k Dimension of feature space is N k Example: k=2 Sequence = ACCTGGTG 1 AC 0 AA 0 AG 0 AT 0 CA 1 CC 0 CG 1 CT 0 GA 0 GC 1 GG 1 GT 0 TA 0 TC 2 TG 0 TT
23
Splice Kernels Hypothesis: False splice sites are intrinsically defective due to bad internal nt combinations All possible size k internal nt combinations are features Example (k=2): If the internal combination (3g,5a) occurs, that feature value is 1, otherwise it is 0
24
Recursive Feature Selection Normal vector to the hyperplane: Normal vector to the hyperplane: w=∑ i=1..m y i α i x i If |w j | large in absolute value, the jth feature is important for SVM discrimination If |w j | large in absolute value, the jth feature is important for SVM discrimination Approximation due to degree 2 polynomial kernel – calculate w up and w down separately, then eliminate bottom 50% of features for each Approximation due to degree 2 polynomial kernel – calculate w up and w down separately, then eliminate bottom 50% of features for each Stop when ROC score drops below 90% of original value on untouched test set Stop when ROC score drops below 90% of original value on untouched test set
25
Results FlanksSplice Sites Exon Body ROCSpecificity a USDS3’5’ CV b 0.6090.484 + –––– 0.7910.638 – + ––– 0.7840.618 ++ – –– 0.8550.695 –– + –– 0.8230.672 –– – + – 0.8370.698 –– ++ – 0.9070.777 ++++ – 0.9320.825 –––– +0.9460.841 ++ –– +0.9840.956 –– +++0.9870.964 +++++0.9910.976 Splice Sites Flanks Exon Bodies True positives detected 32/37 35/37 37/37 --- 1225 -+- 164259668 --+ 108232383 +-+ 58111180 +++ 19 5390
26
Rotationally Invariant Face Detection Using Multi-Resolution Histograms Collaborators: Shikher Bisaria, Tony Jebara
27
Face Detection Given a picture with faces, how do we determine where the faces are in the image? Which pixels are face pixels? Given a picture with faces, how do we determine where the faces are in the image? Which pixels are face pixels? We would like to determine this with a system that: We would like to determine this with a system that: Runs in real time Runs in real time Recognizes rotations of faces Recognizes rotations of faces (e.g. when someone tilts their head to one side) (e.g. when someone tilts their head to one side)
28
Gaussian Blurring Face images are greyscale (.pgms) Face images are greyscale (.pgms) Successive levels of blur are obtained by reconvolving previous level of blur images with a 2 dimensional gaussian function Successive levels of blur are obtained by reconvolving previous level of blur images with a 2 dimensional gaussian function Mathematically equivalent to two passes of a one dimensional gaussian function g(i,j) = 1/(2πσ 2 ) ∑ m ∑ n e -(m 2 +n 2 )/(2σ 2 ) · f(i-m,j-n) = 1/(2πσ 2 ) ∑ m e -m 2 /(2σ 2 ) · ∑ n e -n 2 /(2σ 2 ) · f(i-m,j-n) = 1/(2πσ 2 ) ∑ m e -m 2 /(2σ 2 ) · ∑ n e -n 2 /(2σ 2 ) · f(i-m,j-n)
29
Multi-Resolution Histograms Histogram equalize the image Concatenate histograms of image together after successive levels of gaussian blurring Concatenate histograms of image together after successive levels of gaussian blurring
30
Average Histograms Compute average face and non-face multi-resolution histograms from training set Compute average face and non-face multi-resolution histograms from training set Average Non-Face Histogram Average Face Histogram
31
Optimization Problem C(α) = min α ║H FAVG – h F ║ 2 + ║H NFAVG – h NF ║ 2 Where h F = (1/∑ i α i ) ∑ i α i h i h NF = (1/∑ i (1- α i )) ∑ i (1-α i )h i such that 0≤ α i ≤ 1, ∑ i α i = 1 Let β i = (1- α i ) Q = ‹h i,h j › Q = ‹h i,h j › c α = ‹h i,H FAVG › · constant c α = ‹h i,H FAVG › · constant c β = ‹h i,H NFAVG › · constant c β = ‹h i,H NFAVG › · constant = min α,β α T Qα + 1/(N-1) 2 β T Qβ – 2c α T α – 2/(N-1)c β T β
32
Solve Using SMO α i NEW = [ 1/(N-1) 2 Q ii - 1/(N-1) 2 ∑ k≠i,j α k Q jj + (1- ∑ k≠i,j α k ) Q jj - (1- ∑ k≠i,j α k ) Q ij + 1/(N-1) 2 ∑ k≠i,j α k Q ij - 1/(N-1) 2 Q ij - c α i + c β i + c α j - c β j + ∑ k≠i,j (α k Q ik ) - ∑ k≠i,j (α k Q jk ) - 1/(N-1) 2 ∑ k≠i,j (α k Q ik ) + 1/(N-1) 2 ∑ k≠i,j (α k Q jk )] / [Q ii + Q jj - 2Q ij + 1/(N-1) 2 Q ii + 1/(N-1) 2 Q jj - 2/(N-1) 2 Q ij ] Bounds for α i NEW : L = 0 H = 1 - ∑ k≠i,j α k α j NEW = (1 - ∑ k≠i,j α k ) - α i NEW
33
Results
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.