Privacy-Preserving Support Vector Machines via Random Kernels Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison March 3, 2016 TexPoint.

Slides:



Advertisements
Similar presentations
SVMs Reprised. Administrivia I’m out of town Mar 1-3 May have guest lecturer May cancel class Will let you know more when I do...
Advertisements

The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization Glenn Fung and Olvi L. Mangasarian CSNA 2002 June 13-16, 2002 Madison,
12.1 Systems of Linear Equations: Substitution and Elimination.
Maths for Computer Graphics
An Optimization Approach to Improving Collections of Shape Maps Andy Nguyen, Mirela Ben-Chen, Katarzyna Welnicka, Yinyu Ye, Leonidas Guibas Computer Science.
Support Vector Machines and Kernel Methods
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
The value of kernel function represents the inner product of two training points in feature space Kernel functions merge two steps 1. map input data from.
Chapter 2 Basic Linear Algebra
Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of.
Reduced Support Vector Machine
Matrices and Systems of Equations
Chapter 2 Matrices Definition of a matrix.
Ch 7.2: Review of Matrices For theoretical and computation reasons, we review results of matrix theory in this section and the next. A matrix A is an m.
Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.
Support Vector Regression David R. Musicant and O.L. Mangasarian International Symposium on Mathematical Programming Thursday, August 10, 2000
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
INDR 262 INTRODUCTION TO OPTIMIZATION METHODS LINEAR ALGEBRA INDR 262 Metin Türkay 1.
Matrix Approach to Simple Linear Regression KNNL – Chapter 5.
CE 311 K - Introduction to Computer Methods Daene C. McKinney
Copyright © Cengage Learning. All rights reserved. 7.6 The Inverse of a Square Matrix.
Systems of linear equations. Simple system Solution.
Mathematical Programming in Support Vector Machines
Incremental Support Vector Machine Classification Second SIAM International Conference on Data Mining Arlington, Virginia, April 11-13, 2002 Glenn Fung.
Feature Selection in Nonlinear Kernel Classification Olvi Mangasarian & Edward Wild University of Wisconsin Madison Workshop on Optimization-Based Data.
Matrix. REVIEW LAST LECTURE Keyword Parametric form Augmented Matrix Elementary Operation Gaussian Elimination Row Echelon form Reduced Row Echelon form.
Feature Selection in Nonlinear Kernel Classification Olvi Mangasarian Edward Wild University of Wisconsin Madison.
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
Ran El-Yaniv and Dmitry Pechyony Technion – Israel Institute of Technology, Haifa, Israel Transductive Rademacher Complexity and its Applications.
The Disputed Federalist Papers: Resolution via Support Vector Machine Feature Selection Olvi Mangasarian UW Madison & UCSD La Jolla Glenn Fung Amazon Inc.,
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
Knowledge-Based Breast Cancer Prognosis Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison Computation and Informatics in Biology and Medicine.
Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of.
CS Statistical Machine learning Lecture 18 Yuan (Alan) Qi Purdue CS Oct
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
Nonlinear Data Discrimination via Generalized Support Vector Machines David R. Musicant and Olvi L. Mangasarian University of Wisconsin - Madison
Privacy-Preserving Linear Programming Olvi Mangasarian UW Madison & UCSD La Jolla UCSD – Center for Computational Mathematics Seminar January 11, 2011.
Privacy-Preserving Support Vector Machines via Random Kernels Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison November 14, 2015 TexPoint.
OR Backgrounds-Convexity  Def: line segment joining two points is the collection of points.
Machine Learning Weak 4 Lecture 2. Hand in Data It is online Only around 6000 images!!! Deadline is one week. Next Thursday lecture will be only one hour.
Privacy-preserving rule mining. Outline  A brief introduction to association rule mining  Privacy preserving rule mining Single party  Perturbation.
1 Support Cluster Machine Paper from ICML2007 Read by Haiqin Yang This paper, Support Cluster Machine, was written by Bin Li, Mingmin Chi, Jianping.
Multiple Instance Learning via Successive Linear Programming Olvi Mangasarian Edward Wild University of Wisconsin-Madison.
RSVM: Reduced Support Vector Machines Y.-J. Lee & O. L. Mangasarian First SIAM International Conference on Data Mining Chicago, April 6, 2001 University.
Copyright © 2011 Pearson Education, Inc. Solving Linear Systems Using Matrices Section 6.1 Matrices and Determinants.
CS558 Project Local SVM Classification based on triangulation (on the plane) Glenn Fung.
Support vector machine LING 572 Fei Xia Week 8: 2/23/2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1.
Exact Differentiable Exterior Penalty for Linear Programming Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison December 20, 2015 TexPoint.
Data Mining via Support Vector Machines Olvi L. Mangasarian University of Wisconsin - Madison IFIP TC7 Conference on System Modeling and Optimization Trier.
1.3 Solutions of Linear Systems
Matrices and Matrix Operations. Matrices An m×n matrix A is a rectangular array of mn real numbers arranged in m horizontal rows and n vertical columns.
Nonlinear Knowledge in Kernel Approximation Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison.
Nonlinear Knowledge in Kernel Machines Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison Data Mining and Mathematical Programming Workshop.
A Kernel Approach for Learning From Almost Orthogonal Pattern * CIS 525 Class Presentation Professor: Slobodan Vucetic Presenter: Yilian Qin * B. Scholkopf.
Proximal Plane Classification KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Second Annual Review June 1, 2001 Data Mining Institute.
SVMs in a Nutshell.
Survival-Time Classification of Breast Cancer Patients and Chemotherapy Yuh-Jye Lee, Olvi Mangasarian & W. H. Wolberg UW Madison & UCSD La Jolla Computational.
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.
Massive Support Vector Regression (via Row and Column Chunking) David R. Musicant and O.L. Mangasarian NIPS 99 Workshop on Learning With Support Vectors.
Matrices, Vectors, Determinants.
Minimal Kernel Classifiers Glenn Fung Olvi Mangasarian Alexander Smola Data Mining Institute University of Wisconsin - Madison Informs 2002 San Jose, California,
Knowledge-Based Nonlinear Support Vector Machine Classifiers Glenn Fung, Olvi Mangasarian & Jude Shavlik COLT 2003, Washington, DC. August 24-27, 2003.
Exact Differentiable Exterior Penalty for Linear Programming
Systems of linear equations
L8 inverse of the matrix.
RECORD. RECORD COLLABORATE: Discuss: Is the statement below correct? Try a 2x2 example.
7.5 Solutions of Linear Systems:
Warm-up: Solve the system: 2x – 0.4y = -2 3x – 1 2 y = -2
University of Wisconsin - Madison
Minimal Kernel Classifiers
Presentation transcript:

Privacy-Preserving Support Vector Machines via Random Kernels Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison March 3, 2016 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A AAA A A A A A The 2008 International Conference on Data Mining

Horizontally Partitioned Data A A1A1 A2A2 A3A3 Data Features 1 2..………….…………. n Examples m m

Problem Statement Entities with related data wish to learn a classifier based on all data The entities are unwilling to reveal their data to each other If each entity holds a different set of examples with all features, then the data is said to be horizontally partitioned Our approach: privacy-preserving support vector machine (PPSVM) using random kernels –Provides accurate classification –Does not reveal private information

Outline Support vector machines (SVMs) Reduced and random kernel SVMs Privacy-preserving SVM for horizontally partitioned data Summary

K(x 0, A 0 )u =  1  Support Vector Machines K(A +, A 0 )u ¸ e  +e K(A , A 0 )u · e   e + _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ K(x 0, A 0 )u =  K(x 0, A 0 )u =  Slack variable y ¸ 0 allows points to be on the wrong side of the bounding surface x 2 R n SVM defined by parameters u and threshold  of the nonlinear surface A contains all data points {+…+} ½ A + {  …  } ½ A  e is a vector of ones SVMs Minimize e 0 s (||u|| 1 at solution) to reduce overfitting Minimize e 0 y (hinge loss or plus function or max{, 0}) to fit data Linear kernel: (K(A, B)) ij = (AB) ij = A i B ¢ j = K(A i, B ¢ j ) Gaussian kernel, parameter  (K(A, B)) ij = exp(-  ||A i 0 -B ¢ j || 2 )

Support Vector MachineReduced Support Vector Machine L&M, 2001: replace the kernel matrix K(A, A 0 ) with K(A, Ā 0 ), where Ā 0 consists of a randomly selected subset of the rows of A M&T, 2006: replace the kernel matrix K(A, A 0 ) with K(A, B 0 ), where B 0 is a completely random matrix Random Reduced Support Vector Machine Using the random kernel K(A, B 0 ) is a key result for generating a simple and accurate privacy-preserving SVM

Error of Random Kernels is Comparable to Full Kernels: Linear Kernels Full Kernel AA 0 Error Random Kernel AB 0 Error Each point represents one of 7 datasets from the UCI repository B is a random matrix with the same number of columns as A and either 10% as many rows, or one fewer row than columns Equal error for random and full kernels

Error of Random Kernels is Comparable Full Kernels: Gaussian Kernels Full Kernel K(A, A 0 ) Error Random Kernel K(A, B 0 ) Error

Horizontally Partitioned Data: Each entity holds different examples with the same features A1A1 A2A2 A3A3 A3A3 A2A2 A1A1

Privacy Preserving SVMs for Horizontally Partitioned Data via Random Kernels Each of q entities privately owns a block of data A 1, …, A q that they are unwilling to share with the other q - 1 entities The entities all agree on the same random basis matrix and distribute K(A j, B 0 ) to all entities K(A, B 0 ) = A j cannot be recovered uniquely from K(A j, B 0 )

B Privacy Preservation: Infinite Number of Solutions for A i Given A i B 0 Given – – Consider solving for row r of A i, 1 · r · m i from the equation –BA ir 0 = P ir, A ir 0 2 R n –Every square submatrix of the random matrix B is nonsingular –There are at least Thus there are solutions A i to the equation BA i 0 = P i If each entity has 20 points in R 30, there are solutions Furthermore, each of the infinite number of matrices in the affine hull of these matrices is a solution P ir A ir 0 = Feng and Zhang, 2007: Every submatrix of a random matrix has full rank

Results for PPSVM on Horizontally Partitioned Data Compare classifiers that share examples with classifiers that do not –Seven datasets from the UCI repository Simulate a situation in which each entity has only a subset of about 25 examples

Error Rate of Sharing Data is Better than not Sharing: Linear Kernels Error Without Sharing Data Error Sharing Data Error Rate Without Sharing Error Rate With Sharing 7 datasets represented by one point each

Error Rate of Sharing Data is Better than not Sharing: Gaussian Kernels Error Without Sharing Data Error Sharing Data

Summary Privacy preserving SVM for horizontally partitioned data –Based on using the random kernel K(A, B 0 ) –Learn classifier using all data, but without revealing privately held data –Classification accuracy is better than an SVM without sharing, and comparable to an SVM where all data is shared Related work –Similar approach for vertically partitioned data to appear in ACMTKDD –Liu et al., 2006: Properties of multiplicative data perturbation based on random projection –Yu et al., 2006: Secure computation of K(A, A 0 )

Questions Websites with links to papers and talks: