Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Introduction to Support Vector Machines for Data Mining Mahdi Nasereddin Ph.D. Pennsylvania State University School of Information Sciences and Technology.

Similar presentations


Presentation on theme: "1 Introduction to Support Vector Machines for Data Mining Mahdi Nasereddin Ph.D. Pennsylvania State University School of Information Sciences and Technology."— Presentation transcript:

1 1 Introduction to Support Vector Machines for Data Mining Mahdi Nasereddin Ph.D. Pennsylvania State University School of Information Sciences and Technology

2 2 Agenda Introduction Support Vector Machines Preliminary Experimentation Conclusion Questions?

3 3 Data Mining Techniques: Neural Networks Decision Trees Multivariate Adaptive Regression Splines (MARS) Rule Induction Nearest Neighbor Method and discriminant analysis Genetic Algorithms Support Vector Machines

4 4 First introduced by Vapnik and Chervonenkis in COLT-92 Bases on Statistical Learning Theory Applications Basic Theory Classification Regression

5 5 Successful Applications of SVMS Protein Structure Prediction http://www.cs.umn.edu/~hpark/papers/surfac e.pdf http://www.cs.umn.edu/~hpark/papers/surfac e.pdf Intrusion Detection www.cs.nmt.edu/~ITwww.cs.nmt.edu/~IT Handwriting Recognition Detecting Steganography in digital images http://www.cs.dartmouth.edu/~farid/publicatio ns/ih02.html http://www.cs.dartmouth.edu/~farid/publicatio ns/ih02.html

6 6 Successful Applications of SVMS Breast Cancer Prognosis: Chemotherapy Effect on Survival Rate (Lee, Mangasarian and Wolberg, 2001) Particle and Quark-Flavour Identification in High Energy Physics (http://wwwrunge.physik.uni- freiburg.de/preprints/EHEP9901.ps)http://wwwrunge.physik.uni- freiburg.de/preprints/EHEP9901.ps Function Approximation

7 7 Support Vector Machines (Linearly separable case)

8 8

9 9

10 10 Non-Linearly separable case

11 11 SVM for Regression In case of regression, the goal is to construct a hyperplane that is close to as many points as possible. For both classification and regression, learning is done via quadratic programming (one optimum point)

12 12 Strengths and Weaknesses of SVM Strengths Training is relatively easy No local optimal, unlike in neural networks It scales relatively well to high dimensional data Weaknesses Need a “good” kernel function

13 13 Preliminary Experimentation: Forecasting GDP using Oil Prices (with F. Malik) Forecasting model Objective: To predict the Gross Domestic Product (GDP) for the next quarter using Oil prices (including time lag) GDP time

14 14 Data Set We looked at quarterly Oil prices and GDP data January 1947 – December 2002 Oil price data were obtained from Bureau of Labor Statistics GDP data were obtained from the Bureau of Economic Analysis. We used the growth rate of GDP and the growth rate of oil prices.

15 15 Models Neural Networks Back-propagation One hidden layer Delta rule was used for training LS-SVM (Van Gestel, 2001) Matlab toolbox

16 16 Experimentation Created the training data to predict the last 40 quarters GDP (test data) Trained the neural network and the SVM Used the model to predict GDP, and calculated the error of prediction

17 17 Results ModelMAE Neural Network 0.0044 LS-SVM0.0052

18 18 Good References Introductions Martin Law, “An Introduction to Support Vector Machines” Andrew More, “Support Vector Machines” www.cs.cmu.edu/~awm www.cs.cmu.edu/~awm N. Cristianini www.support-vector.net/tutorial.htmlwww.support-vector.net/tutorial.html In depth Support Vector Machines book www.support-vector.netwww.support-vector.net

19 19 Questions E-mail: mxn16@psu.edumxn16@psu.edu Presentation will be posted (by Friday) at http://www.bklv.psu.edu/faculty/nasereddin http://www.bklv.psu.edu/faculty/nasereddin


Download ppt "1 Introduction to Support Vector Machines for Data Mining Mahdi Nasereddin Ph.D. Pennsylvania State University School of Information Sciences and Technology."

Similar presentations


Ads by Google