Adapted by Dr. Sarah from a talk by Dr. Catherine A. Gorini

Slides:



Advertisements
Similar presentations
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Advertisements

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Data Mining Classification: Alternative Techniques
Data Mining Classification: Alternative Techniques
Classification and Decision Boundaries
Y.-J. Lee, O. L. Mangasarian & W.H. Wolberg
Discriminative and generative methods for bags of features
Intro. ANN & Fuzzy Systems Lecture 8. Learning (V): Perceptron Learning.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
MMLD1 Support Vector Machines: Hype or Hallelujah? Kristin Bennett Math Sciences Dept Rensselaer Polytechnic Inst.
L15:Microarray analysis (Classification) The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
October 2-4, 2000M20001 Support Vector Machines: Hype or Hallelujah? Kristin Bennett Math Sciences Dept Rensselaer Polytechnic Inst.
Active Learning with Support Vector Machines
Classification Problem 2-Category Linearly Separable Case A- A+ Malignant Benign.
Feature Selection and Error Tolerance for the Logical Analysis of Data Craig Bowles Kathryn Davidson Cornell University University of Pennsylvania Mentor:
Lecture 10: Support Vector Machines
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Linear Discriminant Functions Chapter 5 (Duda et al.)
September 28, 2010Neural Networks Lecture 7: Perceptron Modifications 1 Adaline Schematic Adjust weights i1i1i1i1 i2i2i2i2 inininin …  w 0 + w 1 i 1 +
Lines and Planes in Space
Breast Cancer Diagnosis A discussion of methods Meena Vairavan.
METU Informatics Institute Min 720 Pattern Classification with Bio-Medical Applications PART 2: Statistical Pattern Classification: Optimal Classification.
Breast Cancer Diagnosis via Linear Hyper-plane Classifier Presented by Joseph Maalouf December 14, 2001 December 14, 2001.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
An Introduction to Support Vector Machine (SVM) Presenter : Ahey Date : 2007/07/20 The slides are based on lecture notes of Prof. 林智仁 and Daniel Yeung.
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
Biomimetics Pattern Recognition and Machine Thinking in Image Lab of Artificial Neural Networks & Machine Thinking in Image, Institute of Semiconductors,
Breast Cancer Diagnosis via Neural Network Classification Jing Jiang May 10, 2000.
Support Vector Machine Data Mining Olvi L. Mangasarian with Glenn M. Fung, Jude W. Shavlik & Collaborators at ExonHit – Paris Data Mining Institute University.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
Prognostic Prediction of Breast Cancer Using C5 Sakina Begum May 1, 2001.
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
MMLD1 Support Vector Machines: Hype or Hallelujah? Kristin Bennett Math Sciences Dept Rensselaer Polytechnic Inst.
Lecture notes for Stat 231: Pattern Recognition and Machine Learning 1. Stat 231. A.L. Yuille. Fall Perceptron Rule and Convergence Proof Capacity.
CSE182 L14 Mass Spec Quantitation MS applications Microarray analysis.
Lecture notes for Stat 231: Pattern Recognition and Machine Learning 1. Stat 231. A.L. Yuille. Fall 2004 Linear Separation and Margins. Non-Separable and.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
Survival-Time Classification of Breast Cancer Patients and Chemotherapy Yuh-Jye Lee, Olvi Mangasarian & W. H. Wolberg UW Madison & UCSD La Jolla Computational.
Giansalvo EXIN Cirrincione unit #4 Single-layer networks They directly compute linear discriminant functions using the TS without need of determining.
Support Vector Machine: An Introduction. (C) by Yu Hen Hu 2 Linear Hyper-plane Classifier For x in the side of o : w T x + b  0; d = +1; For.
Classification of Breast Cancer Cells Using Artificial Neural Networks and Support Vector Machines Emmanuel Contreras Guzman.
Day 17: Duality and Nonlinear SVM Kristin P. Bennett Mathematical Sciences Department Rensselaer Polytechnic Institute.
Linear Discriminant Functions Chapter 5 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis.
Hybrid Ant Colony Optimization-Support Vector Machine using Weighted Ranking for Feature Selection and Classification.
Non-separable SVM's, and non-linear classification using kernels Jakob Verbeek December 16, 2011 Course website:
Support Vector Machine
Support vector machines
Lesson 3-6: Perpendicular & Distance
Geometrical intuition behind the dual problem
Support Vector Machines
VECTORS APPLICATIONS NHAA/IMK/UNIMAP.
An Introduction to Support Vector Machines
Nearest-Neighbor Classifiers
CS 2750: Machine Learning Support Vector Machines
By the end of Week 3: You would learn how to solve many problems involving lines/planes and manipulate with different coordinate systems. These are.
Machine Learning in Practice Lecture 26
COSC 4335: Other Classification Techniques
Support vector machines
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Support vector machines
Support vector machines
Concave Minimization for Support Vector Machine Classifiers
Neuro-Computing Lecture 2 Single-Layer Perceptrons
Lecture 8. Learning (V): Perceptron Learning
University of Wisconsin - Madison
SVMs for Document Ranking
Presentation transcript:

Adapted by Dr. Sarah from a talk by Dr. Catherine A. Gorini Geometry at Work Adapted by Dr. Sarah from a talk by Dr. Catherine A. Gorini

Computer Learning To Diagnose and Categorize We will use these tools: • higher-dimensional vector spaces • convex sets • inner products

Geometry in Learning Kristin P. Bennett Rensselaer Polytechnic Institute Erin J. Bredensteiner University of Evansville Goal: To classify objects into two classes based on specific measurements made on each object. Examples: tumors: benign or malignant patients: healthy or with heart disease Congressmen: Republicans or Democrats

• Data is collected for a large sample of individuals. • Individuals are assigned to one of two classes by experts. • A perceptron is created. A perceptron is a linear model that is used to classify points into two sets. • New individuals are then classified by a computer using the perceptron.

• Each individual corresponds to a point in Rn, where n is the number of measurements recorded for each individual. • The perceptron corresponds to a plane that separates Rn into two half-spaces, each half-space containing points of only one type.

A plane with normal vector w Rn is given by the vector equation x w =  where is the Euclidean distance from the origin for  R. If p is the position vector of a point on the plane, then (x – p)  w = 0 x w – p  w = 0 x w = p  w =  and g gives the location of the plane relative to the origin. This plane is g units away from the parallel plane through the origin, x w = .

Definition: Let x be a point in Rn to be classified as a member of class A or class B. A perceptron with weights w Rn and threshold R assigns x to class A or to class B using the following rule: If x w – then x  A. If x w – <then x  B. By convention, if x  w = then x  B.

Our Goal Use the data to solve for w Rn and R that gives the “best” plane that separates the two sets of points, A and B, in Rn There are two cases: Linearly Separable Case Linearly Inseparable Case We need some terminology before we can define the cases.

A set is convex if the segment connecting any two points in the set is also in the set. The convex hull of a set of points is the smallest convex set that contains the set. Let A1, A2, … , Am be the points in set A. Then u1 A1 + u2 A2 + … + um Am is in the convex hull of A if u1 + u2 + … + um = 1 and ui > 0.

Our Goal Use the data to solve for w Rn and R that gives the “best” plane that separates the two sets of points, A and B, in Rn There are two cases: Linearly Separable Case: There are many planes lying between the two sets. This happens when the convex hulls of the sets A and B are disjoint. Linearly Inseparable Case: There is no plane lying between the two sets. This happens when the convex hulls of the sets A and B intersect.

Solution of the Linearly Separable Case Find the plane perpendicular to the bisector of the segment connecting the two closest points in the convex hulls of A and B. This is an optimization problem: minimize || u1 A1 + u2 A2 + … + um Am – (v1 B1 + v2 B2 + … + vk Bk) || such that u1 + u2 + … + um = 1 v1 + v2 + … + vk = 1 ui  vj  This standard optimization problem can be solved on a computer.

separating plane Set A Set B

Solution of the Linearly Inseparable Case If there is no plane that separates the two sets, there are many different planes that could define a perceptron. Robust Linear Programming. Minimize the maximum distance from any misclassified point to the separating plane. This can give importance to one hard-to-classify point. Multisurface Method of Pattern Recognition. Find the plane that misclassifies the least number of points. Generalized Optimal Plane. Reduce the average distance of misclassified points from the separating plane and decrease the maximum classification error.

Practical Applications Heart Disease Breast Cancer Sonar Signals to Distinguish Mines from Rocks Voting Patterns of Congressmen

Practical Applications and Dimensions of the Spaces Heart Disease 13 attributes such as age, cholesterol, and resting blood pressure Cleveland Heart Disease Database 297 patients Breast Cancer 9 attributes obtained via needle aspiration of a tumor such as clump thickness, uniformity of cell size, and uniformity of cell shape Wisconsin Breast Cancer Database 682 patients 100% correctness on computer diagnosis of 131 new cases

Practical Applications and Dimensions of the Spaces Sonar Signals to Distinguish Mines from Rocks 60 attributes 208 mines and rocks linearly separable example Voting Patterns of Congressmen 1984 Voting Records Database of 16 key votes 435 Congressmen