K Nearest Neighbors Saed Sayad 1www.ismartsoft.com
KNN - Definition KNN is a simple algorithm that stores all available cases and classifies new cases based on a similarity measure 2www.ismartsoft.com
KNN – different names K-Nearest Neighbors Memory-Based Reasoning Example-Based Reasoning Instance-Based Learning Case-Based Reasoning Lazy Learning K-Nearest Neighbors Memory-Based Reasoning Example-Based Reasoning Instance-Based Learning Case-Based Reasoning Lazy Learning 3www.ismartsoft.com
KNN – Short History Nearest Neighbors have been used in statistical estimation and pattern recognition already in the beginning of 1970’s ( non-parametric techniques ). Dynamic Memory: A theory of Reminding and Learning in Computer and People (Schank, 1982). People reason by remembering and learn by doing. Thinking is reminding, making analogies. Examples = Concepts??? Nearest Neighbors have been used in statistical estimation and pattern recognition already in the beginning of 1970’s ( non-parametric techniques ). Dynamic Memory: A theory of Reminding and Learning in Computer and People (Schank, 1982). People reason by remembering and learn by doing. Thinking is reminding, making analogies. Examples = Concepts??? 4www.ismartsoft.com
KNN Classification Age Loan$ 5www.ismartsoft.com
KNN Classification – Distance AgeLoanDefaultDistance 25$40,000N $60,000N $80,000N $20,000N $120,000N $18,000N $95,000Y $62,000Y $100,000Y $220,000Y $150,000Y $142,000? Euclidean Distance 6www.ismartsoft.com
KNN Classification – Standardized Distance AgeLoanDefaultDistance N N N N N N Y Y Y Y Y ? Standardized Variable 7www.ismartsoft.com
KNN Regression - Distance AgeLoanHouse Price IndexDistance 25$40, $60, $80, $20, $120, $18, $95, $62, $100, $220, $150, $142,000? 8www.ismartsoft.com
KNN Regression – Standardized Distance AgeLoanHouse Price IndexDistance ? 9www.ismartsoft.com
KNN – Number of Neighbors If K=1, select the nearest neighbor If K>1, – For classification select the most frequent neighbor. – For regression calculate the average of K neighbors. If K=1, select the nearest neighbor If K>1, – For classification select the most frequent neighbor. – For regression calculate the average of K neighbors. 10www.ismartsoft.com
Distance – Categorical Variables XYDistance Male 0 Female1 11www.ismartsoft.com
Instance Based Reasoning IB1 is based on the standard KNN IB2 is incremental KNN learner that only incorporates misclassified instances into the classifier. IB3 discards instances that do not perform well by keeping success records. IB1 is based on the standard KNN IB2 is incremental KNN learner that only incorporates misclassified instances into the classifier. IB3 discards instances that do not perform well by keeping success records. 12www.ismartsoft.com
Case Based Reasoning 13www.ismartsoft.com
KNN - Applications Classification and Interpretation – legal, medical, news, banking Problem-solving – planning, pronunciation Function learning – dynamic control Teaching and aiding – help desk, user training Classification and Interpretation – legal, medical, news, banking Problem-solving – planning, pronunciation Function learning – dynamic control Teaching and aiding – help desk, user training 14www.ismartsoft.com
Summary KNN is conceptually simple, yet able to solve complex problems Can work with relatively little information Learning is simple (no learning at all!) Memory and CPU cost Feature selection problem Sensitive to representation KNN is conceptually simple, yet able to solve complex problems Can work with relatively little information Learning is simple (no learning at all!) Memory and CPU cost Feature selection problem Sensitive to representation 15www.ismartsoft.com
16www.ismartsoft.com Questions?