Download presentation
Presentation is loading. Please wait.
1
Classification 3 (Nearest Neighbor Classifier)
COMP1942 Classification 3 (Nearest Neighbor Classifier) Prepared by Raymond Wong The examples used in Decision Tree are borrowed from LW Chan’s notes Screenshot Captured by Kai Ho Chan Presented by Raymond Wong COMP1942
2
Classification Methods
Decision Tree Bayesian Classifier Nearest Neighbor Classifier COMP1942
3
Nearest Neighbor Classifier
How to use the data mining tool COMP1942
4
Nearest Neighbor Classifier
Computer History Computer History 100 40 90 45 20 95 … COMP1942
5
Nearest Neighbor Classifier
Computer History Computer History Buy Book? 100 40 No (-) 90 45 Yes (+) 20 95 … + + - + + - - - + - - COMP1942
6
Nearest Neighbor Classifier
Step 1: Find the nearest neighbor Step 2: Use the “label” of this neighbor Computer History Computer History Buy Book? 100 40 No (-) 90 45 Yes (+) 20 95 … + + - + + - - - + - - Suppose there is a new person Computer History Buy Book? 95 35 ? COMP1942
7
Nearest Neighbor Classifier
k-Nearest Neighbor Classifier: Step 1: Find k nearest neighbors Step 2: Use the majority of the labels of the neighbors Computer History Computer History Buy Book? 100 40 No (-) 90 45 Yes (+) 20 95 … + + - + + - - - + - - Suppose there is a new person Computer History Buy Book? 95 35 ? COMP1942
8
Nearest Neighbor Classifier
How to use the data mining tool COMP1942
9
How to use the data mining tool
We can use XLMiner for classification (NN Classifier) Open “classification-NNClassifier.xls” in MS Excel Note: The input variables :real (cannot be categorical) The output variable : categorical COMP1942
10
COMP1942
11
COMP1942
12
COMP1942
13
COMP1942
14
COMP1942
15
Data source Worksheet Data range Workbook COMP1942
16
COMP1942
17
First row contains header
4 # Columns # Rows in Training set 10 Variables First row contains header Variables in input data COMP1942
18
COMP1942
19
COMP1942
20
COMP1942
21
COMP1942
22
COMP1942
23
COMP1942
24
Classes in the output variable
Specify “Success” class (for Lift Chart) # Classes: Yes 2 Specify initial cutoff probability value for success: 0.5 COMP1942
25
Number of nearest neighbors (k)
Normalize input data 1 Scoring option Number of nearest neighbors (k) Score on specified value of k as above Score on best k between 1 and specified value COMP1942
26
Prior class probabilities
According to relative occurrences in training data User specified prior probabilities Use equal prior probabilities COMP1942
27
Detailed report Summary report Score training data Lift charts
COMP1942
28
Score new data In worksheet COMP1942
29
Data source Worksheet Workbook Data range COMP1942
30
COMP1942
31
First row contains headers
# Rows 1 Variables # Cols 4 First row contains headers COMP1942
32
Continuous variables in input data
Variables in new data Continuous variables in input data Match sequentially Match selected Unmatch all Unmatch selected Match by name COMP1942
33
COMP1942
34
COMP1942
35
COMP1942
36
COMP1942
37
COMP1942
38
COMP1942
39
COMP1942
40
COMP1942
41
COMP1942
42
COMP1942
43
COMP1942
44
COMP1942
45
COMP1942
46
COMP1942
47
COMP1942
48
COMP1942
49
COMP1942
50
COMP1942
51
COMP1942
52
COMP1942
53
COMP1942
54
COMP1942
55
COMP1942
56
COMP1942
57
COMP1942
58
COMP1942
59
COMP1942
60
COMP1942
61
COMP1942
62
COMP1942
63
COMP1942
64
COMP1942
65
COMP1942
66
COMP1942
67
COMP1942
68
COMP1942
69
COMP1942
70
COMP1942
71
COMP1942
72
COMP1942
73
COMP1942
74
COMP1942
75
COMP1942
76
COMP1942
77
COMP1942
78
COMP1942
79
COMP1942
80
COMP1942
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.