Presentation is loading. Please wait.

Presentation is loading. Please wait.

Classification 3 (Nearest Neighbor Classifier)

Similar presentations


Presentation on theme: "Classification 3 (Nearest Neighbor Classifier)"— Presentation transcript:

1 Classification 3 (Nearest Neighbor Classifier)
COMP1942 Classification 3 (Nearest Neighbor Classifier) Prepared by Raymond Wong The examples used in Decision Tree are borrowed from LW Chan’s notes Screenshot Captured by Kai Ho Chan Presented by Raymond Wong COMP1942

2 Classification Methods
Decision Tree Bayesian Classifier Nearest Neighbor Classifier COMP1942

3 Nearest Neighbor Classifier
How to use the data mining tool COMP1942

4 Nearest Neighbor Classifier
Computer History Computer History 100 40 90 45 20 95 COMP1942

5 Nearest Neighbor Classifier
Computer History Computer History Buy Book? 100 40 No (-) 90 45 Yes (+) 20 95 + + - + + - - - + - - COMP1942

6 Nearest Neighbor Classifier
Step 1: Find the nearest neighbor Step 2: Use the “label” of this neighbor Computer History Computer History Buy Book? 100 40 No (-) 90 45 Yes (+) 20 95 + + - + + - - - + - - Suppose there is a new person Computer History Buy Book? 95 35 ? COMP1942

7 Nearest Neighbor Classifier
k-Nearest Neighbor Classifier: Step 1: Find k nearest neighbors Step 2: Use the majority of the labels of the neighbors Computer History Computer History Buy Book? 100 40 No (-) 90 45 Yes (+) 20 95 + + - + + - - - + - - Suppose there is a new person Computer History Buy Book? 95 35 ? COMP1942

8 Nearest Neighbor Classifier
How to use the data mining tool COMP1942

9 How to use the data mining tool
We can use XLMiner for classification (NN Classifier) Open “classification-NNClassifier.xls” in MS Excel Note: The input variables :real (cannot be categorical) The output variable : categorical COMP1942

10 COMP1942

11 COMP1942

12 COMP1942

13 COMP1942

14 COMP1942

15 Data source Worksheet Data range Workbook COMP1942

16 COMP1942

17 First row contains header
4 # Columns # Rows in Training set 10 Variables First row contains header Variables in input data COMP1942

18 COMP1942

19 COMP1942

20 COMP1942

21 COMP1942

22 COMP1942

23 COMP1942

24 Classes in the output variable
Specify “Success” class (for Lift Chart) # Classes: Yes 2 Specify initial cutoff probability value for success: 0.5 COMP1942

25 Number of nearest neighbors (k)
Normalize input data 1 Scoring option Number of nearest neighbors (k) Score on specified value of k as above Score on best k between 1 and specified value COMP1942

26 Prior class probabilities
According to relative occurrences in training data User specified prior probabilities Use equal prior probabilities COMP1942

27 Detailed report Summary report Score training data Lift charts
COMP1942

28 Score new data In worksheet COMP1942

29 Data source Worksheet Workbook Data range COMP1942

30 COMP1942

31 First row contains headers
# Rows 1 Variables # Cols 4 First row contains headers COMP1942

32 Continuous variables in input data
Variables in new data Continuous variables in input data Match sequentially Match selected Unmatch all Unmatch selected Match by name COMP1942

33 COMP1942

34 COMP1942

35 COMP1942

36 COMP1942

37 COMP1942

38 COMP1942

39 COMP1942

40 COMP1942

41 COMP1942

42 COMP1942

43 COMP1942

44 COMP1942

45 COMP1942

46 COMP1942

47 COMP1942

48 COMP1942

49 COMP1942

50 COMP1942

51 COMP1942

52 COMP1942

53 COMP1942

54 COMP1942

55 COMP1942

56 COMP1942

57 COMP1942

58 COMP1942

59 COMP1942

60 COMP1942

61 COMP1942

62 COMP1942

63 COMP1942

64 COMP1942

65 COMP1942

66 COMP1942

67 COMP1942

68 COMP1942

69 COMP1942

70 COMP1942

71 COMP1942

72 COMP1942

73 COMP1942

74 COMP1942

75 COMP1942

76 COMP1942

77 COMP1942

78 COMP1942

79 COMP1942

80 COMP1942


Download ppt "Classification 3 (Nearest Neighbor Classifier)"

Similar presentations


Ads by Google