Data Mining (and machine learning)

Data Mining (and machine learning)
ROC curves Rule Induction CW3

Two classes is a common and special case

Medical applications: cancer, or not? Computer Vision applications: landmine, or not? Security applications: terrorist, or not? Biotech applications: gene, or not? … …

Medical applications: cancer, or not? Computer Vision applications: landmine, or not? Security applications: terrorist, or not? Biotech applications: gene, or not? … … Predicted Y Predicted N Actually Y True Positive False Negative Actually N False Positive True Negative

True Positive: these are ideal. E.g. we correctly detect cancer Predicted Y Predicted N Actually Y True Positive False Negative Actually N False Positive True Negative

True Positive: these are ideal. E.g. we correctly detect cancer False Positive: to be minimised – cause false alarm – can be better to be safe than sorry, but can be very costly. Predicted Y Predicted N Actually Y True Positive False Negative Actually N False Positive True Negative

True Positive: these are ideal. E.g. we correctly detect cancer False Positive: to be minimised – cause false alarm – can be better to be safe than sorry, but can be very costly. False Negative: also to be minimised – miss a landmine / cancer very bad in many applications Predicted Y Predicted N Actually Y True Positive False Negative Actually N False Positive True Negative

True Positive: these are ideal. E.g. we correctly detect cancer False Positive: to be minimised – cause false alarm – can be better to be safe than sorry, but can be very costly. False Negative: also to be minimised – miss a landmine / cancer very bad in many applications True Negative?: Predicted Y Predicted N Actually Y True Positive False Negative Actually N False Positive True Negative

Sensitivity and Specificity: common measures of accuracy in this kind of 2-class tasks
Predicted Y Predicted N Actually Y True Positive False Negative Actually N False Positive True Negative

Sensitivity and Specificity: common measures of accuracy in this kind of 2-class task
Sensitivity = TP/(TP+FN) - how much of the real ‘Yes’ cases are detected? How well can it detect the condition? Specificity = TN/(FP+TN) how much of the real ‘No’ cases are correctly classified? How well can it rule out the condition? Predicted Y Predicted N Actually Y True Positive False Negative Actually N False Positive True Negative

YES NO

Sensitivity: 100% Specificity: 25% YES NO YES NO

Sensitivity: 93.8% Specificity: 50% YES NO

Sensitivity: 81.3% Specificity: 83.3% YES NO YES NO

Sensitivity: 56.3% Specificity: 100% YES NO YES NO

detects all cancer cases (or whatever)
Sensitivity: 100% Specificity: 25% YES NO YES NO 100% Sensitivity means: detects all cancer cases (or whatever) but possibly with many false positives

misses some cancer cases (or whatever) but no false positives
Sensitivity: 56.3% Specificity: 100% YES NO YES NO 100% Specificity means: misses some cancer cases (or whatever) but no false positives

Sensitivity and Specificity: common measures of accuracy in this kind of 2-class tasks
Sensitivity = TP/(TP+FN) - how much of the real TRUE cases are detected? How sensitive is the classifier to TRUE cases? A highly sensitive test for cancer: if “NO” then you be sure it’s “NO” Specificity = TN/(TN+FP) how sensitive is the classifier to the negative cases? A highly specific test for cancer: if “Y” then you be sure it’s “Y”. With many trained classifiers, you can ‘move the line’ in this way. E.g. with NB, we could use a threshold indicating how much higher the log likelihood for Y should be than for N

ROC curves David Corne, and Nick Taylor, Heriot-Watt University - These slides and related resources:

Rule Induction Rules are useful when you want to learn a clear / interpretable classifier, and are less worried about squeezing out as much accuracy as possible There are a number of different ways to ‘learn’ rules or rulesets. Before we go there, what is a rule / ruleset?

Rules IF Condition … Then Class Value is …

YES NO Rules are Rectangular IF (X>0)&(X<5)&(Y>0.5)&(Y<5) THEN YES 5 4 3 2 1

YES NO Rules are Rectangular IF (X>5)&(X<11)&(Y>4.5)&(Y<5.1) THEN NO 5 4 3 2 1

A Ruleset IF Condition1 … Then Class = A IF Condition2 … Then Class = A IF Condition3 … Then Class = B IF Condition4 … Then Class = C …

What’s wrong with this ruleset? (two things)
YES NO What’s wrong with this ruleset? (two things) 5 4 3 2 1

What about this ruleset?
YES NO What about this ruleset? 5 4 3 2 1

Two ways to interpret a ruleset:

As a Decision List IF Condition1 … Then Class = A ELSE IF Condition2 … Then Class = A ELSE IF Condition3 … Then Class = B ELSE IF Condition4 … Then Class = C … ELSE … predict Background Majority Class

As an unordered set IF Condition1 … Then Class = A IF Condition2 … Then Class = A IF Condition3 … Then Class = B IF Condition4 … Then Class = C Check each rule and gather votes for each class If no winner, predict background majority class

Three broad ways to learn rulesets

1. Just build a decision tree with ID3 (or something else) and you can translate the tree into rules!

2. Use any good search/optimisation algorithm. Evolutionary (genetic) algorithms are the most common. You will do this coursework 3. This means simply guessing a ruleset at random, and then trying mutations and variants, gradually improving them over time.

3. A number of ‘old’ AI algorithms exist that still work well, and/or can be engineered to work with an evolutionary algorithm. The basic idea is: iterated coverage

Take each class in turn .. YES NO 5 4 3 2 1

Pick a random member of that class in the training set
YES NO 5 4 3 2 1

Extend it as much as possible without including another class
YES NO 5 4 3 2 1

Next class YES NO 5 4 3 2 1

And so on… YES NO 5 4 3 2 1

CW3 Run expts program that evolves a ruleset
Try different sizes of training and test set Observe ‘overfitting’ and report

Data Mining (and machine learning)

Similar presentations

Presentation on theme: "Data Mining (and machine learning)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Data Mining (and machine learning)

Similar presentations

Presentation on theme: "Data Mining (and machine learning)"— Presentation transcript:

Similar presentations

About project

Feedback