Download presentation
Presentation is loading. Please wait.
Published byFlorence Lamb Modified over 9 years ago
1
1 Appendix D: Application of Genetic Algorithm in Classification Duong Tuan Anh 5/2014
2
2 Classification with Decision trees Class No Yes No Yes No Yes No Training data
3
3 Decision tree There exists the algorithm to create a decision tree from the training set (ID3, C4.5)
4
4 Classification rules from decision tree Represent the knowledge in the form of IF-THEN rules One rule is created for each path from the root to a leaf Each attribute-value pair along a path forms a conjunction The leaf node holds the class prediction Rules are easier for humans to understand. Example IF age = “<=30” AND student = “no” THEN buys_computer = “no” IF age = “<=30” AND student = “yes” THEN buys_computer = “yes” IF age = “31…40” THEN buys_computer = “yes” IF age = “>40” AND credit_rating = “excellent” THEN buys_computer = “no” IF age = “>40” AND credit_rating = “fair” THEN buys_computer = “yes”
5
5 GA for classification rule discovery Individual representation Each individual encodes a single classification rule Each rule is represented as a bit string Example: Instances in the training set are describe by two Boolean attributes A 1 and A 2 and two classes: C 1 and C 2 Rule: IF A 1 AND NOT A 2 THEN C 2 bit string “100” Rule: IF NOT A 2 AND NOT A 2 THEN C 1 bit string “001” If the attribute has k values, k > 2 then k bits are used to encode the attribute values. Classes can be encoded in a similar fashion.
6
6 Genetic operators for rule discovery Generalizing/Specializing Crossover Overfitting: a situation in which a rule is covering one training example. generalization Underfitting: a situation in which a rule is covering too many training examples. specialization The generalizing/specialization crossover operators can be implemented as the logical OR and AND, respectively. Example: Two crossover points children produced by children produced by Parents generalization crossover specialization crossover 0 | 1 0 | 1 0 | 1 1 | 1 0 | 0 0 | 1 1 | 0 1 | 0 1 | 1 1 | 0 1 | 0 0 | 0 OR AND
7
7 Fitness function Let a rule be of the form: IF A THEN C where A is the antecedent and C is the predicted class. Predictive accuracy of a rule called confidence factor (CF) is defined: CF = |A C|/|A| |A|: the number of examples satisfying all the conditions in the antecedent A |A C|: the number of examples that both satisfy the antecedent A and have the class predicted by the consequent C. Example: A rule covers 10 examples (i.e. |A| = 10), in which 8 examples have the class predicted by the rule (i.e. |A & C| = 8), then CF of the rule is CF = 80%. The performance of a rule can be summarized by a matrix called a confusion matrix.
8
8 TP = True positives = Number of examples satisfying A and C FP = False positives = Number of examples satisfying A but not C FN = False negatives = Number of examples not satisfying A but satisfying C TN = True negatives = Number of examples not satisfying A nor C CF measure is defined in terms of the above notation: CF = TP/(TP + FP). Confusion matrix
9
9 Fitness function (cont.) We can know measure the predictive measure of a rule by taking into account not only its CF but also a measure of how “complete” a rule is. Completeness of the rule: what is the proportion of examples having the predicted class C that is actually covered by the rule antecedent. The rule completeness measure: Comp = TP/(TP+FN) The fitness function combines the CF and Comp measures: Fitness = CF Comp. An initial population is created consisting of randomly generated rules. The process of generating a new population based on prior populations of rules continues until a population, P, evolves where each rule in P satisfies a prespecified fitness threshold.
10
10 Reference A. A. Freitas, A Survey of Evolutionary Algorithms for Data Mining and Knowledge, in: Advances in Evolutionary Computing, Springer, 2003.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.