Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mining Disjunctive Association Rules Using Genetic Programming Michelle Lyman Gary Lewandowski Department of Mathematics and Computer Science Xavier University.

Similar presentations


Presentation on theme: "Mining Disjunctive Association Rules Using Genetic Programming Michelle Lyman Gary Lewandowski Department of Mathematics and Computer Science Xavier University."— Presentation transcript:

1 Mining Disjunctive Association Rules Using Genetic Programming Michelle Lyman Gary Lewandowski Department of Mathematics and Computer Science Xavier University Celebration of Student Research April 4, 2005

2 Genetic Algorithms What are they? (Example: string evolution) Target string: xavier Generation 1:\iTksrfitness = 1 Generation 504:\aT:srfitness = 2 Generation 1143:\aT:erfitness = 3 Generation 1498:\av:erfitness = 4 Generation 1701:xav:erfitness = 5 Generation 1857:xavierfitness = 6 Crossover First string: xPki;& Second string: \aT:er Index:3 Resulting strings: xPkier \aT:;& Selection (Roulette Wheel)

3 Genetic Programming Trees Arithmetic operation Logical operation

4 Change an operator or a value Mutation mutated tree original tree mutated tree Replace an existing subtree with a new subtree

5 Crossover Original two trees Resulting trees

6 Association Rules Regular If a customer buys peanut butter, then the customer also buys jelly. Disjunctive If a customer buys ham or turkey, then the customer also buys wheat bread exclusive-or white bread. Conjunctive If a customer buys turkey and cranberry sauce, then the customer also buys corn and pumpkin pie. If some condition a exists, then condition c exists. a is the antecedent. c is the consequent.

7 Card Sorting 1 function8 if-then-else15 encapsulation22 tree 2 method9 boolean16 parameter23 thread 3 procedure10 scope17 variable24 iteration 4 dependency11 list18 constant25 array 5 object12 recursion19 type26 event 6 decomposition13 choice20 loop 7 abstraction14 state21 expression

8 Rule Format loopiteration and treerecursion and AntecedentConsequent If a student groups loop and iteration, then the student also groups tree and recursion.

9 A Small Problem recursiontreeloopiteration and functionthreadscopemethod xor Another possible interpretation of this rule is, if there exists a group that contains recursion and tree and loop and iteration, then there exists a group that contains function exclusive-or method exclusive-or scope exclusive-or thread. Assume we want to build a tree for the rule, if there exists a group that contains recursion and tree and there exists a group that contains loop and iteration, then there exists a group that contains function exclusive-or method exclusive-or there exists a group that contains scope exclusive-or thread. solution: SAND, SXOR, SOR, GAND, and GXOR

10 Goals to understand how students understand and relate concepts high support - Let the set of sorts be called D. Let the set of sorts that fulfill the antecedent of rule R be called A. Let the set of sorts that fulfill the consequent be called C. Then the support for R is given by high confidence - Using the same notation, the confidence for R is given by

11 Goals (Continued) a high number of cards used. Let the lower bound be called low, and the upper bound be called high. Using integer division, if the number of cards used in a tree is n, the score is given by a high balance 1. Card nodes have a balance of 1 2. G-operator nodes have a balance of 1 because each subtree must be true for the g-operator to evaluate to true. 3. S-operator nodes have a balance given by where leftCount is the number of times the left side was true and rightCount is the number of times the right side was true. The balance for the rule is the minimum blance of the antecedent and the consequent. a large percentage of g operators

12 Fitness Function Each of the following four factors is evaluated. Each can take on an integer value between 0 and 100: 1. support 2. confidence 3. card points 4. g operator points Each is weighted by a user specified multiplier, and the values are multiplied together. The result is then scaled by the balance factor, which is a value between 0 and 1.

13 Data 1044 student sorts 158 educator sorts Performance subsets [1, 2): 37 people [2, 3): 140 people [3, 4): 447 people [4, 5): 223 people [5]: 97 people

14 Results 90% Balance, 50% Confidence Educators have more rules. Many sorts separate high-level concepts (i.e., abstraction, encapsulation) from low-level concepts (i.e., array, variable). Concepts that often appear together: (procedure, function) (encapsulation, choice, decomposition, abstraction) (constant, variable, boolean, array) (tree, list) (loop, if-then-else) (choice, thread) I don’t know.


Download ppt "Mining Disjunctive Association Rules Using Genetic Programming Michelle Lyman Gary Lewandowski Department of Mathematics and Computer Science Xavier University."

Similar presentations


Ads by Google