Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mining Binary Constraints in Feature Models: A Classification-based Approach 2011.10.10 Yi Li.

Similar presentations


Presentation on theme: "Mining Binary Constraints in Feature Models: A Classification-based Approach 2011.10.10 Yi Li."— Presentation transcript:

1 Mining Binary Constraints in Feature Models: A Classification-based Approach 2011.10.10 Yi Li

2 Outline Approach Overview Approach in Detail The Experiments

3 Basic Idea If we focus on binary constraints… – Requires – Excludes We can classify a feature-pair as: – Non-constrained – Require-constrained – Exclude-constrained

4 Approach Overview Training & Test FM(s) Make Pairs Vectorize Optimize & Train Test Training & Test Pair(s) Training Vector(s) Trained Classifier Test Vector(s) Classified Test Pair(s) Classifier Stanford Parser

5 Outline Approach Overview Step 1: Make Pairs The Experiment

6 Rules of Making Pairs Unordered – It means if (A, B) is a “requires-pair”, then A requires B or B requires A or both. – Why? Because “non-constrained” and “excludes” are unordered, if we use ordered pairing “ ”, there are redundant pairs for “non-constrained” and “excludes” classes. Cross-Tree Only – Pair (A, B) is valid  A, B has no “ancestor/descendant” relation. – Why? “excludes” between ancestor/descendant is an error. “requires” between them is better expressed by optionality.

7 Outline Approach Overview Step 2: Vectorize the Pairs The Experiment

8 Vectorization: Text to Number A pair contains 2 features’ names and descriptions (i.e. textual attributes) To work with a classifier, a pair must be represented as a group of numerical attributes We calculate 4 numerical attributes for pair (A, B) – Similarity A, B = Pr (A.description == B.description) – Overlap A, B = Pr (A.objects == B.objects) – Target A, B = Pr (A.name == B.objects) – Target B, A = Pr (B.name == A.objects)

9 Reasons of Choosing the Attributes Constraints indicate some kinds of dependency / intervener between features Similar feature descriptions Overlapped objects A feature is targeted by another – These phenomena increase the chance of dependency or intervener being happened

10 Use Stanford Parser to Find Objects The Stanford Parser can perform grammatical analysis on sentences in many languages, including English and Chinese For English sentences, we extract objects (direct, indirect, prepositional) and any adjectives modifying those objects The parser works well even for incomplete sentences. (Common in feature descriptions)

11 Examples Add web links, document files, image files and notes to any event. Use a PDF driver to output or publish web calendars so anyone on your team can view scheduled events. Direct Objects Prepositional Object Direct Objects Direct Object Adjective Modifier

12 Calculate the Attributes Each of the 4 attributes follows the general form: Pr (Text A == Text B ), where Text is either description, objects or name. To calculate: – Stem words in the Text, and remove stop words. – Compute tf_idf (term frequency, inverse document frequency) value v i for each word i. Thus Text = (v 1, v 2, … v n ), n is the total number of distinct words of Text A and Text B – Pr(Text A == Text B ) = (Text A · Text B ) / (|Text A |·|Text B |)

13 Outline Approach Overview Step 3: Optimize and Train the Classifier The Experiment

14 The Support Vector Classifier A (binary) classification technique that has shown promising empirical results in many practical applications. Basic Idea – Data = Points in k-dimensional space (k is the number of attributes) – Classification = Find a hyperplane (a line in 2-D space) to separate these points

15 Find the Line in 2D Attribute 2 Attribute 1 There are infinite number of lines available.

16 SVC: Find the Best Line Best = Maximum Margin Attribute 2 Attribute 1 Margin for Red Margin for Green Larger margin has fewer prediction errors. These points defining the margin are called “support vectors”.

17 LIBSVM: A practical SVC Chih-Chung Chang and Chih-Jen Lin, National Taiwan University – See http://www.csie.ntu.edu.tw/~cjlin/libsvm/http://www.csie.ntu.edu.tw/~cjlin/libsvm/ Key features of LIBSVM – Easy-to-use – Integrated support for cross-validation (discuss later) – Built-in support for multi-class (more than 2 classes) – Built-in support for unbalanced classes (there’s far more NO_CONSTRAINED pairs than the others)

18 LIBSVM: Best Practices 1. Optimize (Find best SVC parameters) – Run cross-validation to compute classification accuracy. – Apply an optimization algorithm to find best accuracy and corresponding parameters. 2. Train with best parameters

19 Cross-Validation (k-Fold) Divide the training data set into k equal-sized subsets. Run the classifier k times. – During each run, one subset is chosen for testing, and others for training. Compute the average accuracy accuracy = Number of correctly classified / Total number

20 The Optimization Algorithm Basic concepts – Solution: a set of parameters to be optimized – Cost Function: a function that evaluates higher values for worse solutions. – Optimization tries to find a solution with lowest cost. For the classifier – Cost = 1 – accuracy We use genetic algorithm for optimization

21 Genetic Algorithm Basic idea – Start with random solutions (initial population) – Produce next generation from top elites of current population Mutation: slightly change an elite solution Crossover (Breeding): combine random parts of 2 elite solutions into a new one – Repeat until the stop condition has been reached – The best solution of last generation is the globally best. [ 0.3, 2, 5 ]  [ 0.4, 2, 5 ] [ 0.3, 2, 5 ] and [ 0.5, 3, 3 ]  [ 0.3, 3, 3 ]

22 Outline Overview Details The Experiments

23 Preparing Data We need – 2 feature models, with already added constraints We use 2 feature models from SPLOT Feature Model Repository – Graph Product Line, by Don Batory – Weather Station, by Pure-Systems Most of the features are terms that are defined in Wikipedia, we use the first paragraph of the definition as the feature’s description

24 Experiment Settings There are 2 types of experiments Without Feedback With Limited Feedback Generate Training & Test Set Optimize, Train and Test Result Generate Initial Training & Test Set Optimize, Train and Test Result Training & Test Set Check a few results Add checked results to training set; Remove checked results from test set Add checked results to training set; Remove checked results from test set

25 Experiment Settings For each type of experiment, we compare 4 train/test methods (which are widely used in data mining fields) 1. Training Set = FM 1, Test Set = FM 2 2. Training Set = FM 1 + A small part of FM 2, Test Set = Rest of FM 2 3. Training Set = A small part of FM 2, Test Set = Rest of FM 2 4. The same as 3, but do iterated LU training

26 What do the Experiments for? Comparison of the 4 methods: Can a trained classifier be applied to different feature models (domains) ? – or: Do the constraints in different domains follow the same pattern? Comparison of 2 categories: Does limited feedback (an expected practice in real world) improve the results ?

27 Preliminary Results (Found a bug in implementation of Method 2 – 4, so only run Method 1) Feedback strategy: constraint and higher similarity first Accuracy Without Feedback83.95% Feedback (5)86.85% Feedback (10)88.73% Feedback (15)95.45% Feedback (20)98.36% Test Model = Graph Product Line Accuracy Without Feedback97.84% Feedback (5)99.44% Feedback (10)99.44% Feedback (15)99.44% Feedback (20)99.44% Test Model = Weather Station

28 Outline Overview Preparing Data Classification Cross Validation & Optimization The Experiment What’s Next

29 Future Work More FMs for experiments Use Stanford Parser for Chinese to integrate constraints mining into CoFM


Download ppt "Mining Binary Constraints in Feature Models: A Classification-based Approach 2011.10.10 Yi Li."

Similar presentations


Ads by Google