Active Learning based on Bayesian Networks Luis M. de Campos, Silvia Acid and Moisés Fernández.

Slides:



Advertisements
Similar presentations
Study on Ensemble Learning By Feng Zhou. Content Introduction A Statistical View of M3 Network Future Works.
Advertisements

Artrelle Fragher & Robert walker. 1 you look for the median 1 you look for the median 2 then you look for the min and max 2 then you look for the min.
Best fit line Graphs (scatter graphs) Looped Intro Presentation.
Chapter 13: Query Processing
Fill in missing numbers or operations
Name: Date: Read temperatures on a thermometer Independent / Some adult support / A lot of adult support
Ozone Level ppb (parts per billion)
Win Big AddingSubtractEven/Odd Rounding Patterns Q $100 Q $200 Q $300 Q $400 Q $500 Q $100 Q $200 Q $300 Q $400 Q $500 Last Chance.
Multiplication X 1 1 x 1 = 1 2 x 1 = 2 3 x 1 = 3 4 x 1 = 4 5 x 1 = 5 6 x 1 = 6 7 x 1 = 7 8 x 1 = 8 9 x 1 = 9 10 x 1 = x 1 = x 1 = 12 X 2 1.
Division ÷ 1 1 ÷ 1 = 1 2 ÷ 1 = 2 3 ÷ 1 = 3 4 ÷ 1 = 4 5 ÷ 1 = 5 6 ÷ 1 = 6 7 ÷ 1 = 7 8 ÷ 1 = 8 9 ÷ 1 = 9 10 ÷ 1 = ÷ 1 = ÷ 1 = 12 ÷ 2 2 ÷ 2 =
Math Expressions How does it all work together…?.
Is Random Model Better? -On its accuracy and efficiency-
/4/2010 Box and Whisker Plots Objective: Learn how to read and draw box and whisker plots Starter: Order these numbers.
Half Life. The half-life of a quantity whose value decreases with time is the interval required for the quantity to decay to half of its initial value.
1 1  1 =.
1  1 =.
Partial Products for Multiplication
Year 6 mental test 15 second questions Numbers and number system Numbers and the number system, Measures and Shape.
Absorption Costing and Activity Based Costing
Who Wants To Be A Millionaire?
Decimals 10ths and 100ths.
I can interpret intervals on partially numbered scales and record readings accurately ? 15 ? 45 ? 25 ? 37 ? 53 ? 64 Each little mark.
£1 Million £500,000 £250,000 £125,000 £64,000 £32,000 £16,000 £8,000 £4,000 £2,000 £1,000 £500 £300 £200 £100 Welcome.
GCSE Higher Revision Starters 11 Module 3 and 5 Revision.
Year 6/7 mental test 5 second questions
Chris Morgan, MATH G160 March 19, 2011 Lecture 23
Machine Learning Intro iCAMP 2012
Constructing a Non-Linear Model with Neural Networks for Workload Characterization Richard M. Yoo Han Lee Kingsum Chow Hsien-Hsin S. Lee Georgia Tech Intel.
CSI 1306 ALGORITHMS - PART 4 LIST PROCESSING. Lists Sometimes a problem deals with a list of values We represent such a list with a single name, and use.
Final Exam Sample Problem
Price Points With Bar Charts. Investment to Value Ratio $45,000 - $65,000 $25,000 - $45,000 Under $25,000 $90,000 - $125,000 $65,000 - $90,000 Low Price.
Best Value For Money adjudications for Industrial Services C. Lara/CERN 14 November 2013.
Crops Choice Info sheets. Peas Cost of seed: £500 Expected selling price: £ Selling price will increase by 20% if product is farmed organically.
Look at This PowerPoint for help on you times tables
CHAPTER 16 Life Tables.
ADBIS 2007 Discretization Numbers for Multiple-Instances Problem in Relational Database Rayner Alfred Dimitar Kazakov Artificial Intelligence Group, Computer.
Decision Modeling Decision Analysis.
1 1 Slides by John Loucks St. Edwards University Modifications by A. Asef-Vaziri.
Effects on UK of Eustatic sea Level rise GIS is used to evaluate flood risk. Insurance companies use GIS models to assess likely impact and consequently.
Making Landmark or Friendly Numbers (Multiplication)
K-means Clustering Given a data point v and a set of points X,
Classification.. continued. Prediction and Classification Last week we discussed the classification problem.. – Used the Naïve Bayes Method Today..we.
Introduction to Indexes Rui Zhang The University of Melbourne Aug 2006.
Comparison of X-ray diffraction patterns of La 2 CuO 4+   from different crystals at room temperature Pia Jensen.
Least Common Multiples and Greatest Common Factors
Example of a Decision Tree Problem: The Payoff Table
1 Using Bayesian Network for combining classifiers Leonardo Nogueira Matos Departamento de Computação Universidade Federal de Sergipe.
23-8 3x6 Double it Take Away 6 Share By 9 Double it +10 Halve it Beginner Start Answer Intermediate 70 50% of this ÷7÷7 x8 Double it Start Answer.
Making Numbers Two-digit numbers Three-digit numbers Click on the HOME button to return to this page at any time.
EMR 6500: Survey Research Dr. Chris L. S. Coryn Kristin A. Hobson Spring 2013.
Number bonds to 10,
Beat the Computer Drill Divide 10s Becky Afghani, LBUSD Math Curriculum Office, 2004 Vertical Format.
2 x0 0 12/13/2014 Know Your Facts!. 2 x1 2 12/13/2014 Know Your Facts!
Let’s take a 15 minute break Please be back on time.
Fractions Simplify: 36/48 = 36/48 = ¾ 125/225 = 125/225 = 25/45 = 5/9
MEL 4E.  Graphing data can make it easier to quickly see trends. There are different types of graphs which each show and compare data.
Chapter 11: The t Test for Two Related Samples
1 PROBLEM 1 PROBLEM 3 PROBLEM 2 PROBLEM 4 PROBLEM 5 PROBLEM 8PROBLEM 7 PROBLEM 6 STANDARD 13 SUPPLEMENT AND COMPLEMENT: NUMERIC PROBLEM 10PROBLEM 9 PRESENTATION.
Basics of Statistical Estimation
Commonly Used Distributions
Multiplication Facts Practice
Graeme Henchel Multiples Graeme Henchel
0 x x2 0 0 x1 0 0 x3 0 1 x7 7 2 x0 0 9 x0 0.
PROJECTS ON SUPERVISED AND FACTORIZATION BAYESIAN NETWORKS Course 2007/2008 Concha Bielza, Pedro Larrañaga Universidad Politécnica de Madrid.
Three Papers: AUC, PFA and BIOInformatics The three papers are posted online.
ROC Curves.
Semi-supervised Learning on Partially Labeled Imbalanced Data May 16, 2010 Jianjun Xie and Tao Xiong.
AISTATS 2010 Active Learning Challenge: A Fast Active Learning Algorithm Based on Parzen Window Classification L.Lan, H.Shi, Z.Wang, S.Vucetic Temple.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Active learning Haidong Shi, Nanyi Zeng Nov,12,2008.
Presentation transcript:

Active Learning based on Bayesian Networks Luis M. de Campos, Silvia Acid and Moisés Fernández

2 Index of Contents 1. Introduction The scenario is pool-based active learning cycle. 2. Data and evaluation We have participated in 5 from the six datasets considered. The evaluation is realized with AUC and ALC. 3. Methods Features, modules implemented, general procedure, how to query labels and a practical example. 4. Results The best result is in sixth position. 5. Conclusions 6. Acknowledgments

3 1. Introduction

4 2. Data and evaluation There are 6 datasets of test-final phase. We have participated in five from the six: A, C, D, E and F. These datasets are from different application domains: Chemoinformatics. Embryology. Marketing Text ranking. Evaluation with: Area under the ROC curve (AUC) Area under the Learning Curve (ALC).

5 3. Methods. Features Hardware used: laptop with platform Ubuntu 8.10, 4GB of memory and Intel core duo to 2.53GHz. We have used three base classifiers from Bayesian Networks: Naive Bayes. It was used in dataset D. TAN (Tree Augmented Network) with score BDeu. It was used in dataset F. CHillClimber. New classifier that moves in a reduced search space centered on the node class. It was used with score BDeu and in dataset A, C and E. Method of discretization for numerical variables: Fayyad & Irani MDL in TAN and CHillClimber. None in Naive Bayes.

6 3. Methods. Features and Modules Active learning method: uncertainty sampling. We didnt use unlabeled data for training. Software implemented (several modules): Matlab: main module. It calls the module C++. C++: intermediate module. It calls the module Weka-Java. Weka-Java: final module. Its implemented with Java in Weka with several modifications

7 3. Methods. Procedure The procedure is as follows: 1.Algorithm trains with all known instances, initially it only has got the seed. 2.It selects new examples to query using a particular method (a,b,c). See the following transparency. 3.It joins all of known instances. 4.Are they all instances known? No: go to 1. Yes: end. Number of instances to query in each iteration is fixed (three different ways): Exponencial. Equal10-All. All-Equal10. n is the total labels of dataset. (n/2)/10 (n/2) … (n/2)/10 … 1248 … Iteration 1 Iteration 2 Iteration 3 Iteration 4 Iteration 5 … Iteration 2 Iteration 3 … Iteration 1 Iteration 4

8 3. Methods. How to query examples (a, b or c) For each iteration we sort the examples in increasing ordering of the probabilities of the most probable class. Then we choose x examples with the particular method elected: a.We query the x examples having the lowest probabilities. b.We query x1 and x2 examples having the lowest probabilities corresponding to class -1 and to class 1 respectively maintaining the proportion of examples of each class known so far.. x = x1 + x2. c.like method b, but x1 and x2 are calculated using the proportion of examples of each class estimated from both the tags returned by the oracle and values returned by our classifier.

9 3. Methods. An example. Prior knowledge: 6 examples corresponding to class -1 and 4 to class 1. In addition, our classifier shows the next probabilities: Our strategy of type exponencial indicates that we have to choose 4 examples (we are in the iteration three): With method a: we would choose examples 3,5,4,6. With method b: we would choose examples 3,5,2,1. With method c: we would choose examples 3,5,4,2. ExampleClass -1Class ……… ExampleMaxProbClass ……… ExampleMaxProbClass ……… Select Max probability Sort

10 4. Results Our results are rather modest, obtaining reasonable performance only in two datasets, C and E. To the left we can see the plot of dataset E and to the right the plot of dataset C. Dataset ACDEF Method CHillClimber, exponencial, a) TAN, equal10- all, c) NaiveBayes, all-equal10, a) CHillClimber, exponencial, b) TAN, exponencial, b) Ranking 20/226/1415/1912/2013/16

11 5. Conclusions We can improve our process if we apply further processing by clustering when we have a few instances. Advantages: Simple. No time consuming. Disadvantages: Static behavior. Lack of knowledge in early stages of the process.

12 Acknowledgments This work has been supported by the Spanish research programme Consolider Ingenio 2010: MIPRCV (CSD ).