Download presentation
Presentation is loading. Please wait.
1
Dynamic Integration of Virtual Predictors Vagan Terziyan Information Technology Research Institute, University of Jyvaskyla, FINLAND e-mail: vagan@it.jyu.fi URL: http://www.cs.jyu.fi/ai CIMA2001/AIDA’2001 Bangor, June 19-22, 2001
2
Acknowledgements Information Technology Research Institute (University of Jyvaskyla): Customer-oriented research and development in Information Technology http://www.titu.jyu.fi/eindex.html Multimeetmobile (MMM) Project (2000-2001): Location-Based Service System and Transaction Management in Mobile Electronic Commerce http://www.cs.jyu.fi/~mmm Academy of Finland Project (1999): Dynamic Integration of Classification Algorithms
3
Acknowledgements also to a Team (external to MMM project) Alexey Tsymbal Irina Skrypnik Seppo Puuronen Department of Computer Science and Information Systems, University of Jyvaskyla http://www.cs.jyu.fi
4
Contents zThe problem zVirtual Predictor zClassification Team zTeam Direction zDynamic Selection of Classification Team zImplementation for Mobile e-Commerce zConclusion
5
Inductive learning with integration of predictors Sample Instances ytyt Learning Environment P1P1 P2P2...PnPn Predictors/Classifiers
6
Virtual Classifier TC - Team Collector TM - Training Manager TP - Team Predictor TI - Team Integrator FS - Feature Selector DE - Distance Evaluator CL - Classification Processor Virtual Classifier is a group of seven cooperative agents:
7
Classification Team: Feature Selector FS - Feature Selector
8
Feature Selection zFeature selection methods try to pick a subset of features that are relevant to the target concept; zEach of these methods has its strengths and weaknesses based on data types and domain characteristics; zThe choice of a feature selection method depends on various data set characteristics: (i) data types, (ii) data size, and (iii) noise.
9
Classification of feature selection methods [Dash and Liu, 1997]
10
Feature Selector: to find the minimally sized feature subset that is sufficient for correct classification of the instance Feature Selector Sample Instances
11
Classification Team: Distance Evaluator DE - Distance Evaluator
12
Use of distance evaluation zDistance between instances is useful to recognize nearest neighborhood of any classified instance; zDistance between classes is useful to define the misclassification error; zDistance between classifiers is useful to evaluate weights of every classifier for their further integration.
13
Well known distance functions [ Wilson & Martinez 1997]
14
Distance Evaluator: to measure distance between instances based on their numerical or nominal attribute values Distance Evaluator
15
Classification Team: Classification Processor CL - Classification Processor
16
Classification Processor: to predict class for a new instance based on its selected features and its location relatively to sample instances Classification Processor Sample Instances Feature Selector Distance Evaluator
17
Team Instructors: Team Collector TC - Team Collector completes Classification Teams for training
18
Team Collector - completes classification teams for future training Team Collector FS i DE j CL k Feature Selection methods Distance Evaluation functions Classification rules
19
Team Instructors: Training Manager TM - Training Manager trains all completed teams on sample instances
20
Training Manager - trains all completed teams on sample instances Training Manager FS i1 DE j1 CL k1 FS i2 DE j2 CL k2 FS in DE jn CL kn Sample InstancesSample Metadata Classification Teams
21
Team Instructors: Team Predictor TP - Team Predictor predicts weights for every classification team in certain location
22
Team Predictor - predicts weights for every classification team in certain location Training Manager: e.g. WNN algorithm Sample Metadata Predicted weights of classification teams Location
23
Team Predictor - predicts weights for every classification team in certain location NN 1 NN 2 NN 3 NN 4 d max d1d1 d2d2 d3d3 PiPi w ij = F (w 1j, d 1, w 2j, d 2, w 3j, d 3, d max ) Sample metadata
24
Team Prediction: Locality assumption Each team has certain subdomains in the space of instance attributes, where it is more reliable than the others; This assumption is supported by the experiences, that classifiers usually work well not only in certain points of the domain space, but in certain subareas of the domain space [Quinlan, 1993]; If a team does not work well with the instances near a new instance, then it is quite probable that it will not work well with this new instance also.
25
Team Instructors: Team Integrator TI - Team Integrator produces classification result for a new instance by integrating appropriate outcomes of learned teams
26
Team integrator - produces classification result for a new instance by integrating appropriate outcomes of learned teams Team Integrator FS i1 DE j1 CL k1 FS i2 DE j2 CL k2 FS in DE jn CL kn New instance y t1 y t2 y t1 ytyt Weights of classification teams in the location of a new instance Classification teams
27
Simple case: static or dynamic selection of a classification team from two Assume that we have two different classification teams and they have been learned on a same sample set with n instances. Let the first team classifies correctly m 1, and the second one m 2 sample instances respectively. We consider two possible cases to select the best team for further classification: a static selection case and a dynamic selection case.
28
Static Selection zStatic selection means that we try all teams on a sample set and for further classification select one, which achieved the best classification accuracy among others for the whole sample set. Thus we select a team only once and then use it to classify all new domain instances.
29
Dynamic Selection zDynamic selection means that the team is being selected for every new instance separately depending on where this instance is located. If it has been predicted that certain team can better classify this new instance than other teams, then this team is used to classify this new instance. In such case we say that the new instance belongs to the “competence area” of that classification team.
30
Theorem zThe average classification accuracy in the case of (dynamic) selection of a classification team for every instance is expected to be not worse than the one in the case of (static) selection for the whole domain. zThe accuracy of these two cases can be equal if and only if : where k is amount of instances correctly classified by both teams
31
“Competence areas ” of classification teams in dynamic selection m 1 instances m 2 instances n instances k instances
32
M-Commerce LBS system http://www.cs.jyu.fi/~mmm In the framework of the Multi Meet Mobile (MMM) project at the University of Jyväskylä, a LBS pilot system, MMM Location-based Service system (MLS), has been developed. MLS is a general LBS system for mobile users, offering map and navigation across multiple geographically distributed services accompanied with access to location-based information through the map on terminal’s screen. MLS is based on Java, XML and uses dynamic selection of services for customers based on their profile and location.
33
Architecture of LBS system Mobile network Geographical, spatial data Positioning Service Location-Based Service Personal Trusted Device Location-based data: (1) services database (access history); (2) customers database (profiles)
34
Sample from the location- based services’ access history Mobile customer description Ordered service
35
Adaptive interface for MLS client Only predicted services, for the customer with known profile and location, will be delivered from MLS and displayed at the mobile terminal screen as clickable “points of interest”
36
Conclusion zKnowledge discovery with an ensemble of classifiers is known to be more accurate than with any classifier alone [e.g. Dietterich, 1997]. zIf a classifier somehow consists of certain feature selection algorithm, distance evaluation function and classification rule, then why not to consider these parts also as ensembles making a classifier itself more flexible? z We expect that classification teams completed from different feature selection, distance evaluation, and classification methods will be more accurate than any ensemble of known classifiers alone, and we focus our research and implementation on this assumption.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.