Download presentation
Presentation is loading. Please wait.
1
The application of rough sets analysis in activity-based modelling. Opportunities and constraints Speaker: Yanan Yean
2
OUTLINE 1.Introduction 2.Activity-based modelling 3.Rough sets 4.Data 5.Application of rough sets in the SAMBA- project –Case study 1 –Case study 2 6.Conclusions and further challenges
3
1.Introduction Ⅰ Knowledge on travel behavior increases continuously, as researchers constantly make improvements to obtain more accurate and realistic models. If databases grow too large human inspection and interpretation are not feasible any more, resulting in a gap between data generation and data understandinag. A vast multitude of methods that ‘learn’ from examples, and that can be used to extract patterns from data for classification. e.x. rough sets The rough sets technique is a mathematical tool to search large, complex databases for meaningful decision rules.
4
1.Introduction Ⅱ The aim is to explore how travel data of a Belgian travel behavior survey can be analyzed using the rough sets analysis. The basic concepts of the rough sets technique Explore the possibilities of using rough sets by conducting two case studies in which we assess the performance of the approach in pattern generation, classification and choice prediction.
5
2.Activity-based modelling Ⅰ the most popular and advanced approach in passenger transport modelling is the activity-based modelling approach. It aims at forecasting which activities are done, where at what time, with whom, for how long and with which type of transport mode. The application of such models is characterized by many problems. A modelling approach that avoids these problems is qualitative modelling. (IF,THEN…ELSE )
6
2.Activity-based modelling Ⅱ Another widely used technique within AI is rough sets, rough sets have now already been successfully applied in a wide variety of research fields.( medicine, tourism travel demand, geography) Unlike many other DM techniques, the obtained results are expressed in a more or less natural language, which make the results easer to interpret.
7
3.Rough sets-some basic concepts Ⅰ Indiscernibility –Indiscernibility is related to similarity –Sets of objects will probably not be determined unambiguously, hence, objects will have to be described roughly through a pair of sets: i.e. a lower and a upper approximation. –An important advantage of the rough set approach is that it can deal with a set of inconsistent examples, i.e. objects indiscernible by condition attributes but discernible by decision attributes.
8
3.Rough sets-some basic concepts Ⅱ Reduct and core –In large data sets some attributes may be redundant, and thus can be eliminated without losing essential classificatory information. –The reduct is the minimal subset still providing the same object classification as with the full set of attributes. –The intersection of all reducts is called the core. –The core is the class of all indispensable attributes. Decision rule –As a DM technique, one of the most important reasons for applying rough sets is the generation of decision rules.
9
3.Rough sets-modelling process Ⅰ Usually, the rough sets modelling process can be divided in five main stages. –Data selection –Pre-processing and transformation The selected data set can be split in a training set and a test set in order to enable in the final step an assessment of the decision rules in the output. –Creation of reducts –Rule generation If gender (female) and age (35-45) and purpose (shopping) then mode (car) or mode (bike)
10
3.Rough sets-modelling process Ⅱ –Evaluation The overall performance can be evaluated by testing how well the generated decision rules could classify objects. In this paper we will only make use of the standard voting and the Naïve Bayes procedure.
11
4.data The applied data are part of a broader research project called Spatial Analysis and Modelling Based on Activities (SAMBA), funded by the Belgian Federal Govement. The final aim is to build an origin-destination matrix, which allows deducing travel demand in the Belgian spatial context.
12
5.Application of rough sets in the SAMBA-project Ⅰ the aim is to find out how the rough sets techniques perform with SAMBA-data as input. In the first case study we will try to find pattern on spatial preferences In the second case study the aim is retrieving patterns in transport mode choice.
13
5.Application of rough sets in the SAMBA-project Ⅱ 條件變數 決策變數
14
5.Application of rough sets in the SAMBA-project Ⅲ Case study 1 The reducts were calculated based on a genetic algorithm and on a Johnson`s algorithm. The GA is a heuristic for function optimization and promotes ‘survival of the fittest’, it may find more than one reduct The Johnson`s algorithm has a natural bias towards finding a single prime implicant of minimal length. Based on the reduct of nine variables over 4000 rules were generated.
15
5.Application of rough sets in the SAMBA-project Ⅳ This is a rather large number since we stared with only 8500 objects. This means that most rules are supported by just one or two objects. e.x. x1(3) x5(4)
16
5.Application of rough sets in the SAMBA-project Ⅴ However, the amount of rules is still too high for direct human interpretation. In fact, some additional treatment will be necessary in order to understand the relation between destination choice and the conditional variables.
17
5.Application of rough sets in the SAMBA-project Ⅰ Case study 2 In the second case study we will try to find rules Describing the transportation mode choice.
18
5.Application of rough sets in the SAMBA-project Ⅱ People, with the same background, could choose very different options and even the same person facing the same choice options will not always choose the same transportation mode. One way to surmount this problem is reducing the value set of the attributes. But then the classification performance could be lower, because less detail information in the condition attributes could lead to more approximate rules and indecisiveness.
19
5.Application of rough sets in the SAMBA-project Ⅲ daily;sometimes;never =2km
20
5.Application of rough sets in the SAMBA-project Ⅲ Although the number of variable classes was reduced, the results are quite similar: the reduct is now both for uncompleted and completed table equal to 16. The number of rule is a little lower: 7669(8077) without completion and 3535(3740) after conducting the completion task The classification performance is equal to the first results. So, based on less detailed information, the same classification performance was reached by the generated rules.
21
5.Application of rough sets in the SAMBA-project Ⅳ
22
Base on this new table, the reduct is still equal to six, but the amount of rules is clearly reduced:1408(7669) rules without completion task and 1212(3535) rules with completion task, while the classification performance is still around 0.72. However, over 1000 rules is not yet interpretable. 5.Application of rough sets in the SAMBA-project Ⅳ
23
5.Application of rough sets in the SAMBA-project Ⅴ With the ROSETTA-software, it is possible to filter the rules based on several criteria. The table 5 displays the outcome of the filtering procedure based on minimal support. Thus, although the filtering improves human interpretation of the rule set, it biases the result by filtering away less supported, but not necessarily less important categories.
24
6.Conclusions and further challenges we found a solution for tempering the amount of rules of rules by filtering the set of rules based on rule support. –It also means loss in predictive ability A major challenge is finding a balance in producing clear, comprehensible rule sets, while still maintaining maximum detail for better pattern prediction.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.