A Modified Naïve Possibilistic Classifier for Numerical Data

A Modified Naïve Possibilistic Classifier for Numerical Data
Presented by: Karim Baati Ph.D. Student at REGIM-Lab., University of Sfax, Tunisia Teaching and Research Assistant at ESPRIT School of Engineering, Tunis, Tunisia.

Presentation outline Context Possibilistic classifier
Estimation of possibility beliefs G-Min algorithm Experimental results Conclusion and perspectives

Context

Context (1) Machine learning is a field that involves all techniques that help to reach a decision. These techniques may be divided into two major categories namely, supervised and unsupervised. The difference is that the first category (supervised) assigns a final decision among a set of predefined classes whereas the second one (unsupervised) allocates a final decision to a hidden class that it searches itself. It is rather with regard to the first category that we can locate the current work.

Context (2) Regardless the type of the supervised machine learning technique, the main objective is usually defined as how to find the right class In order to do that, each technique requires data that are often depicted by vectors in which every value stands for a particular attribute. C = {c1=No disease, c2=Heart disease, c3=Lung disease} A = {a1= Gender {Man, Woman}, a2 = Temperature {Real}, a3= Blood pressure{Real}, a4=Chest pain {Yes, No}} vt={Man, 39, 17, No}. What is the final decision c*?

Context (3) Normally, a perfect classification process has to be based on perfect data Yet, data which are handled in our daily life are never perfect Main types of imperfections: uncertainty, imprecision, heterogeneity, insufficiency, etc. In the current work, we are dealing with poor data. Poor data refer to data which are not sufficient enough to let us acquire the necessary knowledge to make decision.

Context (4) Poor data may exist for different reasons : insufficient number, of instances, many missing values , imbalanced data, etc. That could be encountered in many fields especially in medical diagnosis when a new pathology emerges. Challenge: poor data often leads to ambiguity when making the final decision. Ambiguity refers to the fact that the final decision has very close possibility estimate to the other alternatives of the classification problem.

Possibilistic classifier

Possibilistic classifier (1)
Naïve Possibilistic Classifier (NPC) hybridizes the naïve Bayesian structure as a good pattern that has proven its efficiency although the strong assumption of attributes independency with the possibilistic framework as a strong tool to deal with poor data.

Prossibilistic classifier (2)
In the estimation step, possibilisty distributions must be normalized (at least one event must have a possibility equal to 1) but no need to have the sum of values equal to 1 as in probability theory) In the fusion step, possibilistic classification requires the use of either the product or the minimum as a fusion rule.

Estimation of possibility beliefs

Estimation of possibility beliefs (1)
The proposed method is based on the probability to possibility transformation of Dubois et al. in the continuous case. First advantage: Method based on the maximum specificity that produces the upper bound of probability of a given event. Second advantage : Possiblity theory is based on fuzzy sets theory and therefore the method allows to converge from the probability-based estimation and its “strict” Bayes rule to a fuzzy-sets-based estimation which is more suitable to handle ambiguity.

Estimation of possibility beliefs (2)
To estimate possibilistic beliefs from numerical data, attribute values are first normalized as follows: Afterward, we make use of the probability-possibility transformation of Dubois et al. in the continuous case where G is a Gaussian cumulative distribution which may be assessed using the table of the standard normal distribution.

G-Min algorithm

G-Min algorithm (1) The Generalized Minimum-based algorithm (G-Min) aims to avoid ambiguity during the final decision from possibilistic beliefs. It is based on two steps. The first aims to build a set of possible decisions whereas the second aims to filter this set in order to find a final class with a high score of reliability The principle behind the proposed algorithm is to simulate a wise human behavior that delays the final decision in case of ambiguity until having a reliable decision.

G-Min algorithm (2)

Experimental results

Experimental results Experiments are conducted on 15 datasets provided by UCI.

Experimental results The proposed classifier is the best in terms of average rank

Conclusion and perspectives

Conclusion and perspectives
The new version hybridizes the capability of the former NPC to estimate possibilistic beliefs from numerical data with the efficiency of the G-Min as a novel algorithm for the fusion of possibilistic beliefs. Experimental results have shown that the proposed classifier largely outperforms the former NPC in terms of accuracy The good behavior of the proposed G-Min-based NPC may be promising in the sense that this classifier can be joined to one of the previously proposed possibilistic classifiers for categorical data in order to treat uncertainty stemming from mixed numerical and categorical data

Thank You For Your Attention ! Please send your questions to:

A Modified Naïve Possibilistic Classifier for Numerical Data

Similar presentations

Presentation on theme: "A Modified Naïve Possibilistic Classifier for Numerical Data"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Modified Naïve Possibilistic Classifier for Numerical Data

Similar presentations

Presentation on theme: "A Modified Naïve Possibilistic Classifier for Numerical Data"— Presentation transcript:

Similar presentations

About project

Feedback