Abdur Rahman Department of Statistics

Slides:



Advertisements
Similar presentations
The Software Infrastructure for Electronic Commerce Databases and Data Mining Lecture 4: An Introduction To Data Mining (II) Johannes Gehrke
Advertisements

Logistic Regression Psy 524 Ainsworth.
A SOFTWARE TOOL DEVELOPED FOR THE CLASSIFICATION OF REMOTE SENSING SPECTRAL REFLECTANCE DATA Abdullah Faruque School of Computing & Software Engineering.
Chapter 4: Linear Models for Classification
Statistical Methods Chichang Jou Tamkang University.
Principle of Locality for Statistical Shape Analysis Paul Yushkevich.
Data mining and statistical learning - lecture 13 Separating hyperplane.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
METU Informatics Institute Min 720 Pattern Classification with Bio-Medical Applications PART 2: Statistical Pattern Classification: Optimal Classification.
1 Linear Methods for Classification Lecture Notes for CMPUT 466/551 Nilanjan Ray.
Multiple Discriminant Analysis and Logistic Regression.
This week: overview on pattern recognition (related to machine learning)
Classification (Supervised Clustering) Naomi Altman Nov '06.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Business Intelligence and Decision Modeling Week 11 Predictive Modeling (2) Logistic Regression.
Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.
Generalizing Linear Discriminant Analysis. Linear Discriminant Analysis Objective -Project a feature space (a dataset n-dimensional samples) onto a smaller.
Multivariate Data Analysis Chapter 5 – Discrimination Analysis and Logistic Regression.
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
Text Classification 2 David Kauchak cs459 Fall 2012 adapted from:
Linear Discriminant Analysis and Its Variations Abu Minhajuddin CSE 8331 Department of Statistical Science Southern Methodist University April 27, 2002.
Linear Discriminant Analysis (LDA). Goal To classify observations into 2 or more groups based on k discriminant functions (Dependent variable Y is categorical.
Logistic Regression. Linear Regression Purchases vs. Income.
Linear Discriminant Analysis and Logistic Regression.
Introduction to Machine Learning Multivariate Methods 姓名 : 李政軒.
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Part 9: Review.
Linear Classifiers Dept. Computer Science & Engineering, Shanghai Jiao Tong University.
LECTURE 05: CLASSIFICATION PT. 1 February 8, 2016 SDS 293 Machine Learning.
Computational Intelligence: Methods and Applications Lecture 22 Linear discrimination - variants Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
DISCRIMINANT ANALYSIS. Discriminant Analysis  Discriminant analysis builds a predictive model for group membership. The model is composed of a discriminant.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Logistic Regression: Regression with a Binary Dependent Variable.
Classification Methods
BINARY LOGISTIC REGRESSION
Machine Learning – Classification David Fenyő
Chapter 7. Classification and Prediction
Week 2 Presentation: Project 3
Logistic Regression APKC – STATS AFAC (2016).
Notes on Logistic Regression
CH 5: Multivariate Methods
Statistical Techniques
Multiple Discriminant Analysis and Logistic Regression
Machine Learning Basics
Overview of Supervised Learning
Machine Learning Week 1.
Classification Discriminant Analysis
IS6000 – Class 10 Introduction to SmartPLS (&SPSS)
Classification Discriminant Analysis
Nearest-Neighbor Classifiers
Comparisons among methods to analyze clustered multivariate biomarker predictors of a single binary outcome Xiaoying Yu, PhD Department of Preventive Medicine.
PCA, Clustering and Classification by Agnieszka S. Juncker
Presenter: Georgi Nalbantov
Categorical Data Analysis Review for Final
Logistic Regression.
Computer Vision Chapter 4
Generally Discriminant Analysis
Somi Jacob and Christian Bach
Parametric Methods Berlin Chen, 2005 References:
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
Machine Learning with Clinical Data
Physics-guided machine learning for milling stability:
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
CAMCOS Report Day December 9th, 2015 San Jose State University
Outlines Introduction & Objectives Methodology & Workflow
Classification Methods
Is Statistics=Data Science
Presentation transcript:

Predicting Depression Occurrence Using Classification Algorithm in Data Mining Abdur Rahman Department of Statistics Shahjalal University of Science and Technology Sylhet, Bangladesh E-mail: airdipu@gmail.com

Introduction Universal definition of old age is elusive Only 6.13 percent is elder (60+) in Bangladesh Become senile and lose ability in physically and mentally Aging is one of the embryonic problems in Bangladesh Self-assessments of health are common components of population- based surveys Elderly are found to suffer from diseases like depression, sleeping problem, gastric problem, diabetes, mental problem and so on

Methodology Linear Discriminant Analysis (LDA) Quadratic Discriminant Analysis (QDA) Logistic Regression Analysis K-Nearest Neighbor (KNN)

Figure: Architecture of Classification Algorithm

Sampling Method Cluster sampling Urban area, rural area, tea garden area and ethnic area Collected whole population from each cluster

Data Primary data Collected during March to September 2015 229 elderly peoples aged ranges from 60 to 60+ Face to face personal interviews through questionnaires

Linear Discriminant Analysis LDA undertakes the same task as Logistic Regression. It classifies data based on categorical variables Making profit or not Buy a product or not Satisfied customer or not Political party voting intention

Linear Discriminant Analysis LDA involves the determination of linear equation (just like linear regression) that will predict which group the case belongs to. Here D: discriminant function v: discriminant coefficient or weight for the variable X: variable a: constant

Quadratic Discriminant Analysis Quadratic discriminant analysis calculates a Quadratic Score Function This is a function of population mean vectors and the variance- covariance matrices for the ith group

Logistic Regression In logistic regression, the dependent variable is binary or dichotomous, i.e. it only contains data coded as 1 (TRUE, success, pregnant, etc.) or 0 (FALSE, failure, non- pregnant, etc.) The logit transformation is defined as the logged odds

KNN KNN is completely non-parametric: No assumptions are made about the shape of the decision boundary! We can expect KNN to dominate both LDA and Logistic Regression when the decision boundary is highly non-linear The most intuitive nearest neighbour type classifier is the one nearest neighbour classifier that assigns a point x to the class of its closest neighbour in the feature space, that is

Figure: Error Rate for Different Value of K

Results & Discussions If the true decision boundary is Linear: LDA and Logistic outperforms Moderately Non-linear: QDA outperforms More complicated: KNN is superior

Correctly Classified (%) Results & Discussions   Correctly Classified (%) Misclassified (%) QDA 93.67 6.33 LDA 94.94 5.06 Logistic Regression KNN 98.73 1.27

Figure: Graphical Representation of Accuracy

Conclusions LDA and Logistic regression shows same accuracy QDA performs lowest accuracy KNN is better than LDA, QDA and Logistic regression

THANK YOU