MTBI Personality Predictor using ML

Slides:

Advertisements

Similar presentations

Florida International University COP 4770 Introduction of Weka.

Advertisements

ICONIP 2005 Improve Naïve Bayesian Classifier by Discriminative Training Kaizhu Huang, Zhangbing Zhou, Irwin King, Michael R. Lyu Oct

Scott Wiese ECE 539 Professor Hu

An Overview of Machine Learning

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –SVM –Multi-class SVMs –Neural Networks –Multi-layer Perceptron Readings:

Lesson learnt from the UCSD datamining contest Richard Sia 2008/10/10.

Artificial Intelligence Statistical learning methods Chapter 20, AIMA (only ANNs & SVMs)

Scalable Text Mining with Sparse Generative Models

Introduction to Artificial Neural Network and Fuzzy Systems

Machine Learning Usman Roshan Dept. of Computer Science NJIT.

Machine Learning Queens College Lecture 13: SVM Again.

Machine Learning Lecture 11 Summary G53MLE | Machine Learning | Dr Guoping Qiu1.

Use of web scraping and text mining techniques in the Istat survey on “Information and Communication Technology in enterprises” Giulio Barcaroli(*), Alessandra.

Powerpoint Templates Page 1 Powerpoint Templates Scalable Text Classification with Sparse Generative Modeling Antti PuurulaWaikato University.

Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.

Principal Component Analysis Machine Learning. Last Time Expectation Maximization in Graphical Models – Baum Welch.

Introduction to String Kernels Blaz Fortuna JSI, Slovenija.

USE RECIPE INGREDIENTS TO PREDICT THE CATEGORY OF CUISINE Group 7 – MEI, Yan & HUANG, Chenyu.

Intelligent Database Systems Lab Advisor ： Dr.Hsu Graduate ： Keng-Wei Chang Author ： Lian Yan and David J. Miller 國立雲林科技大學 National Yunlin University of.

© Devi Parikh 2008 Devi Parikh and Tsuhan Chen Carnegie Mellon University April 3, ICASSP 2008 Bringing Diverse Classifiers to Common Grounds: dtransform.

ECE 471/571 - Lecture 19 Review 11/12/15. A Roadmap 2 Pattern Classification Statistical ApproachNon-Statistical Approach SupervisedUnsupervised Basic.

***Classification Model*** Hosam Al-Samarraie, PhD. CITM-USM.

Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.

Estimation of car gas consumption in city cycle with ANN Introduction  An ANN based approach to estimation of car fuel consumption  Multi Layer Perceptron.

Machine Learning Usman Roshan Dept. of Computer Science NJIT.

Neural network based hybrid computing model for wind speed prediction K. Gnana Sheela, S.N. Deepa Neurocomputing Volume 122, 25 December 2013, Pages 425–429.

1 Introduction to Neural Networks Recurrent Neural Networks.

Usman Roshan Dept. of Computer Science NJIT

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

A Simple Approach for Author Profiling in MapReduce

Classify A to Z Problem Statement Technical Approach Results Dataset

2/13/2018 4:38 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.

CNN-RNN: A Uniﬁed Framework for Multi-label Image Classiﬁcation

Sentiment Analysis of Twitter Messages Using Word2Vec

CEE 6410 Water Resources Systems Analysis

ECE 5424: Introduction to Machine Learning

Simon Fraser University Simon Fraser University

COMP24111: Machine Learning and Optimisation

Table 1. Advantages and Disadvantages of Traditional DM/ML Methods

Image Recognition. Contents: Motivation Objective Definition Introduction Preprocessing / Edge Detection Neural Networks in Image Recognition Practical.

Deep Learning with TensorFlow online Training at GoLogica Technologies

Natural Language Processing of Knee MRI Reports

Schizophrenia Classification Using

Classifying enterprises by economic activity

כריית מידע -- מבוא ד"ר אבי רוזנפלד.

Analyzing and Visualizing Disaster Phases from Social Media Streams

Machine Learning with Weka

Final Presentation: Neural Network Doc Summarization

Principal Component Analysis

Predicting Pneumonia & MRSA in Hospital Patients

Word Embedding Word2Vec.

Machine Learning 101 Intro to AI, ML, Deep Learning

Advanced Artificial Intelligence Classification

Milton King, Waseem Gharbieh, Sohyun Park, and Paul Cook

Automated Recipe Completion using Multi-Label Neural Networks

Somi Jacob and Christian Bach

Hans Behrens, , 25% Yash Garg, , 25% Prad Kadambi, , 25%

Multivariate Methods Berlin Chen

Presentation By: Eryk Helenowski PURE Mentor: Vincent Bindschaedler

Multivariate Methods Berlin Chen, 2005 References:

Machine Learning Support Vector Machine Supervised Learning

Generative Models and Naïve Bayes

Logistic Regression [Many of the slides were originally created by Prof. Dan Jurafsky from Stanford.]

Semi-Supervised Learning

Introduction to Sentiment Analysis

Automatic Handwriting Generation

Usman Roshan Dept. of Computer Science NJIT

LHC beam mode classification

Credit Card Fraudulent Transaction Detection

Do Better ImageNet Models Transfer Better?

Presentation transcript:

MTBI Personality Predictor using ML Salman Ahmed Andy Sin Ashrarul Haq Sifat Instructor: Dr. Bert Huang Virginia Tech 12/13/2017

Myers-Briggs Type Indicator What is MBTI Myers-Briggs Type Indicator

Motivation freeform writing: a great degree of personal expression Neuro-scientific background Application in research, business, fun, and many more

Objectives Identify a correlation between writing styles and psychological personalities Evaluate accuracy of MTBI predictor Convert textual representation of freeform writing into feature representation Explore the state-of-the-arts techniques for this prediction task Express the necessity of fancy machine learning models (RNN, ConvNets, CNNs, etc.) in this area

Prior Work Big Five Personality Inventory MBTI Web crawlers to collect data SVM model Estimation accuracy 80% MBTI close relation of brain neurons to written communication short-term memory based recurrent neural network 37% accuracy

Dataset MBTI Dataset from kaggle Not balanced: possibility of biasness 8765 examples 1500 words in each

Data Cleaning

Traditional Model Naïve Bayes – count method tried this method to see the learning works for a basic model Multi-Layer Perceptron - Vector representation Genism Word2Vec embeddings Turn each example into a 32-dimensional vector matrix of 8675 x 32 dimension

Improved Model Principle Component Analysis CountVectorizer maximum number of features : 5000 normalized TF or TF-IDF representation

Axis:

Multinomial Naive Bayes with TF-IDF and Count Vectorizer Logistic Regression with TF-IDF and Count Vectorizer Multi-Layer Perceptron with TF-IDF and Count Vectorizer

Results The Naïve Bayes : 19% accuracy

MLP (basic counting) : 22% accuracy

Multinomial Naïve Bayes: 53% accuracy

Logistic Regression : 64% accuracy

MLP : 48% accuracy

Comparison of Models Model name Accuracy Naïve Bayes (basic counting) 19% Multilayer Perceptron (Word to Vector) 22% Multinomial Naïve Bayes (Count Vectorizer and TF-IDF Similarity) 53% Logistic Regression (Count Vectorizer and TF-IDF Similarity) 64% Multilayer Perceptron (Count Vectorizer and TF-IDF Similarity) 48%

Summary

Conclusion