Author Identification for LiveJournal Alyssa Liang.

Slides:

Advertisements

Similar presentations

Albert Gatt Corpora and Statistical Methods Lecture 13.

Advertisements

CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.

Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.

Tweet Classification for Political Sentiment Analysis Micol Marchetti-Bowick.

Text Categorization Karl Rees Ling 580 April 2, 2001.

Political Party, Gender, and Age Classification Based on Political Blogs Michelle Hewlett and Elizabeth Lingg.

Naïve Bayes Advanced Statistical Methods in NLP Ling572 January 19,

Named Entity Classification Chioma Osondu & Wei Wei.

PROBABILISTIC MODELS David Kauchak CS451 – Fall 2013.

PROBABILISTIC MODELS David Kauchak CS451 – Fall 2013.

Sentiment Analysis An Overview of Concepts and Selected Techniques.

Made with OpenOffice.org 1 Sentiment Classification using Word Sub-Sequences and Dependency Sub-Trees Pacific-Asia Knowledge Discovery and Data Mining.

Maximum Entropy Model (I) LING 572 Fei Xia Week 5: 02/05-02/07/08 1.

 Traditional NER assume that each entity type is an independent class.  However, they can have a hierarchical structure.

Assuming normally distributed data! Naïve Bayes Classifier.

Classifier Decision Tree A decision tree classifies data by predicting the label for each record. The first element of the tree is the root node, representing.

Part-of-speech Tagging cs224n Final project Spring, 2008 Tim Lai.

Announcements  Homework 4 is due on this Thursday (02/27/2004)  Project proposal is due on 03/02.

Taking the Kitchen Sink Seriously: An Ensemble Approach to Word Sense Disambiguation from Christopher Manning et al.

Boosting Applied to Tagging and PP Attachment By Aviad Barzilai.

Generative Models Rong Jin. Statistical Inference Training ExamplesLearning a Statistical Model  Prediction p(x;  ) Female: Gaussian distribution N(

Data Mining on NIJ data Sangjik Lee. Unstructured Data Mining Text Keyword Extraction Structured Data Base Data Mining Image Feature Extraction Structured.

Language Model. Major role: Language Models help a speech recognizer figure out how likely a word sequence is, independent of the acoustics. A lot of.

Maximum Entropy Model LING 572 Fei Xia 02/07-02/09/06.

Today Logistic Regression Decision Trees Redux Graphical Models

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Seven Lectures on Statistical Parsing Christopher Manning LSA Linguistic Institute 2007 LSA 354 Lecture 7.

Machine Learning Lecture 10 Decision Trees G53MLE Machine Learning Dr Guoping Qiu1.

Ensemble Learning (2), Tree and Forest

File and Database Design; Logic Modeling Class 24.

Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.

SI485i : NLP Set 12 Features and Prediction. What is NLP, really? Many of our tasks boil down to finding intelligent features of language. We do lots.

Language Identification Ben King1/23June 12, 2013 Labeling the Languages of Words in Mixed-Language Documents using Weakly Supervised Methods Ben King.

SPAM DETECTION USING MACHINE LEARNING Lydia Song, Lauren Steimle, Xiaoxiao Xu.

ID3 Algorithm Allan Neymark CS157B – Spring 2007.

Data Mining: Classification

1 Persian Part Of Speech Tagging Mostafa Keikha Database Research Group (DBRG) ECE Department, University of Tehran.

Fast Webpage classification using URL features Authors: Min-Yen Kan Hoang and Oanh Nguyen Thi Conference: ICIKM 2005 Reporter: Yi-Ren Yeh.

Bayesian Networks. Male brain wiring Female brain wiring.

SI485i : NLP Set 5 Using Naïve Bayes.

Hierarchical Topic Models and the Nested Chinese Restaurant Process Blei, Griffiths, Jordan, Tenenbaum presented by Rodrigo de Salvo Braz.

Machine Learning Queens College Lecture 2: Decision Trees.

Group 5 Abhishek Das, Bharat Jangir.. Project Overview We received a total responses of 119 responses. The division of the responses were as follows:

Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.

1 E. Fatemizadeh Statistical Pattern Recognition.

Machine Learning II 부산대학교 전자전기컴퓨터공학과 인공지능연구실 김민호

MLE’s, Bayesian Classifiers and Naïve Bayes Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 30,

CIKM Opinion Retrieval from Blogs Wei Zhang 1 Clement Yu 1 Weiyi Meng 2 1 Department of.

Author Age Prediction from Text using Linear Regression Dong Nguyen Noah A. Smith Carolyn P. Rose.

MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.

1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.

Exercises Decision Trees In decision tree learning, the information gain criterion helps us select the best attribute to split the data at every node.

Date: 2015/11/19 Author: Reza Zafarani, Huan Liu Source: CIKM '15

Machine Learning Decision Trees. E. Keogh, UC Riverside Decision Tree Classifier Ross Quinlan Antenna Length Abdomen Length.

DECISION TREE Ge Song. Introduction ■ Decision Tree: is a supervised learning algorithm used for classification or regression. ■ Decision Tree Graph:

Machine Learning in Practice Lecture 10 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Linear Models (II) Rong Jin. Recap  Classification problems Inputs x  output y y is from a discrete set Example: height 1.8m  male/female?  Statistical.

Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.

Problem Solving with NLTK MSE 2400 EaLiCaRA Dr. Tom Way.

Text Classification and Naïve Bayes Formalizing the Naïve Bayes Classifier.

A Simple Approach for Author Profiling in MapReduce

Simone Paolo Ponzetto University of Heidelberg Massimo Poesio

Data Mining Lecture 11.

An Overview of Concepts and Selected Techniques

Recap: Conditional Exponential Model

Nathaniel Choe Rohan Suri

Decision Tree  Decision tree is a popular classifier.

Logistic Regression [Many of the slides were originally created by Prof. Dan Jurafsky from Stanford.]

Decision Tree  Decision tree is a popular classifier.

Presentation transcript:

Author Identification for LiveJournal Alyssa Liang

The problem LiveJournal – a blogging website Given a document (an entry), identify the author Hierarchical classification first classify by gender then classify author based on gender Document MaleFemale Male 1Male 2Female 1Female 2Female 3

Features Unigrams & Bigrams Average sentence and word length Number of words and distinct words Number of sentences in paragraph Number of UPPERCASE characters Number of words not in the dictionary Number of words with length <= 4 Number of characters in italics, bold, and striked out

The 3 Classifiers Naïve Bayes – generative model Assumes features in document are independent of each other Implemented multi-variate Bernoulli model Only represented if feature appeared in document, not number of times feature appears Decision Trees An internal nodes is a test of a feature, and each branch from the node represents the values it can take A leaf node represents a classification Build a smallish tree from the training data using minimum average entropy Maximum Entropy – conditional model “model all that is known and assume nothing is unknown” Tries to find most uniform model that satisifies constraints, i.e. maximize the entropy

Results Hierarchical classification has no benefits Need to improve gender classification – could use different features Hierarchical Feature Reduction (on gender classification) took 512 most important features and reran maxent training; then took 256 most important features, etc. Proved to be very stable Best features consisted mostly of bigrams (many of which contained punctuation). Also chose features where there was a large difference between male and female (number of distinct words, UPPERCASE letters, short words)