Download presentation
Presentation is loading. Please wait.
Published byBetty Allen Modified over 9 years ago
1
Author Age Prediction from Text using Linear Regression Dong Nguyen Noah A. Smith Carolyn P. Rose
2
Introduction Frame author age prediction from text as a regression problem. Using multi-corpus approach: blogs, telephone conversations and online forum posts Investigation of age prediction with age modeled as a continuous variable.
3
Introduction Frame author age prediction from text as a regression problem. Using multi-corpus approach: blogs, telephone conversations and online forum posts Investigation of age prediction with age modeled as a continuous variable.
4
Data description Fisher telephone corpus Blog corpus Breast cancer forum – Information such as gender and age were indicated. – Every document consists of all posts from a particular user
5
Data description
6
Experiment Linear regression
7
Experiment JOINT Model:
8
Experiment Overview different models – INDIV: Models trained on the three corpora individually – JOINT: Model trained on all three corpora with features represented. – JOINT-Global: Using the learned JOINT model but only keeping the global features – JOINT-Global-Retrained: Using the discovered global features by the JOINT model, but retrained on each specific dataset
9
Experiment Features – Gender Binary feature (Male=1, Female=0) – Textual features Unigrams POS unigrams and bigrams LIWC (linguistic inquiry and word count). This is a word counting program that captures word classes such as inclusion words (LIWC-incl: "with," "and," "include" etc.), causation words (LIWC cause:"because" "hence" etc.), and stylistic characteristics such as percentage of words longer than 6 letters (LIWC-Sixltr).
10
Results and discussion
13
Reference Author Age Prediction from Text using Linear Regression. Dong Nguyen Noah A. Smith Carolyn P. Rose
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.