Presentation is loading. Please wait.

Presentation is loading. Please wait.

Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Wednesday 18 February 2009 The Languages of Emotion and Financial News Ann Devitt Khurshid.

Similar presentations


Presentation on theme: "Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Wednesday 18 February 2009 The Languages of Emotion and Financial News Ann Devitt Khurshid."— Presentation transcript:

1 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Wednesday 18 February 2009 The Languages of Emotion and Financial News Ann Devitt Khurshid Ahmad

2 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Sentiment and the Markets

3 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Sentiment and the Markets

4 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Global chipmakers, battling slower technology demand, are betting size matters as they pin their hopes for future growth on small and easy to carry mobile devices such as netbooks and smartphones. Specialised Language of Financial News Bloomberg.com, 18/2/09

5 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Global chipmakers, battling slower technology demand, are betting size matters as they pin their hopes for future growth on small and easy to carry mobile devices such as netbooks and smartphones. Specialised Language of Financial News Bloomberg.com, 18/2/09

6 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Global chipmakers, battling slower technology demand, are betting size matters as they pin their hopes for future growth on small and easy to carry mobile devices such as netbooks and smartphones. Specialised Language of Financial News Bloomberg.com, 18/2/09

7 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Global chipmakers, battling slower technology demand, are betting size matters as they pin their hopes for future growth on small and easy to carry mobile devices such as netbooks and smartphones. Specialised Language of Financial News Bloomberg.com, 18/2/09

8 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Sentiment and the Markets

9 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Sentiment and the Markets

10 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Engle Ng (1993) Asymmetry Curve

11 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Outline  Current psychological theory of emotion  Evaluation of lexical “emotion” resources  Corpus analysis of language of “emotion”

12 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Outline  Current psychological theory of emotion  Evaluation of lexical “emotion” resources  Corpus analysis of language of “emotion”

13 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Cognitive Theory of Emotion: Categorical Ekman (1975)

14 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Cognitive Theory of Emotion: Dimensions  Osgood / Russell  Evaluation  Activity  Potency  Mehabrian PAD  Pleasure  Activation  Dominance

15 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Cognitive Theory of Emotion Watson and Tellegen (1985)

16 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Outline  Current psychological theory of emotion  Evaluation of lexical “emotion” resources  Corpus analysis of language of “emotion”

17 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Lexical Resource Evaluation SentiWordNet Whissel WNA General Inquirer

18 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Whissel Lexical Resource Evaluation Senti WordNet SentiWordNet WNA General Inquirer

19 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Lexical Resource Evaluation Senti WordNet WordPositiveValNegativeVal Happy 0.90.0 Sad0.00.9  39066 terms  Evaluation dimension scale: 0 - 1  Low average: Pos=0.18, Neg=0.23  More extreme Neg values  Error-prone: rude (pos 0.875), gladsome (neg 0.875)

20 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Lexical Resource Evaluation General Inquirer SentiWordNet Whissel WNA General Inquirer

21 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Lexical Resource Evaluation General Inquirer ECSTATIC Pos Pleasure SORROWFULNeg Pain  Hand-coded, content analysis basis  8641 terms  184 binary categories (including MAB dimensions)  Negative > Positive  Active > Passive  Strong > Weak

22 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Lexical Resource Evaluation Whissel Dictionary of Affect SentiWordNet Whissel WNA General Inquirer

23 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Lexical Resource Evaluation Whissel Dictionary of Affect WordEvalActivImag great 2.6250 2.1250 1.0 disastrous 1.4444 2.4000 2.0  Corpus selection, hand-coded  8742 terms  Dimensional representation: 1-3 scale  Evaluation, Activation, Imagery

24 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Lexical Resource Evaluation WordNet Affect SentiWordNet Whissel WNA General Inquirer

25 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Lexical Resource Evaluation WordNet Affect WordBinaryFeatures Lonelinesscognitive state, emotion Happinesscognitive state, emotion  5432 terms  Domains of emotional experience  No Polarity  Short-term: Mood, Manner  Long-term: Attribute, Trait

26 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Lexical Resource Evaluation Lexical Overlap  Are the lexica consistent?  Are they mutually exclusive?  Dice, Jaccard, Asymmetric coefficients

27 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Whissel Lexical Resource Evaluation Lexical Overlap SentiWordNet WNA General Inquirer

28 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin SentiWordNet Lexical Resource Evaluation Lexical Overlap General Inquirer Whissel WNA 1.Statistically significant agreement for Polarity Assignment (Chi square test) 2.Very weak correlation for activation features.

29 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin SentiWordNet Lexical Resource Evaluation Lexical Overlap General Inquirer Whissel WNA 1.Weak correlation of SWN with Whissel evaluation 2. No correlation with Whissel activation dimension 3. SWN positive negatively correlated with imageability

30 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin SentiWordNet Lexical Resource Evaluation Lexical Overlap WNA General Inquirer Whissel 1.SWN tends to negative for short term WNA features 2.SWN tends to positive for long- term WNA features

31 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Whissel Lexical Resource Evaluation Lexical Overlap SentiWordNet WNA General Inquirer

32 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Lexical Resource Evaluation Lexical Overlap  WNA feature division: Short-termLong-term NegativePositive PhysicalCognitive More activeLess active InternalExternal Less abstractMore concrete

33 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Lexical Resource Evaluation Some conclusions  The lexica:  Are quite consistent  Can be used in combination  SentiWN: Largely unexplored territory

34 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Outline  Current psychological theory of emotion  Evaluation of lexical “emotion” resources  Corpus analysis of language of “emotion” General Language

35 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Emotion in General Language Corpus Study Aims  Does “emotion” constitute a distinct sub- language?  Is there a polarity bias in General Language? (the Polyanna Hypothesis of Boucher and Osgood)  What is the impact of using different lexica?

36 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Corpus Analysis The Data BNC  100 million words  Balanced, broad corpus

37 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Corpus Analysis Methodology Is emotion a distinct sub-language?  Examine distribution type  Examine distribution spread  Bootstrap sampling distribution

38 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Corpus Analysis Distribution Type  Zipfian: BNC=Emotion Lexica

39 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Corpus Analysis Distribution shape  Comparison of means: student t-test BNC ≠ Emotion Lexica (p<0.000)  Different sample means  5-30 times more frequent than gen. language  Assumptions of test?

40 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Corpus Analysis Bootstrap Sampling Distribution  Are sentiment-bearing terms a statistically distinct and highly frequent subset of English?  1000 random samples of terms from BNC  Sample size = size of sentiment lexicon  H0: Observed sample falls inside within 95% of bootstrap random sampling distribution of means

41 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Corpus Analysis Bootstrap Sampling Distribution  Are sentiment-bearing terms a statistically distinct and highly frequent subset of English?  For all lexica:  Mean term frequency of lexicon well outside 95%  Sentiment lexica are not representative of BNC (p<0.05)

42 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Corpus Analysis Sentiment Features  Is there a polarity bias in General Language?  Positive polarity bias  Statistically significant for all lexica (χ2 test of independence)

43 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Corpus Analysis Sentiment Features  Is there a polarity bias in General Language when you include intensity of polarity?  Positive polarity bias  Statistically significant for all lexica  χ2 = 158.5, df=1, p<0.0001 for General Inquirer  χ2 = 63.6, df=1, p<0.0001 for Whissel

44 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Corpus Analysis Some conclusions  Sentiment-bearing terms are a distinct subset of English  Positive polarity bias in BNC  General Inquirer and Whissel:  Low coverage and high frequency  SentiwordNet:  Wide coverage and much lower frequency

45 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Outline  Current psychological theory of emotion  Evaluation of lexical “emotion” resources  Corpus analysis of language of “emotion” Comparative

46 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Comparative Corpus Analysis Aims  Examine affective term use  Identify statistically different distributions  Is there a dominant feature/polarity?

47 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Comparative Corpus Analysis The Data  Financial Language  2 million words  On-line financial news: Reuters, CNN, Bloomberg Newspapers  General Language BNC  100 million words  Balanced, broad corpus

48 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Comparative Corpus Analysis The Data  BNC sub-corpora  Imaginative written English  16 million words  Informative written English  70 million words

49 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Comparative Corpus Analysis Methodology  Compare proportions of Sentiment Features  χ 2 Test of Independence  H 0 : π FinCorpus = π BNC

50 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Comparative Corpus Analysis Methodology  Statistical significance of different proportion  χ 2 > 7.8794  p >= 0.005  Features:  41 Lexicon Sentiment Features from 4 lexica  Frequency per million words

51 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Comparative Corpus Analysis Financial Corpus  WRT Imaginative: More affective terms  WRT Informative: Many more affective terms  WRT BNC:  Dependent on feature type  Distributions are statistically distinct

52 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Comparative Corpus Analysis Positive GI Features

53 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Comparative Corpus Analysis Positive GI Features

54 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Comparative Corpus Analysis Negative GI Features

55 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Comparative Corpus Analysis Negative GI Features

56 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Comparative Corpus Analysis Negative GI Features

57 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Some conclusions  Lexical resources for sentiment are consistent  Financial news is a sub-language:  Affective content is statistically distinct relative to general language  Text polarity is asymmetric, positive skew  Different skews for different domains

58 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Something to think about  If different language varieties and domains have distinct use of sentiment terms and their own polarity bias:  Individual sentiment values are not informative  So what do we need??

59 Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Thank You! Ann.Devitt@cs.tcd.ie


Download ppt "Wednesday 18 February 2009 Ann Devitt Trinity College Dublin Wednesday 18 February 2009 The Languages of Emotion and Financial News Ann Devitt Khurshid."

Similar presentations


Ads by Google