Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dan Jurafsky Lecture 8: Medical Applications: Intoxication, Depression, Trauma, Alzheimers, General Medical Health CS 424P/ LINGUIST 287 Extracting Social.

Similar presentations


Presentation on theme: "Dan Jurafsky Lecture 8: Medical Applications: Intoxication, Depression, Trauma, Alzheimers, General Medical Health CS 424P/ LINGUIST 287 Extracting Social."— Presentation transcript:

1

2 Dan Jurafsky Lecture 8: Medical Applications: Intoxication, Depression, Trauma, Alzheimers, General Medical Health CS 424P/ LINGUIST 287 Extracting Social Meaning and Sentiment

3 Topic 1: Intoxication

4 Hollien et al 2001 Methods: 35 young adults, 19 males, 16 females given series of doses of alcohol speech collected at 4 BAC stages Rainbow passage difficult words (buttercup, shapupie) extemp speech (“Tell us about your favorite TV program) head-mounted mikes Investigated: F0 mean and variance duration/rate of speech intensity disfluencies

5 Hollien et al 2001 Results: F0

6 Hollien et al 2001 Results: Duration

7 Hollien et al 2001 Results: Disfluencies

8 Hollien et al 2001 Results: Magnitudes

9 Hollien et al 2001 Results: Speaker Specific Effects What did they find?

10 A famous case study Johnson, K., Pisoni, D. & Bernacki, R. (1990) Do voice recordings reveal whether a person is intoxicated?: A case study. Phonetica. 47: 215-237.

11 Exxon Valdez

12 Was Captain Hazelwood drunk? Not clear if this is relevant, since he was asleep below deck The third mate was in charge of the wheelhouse the ship’s radar was broken But is a well-studied case

13 Johnson et al examined 3 kinds of cues Segmental Effects Disfluencies Suprasegmental Effects

14 Keith Johnsons /s/ and/ʃ/

15 / ʃ /: Captain Hazelwood

16

17

18 Duration

19 F0

20 Summary

21 Questions Johnson et al. examined various possible causes. What other kinds of speaker state could cause drop in F0, slower speech, and disfluencies?

22 New Corpus! Alcohol Language Corpus Florian Schiel et al 2009, 2010 http://www.bas.uni- muenchen.de/forschung/Bas/BasALCeng.html http://www.bas.uni- muenchen.de/forschung/Bas/BasALCeng.html 124 speakers, 11,160 recordings recorded in a car (sometimes with engine running) tonguetwisters command and control speech (“turn off the radio”) spontaneous dialogue and monologue sample, drunk: sample, sober:

23 Automatic Classification Use of prosodic speech characteristics for automated detection of alcohol intoxication Michael Levit, Richard Huber, Anton Batliner, Elmar Noeth Break utterance into phrases automatically, based on fundamental frequency (where possible); zero-crossing rate energy

24 Then use 4 classes of features Prosodic F0 max, F0 min, energy max, energy min, pause length Duration of voiced regions, unvoiced regions, etc. Jitter and shimmer Average cepstrum and cepstral slope

25 Methods Alcoholized speech samples collected at the Police Academy of Hessen, Germany 120 readings (87 minutes) of a fable 33 male speakers BAC between 0 and.24/mille Binary task: above or below 0.8/mille leave-one-out cross-validation neural net classifier

26 Results of Levit et al. Used dev set to find best classifier This used two feature classes: Prosodic features Jitter/shimmer Results with this classifier 62% phrase-accuracy 69% for the whole speech sample voting of the phrases

27 Automatic detection features in the Bavarian corpus Humans: 62%-75% Machine: features used to date: F0 duration rhythm (correlated with duration but doesn’t require word transcripts) formants (f1 mean and F4 variance) Future work!!! disfluencies other segmental features: s versus sh but Schiel finding: more hyperarticulation in vowels in women in their corpus

28 Topic 2: Depression

29 Stirman and Pennebaker Suicidal poets 300 poems from early, middle, late periods of 9 suicidal poets 9 non-suicidal poets

30 Stirman and Pennebaker: 2 models Durkheim disengagement model: suicidal individual has failed to integrate into society sufficiently, is detached from social life detach from the source of their pain, withdraw from social relationships, become more self-oriented prediction: more self-reference, less group references Hopelessness model: Suicide takes place during extended periods of sadness and desperation, pervasive feelings of helplessness, thoughts of death prediction: more negative emotion, fewer positive, more refs to death

31 Methods 156 poems from 9 poets who committed suicide published, well-known in English have written within 1 year of commmiting suicide Control poets matched for nationality, education, sex, era.

32 The poets

33 Stirman and Pennebaker: Results

34 Significant factors Disengagement theory I, me, mine we, our, ours Hopelessness theory death, grave Other sexual words (lust, breast)

35 Rude et al: Language use of depressed and depression-vulnerable college students Beck (1967) cognitive theory of depression depression-prone individuals see the world and tehmselves in pervasively engative terms Pyszynski and Greenberg (1987) think about themselves after the loss of a central source of self-worth, unable to exit a self-regulatory cycle concerned with efforts to regain what was lost. results in self-focus, self-blame Durkheim social integration/disengagement perception of self as not integrated into society is key to suicidality and possibly depression

36 Methods College freshmen 31 currently-depressed (standard inventories) 26 formerly-depressed 67 never-depressed Session 1: take depression inventory Session 2: write essay please describe your deepest thoughts and feelings about being in college… write continuously off the top of your head. Don’t worry about grammar or spelling. Just write continuously.

37 Results depressed used more “I,me” than never-depressed turned out to be only “I” and used more negative emotional words not enough “we” to check Durkheim model formerly depressed participants used more “I” in the last third of the essay

38 Ramirez-Esparza et al: Depression in English and Spanish Study 1: Use LIWC counts on posts from 320 English and Spanish forums 80 posts each from depression forums in English and Spanish 80 control posts each from breast cancer forums Run the following LIWC categories I we negative emotion positive emotion

39 Results of Study 1

40 Conclusions?

41 Study 2 From depression forums: 404 English posts 404 Spanish posts Create a term by document matrix of content words 200 most frequent content words Do a factor analysis dimensionality reduction in term-document matrix Used 5 factors

42 English Factors a

43 Spanish Factors a

44 Implications? Problems? New applications?

45 Topic 3: Trauma

46 Cohn, Mehl, Pennebaker: Linguistic Markers of Psychology Change Surrounding September 11, 2001 1084 LiveJournal users all blog entries for 2 months before and after 9/11 Lumped prior two months into one “baseline” corpus. Investigated changes after 9/11 compared to that baseline Using LIWC categories

47 Variables examined Emotional positivity difference between LIWC scores for positive emotion words (happy, good, nice) and negative emotion words (kill, ugly, guilty). cognitive processing think, question, because: concerned with organizing and intellectually understanding issues social orientation talk, share, friends and personal pronouns besides I/me. (essentially counts # of references to other people)

48 Last factor: Psychological Distancing psychological distancing factor-analytic: + articles, + words > 6 letters long - I/me/mine - would/should/could - present tense verbs low score = personal, experiential lg, focus on here and now high score: abstract, impersonal, rational tone

49 Results

50 Implications? Methodological problems? Ideas for exciting new studies?

51 Topic 4: Alzheimers

52 The Nun Study Linguistic Ability in Early Life and the Neuropathology of Alzheimer’s Disease and Cerebrovascular Disease: Findings from the Nun Study D.A. SNOWDON, L.H. GREINER, AND W.R. MARKESBERY The Nun Study: a longitudinal study of aging and Alzheimer’s disease Cognitive and physical function assessed annually All participants agreed to brain donation at death At the first exam given between 1991 and 1993, the 678 participants were 75 to 102 years old. This study: subset of 74 participants for whom we had handwritten autobiographies from early life, all of whom had died.

53 The data In September 1930 leader of the School Sisters of Notre Dame religious congregation requested each sister write “a short sketch of her own life. This account should not contain more than two to three hundred words and should be written on a single sheet of paper... include the place of birth, parentage, interesting and edifying events of one's childhood, schools attended, influences that led to the convent, religious life, and its outstanding events.” Handwritten diaries found in two participating convents, Baltimore and Milwaukee

54 The linguistic analysis Grammatical complexity Developmental Level metric (Cheung/Kemper) sentences classified from 0 (simple one-clause sentences) to 7 (complex sentences with multiple embedding and subordination) Idea density: average number of ideas expressed per 10 words. elementary propositions, typically verb, adjective, adverb, or prepositional phrase. Complex propositions that stated or inferred causal, temporal, or other relationships between ideas also were counted. Prior studies suggest: idea density is associated with educational level, vocabulary, and general knowledge grammatical complexity is associated with working memory, performance on speeded tasks, and writing skill.

55 Idea density “I was born in Eau Claire, Wis., on May 24, 1913 and was baptized in St. James Church.” (1) I was born, (2) born in Eau Claire, Wis., (3) born on May 24, 1913, (4) I was baptized, (5) was baptized in church (6) was baptized in St. James Church, (7) I was born...and was baptized. There are 18 words or utterances in that sentence. The idea density for that sentence was 3.9 (7/18 * 10 = 3.9 ideas per 10 words).

56 Results correlation between neuropatholocially defined Alzheimers desiease had lower idea desnity socres than thnon-Alzheimers Correlations between idea density scores and mean neurofibrillary tangle counts −0.59 for the frontal lobe, −0.48 for the temporal lobe, −0.49 for the parietal lobe

57 Explanations? Early studies found same results with a college- education subset of the population who were teachers, suggesting education was not the key factor They suggest: Low linguistic ability in early life may reflect suboptimal neurological and cognitive development which might increase susceptibility to the development of Alzheimer’s disease pathology in late life

58 Garrod et al. 2005 British writer Iris Murdoch last novel published 1995, Diagnosed with Alzheimers 1997 Compared three novels Under the Net (first) The Sea (in her prime) Jackson's Dilemma (final novel) All her books written in longhand with little editing

59 Type to token ratio in the 3 novels

60 Syntactic Complexity

61 Mean proportions of usages of the 10 most frequently occurring words in each book that appear twice within a series of short intervals, ranging from consecutive positions in the text to a separation of three intervening words. Garrard P et al. Brain 2005;128:250-260 Brain Vol. 128 No. 2 © Guarantors of Brain 2004; all rights reserved

62 Parts of speech

63 Comparative distributions of values of: (A) frequency and (B) word length in the three books. Garrard P et al. Brain 2005;128:250-260 Brain Vol. 128 No. 2 © Guarantors of Brain 2004; all rights reserved

64 From Under the Net, 1954 "So you may imagine how unhappy it makes me to have to cool my heels at Newhaven, waiting for the trains to run again, and with the smell of France still fresh in my nostrils. On this occasion, too, the bottles of cognac, which I always smuggle, had been taken from me by the Customs, so that when closing time came I was utterly abandoned to the torments of a morbid self-scrutiny.” From Jackson's Dilemma, 1995 "His beautiful mother had died of cancer when he was 10. He had seen her die. When he heard his father's sobs he knew. When he was 18, his younger brother was drowned. He had no other siblings. He loved his mother and his brother passionately. He had not got on with his father. His father, who was rich and played at being an architect, wanted Edward to be an architect too. Edward did not want to be an architect."

65 Lancashire and Hirst Vocabulary Changes in Agatha Christie’s Mysteries as an Indication of Dementia: A Case Study Ian Lancashire and Graeme Hirst 2009

66

67 Vocabulary Changes in Agatha Christie’s Mysteries as an Indication of Dementia: A Case Study Ian Lancashire and Graeme Hirst 2009 Examined all of Agatha Christie’s novels Features: Nicholas, M., Obler, L. K., Albert, M. L., Helm-Estabrooks, N. (1985). Empty speech in Alzheimer’s disease and fluent aphasia. Journal of Speech and Hearing Research, 28: 405–10. Number of unique word types Number of different repeated n-grams up to 5 Number of occurences of “thing”, “anything”, and “something”

68

69 Results

70 Topic 5: Writing and physical health People asked to write about traumatic experiences subsequently exhibit better physical health than people asked to write about superficial topics Intuition: people who write about emotional topics report that the experiment makes them think differently about their experience. Hypothesis: Do changes in writing style correlate with improved health? Could we find these changes automatically?

71 Singular Value Decomposition Singular Value Decomposition (SVD) is a form of factor analysis Any m  n matrix A can be written using an SVD of the form A = UDV T where: U is an m  n matrix (a ‘hanger’ matrix) D is an n  n diagonal matrix (a ‘stretcher’ matrix) V T is an n  n matrix (an ‘aligner’ matrix) (see http://www.uwlax.edu/faculty/will/svd/index.html)http://www.uwlax.edu/faculty/will/svd/index.html

72 Application of SVD to LSA Assemble a large corpus of natural language Parse corpus into meaningful passages Form matrix with passages as rows and words as columns SVD applied to re-represent the words and passages as vectors in a high-dimensional ‘semantic space’

73 SVD: an example (1) Titles of Technical Memos c1: Human machine interface for ABC computer applications c2: A survey of user opinion of computer system response time c3: The EPS user interface management system c4: System and human system engineering testing of EPS c5: Relation of user perceived response time to error measurement m1: The generation of random, binary, ordered trees m2: The intersection graph of paths in trees m3: Graph minors IV: Widths of trees and well-quasi-ordering m4: Graph minors: A survey

74 LSA This example is taken from: Deerwester, S.,Dumais, S.T., Landauer, T.K.,Furnas, G.W. and Harshman, R.A. (1990). "Indexing by latent semantic analysis." Journal of the Society for Information Science, 41(6), 391-407. Slides are from a presentation by Tom Landauer and Peter Foltz, adapted by Melanie Martin

75 A Small Example Technical Memo Titles c1: Human machine interface for ABC computer applications c2: A survey of user opinion of computer system response time c3: The EPS user interface management system c4: System and human system engineering testing of EPS c5: Relation of user perceived response time to error measurement m1: The generation of random, binary, ordered trees m2: The intersection graph of paths in trees m3: Graph minors IV: Widths of trees and well-quasi-ordering m4: Graph minors: A survey Slides are from a presentation by Tom Landauer and Peter Foltz, adapted by Melanie Martin

76 A Small Example – 2 r (human.user) = -.38r (human.minors) = -.29 Slides are from a presentation by Tom Landauer and Peter Foltz, adapted by Melanie Martin

77 A Small Example – 3 Singular Value Decomposition {A}={U}{ S }{V} T Dimension Reduction {~A}~={~U}{~ S }{~V} T

78 A Small Example – 4 {U} = Slides are from a presentation by Tom Landauer and Peter Foltz, adapted by Melanie Martin

79 A Small Example – 5 { S } = Slides are from a presentation by Tom Landauer and Peter Foltz, adapted by Melanie Martin

80 A Small Example – 6 {V} = Slides are from a presentation by Tom Landauer and Peter Foltz, adapted by Melanie Martin

81 A Small Example – 7 r (human.user) =.94r (human.minors) = -.83

82 A Small Example – 2 reprise r (human.user) = -.38r (human.minors) = -.29 Slides are from a presentation by Tom Landauer and Peter Foltz, adapted by Melanie Martin

83 Pennebaker results Pronouns: I, my, it, you, me, she, he, her, we, they, your, him, his, them, our, myself, their, us, its

84 Implications?


Download ppt "Dan Jurafsky Lecture 8: Medical Applications: Intoxication, Depression, Trauma, Alzheimers, General Medical Health CS 424P/ LINGUIST 287 Extracting Social."

Similar presentations


Ads by Google