Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jonathan Simon Elizabeth Langdon COM 633, Fall 2010.

Similar presentations


Presentation on theme: "Jonathan Simon Elizabeth Langdon COM 633, Fall 2010."— Presentation transcript:

1 Jonathan Simon Elizabeth Langdon COM 633, Fall 2010

2  The function of GI is to generate a count of words falling into various dictionary-supplied categories  Uses categories from the Harvard IV-4 dictionary and the Lasswell dictionary, as well as five categories based on the social cognition work of Semin and Fiedler  182 categories in all  Each category is a list of words and word senses

3  Examples of Harvard IV-4 categories: Pstv 1045 positive words, plus a subset of 557 words tagged Affil for words indicating affiliation or supportiveness PstvAffil Ngtv 1160 negative words, plus a subset of 833 words tagged Hostile for words indicating an attitude or concern with hostility or aggressiveness NgtvHostile Strong 1902 words implying strength, plus a subset of 689 words tagged Power, indicating a concern with power, control or authority StrongPower Weak 755 words implying weakness, plus a subset of 284 words tagged Submit, indicating submission to authority or power, dependence on others, vulnerability to others, or withdrawal WeakSubmit,

4  Examples of Lasswell categories: PowGain = 65 words about power increasing PowGain PowLoss = 109 words of power decreasing PowLoss PowEnds = 30 words about the goals of the power process PowEnds PowAren = 53 words referring to political places and environments PowAren PowCon= 228 words for ways of conflicting PowCon

5  For names and basic descriptions of each category: http://www.wjh.harvard.edu/~inquirer/homecat.htm http://www.wjh.harvard.edu/~inquirer/homecat.htm  For a list of all words contained in each of the 182 categories: http://www.webuse.umd.edu:9090/tags/http://www.webuse.umd.edu:9090/tags/

6  Users CAN add new categories  Considerations for adding categories: “Somewhat comparable to producing a set of survey questions that everyone agrees has validity in measuring a well-specified construct” To map categories with accuracy requires attention to word use, word senses, and disambiguation routines

7  Purpose: Analyze content of news articles from three different sources Articles are about the same Ted Strickland fundraiser Include a newscast (via closed captioning) from WKYC, an online article from FOX8, and online article from The Plain Dealer

8  Beginning Screens:

9  Input: Select the content you wish to analyze Use plain text format (.txt) Analyze a single file or multiple files at one time  To analyze multiple files simultaneously, save them to a directory (e.g. F:\NewsArticles)  In output, each file will have its own line of data within your Excel file (one row for single files, multiple rows for multiple files)

10  Output: Specify where you want the data output to be saved, name the file and add the.xls extension  Dictionary: You will not need to change this! GI will analyze your content using all of its 182 categories

11  Tags: Output is a matrix of counts and percentages of words falling into the dictionaries’ semantic categories Format column includes r (raw count, or simple count of words) and s (scaled count, or percentage of words in each category Wordcount column is total number of words in the file Leftovers column shows words not found in any dictionary

12

13  Words: Output is a count of all words appearing in your file Rows are words, columns are file names

14

15  Overall, the WKYC article can be viewed as being more positive and affiliative when compared to the FOX and PD articles  WKYC story showed highest percentages of all positively valenced categories  FOX or Plain Dealer showed higher percentages of all negatively valenced categories  CATA / GI findings are reflective of the overall tone of the articles, as experienced by readers (e.g. pulled quotes, emphasis on political / economic climates, etc.)

16

17  Yoshikoder is provides a general word count, custom dictionary word count, KWIC, and reading highlight function  The program can handle multiple documents and analyze them individually or side by side  All dictionaries must be either custom built or downloaded from an external source – several dictionaries are available on the Yoshikoder website

18  Dictionaries consist of 2 levels: Categories and Patterns  Categories are concept words that fall into a larger construct  Patterns are individual words or phrases that fall into a category and are actually searched for  Yoshikoder dictionaries allow wild cards (*)

19  Purpose: Analyze content of news articles from three different sources Articles are about the same Ted Strickland fundraiser Include a newscast (via closed captioning) from WKYC, an online article from FOX8, and online article from The Plain Dealer  This analysis will identify which issues were most frequently mentioned in these stories given a list of predetermined possible issues

20  Beginning Screen:

21  Add Document: Documents must be.TXT file

22  Multiple Documents can be uploaded

23 123 4

24 567 8 9

25  It is important to make sure that the proper level is highlighted when adding a category or pattern. Yoshikoder can stack categories within each other

26  Pre-made or downloaded dictionaries can be imported

27  A Yoshikoder “concordance” is a KWIC analysis Concordance > Make Concordance Results can be exported to HTML or Excel

28  Report Document Word Frequencies reports the frequencies of all words in an individual document All Word Frequencies reports the frequencies of all words in all documents, sorted by document Unified Word Frequencies reports the frequencies of all words in all selected documents

29  Report Dictionary Report shows the frequencies of dictionary words, by category or pattern for an individual document A unified dictionary report downloads the category frequencies into an excel spreadsheet Document Comparison will compare any two documents Statistical Comparison Report will compare any two documents in terms of percent difference

30

31

32 The Channel 3 newscast contained more issue keywords than the Fox 8 and PD stories, with the biggest difference in focus being in education issues. The “Jobs” issue was most frequently mentioned, however it was more emphasized in the FOX 8 and PD story than in channel 3’s coverage. The remainder of issue mentions were sporadic with little overlap between the sources.


Download ppt "Jonathan Simon Elizabeth Langdon COM 633, Fall 2010."

Similar presentations


Ads by Google