+ CATPAC & WordStat Anne D. Sito & Erin Sonenstein COM 633: FA 09
+ CATPAC
+ Overview of CATPAC Designed to recognize frequently used words in text Identifies and groups patterns of similar words Provides output of clustering algorithms, perceptual maps, and interactive clustering
+ Data Preparation: Text
+ 1. Convert document into.txt file
+ 2. Inputting Data
+ 3. Select Text File You Want to Analyze
+ 4. Select “Make Dendrogram”
+ 5. Initial Output Screen
+ 6. Output Data Screen
+ 7. Output: Dendrogram
+ 8. Data Presented in ThoughtView 2D
+ 9. Data Presented in ThoughtView 3D
+ 10. Thought View 3D (Rotated)
+ Discussion and Limitations +’s Found words like “you”, “you’ll”, & “and” to be the most used in this text. Examines relationships between words based on proximity in the text. -’s Words are measured based on frequency, not importance. Focuses less on what words “mean” or how they fit together based on dictionaries.
+ WordStat
+ Overview of WordStat Content Analysis Module for SIMSTAT Specifically designed to process textual information geared for open-ended data which includes: journal articles, speeches, electronic communication, interviews, etc. Has existing dictionary library and can also run analyses from new dictionaries built by the user Can perform statistical analyses (i.e., factor analysis, word frequencies, multiple regression, etc.) KWIC: Key Word In Context tables are available for any included or not included word or word pattern
+ Data: Comparing Reviews of the Book on Amazon.com Between Men and Women
+ 1. Create a Text File
+ 2. Input Text File to WordStat
+ 3. Define Your Variables
+ 4. Running the Analysis
+ 5. Existing Dictionary Was Not Relevant for Our Data
+ 6. New Dictionary Available Online!
+ 7. (Free) New Dictionary Download
+ 8. Import New Dictionary; Maintain Exclusion List
+ 9. Level 1 Analysis
+ 10. Level 2 Analysis
+ 11. Overall Frequencies
+ 12. Gender Differences
+ 13. Dendrogram
+ 14. Clustering
D Figure of Output
+ 16. Concurrence Matrix
+ 17. KWIC by Gender
+ 18. Words by each Text Case
+ 19. Word Count Category Frequency
+ 20. Aggression Example
+ 21. Limitations: Terrific=Anxiety?
+ Discussion & Limitations Allows multiple independent variables Dictionaries may not always be complete Words in.txt file must be be spelled correctly Could not distinguish between quotes from the book and original thoughts May not account for different usage of certain words, (e.g., combating, terrific)
+ Any Questions? Thank You!