Download presentation
Presentation is loading. Please wait.
Published byEugene Cole Modified over 9 years ago
1
Application of Confidence Intervals to Text-based Social Network Construction By CDT Julie Jorgensen, 06, G4 Advisors: MAJ Ian McCulloh, D/MATH LTC John Graham, D/BS&L
2
Agenda The Real-World Problem Text Analysis/Social Network Analysis Solution Social Network Analysis Simple Text Analysis A Better Solution Themed Analysis Example Case – Jihadist Texts Theme Scores Network Construction Procedure Jihadist Network Results Importance and Conclusions
3
The Real-World Problem Commanders need to understand “Human Terrain” Majority of ‘HT’ information is in text form The Combating Terrorism Center receives volumes of data every day. Harmony Database is being rapidly declassified Need an efficient way to plow through large amounts of text data and see the linkages. Solution: Text Analysis Displayed in Social Network Analysis
4
Social Network Analysis A mathematical method of quantifying connections between individuals or groups and drawing conclusions from those connections Assumes rational beings are interdependent Nodes Key Actors Links Relationships between Nodes
5
“Human Terrain” Example: 9/11 Hijacker Network
6
Barzani Khamenei Iraq Elections
7
Demonstration Data Set: Jihadist Texts Approx. 250 translated texts MEMRI FBIS Other Sources 15 Authors More than 1 text Not well known
8
Simple Text Analysis: The Plagiarism Check Problem Word matching is overly simple. Ignores context Actors can be overly weighted by writing more
9
Alternative: Themed Analysis Traditional Network Analysis Methods Citation Analysis Physical Network Communication or Financial Network Themed Analysis Relates nodes across multiple fields One similar theme versus many similar themes
10
Demonstration: Text Analysis
11
Theme Scores *Theme Score is the sum of each word’s score per text Problem Commander needs information in representations he/she understands. Networks can compare authors across single themes But difficult to compare authors across multiple themes
12
Constructing a Network Across Multiple Themes Scrub Texts Construct Theme Scores Construct Confidence Intervals Discern Similarity between Nodes Binary or Standardized Difference of Means Create Square Matrix Draw Network *why not ANOVA?
13
Confidence Intervals 95% Confidence Interval = Each Author, Each Theme Example:
14
Relationship Scores Each possible pair of authors per theme Overlapping Confidence Intervals Disparate Confidence Intervals
15
Matrix Construction Multiplication of Scores for each author and each theme Resultant Square Matrix Geometric Mean =
16
Themed Network
17
Theme Analysis: Confidence Interval vs Average Able to look at each theme individually. Average Rank does not account for connections importance, weighting, predictors Themes are combined Can see connections between authors across a combination of themes.
18
Method Comparison
19
Conclusions Socially Engineered Algorithms involve extensive tradeoffs and decisions by the mathematician that can significantly impact commander’s decision-making. Multiple views of the same data is a critical requirement. Find Linkages in large amounts of data Find Connections across multiple fields Non-Tangible Relationships Real World: Track / Catch criminals / radical ideologues Representation of Human Terrain
20
Future Work Publish method in Journal of Computational and Mathematical Organization Theory Integration into ORA (Organizational Risk Analysis) Statistical Software: In use by Intelligence Analysts. Analysis of change over time
21
Questions?
22
References Dr. Jaret Brachman. Combating Terrorism Center, USMA. Dr. Steven Corman. Hugh Downs School of Human Communication, Arizona State University. http://www.checkpoint-online.ch/CheckPoint/Images/N- HusseinCapture.jpg http://www.checkpoint-online.ch/CheckPoint/Images/N- HusseinCapture.jpg http://www.salmac.co.za/profile-writing-arabic.gif Wasserman, Stanley and Katherine Faust. Social Network Analysis: Methods and Applications. New York: Cambridge University Press, 1994, 4.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.