Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu The University of Texas at Dallas {layfield, bxt043000, lkhan, muratk}@utdallas.edu

Overview  Introduction ► Our Goal ► System Design ► Social Networks ► Threat Detection ► Correlation Analysis  The Experiment ► Setup ► Current Results ► Issues ► Future Work

Introduction  Automated message surveillance is essential to communication monitoring ► Widespread use of electronic communication ► Exponential data growth ► Impossible to sift through all ‘by hand’  Going beyond basic surveillance ► Identifying groups rather than individuals ► Monitoring conversations rather than messages

Our Goal  Design new techniques and apply existing algorithms to… ► Create a machine-understandable model of existing social networks ► Identify abnormal conversations and behavior ► Monitor a given communications system in real-time ► Continuously learn and adapt to a dynamic environment

System Design  Three major components: ► Social Network Modeler ► Initial Activity Detector ► Correlated Activity Investigator

Social Networks  Individuals engaged in suspicious or undesirable behavior rarely act alone  We can infer than those associated with a person positively identified as suspicious have a high probability of being either: ► Accomplices (participants in suspicious activity) ► Witnesses (observers of suspicious activity)  Making these assumptions, we create a context of association between users of a communication network

Social Networks  Within our model: ► Every node is a unique user ► Every message creates or strengthens a link between nodes  Over time, the network changes ► Frequent communication leads to stronger links ► Intermittent messaging implies weakening social ties  The strength of the link implies how strong an association between individuals is  From this data, we can theoretically identify ► Hubs ► Groups ► Liaisons

Social Networks

Threat Detection  Every message sent is scrutinized in the interest of identifying suspicious communication ► Keywords analysis ► Prior context (i.e. previous message content)  When a detection algorithm yields a strong result, a token is created ► The token is created at the origin and passed to the recipient(s) ► Existing tokens, if any, are cloned instead  The result is a web that potentially reflects the dissemination of suspicious information activity

Correlation Analysis  Future messages with similar suspicious topics are not always identifiable with the same ‘initial’ techniques ► Quick replies ► Pronoun use ► Assumption that recipient is aware of topic  If a token is present at the sender when a message is sent: ► Message token is associated with and new message are analyzed ► If analysis yields a strong match, the token is further cloned and passed to recipient

The Experiment  A rare set of words shared between two or more messages are candidates for keyword analysis, but they are not always easily sifted from ‘noise’  Noise within text-based messages comes in a variety of forms ► Misspelled words ► Unusual word choice ► Incompatible variations of the same language (i.e. British vs. American English) ► Unexpected language  However, we do not want to eliminate potential keywords ► Document names ► Terminology specific to a subject ► ‘Buzz’ words

The Experiment  We proposed an experiment that attempts to eliminate false positives due to noisy data while strengthening and expanding our correlation techniques

Setup  Tools ► Running word ‘rank’ database ► Implementation of word set theory infrastructure ► JAMA Matrix Library  Singular Value Decomposition  Our Approach ► Apply SVD noise filtering based on 100 messages ► Analyze word frequency correlation between current message and prior suspicious messages ► Generate a score based on the results

Setup  Construct a matrix based on the last 100 messages words messages More common Less common

Setup  Decompose and rebuild U VTVT  A Eliminate ‘weak’ singular values

Setup Pulled from messages j and k ‘Raw’ total score for word w i Pulled from ‘running’ word database Counts only intersection of words Predefined fixed threshold

Current Results  Method is not currently accurate  Large fluctuations ► Correlation easily swayed by plethora of common words ► Uncommon words not given enough weight

Current Results 1000 messages evaluated, first 100 used to seed word ranks.

Issues  Word frequencies fluctuate wildly during beginning of experiment (0.0 – 10.0+)  Extreme cost for current construction methods and computation  Filtering context limited to recent global history  Affected by large bodies of text

Future Work  Tap potential of existing matrix for further analysis  Adaptive filtering feedback algorithms  Speed improvements to accommodate real-time streams  Flexible communication platform monitoring  Addition of pipe architecture for modular threat detection and correlation

Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Similar presentations

Presentation on theme: "Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu.

Similar presentations

Presentation on theme: "Social Networks and Surveillance: Evaluating Suspicion by Association Ryan P. Layfield Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu."— Presentation transcript:

Similar presentations

About project

Feedback