Authors: S. Volkova, J. JanG, Presenter: Maria Glenski

Slides:



Advertisements
Similar presentations
Division of Youth Services Oct 26, 2012 Common Core & the Content Areas.
Advertisements

Auditing Concepts.
Active ReadingStrategies. Reader Reception Theory emphasizes that the reader actively interprets the text based on his or her particular cultural background.
Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.
Social Research Methods
Unit 1 – Understanding Non-Fiction and Media Texts
Creating Meaningful Writing Opportunities for your students.
--- Hephizibah Roskelly and David A. Jolliffee, Everyday Use
Business Communication Workshop
COMPUTER-ASSISTED PLAGIARISM DETECTION PRESENTER: CSCI 6530 STUDENT.
 Chapter 6: Interacting with Texts (p. 104) › Active Reading (p. 105) › Annotating (p. 105) › Scanning/Focused Reading (p. 107)
Exam Taking Kinds of Tests and Test Taking Strategies.
Nonfiction.
EDITORIALS Writer’s Craft Online Journalism Unit.
True genius resides in the capacity for the evaluation of uncertain, hazardous, and conflicting information. - Winston Churchill.
Recognizing Activities of Daily Living from Sensor Data Henry Kautz Department of Computer Science University of Rochester.
Introduction to Rhetoric
Business Communication Workshop Course Coordinator:Ayyaz Qadeer Lecture # 9.
Three Pillars of Persuasion Establishing Rhetorical Techniques.
Using Linguistic Analysis and Classification Techniques to Identify Ingroup and Outgroup Messages in the Enron Corpus.
Evaluating Sources. Evaluation During Reading After you have asked yourself some questions about the source and determined that it's worth your time to.
TCH 264: Museum Literacies April 21, Today’s Class Share Writing Crawl Pieces Examine Museum Literacies Describe classroom applications Writer’s.
© 2015 The College Board The Redesigned SAT Essay Writing Oakland Schools.
{ Final Exam Terms Take notes.  Use of words in a certain way to convey meaning or to persuade. It can also be a technique to evoke an emotion on the.
Summarise (Sum up) Analyse (Work out) Hypothesise (Put forward)
A Pocket Guide to Public Speaking Pages Google and Yahoo may lead to false or biased information.
Detection of Misinformation on Online Social Networking
R-NET: Machine Reading Comprehension With Self-Matching Networks
The Power of Persuasion
Auditing Concepts.
Jonatas Wehrmann, Willian Becker, Henry E. L. Cagnini, and Rodrigo C
AP Language Reading Strategies and Rhetorical Analysis
Dr Anie Attan 26 April 2017 Language Academy UTMJB
Evaluating of Information
Review on Fact Checking and Automatic Fact Checking Systems
Rosta Farzan and Keyang Zheng, School of Computing and Information
Forecasting the Future using Diverse Social Media Sources
Assessing Credibility
Factual Claim Validation Models Extraction of Evidence
Visualizing Spatiotemporal Embeddings Demo
Advanced English 6 November 1-2, 2017
Lexical: Words vs. Characters Syntactic and Stylistic
Deceptive News Prediction Clickbait Score Inference
D. Arendt (presenter), S. Volkova, E. Bell
Quantifying Deception Propagation on Social Networks
Proportion of Original Tweets
The Rhetorical Triangle
Advanced English 6 November 29-30
REVEAL Total cost: EUR EU contribution: EUR
How to read FOR 8th grade AND BEYOND
Prentice Hall Literature Common Core Edition Grade Nine Pg 519
Writing analytically PETER checklist Point:
PROPAGANDA.
Nonfiction vocabulary
Deep Learning Research & Application Center
A Network Science Approach to Fake News Detection on Social Media
Chelsea Jordan-Makely, MLIS
Digital Defence Diplomacy
Reading Standards Vocabulary
Advanced English 6 November 10, 14
Slide Deck 5: Online Verification Skills
Slide Deck 4 Online Verification Skills
Factual Claim Validation Models
Slide Deck 5: Online Verification Skills
Slide Deck 4: Online Verification Skills
Verification Skills.
Slide Deck 5: Online Verification Skills
Slide Deck 4: Online Verification Skills
SLIDE DECK 6: Online Verification Skills
The Rhetorical Triangle
Presentation transcript:

Authors: S. Volkova, J. JanG, Presenter: Maria Glenski Misleading or Falsification? Inferring Deceptive Strategies and Types in Online News and Social Media Authors: S. Volkova, J. JanG, Presenter: Maria Glenski Data Sciences and Analytics, National Security Directorate, Pacific Northwest National Laboratory WWW Track on Journalism, Misinformation and Fact-Checking, April 25th, Lyon, France WWW Track on Journalism, Misinformation and Fact-Checking November 19, 2018

Deceptive News Shared Online WWW Track on Journalism, Misinformation and Fact-Checking November 19, 2018

WWW Track on Journalism, Misinformation and Fact-Checking Contributions Recent work: Psycholinguistic analysis across deception types in the news pages (Rashkin et al., 2017) Predicting credibility of PolitiFact statements (Rashkin et al., 2017; Wang et al., 2017) and analyzing credibility of tweets (Mitra et al., 2017) Models to classify deceptive news types on Twitter (Volkova et al., 2017) Our approach: Focusing on deception types and deception strategies – misleading and falsification Verifying model generalizability across domains: news pages, tweets and summary statements Qualitatively analyze writers’ intent behind misinformation: psycholinguistic signals moral foundations connotations WWW Track on Journalism, Misinformation and Fact-Checking November 19, 2018

Deception Types and Strategies Disinformation: false facts to deliberately deceive the audience VS. Misinformation is conveyed in the honest but mistaken belief that the relayed incorrect facts are true Propaganda: a form of persuasion to influence audiences via controlled transmission of deceptive, selectively omitting, and one-sided messages Hoax: type of misinformation that aims to deliberately deceive the reader Misleading: topic changes, irrelevant information, and equivocations Falsification: with contradictions or distortions WWW Track on Journalism, Misinformation and Fact-Checking November 19, 2018

WWW Track on Journalism, Misinformation and Fact-Checking Task Definition Build generalizable predictive models to differentiate between deception types and strategies in news across domains Deception strategies: misleading vs. falsification Deception types: propaganda vs. hoax vs. disinformation More Intent to Deceive Less Intent to Deceive Falsification Misleading Over time, connections between accounts within similarity network appear and disappear Amount of similarity between accounts is also temporally dynamic More Intent to Deceive Less Intent to Deceive Disinformation Propaganda Hoax WWW Track on Journalism, Misinformation and Fact-Checking November 19, 2018

Datasets: Deception Strategies Domains Misleading Falsification Summaries 616 1,376 News Pages 81 85 Tweets 96 109 Confirmed cases of disinformation summaries from the European Union’s East Strategic Communications Task Force: https://euvsdisinfo.eu/ and @EUvsDisinfo Falsification: unprovable, no evidence, no proof, no supporting evidence, Crowdsourcing: pairwise inter-annotator agreement kappa is 0.64 (5 annotators) Followed URLs in disinformation summaries to collect the original news pages Queried Twitter public API using SVO and timestamps to extract unique disinfo tweets Parsed summaries, news pages and tweets using SyntaxNet to extract SVO tuples Understand agents and themes of deception Contrast connotations/perspectives across deception types WWW Track on Journalism, Misinformation and Fact-Checking November 19, 2018

Datasets: Deception Types Domains Propaganda Hoaxes Disinformation News Pages 17,872 5,297 166 Tweets 3,834 453 205 Collecting propaganda and hoax news pages and tweets: Downloaded 17,872 propaganda (ActivistPost), 5,297 hoax (DCGazette) news pages Collected the corresponding propaganda and hoax tweets using public Twitter API Collecting disinformation news pages and tweets: Followed URLs in disinformation summaries to collect the original news pages Queried Twitter public API using SVO and timestamps to extract disinformation tweets WWW Track on Journalism, Misinformation and Fact-Checking November 19, 2018

Predictive Models and Signals Machine learning models: MaxEntropy and RandomForest1 Neural network models: Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN)2 Predictive signals: Content: TFIDF, dimensionality reduction, GloVe embeddings3 Style, syntax, complexity and readability: Automated Readability Index (ARI), Flesch-Kincaid readability tests, Coleman-Liau index4 Biased language: intensifiers, dramatic adverbs, assertive, imperative, report verbs Moral foundations: care and harm, fairness and cheating, loyalty and betrayal, authority and subversion, purity and degradation Psycholinguistic signals: imperative commands, personal pronouns, emotional language, quotations, and inclusions What is being discussed How the content is being discussed Lexicons How emotional, subjective the discussion is 1https://nlp.stanford.edu/projects/glove/ 2http://scikit-learn.org/stable/ 3https://keras.io/ 4https://github.com/nltk/nltk_contrib/tree/master/nltk_contrib/readability WWW Track on Journalism, Misinformation and Fact-Checking November 19, 2018

WWW Track on Journalism, Misinformation and Fact-Checking Deception Strategy Classification Results: Misleading vs. Falsification The best models are LSTM and MaxEntropy Falsification strategy is easier to identify than misleading strategy Deceptive strategies are easier to predict in tweets than in summaries and news Predictive signals: connotations (summaries), moral foundations, biased language and psycholinguistic cues (news pages), syntax and connotations (tweets) WWW Track on Journalism, Misinformation and Fact-Checking November 19, 2018

WWW Track on Journalism, Misinformation and Fact-Checking Deception Type Classification Results: Propaganda vs. Disinformation vs. Hoax The best performing model is LSTM Disinformation is easier to predict than propaganda or hoaxes Deceptive news types – disinformation, propaganda, and hoaxes, unlike deceptive strategies, are more salient, and easier to identify in tweets than in news pages Predictive signals: content (summaries and tweets) WWW Track on Journalism, Misinformation and Fact-Checking November 19, 2018

Connotation Analysis: Background Identify writers’ intent behind digital misinformation by analyzing psycholinguistic signals – moral foundations and connotations extracted from different types of deceptive news WWW Track on Journalism, Misinformation and Fact-Checking November 19, 2018

Connotation Analysis Results: Disinformation Writer → agent Writer → theme Implications: Quantitatively demonstrate how agents and themes of strategic deception vary across deception types Qualitatively identify the hidden agenda of content WWW Track on Journalism, Misinformation and Fact-Checking November 19, 2018

Linguistic Realizations of Deception Misleading vs. Falsification Significant differences in subjective language and moral foundations Misleading statements are more subjective than falsified statements in summaries and news pages but not tweets Falsified compared to misleading statements include more: Harm+ and Ingroup+ signals in tweets Affect terms in tweets Tweets Implications: Build models for factuality assessment without external knowledge Improve fact-checking systems by going beyond fake news classification WWW Track on Journalism, Misinformation and Fact-Checking November 19, 2018

Summary and Future Work Predictive signals: Content + moral foundations and connotations are more predictive of deception strategies than style and syntax Content is the most predictive of deception types Predictive models: LSTMs achieve higher performance compared to ML models Deception types: Disinformation is less difficult to predict compared to hoaxes and propaganda Deception strategies: Falsification strategy is easier to infer than misleading strategy Content is the most predictive of misleading strategy across all domains How emotional and subjective the discussion is What is being discussed Domains: Deception types, unlike deception strategies, are easier to identify in tweets than in news pages WWW Track on Journalism, Misinformation and Fact-Checking November 19, 2018

WWW Track on Journalism, Misinformation and Fact-Checking Future Work Multilingual, multimodal (text and images) deception classification Misinformation propagation and influence (deception types, languages) Reactions to deceptive news across platforms: Reddit and Twitter References: Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter. S. Volkova, K. Shaffer, J. Jang and N. Hodas. ACL 2017. Truth of Varying Shades: On Political Fact-Checking and Fake News. H. Rashkin, E. Choi, J. Jang, Y. Choi, and S. Volkova. EMNLP 2017. Fishing for Clickbaits in Social Images and Texts with Linguistically-Infused Neural Network Models. M. Glenski, E. Ayton, D. Arendt and S. Volkova. Proceedings of Google Clickbait Workshop. 2017. Domains: Deception types, unlike deception strategies, are easier to identify in tweets than in news pages WWW Track on Journalism, Misinformation and Fact-Checking November 19, 2018

Svitlana Volkova Senior Research Scientist Data Sciences and Analytics Group Computational and Statistical Analytics Division svitlana.volkova@pnnl.gov http://www.cs.jhu.edu/~svitlana/