Quantifying Deception Propagation on Social Networks

Quantifying Deception Propagation on Social Networks
Maria Glenski † and Svitlana Volkova Data Sciences and Analytics Group, National Security Directorate † Computer Science and Engineering, University of Notre Dame NOTE: Print this poster file at 100% SCALE to result in a physical print measuring 48” wide x 36” tall. All type size notations shown above are based on the final printed size of the poster. • Contact Digital Duplicating ( , to order poster printing and finishing services for your completed poster design. • Remember to have your poster cleared for public display/distribution through the ERICA Information Release system ( • Sidebar “About PNNL” box is considered optional, and can be removed if space is needed for technical content. Data Collection Inferring User Attributes Predicted User Demographics 282 Twitter accounts previously identified1 as spreading verified (V) or deceptive news. Deceptive news sources are categorized as clickbait (CB), conspiracy (CS), propaganda (P), or disinformation (D). Twitter data: collected all tweets posted in 2016 directly retweeting or mentioning verified or deceptive sources. 145,688 users with ≥ 5 English-interactions with deceptive sources. Each user classified as person or organization using the Humanizr classifier2. 66,171 person-users with ≥200 public tweets available. Learned attribute-specific CNN models initialized with 200-dim GLOVE embeddings using annotated data3: AUC ROC Gender 0.89 Age 0.72 Income 0.72 Education 0.76 Inferred attributes of each user from their 200 most recent tweets. Summary of attributes and the number of users classified as each demographic by our models: Attribute Class N Gender M : Male 63,259 F : Female 2,912 Age Y : 24 and younger 3,336 O : 25 and older 62,835 Income B : Below $35k 12,885 A : At least $35k 53,286 Education H : High School 11,807 D : Bachelor’s Degree 54,364 V CB CS P D % Person 91.47 92.33 91.50 93.03 89.76 # Users 83,785 8,367 16,520 57,314 30,456 Intent to Deceive V CB CS P D . . . Dense Layer (100 units) Convolutional Layer (100 units) Embedding Layer (200 units) Word Sequence Input Attribute Class Activation Layer (soft max) CNN architecture Distributions of each attribute within each population and all users overall: % Population Population Type # sources # tweets # users posting # users mentioned V 182 6,567,002 1,423,227 390,164 CB 11 40,347 19,361 6,002 CS 13 126,246 35,799 9,171 P 26 609,251 233,799 34,532 D 50 3,487,732 292,437 82,638 Total 282 10,819,357 1,784,655 471,967 Who is more likely to propagate? Is there some group of users responsible for most of the propagation? Mann Whitney U tests comparing binary features of whether users shared from sources of each type at least once, find who is more likely (p < 0.01) to propagate from each type: We look at the Palma Ratios for the distributions of retweet frequencies of user-source pairs within each population where Palma Ratio= 𝑇𝑜𝑝 10% 𝐵𝑜𝑡𝑡𝑜𝑚 40% . If all users share an equal number of tweets, Palma Ratio= 1 4 The most active 10% of younger users retweeting disinformation sources share times what the least active 40% do. Propagation of disinformation sources is more often spread by the same top 10% of users with higher incomes than lower incomes. V CB CS P D Gender Age Income Education Women retweet conspiracy theory more evenly than men. #1 Overall, disinformation is propagated most unevenly, followed by conspiracy theory. The biggest differences in behavior also occur with disinformation and conspiracy theory sources. Key M: Male F: Female There is always a group of users sharing more than others. All Palma Ratios > 0.25 Users in different age brackets have the greatest difference in Palma Ratios for verified sources. Y: 24 and younger O: 25 and older V CB CS P D M-F 1.39 0.32 5.14 0.54 7.03 Y-O -2.30 -0.35 -4.30 -0.67 23.19 B-A 0.72 2.75 1.02 26.50 H-D -0.40 0.51 2.86 24.86 V CB CS P D M/F 1.45 1.23 1.84 1.18 1.39 Y/O 0.49 0.80 0.35 0.81 2.28 B/A 1.07 1.43 1.46 1.3 3.95 H/D 0.91 1.30 1.48 1.21 3.33 B: Below $35k A: At least $35k H: High school D: Bachelor’s Degree Comparing populations using deltas (left) and ratios (right) of the Palma Ratios of each, by attribute. Volkova, S., et al. "Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter." ACL Volkova, S. and Bachrach, Y. Inferring Perceived Demographics from User Emotional Tone and User-Environment Emotional Contrast. ACL 2016. 2. McCorriston, J., et al. "Organizations Are Users Too: Characterizing and Detecting the Presence of Organizations on Twitter." ICWSM 2015. This research was conducted under the Laboratory Directed Research and Development Program at Pacific Northwest National Laboratory, a multi-program national laboratory operated by Battelle for the U.S. Department of Energy

Quantifying Deception Propagation on Social Networks

Similar presentations

Presentation on theme: "Quantifying Deception Propagation on Social Networks"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Quantifying Deception Propagation on Social Networks

Similar presentations

Presentation on theme: "Quantifying Deception Propagation on Social Networks"— Presentation transcript:

Similar presentations

About project

Feedback