Disinformation on the Web:

Slides:



Advertisements
Similar presentations
Measuring Reliability in Wikipedia Wen-Yuan Zhu
Advertisements

 Subject: What is the text about? (topic)  Out of what exigency (opportunity) did the text grow? What is the context (time, place, social attitudes,
 A wiki is a collaborative Web site that combines the collective work of many authors.  A wiki allows anyone to edit, delete or modify content that.
Blogs defined From Wikipedia A blog is a website in which journal entries are.
Chapter 5 Searching for Truth: Locating Information on the WWW.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Fluency with Information Technology Third Edition by Lawrence Snyder Chapter.
Christy Gavin Wikipedia Jimmy Wales January 15, 2001.
+ Beginning Blogging by Six Sisters’ Stuff. + Just start! What do you want to blog about? What are you an expert in? What makes you unique? What are you.
Secondary Education: Researching the Holocaust Through iconn Presented by Nick Wharton University of Hartford Libraries.
CREATED BY: Weinstein, Pereira, Newton, & Merzena, Ltd. [Tracy Newton, Kathryn Merzena, Emily Pereira, and Kerry Weinstein] Cerwood Main Library Blog.
Introduction The large amount of traffic nowadays in Internet comes from social video streams. Internet Service Providers can significantly enhance local.
Chapter 5 Searching for Truth: Locating Information on the WWW.
Detecting Promotional Content in Wikipedia Shruti Bhosale Heath Vinicombe Ray Mooney University of Texas at Austin 1.
Do Now Today’s Title: Making Assertions In your notebook, get ready for a practice quiz: ◦ Title: Practice Quiz for Citations ◦ Number it #1-5.
Cara Catalano Wikis Nassau Library System. Cara CatalanoCara Catalano Library Media Specialist, Turtle Hook Middle School Library Media Specialist, Turtle.
Class 6 Redefining the Book, Changing Role of Editorial/Publishers.
DAY 2: TIPS ON SEARCHING WISELY Robert Phipps August
What are a journalist’s ethics? Accuracy – as much as humanly possibly, a journalist must be accurate. How can you ensure accuracy  Investigate, research.
What do experts think? What do students think? What do you think?
How to be a critical internet user Laramie/Lederle Collaboration Effort March 28, 2006.
Instructional Technology & Design Office or The World of Wikis Presented by Rebecca McGuire.
DAY 2: TIPS ON SEARCHING WISELY Tazin Afrin August 22,
Why Not Grab a Free Lunch? Mining Large Corpora for Parallel Sentences to Improve Translation Modeling Ferhan Ture and Jimmy Lin University of Maryland,
Jean Cook University of West Georgia Please write down: tinyurl.com/wikilib___.
Working With Wikis: Taking Student Organizations Online Frederic Murray, MLIS Instructional Services Librarian Al Harris Library
EVALUATING ONLINE SOURCES. GOAL Identify criteria to evaluate websites. Evaluate websites to determine their usefulness for research & your own personal.
Integrating ICT in Secondary Gail Butler Macmillan teaching training 2010.
Tajik Wikipedia Free Encyclopedia Ibrahim Rustamov Note: To view pages on the Internet properly with all Tajik letters, please.
Credibility “Can We Trust The Source?”. What is Credibility? »Credibility is how trustworthy an author or source is. »You can ask yourself 3 questions.
Gefördert durch das Kompetenzzentrenprogramm © Know-Center 2012 Measuring the Quality of Web Content using Factual Information 16. April.
7 th Grade Research Source Requirements. What is a source? Sources are pieces of evidence that will support your research. First hand research is research.
SENIOR RESEARCH PROJECT AND PRESENTATION If I listen, I’ll do really well.
How exactly does Wikipedia keep their posted sources of information as accurate as possible? By: Group A Jean Blackwell Christine Gries Maria Mancha Kendra.
C HAPTER Introduction to Web 2.0 Wikis Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 4.
INTRODUCTION TO MAPNET WIKI Anar Khan on behalf of AgResearch IS Bioinformatics, Mathematics and Statistics 10/10/2006.
#3 Authority #1 Info Seeking Process #2 Search Strategies.
Wikis. What are Wikis? Could this be a Wiki? MoT0Ehttp:// MoT0E.
DAY 2: EFFECTIVE INTERNET SEARCHES Sravanthi Lakkimsetty January 13,
Wikis On-line Collaboration Using Wikispaces. What is a Wiki? A wiki is a type of website that allows users easily to add, remove, or otherwise edit and.
And Beyond How to Evaluate What You Have Searched from the Internet.
A Side Discussion: The Power of Characters
Contribution Patterns Among Active Wikipedians: Finding and Keeping
Internet Protocol Address
== What You See Is Wiki ==
Informational Resources By Teresa Hudson
Social Networking and Collaboration: Mythbusters Edition
Assessing Credibility
More people than live in the United States.
Wikipedia Network Analysis: Commonality detection among Wikipedia authors Deepthi Sajja.
Evaluating Sources.
Reliable vs. Unreliable Web Sources
Deep Learning Research & Application Center
Searching for Truth: Locating Information on the WWW
Researching and Evaluating the Literature
A Network Science Approach to Fake News Detection on Social Media
Just the Facts How to be an Internet Detective
COLLABORATING VIA BLOGS AND WIKIS
CS224w: Social and Information Network Analysis
Searching for Truth: Locating Information on the WWW
What Are Wikis, and Why Should You Use Them?
Mrs. Hunt-Barron Berea Middle School ELA
Searching for Truth: Locating Information on the WWW
The Dark Days of Israel’s Judges A Study of the Book of Judges Lesson 2 Judges 1:1 – 2:5 Living With the Enemy August 23,
SEMI Journal 5/14/18 Start a new set of journals
Unit 5: Building the Story
How to add a website to your class Wiki
Botnet Detection by Monitoring Group Activities in DNS Traffic
Mini Research Project Evaluating Sources.
A short lesson on how to tell the difference
USE WIKIPEDIA TO SEARCH INFORMATION
Presentation transcript:

Disinformation on the Web: Impact, Characteristics and Detection of Wikipedia Hoaxes Srijan Kumar Univ. of Maryland Robert West Stanford Univ. Jure Leskovec Stanford Univ.

Web: Source of information

Web: Source of false information

Types of false information Misinformation honest mistake Disinformation deliberate lie to mislead Hoax “deliberately fabricated falsehood made to masquerade as truth” Wikipedia

The free encyclopedia that anyone can edit Why Wikipedia? Freely accessible Large reach Major source of information for many Easy to add (false) information The free encyclopedia that anyone can edit

Hoaxes on Wikipedia

What are the hoaxes like? Today’s talk Impact of hoaxes Characteristics of hoaxes Detection of hoaxes Quantify their impact? What are the hoaxes like? Can we find them?

Data: Wikipedia Hoaxes Hoax article vs hoax facts

Data: Wikipedia Hoaxes Hoax article vs hoax facts 21,218 hoax articles Hoax lifecycle:

What are the hoaxes like? Today’s talk Impact of hoaxes Characteristics of hoaxes Detection of hoaxes Quantify their impact? What are the hoaxes like? Can we find them?

Impact of hoaxes “The worst hoaxes are those which (a) last for a long time, (b) receive significant traffic, (c) are relied upon by credible news media.” Jimmy Wales on Quora

“The worst hoaxes are those which (a) last for a long time” Impact of hoaxes “The worst hoaxes are those which (a) last for a long time” 0.99

“The worst hoaxes are those which (b) receive significant traffic” Impact of hoaxes “The worst hoaxes are those which (b) receive significant traffic” 10 100 500

Impact of hoaxes 1.08 active inlinks “The worst hoaxes are those which (c) are relied upon by credible news media” 1.08 active inlinks

Today’s talk Impact of hoaxes Characteristics of hoaxes Detection Most hoaxes are caught soon, but some hoaxes are impactful! What are the hoaxes like? Can we find them?

Wrongly flagged temporarily flagged Successful hoax pass patrol survive for a month viewed frequently Failed hoax flagged and deleted during patrol Hoax Legitimate articles never flagged Wrongly flagged temporarily flagged Non-hoax

Characteristics of hoaxes Appearance: Link-network: Support: Editor: how the article looks how the article connects how other articles refer to it how the article creator looks

Characteristics of hoaxes Appearance: Link-network: Support: Editor: how the article looks how the article connects how other articles refer to it how the article creator looks Features: Plain-text length Plain-text-to-markup ratio Wiki-link density Web-link density Hoax articles are longer, but they mostly have plain text and have lesser web and wiki links.

Characteristics of hoaxes Appearance: Link-network: Support: Editor: hoaxes mostly have text and how the article connects how other articles refer to it how the article creator looks few references. CC = 0 incoherent article CC > 0 coherent article Legitimate articles are more coherent than successful hoaxes

Characteristics of hoaxes Appearance: Link-network: Support: Editor: hoaxes mostly have text and hoaxes have incoherent wikilinks. how other articles refer to it how the article creator looks few references. Features: Number of prior mentions Time since first mention Creator of first mention Hoax mentions are less in number, more recently created, and mostly created by IP addresses or article creator

Characteristics of hoaxes Appearance: Link-network: Support: Editor: hoaxes mostly have text and hoaxes have incoherent wikilinks. hoaxes have few, recent, suspicious mentions. how the article creator looks few references. Features: Creator’s age Creator’s experience Hoax creators are more recently registered, and have lesser editing experience.

Today’s talk Impact of hoaxes Characteristics of hoaxes Detection Most hoaxes are caught soon, but some hoaxes are impactful! Hoaxes are different from non-hoaxes in many respects! Can we find them?

Will a hoax get past patrol? Detection of hoaxes AUC = 98% Editor and Network features Is an article a hoax? AUC = 86% Editor and support features AUC = 71% Appearance features Will a hoax get past patrol? Is an article flagged as hoax really one?

We discovered new hoaxes! Flagged by us, deleted by Wikipedia administrators Steve Moertel Popcorn entrepreneur 6 years 11 months! Maurice Foxell Children’s book author and Knight Commander of Royal Victorian Order 1 year 7 months! More at cs.umd.edu/~srijan/hoax

Human Guessing Experiment What makes a hoax credible? Can readers identify hoaxes? 64 public successful hoaxes Pair with similar legitimate articles 320 random hoax/non-hoax pairs x 10 raters on Mech Turk Results 50% 66% 86% Random Human Classifier If a hoax “looks” like a genuine Wikipedia article, it is assumed to be credible. Accurate detection needs non-appearance features.

Conclusion Impact of hoaxes Characteristics of hoaxes Detection Most hoaxes are caught soon, but some hoaxes are impactful! Hoaxes are different from non-hoaxes in many respects! Non-appearance features are important to detect hoaxes!

Thanks! srijan@cs.umd.edu cs.umd.edu/~srijan/hoax hoax articles vs hoax facts