Download presentation
Presentation is loading. Please wait.
Published byFrank Banks Modified over 6 years ago
1
Review on Fact Checking and Automatic Fact Checking Systems
Deep Learning Research & Application Center 17 October 2017 Claire Li
2
Fact Checking Factual claims are those that can be verified
The room measures ten feet by twelve feet Fact checking is a way of knowledge-based news content verification, it assigns a truth value (can be in some degree) to a factual claim made in a particular context Important features include Context, time, speaker, multiple sources (URL) , evidences, etc Fact-checkers verdict factual claims by investigating relevant data and documents and publish their verdicts Automatic fact checking systems consists of Pre-restrict the task to claims can be fact-checking objectively (scope of the task), for example Spot check-worthy factual claims, how, related publications: [1], [2] Verdict check-worthy factual claims automatically, how, related publications: [1], [2]; websites: fullfact.org, Hybrid technologies: deep learning approaches + reasoning techniques with world knowledge base Integrating existing tools
3
Unsuitable claims for the task of automatic fact-checking [2]
assessing causal relations, e.g. whether a statistic should be attributed to a particular law concerning the future, e.g. speculations involving oil prices not concerning facts, e.g. whether a politician is supporting certain policies (e.g. opinions, believes) statements whose verdict relied on data that were not available online such as needing personal communications And more…
4
Automatic Fact-checking System
collect information and data; analyze claims&extract evidence; match claims with evidence; validation and explanations Monitor Model Claim Spotting Model Claim Verdict Model Create & publish Extract natural language sentences from textual/audio sources; Separate factual claims from opinions, beliefs, hyperboles, questions
5
Spot claims worth checking [1][2]
Match statements to ones already fact-checked claims (problem of K-nearest neighbor/semantic similarity between statements) Create a google Custom Search Engine for claim corpus [4] Google’s structured data with ClaimReview markup From fact-checking websites construct publicly available fact checked local database Hoax-Slayer, archives for fact-checked claims politFact, more than 6,000 fact-checks Google’s Schema.org: more than 7,000 fact-checking with ClaimReview markup channel 4, Washington post Calculating semantic similarities between sentences based on word2vec
6
ClaimReview as a subtype of Review.
"A fact-checking review of claims made in some creative work." claimReviewed as a property of ClaimReview. "A short summary of the specific claims reviewed in a ClaimReview." author property on Review to indicate the organization behind the review. claimReviewSiteLogo on the (Claim)Review the fact-checking organization's logo. itemReviewed property the document that carries the claims being reviewed (which could include as shown here, offline newspaper articles). rating how the claim was judged by the article
7
<script type="application/ld+json">
{ " "ClaimReview", "datePublished": " ", "url": " "author": { "Organization", "url": " "sameAs": " "claimReviewed": "More than 3,000 homicides were committed by \"illegal aliens\" over the past six years.", "reviewRating": { "Rating", "ratingValue": 1, "bestRating": 6, "worstRating": "1", "alternateName": "False", "image": " "itemReviewed": { "CreativeWork", "Person", "name": "Rich Perry", "jobTitle": "Former Governor of Texas", "image": " "datePublished": " ", "name": "The St. Petersburg Times interview [...]" }}</script>
8
<script type="application/ld+json">
{ " "ClaimReview", "datePublished": " ", "url": " "itemReviewed":{ "CreativeWork", "author": { "Organization", "name": "Square World Society", "sameAs": "datePublished": " “}, "claimReviewed": "The world is flat", "author":{ "name": "Example.com science watch“}, "reviewRating": { "Rating", "ratingValue": "1", "bestRating": "5", "worstRating": "1", "alternateName" : "False“} }</script>
9
Spot claims worth checking
Identifying new factual claims that have not been fact checked before in new text Use machine learning algorithms to detect ‘check- worthy claims’, related publications ClaimBuster: a platform that allows you to score political sentences to assess how check-worthy they are uses a human-labeled dataset of check-worthy factual claims from the U.S. general election debate transcripts learns from labelled check-worthy sentences, identifies features they have in common, then looks for these features in new sentences
10
Claim Verdict Model Information retrieval from free open source search engines Given claims spotted, search for documents contain relevant fact checks or evidences Ranking and classification problem Apache Lucene, in Java, cross-platform fuzzy searches: e.g. roam~0.8, find terms similar in spelling to roam as 0.8 proximity query: e.g., "Barack michellea"~10 range query, title:{Aida TO Carmen} phrase query: e.g., “new york " used by infomedia, Bloomberg , and Twitter’s real time searching Apache Solr (better for text search) and Elastic Search (better for complex time series search and aggregations) Solr/elasticsearch are built on top of Lucene Basic Queries, text: obama, all docs with text field containing obama Phrase query, text: “Obama michellea” Proximity query, text: ”big analytics”~1, big analytics, big data analytics Boolean query, solr AND search OR facet NOT highlight Range query, age: [18 To 30] Used by Netflix, eBay, Instagram, and Amazon CloudSearch
11
Claim Verdict Model [1][2]
Monitor Model Claim Spotting Model Claim Verdict Model Create & publish LSTMs True Mostly true Half true Half false Mostly false False LSTMs
13
Claim Verdict Model: extraction of evidence
14
Claim Verdict Model: Claim Validation
True, mostly true, half true, half false, mostly false, false
15
Related works 2015-Computational Fact Checking from Knowledge Networks, PLoS One 2017towards automated fact-checking-detecting check-worthy factual claims by claimBuster 2017-Fully Automated Fact Checking Using External Sources 2017-ClaimBuster: The First-ever End-to-end Fact-checking System.pdf Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking, emnlp 2017
16
Reference Sarah Cohen, Chengkai Li, Jun Yang, and Cong Yu Computational journalism: A call to arms to database researchers. In Proceedings of the Conference on Innovative Data Systems Research, volume 2011, pages 148–151. Fact Checking: Task definition and dataset construction.pdf In Proceedings of the ACL Workshop on Language Technologies and Computational Social Science, Baltimore, MD N. Hassan, B. Adair, J. T. Hamilton, C. Li, M. Tremayne, J. Yang, and C. Yu. The quest to automate fact-checking. In Computation Journalism Symposium, 2015 Creating Custom Search Engine with configuration files
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.