Fenglong Ma1, Yaliang Li1, Qi Li1, Minghui Qiu2,

Slides:



Advertisements
Similar presentations
Testing Relational Database
Advertisements

MICHAEL PAUL AND ROXANA GIRJU UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN A Two-Dimensional Topic-Aspect Model for Discovering Multi-Faceted Topics.
CHAPTER 7 Sampling Distributions
Chapter 7 Introduction to Sampling Distributions
QBM117 Business Statistics Statistical Inference Sampling 1.
On Community Outliers and their Efficient Detection in Information Networks Jing Gao 1, Feng Liang 1, Wei Fan 2, Chi Wang 1, Yizhou Sun 1, Jiawei Han 1.
Distributed Representations of Sentences and Documents
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS
POTENTIAL RELATIONSHIP DISCOVERY IN TAG-AWARE MUSIC STYLE CLUSTERING AND ARTIST SOCIAL NETWORKS Music style analysis such as music classification and clustering.
Inferential Statistics
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 24 Statistical Inference: Conclusion.
Right Buddy Makes the Difference: an Early Exploration of Social Relation Analysis in Multimedia Applications Jitao Sang, Changsheng Xu*. 1 Institute of.
Boltzmann Machines and their Extensions S. M. Ali Eslami Nicolas Heess John Winn March 2013 Heriott-Watt University.
Conditional Topic Random Fields Jun Zhu and Eric P. Xing ICML 2010 Presentation and Discussion by Eric Wang January 12, 2011.
Random Sampling, Point Estimation and Maximum Likelihood.
CSD 5100 Introduction to Research Methods in CSD Observation and Data Collection in CSD Research Strategies Measurement Issues.
Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department.
Crowdsourcing with Multi- Dimensional Trust Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department of Electrical.
A Confidence-Aware Approach for Truth Discovery on Long-Tail Data
Variables, sampling, and sample size. Overview  Variables  Types of variables  Sampling  Types of samples  Why specific sampling methods are used.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Please turn off cell phones, pagers, etc. The lecture will begin shortly.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Gile Sampling1 Sampling. Fundamental principles. Daniel Gile
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.
Multiple Location Profiling for Users and Relationships from Social Network and Content Rui Li, Shengjie Wang, Kevin Chen-Chuan Chang University of Illinois.
Probabilistic Models for Discovering E-Communities Ding Zhou, Eren Manavoglu, Jia Li, C. Lee Giles, Hongyuan Zha The Pennsylvania State University WWW.
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
1.  Interpretation refers to the task of drawing inferences from the collected facts after an analytical and/or experimental study.  The task of interpretation.
Stats 845 Applied Statistics. This Course will cover: 1.Regression –Non Linear Regression –Multiple Regression 2.Analysis of Variance and Experimental.
Review I A student researcher obtains a random sample of UMD students and finds that 55% report using an illegally obtained stimulant to study in the past.
Inference: Probabilities and Distributions Feb , 2012.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 8. Parameter Estimation Using Confidence Intervals.
1 A Biterm Topic Model for Short Texts Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Xueqi Cheng Institute of Computing Technology, Chinese Academy of Sciences.
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
1 Module One: Measurements and Uncertainties No measurement can perfectly determine the value of the quantity being measured. The uncertainty of a measurement.
The Analysis of Variance ANOVA
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture.
Analyzing and Predicting Question Quality in Community Question Answering Services Baichuan Li, Tan Jin, Michael R. Lyu, Irwin King, and Barley Mak CQA2012,
Topics Semester I Descriptive statistics Time series Semester II Sampling Statistical Inference: Estimation, Hypothesis testing Relationships, casual models.
7.1 What is a Sampling Distribution? Objectives SWBAT: DISTINGUISH between a parameter and a statistic. USE the sampling distribution of a statistic to.
Sample Size Mahmoud Alhussami, DSc., PhD. Sample Size Determination Is the act of choosing the number of observations or replicates to include in a statistical.
Designing an Experiment &The Characteristics of Scientific Knowledge.
Crowdsourcing High Quality Labels with a Tight Budget Qi Li 1, Fenglong Ma 1, Jing Gao 1, Lu Su 1, Christopher J. Quinn 2 1 SUNY Buffalo; 2 Purdue University.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
As a data user, it is imperative that you understand how the data has been generated and processed…
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Jointly Modeling Aspects, Ratings and Sentiments for Movie Recommendation (JMARS) Authors: Qiming Diao, Minghui Qiu, Chao-Yuan Wu Presented by Gemoh Mal.
CRITICALLY APPRAISING EVIDENCE Lisa Broughton, PhD, RN, CCRN.
Chapter 26: Generalizations and Surveys. Inductive Generalizations (pp ) Arguments to a general conclusion are fairly common. Some people claim.
Resolving Conflicts in Heterogeneous Data by Truth Discovery and Source Reliability Estimation Qi Li 1, Yaliang Li 1, Jing Gao 1, Bo Zhao 2, Wei Fan 3,
Presenter: Siddharth Krishna Sinha Instructor: Jing Gao
Estimating standard error using bootstrap
This will help you understand the limitations of the data and the uses to which it can be put (and the confidence with which you can put it to those.
AP Biology Intro to Statistics
Inference.
Multimodal Learning with Deep Boltzmann Machines
AP Biology Intro to Statistics
Collective Network Linkage across Heterogeneous Social Platforms
12 Inferential Analysis.
12 Inferential Analysis.
LESSON 18: CONFIDENCE INTERVAL ESTIMATION
Learning Probabilistic Graphical Models Overview Learning Problems.
New Techniques and Technologies for Statistics 2017  Estimation of Response Propensities and Indicators of Representative Response Using Population-Level.
GhostLink: Latent Network Inference for Influence-aware Recommendation
Presentation transcript:

FaitCrowd: Fine Grained Truth Discovery for Crowdsourced Data Aggregation   Fenglong Ma1, Yaliang Li1, Qi Li1, Minghui Qiu2, Jing Gao1, Shi Zhi3, Lu Su1, Bo Zhao4, Heng Ji5, Jiawei Han3 Presenter: Jing Gao 1SUNY Buffalo; 2Singapore Management University; 3University of Illinois Urbana-Champaign; 4LinkedIn; 5Rensselaer Polytechnic Institute

Which of these square numbers also happens to be the sum of two smaller numbers? 16 25 36 49 https://www.youtube.com/watch?v=BbX44YSsQ2I

A Straightforward Aggregation Method Voting/Averaging Take the value that is claimed by majority of the sources (users) Or compute the mean of all the claims

Which of these square numbers also happens to be the sum of two smaller numbers? 16 25 36 49 https://www.youtube.com/watch?v=BbX44YSsQ2I

A Straightforward Aggregation Method Voting/Averaging Take the value that is claimed by majority of the sources (users) Or compute the mean of all the claims Limitation Ignore source reliability (user expertise) Source reliability Is crucial for finding the true fact but unknown

Object Aggregation Source 1 Source 2 Source 3 Source 4 Source 5

Truth Discovery Principle To learn users’ reliability degree and discover trustworthy information (i.e., the truths) from conflicting data provided by various users on the same object. A user is reliable if it provides many pieces of true information A piece of information is likely to be true if it is provided by many reliable users

Existing Work on Truth Discovery Existing methods Assign single expertise (reliability degree) to each user (source). Expertise Barack Obama Albert Einstein Michael Jackson

Example--Existing Truth Discovery Methods Input Question Set User Set Answer Set Output Users’ Expertise Truths Question User u1 u2 u3 q1 1 2 q2 q3 q4 q5 q6 User u1 u2 u3 Expertise 5.00E-11 0.961 3.989 Question q1 q2 q3 q4 q5 q6 Truth 1 2 Question q1 q2 q3 q4 q5 q6 Ground Truth 1 2

Overview of Our Work Goal To learn fine-grained (topical-level) user expertise and the truths from conflicting crowd-contributed answers. Politics Physics Music

Example--Our Model Input Output Question Set User Set Answer Set Word u1 u2 u3 q1 1 2 a b q2 c q3 q4 d e q5 f q6 Input Question Set User Set Answer Set Question Content Output Questions’ Topic Topical-Level Users’ Expertise Truths Topic Question K1 q1 q2 q3 K2 q4 q5 q6 User u1 u2 u3 Expertise K1 2.34 2.70E-4 1.00 K2 1.30E-4 2.35 Question q1 q2 q3 q4 q5 q6 Truth 1 2 Question q1 q2 q3 q4 q5 q6 Ground Truth 1 2

FaitCrowd Model Overview Jointly modeling question content and users’ answers by introducing latent topics. Modeling question content can help estimate reasonable user reliability, and in turn, modeling answers leads to the discovery of meaningful topics. Learning topic-level user expertise, truths and topics simultaneously.

Modeling Question Content Word Generation Assume that each question is about a single topic (the length of each question is short). Draw a topic indicator

Modeling Question Content Word Generation Assume that each question is about a single topic (the length of each question is short). Draw a topic indicator Assume that a word can be drawn from topical word distribution or background word distribution. Draw a word category

Modeling Question Content Word Generation Assume that each question is about a single topic (the length of each question is short). Draw a topic indicator Assume that a word can be drawn from topical word distribution or background word distribution. Draw a word category Draw a word

Modeling Answers Answer Generation The correctness of a user’s answer may be affected by the question’s topic, user’s expertise on the topic and the question’s bias. Draw user’s expertise

Modeling Answers Answer Generation The correctness of a user’s answer may be affected by the question’s topic, user’s expertise on the topic and the question’s bias. Draw user’s expertise Draw the truth

Modeling Answers Answer Generation The correctness of a user’s answer may be affected by the question’s topic, user’s expertise on the topic and the question’s bias. Draw user’s expertise Draw the truth Draw the bias

Modeling Answers Answer Generation The correctness of a user’s answer may be affected by the question’s topic, user’s expertise on the topic and the question’s bias. Draw user’s expertise Draw the truth Draw the bias Draw a user’s answer

Inference Method Gibbs-EM Gibbs sampling to learn the hidden variables and . Gradient descent to learn hidden factors and .

Datasets & Measure Datasets Measure The Game Dataset The SFV Dataset Collected from a crowdsourcing platform via an Android App based on a TV game show “Who Wants to Be a Millionaire”. 2,103 questions, 37,029 sources, 214,849 answers and 12,995 words The SFV Dataset Extracted from Slot Filling Validation (SFV) task of the NITS Text Analysis Conference Knowledge Base Population (TAC-KBP) track. 328 questions, 18 sources, 2,538 answers and 5,587 words Measure Error Rate The lower the better

Baseline Methods Basic Method Truth Discovery Crowdsourcing MV Truth Discovery Truth Finder AccuPr Investment 3-Estimates CRH CATD Crowdsourcing D&S ZenCrowd Variations of FaitCrowd FaitCrowd-b FaitCrowd-b-g

Performance Validation Table 1: Performance on the Game Dataset. Analysis For easy questions (from Level 1 to Level 7), all the methods can estimate most answers correctly. For difficult questions (from Level 8 to Level 10) , the performance of FaitCrowd is much better than that of the baseline methods. FaitCrowd performs well on both Game and SFV datasets. Table 2: Performance on the SFV Dataset.

Model Validation Goal Explanation Illustrate the importance of joint modeling question content and answers by comparing with the method that conducts topic modeling and true answer inference separately. Explanation Dividing the whole dataset into sub-topical datasets will reduce the number of responses per topic, which leads to insufficient data for baseline approaches. Table 3: Results of Model Validation.

Topical Expertise Validation Goal Validate the correctness of topical expertise learned by FaitCrowd. Ideally, the expertise estimated by the proposed method is consistent with the ground truth accuracy. Figure 1: Topic 2 on the Game Dataset. Figure 2: Topic 4 on the SFV Dataset.

Expertise Diversity Analysis Goal Demonstrate that the topical expertise for each source varies on different topics. Ideally, the topical expertise should correspond to the ground truth accuracy, i.e., the higher expertise, the higher the ground truth accuracy. Figure 3: Source 7 on the Game Dataset. Figure 4: Source 16 on the SFV Dataset.

Problem Solution Results Summary Recognize the difference in source reliability among topics on the truth discovery task and propose to incorporate the estimation of fine grained reliability into truth discovery. Solution Propose a probabilistic model that simultaneously learns the topic-specific expertise for each source, aggregates true answers, and assigns topic labels to questions. Results Empirically show that the proposed model outperforms existing methods in multi-source aggregation with two real world datasets.

Thank you! Questions?