Download presentation
Presentation is loading. Please wait.
Published byJuniper Hutchinson Modified over 9 years ago
2
MOTIVATION AND CHALLENGE Big data Volume Velocity Variety Veracity Contributor Content Context Value 5 Vs of Big Data 3 Cs of Veracity
3
ALETHIOMETER FRAMEWORK CCContributorontentontext 3
4
C1 CONTRIBUTOR 4
5
5 Contributor modalities Reputation - Analyse comments in the course of time, discover sentiments and opinions towards a source. - Measured by the number of upvotes or likes. History - Information about activity on different social media platforms, combined with validity data. - Measured by the update frequency of valid posts. Popularity - Information about following source activity (readings, recommendations). - Measured by the number of friends/followers, and the number of responses.
6
6 Contributor modalities Influence - Information about activities triggered by this source (re-posts, discussions or comments). - Measured by number of retweets/shares, Klout influence score. Presence - Information about type of source (individual, organisation,officially verified account, fake identity, etc.) and its presence on multiple social media platforms. - Measured by the number of accounts in different social media.
7
C2 CONTENT 7
8
8 Reputation of linked web content - Measured in terms of domain reputation, page rank (GoogleRank or Alexa PageRank), or properties of the contributors to the content. Provenance - Finding the original occurrence of the content and its whole path across sources, places and time, and measuring the reputation of these sources. Popularity - Information about how many people are following this content. - Measured by the number of followers, and the number of responses. Content modalities
9
9 Influence - Analyse if this content is triggering discussions or other actions in the social sphere. - Measured by number of retweets/shares. Originality - Check whether the content or parts thereof have been used in the past (e.g., reused text or images that have appeared in the past). Authenticity - Check whether the content has been changed with respect to its original state (e.g., changed text or attached multimedia content) Objectivity and Diversity - Measured by the variation of opinions found for people, content, or general entities. Content modalities
10
C3 CONTEXT 10
11
11 Cross-checking - Measured by the number of different reports or mentions about the same thing coming from independent sources Coherence - Measurement of text coherence (e.g., Coh-Metrix) and coherence between the content and tags, attached web-links, or attached multimedia. Proximity - Measurement of coherence between reference location/time and publication location/time. Context modalities
12
12 How to combine all these parameters?
13
13 Approach for rating of modality parameters Rate parameters on 5-point discrete scale, from 0 to 4 - [0, a 0 ) → 0, [a 0, a 1 ) →1, [a 1, a 2 ) → 2, [a 2, a 3 ) → 3, [a 3, ∞) → 4. - a 0 : 20 th percentile, a 1 : 40 th percentile, a 2 : 60 th percentile, a 3 : 80 th percentile (adjust the scale so it follows a uniform distribution). Weight the rating of parameters for deriving a total score uniformly or based on their significance
14
14
15
15 Parameters studied Number of followers Number of tweets User account age Sample: ~10 M tweets, 5 K users Collection period: July-September 2013 Preliminary statistical results
16
16 Empirical distributions Heavy-tailed distributions Multimodal heavy-tailed distributions with three different peaks (6.7 months, 23.3 months, 4.4 yrs)
17
17 Correlation coefficients Friends - followers: 0.1222 Friends - tweets: 0.08 Followers - tweets: 0.0197 Conclusion: - all parameters relatively independent from one-another - need to be studied independently
18
18 Summary Defined Alethiometer: a framework taking into account all aspects: Contributor, Content and Context Showed an approach for combining the ratings of all parameters Attested the relative independence of parameters and the need to consider a variety of measures (also previously emphasized in the literature) Future work Investigate statistical properties of other modalities Extract the significance of modalities Study correlation between content, contributor and context modalities Summary and future work
19
find us at http://ilab.atc.grhttp://ilab.atc.gr follow us @iLabATC Questions & Answers
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.