Twitter Hashtags RMBI4310Spring 2016 Group 14 Cheung Hiu Yan, Debbie20120038 Chow Miu Lam, Carman20121408 Tsang Wing Wah, Denise20124917.

Slides:



Advertisements
Similar presentations
Entity-Centric Topic-Oriented Opinion Summarization in Twitter Date : 2013/09/03 Author : Xinfan Meng, Furu Wei, Xiaohua, Liu, Ming Zhou, Sujian Li and.
Advertisements

Influence and Passivity in Social Media Daniel M. Romero, Wojciech Galuba, Sitaram Asur, and Bernardo A. Huberman Social Computing Lab, HP Labs.
Presented By: Omofonmwan Nelson. Agenda:  Twitter  Benefits of Twitter  Tweet  Tweeter Services  Geographical Distribution  Conclusion.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
Semantic text features from small world graphs Jure Leskovec, IJS + CMU John Shawe-Taylor, Southampton.
Analysis of Variance. Experimental Design u Investigator controls one or more independent variables –Called treatment variables or factors –Contain two.
Hypothesis Testing GTECH 201 Lecture 16.
What’s New in Search? How destinations can leverage new search trends.
Web Information Retrieval Projects Ida Mele. Rules Students can work in teams (max 3 people) The project must be delivered by the deadline that will be.
Fig Theory construction. A good theory will generate a host of testable hypotheses. In a typical study, only one or a few of these hypotheses can.
CS344: Introduction to Artificial Intelligence Vishal Vachhani M.Tech, CSE Lecture 34-35: CLIR and Ranking in IR.
Projects ( ) Ida Mele. Rules Students have to work in teams (max 2 people). The project has to be delivered by the deadline that will be published.
Chapter 33 Conducting Marketing Research. The Marketing Research Process 1. Define the Problem 2. Obtaining Data 3. Analyze Data 4. Rec. Solutions 5.
Repository Method to suit different investment strategies Alma Lilia Garcia & Edward Tsang.
WARNINGBIRD: A Near Real-time Detection System for Suspicious URLs in Twitter Stream.
Extracting Key Terms From Noisy and Multi-theme Documents Maria Grineva, Maxim Grinev and Dmitry Lizorkin Institute for System Programming of RAS.
1 Wikification CSE 6339 (Section 002) Abhijit Tendulkar.
“How much context do you need?” An experiment about context size in Interactive Cross-language Question Answering B. Navarro, L. Moreno-Monteagudo, E.
Analysis of Variance ( ANOVA )
Artificial Neural Networks (ANN). Output Y is 1 if at least two of the three inputs are equal to 1.
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
Marketing Research. Monday, February 23 Give a couple examples of Marketing Research. Give a couple examples of Marketing Research. Why do you think Marketing.
Iterative Readability Computation for Domain-Specific Resources By Jin Zhao and Min-Yen Kan 11/06/2010.
The Research Enterprise in Psychology. The Scientific Method: Terminology Operational definitions are used to clarify precisely what is meant by each.
WALKING IN FACEBOOK: A CASE STUDY OF UNBIASED SAMPLING OF OSNS junction.
1 Science as a Process Chapter 1 Section 2. 2 Objectives  Explain how science is different from other forms of human endeavor.  Identify the steps that.
Page 1 Ming Ji Department of Computer Science University of Illinois at Urbana-Champaign.
Laboratory for InterNet Computing CSCE 561 Social Media Projects Ryan Benton October 8, 2012.
Prepared by: Mahmoud Rafeek Al-Farra College of Science & Technology Dep. Of Computer Science & IT BCs of Information Technology Data Mining
Date: 2012/4/23 Source: Michael J. Welch. al(WSDM’11) Advisor: Jia-ling, Koh Speaker: Jiun Jia, Chiou Topical semantics of twitter links 1.
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
Graph-based Text Classification: Learn from Your Neighbors Ralitsa Angelova , Gerhard Weikum : Max Planck Institute for Informatics Stuhlsatzenhausweg.
Understanding User’s Query Intent with Wikipedia G 여 승 후.
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
Algorithmic Detection of Semantic Similarity WWW 2005.
SCIENTIFIC METHOD CA STATE STANDARD 8.
Created by Mr. Hemmert The Scientific Method involves a series of steps that are used to investigate a natural occurrence. It’s a process used by scientists.
 Based on observed functioning of human brain.  (Artificial Neural Networks (ANN)  Our view of neural networks is very simplistic.  We view a neural.
2015/12/121 Extracting Key Terms From Noisy and Multi-theme Documents Maria Grineva, Maxim Grinev and Dmitry Lizorkin Proceeding of the 18th International.
1 Masters Thesis Presentation By Debotosh Dey AUTOMATIC CONSTRUCTION OF HASHTAGS HIERARCHIES UNIVERSITAT ROVIRA I VIRGILI Tarragona, June 2015 Supervised.
Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations Rui Li, Shengjie Wang, Hongbo Deng, Rui Wang, Kevin.
Post-Ranking query suggestion by diversifying search Chao Wang.
ApproxHadoop Bringing Approximations to MapReduce Frameworks
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
1 Friends and Neighbors on the Web Presentation for Web Information Retrieval Bruno Lepri.
Information Retrieval CSE 8337 Spring 2005 Modeling (Part II) Material for these slides obtained from: Modern Information Retrieval by Ricardo Baeza-Yates.
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
11 A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 1, Michael R. Lyu 1, Irwin King 1,2 1 The Chinese.
Guide to Using Excel 2007 For Basic Statistical Applications To Accompany Business Statistics: A Decision Making Approach, 8th Ed. Chapter 17: Introduction.
Unsupervised Streaming Feature Selection in Social Media
Don’t Follow me : Spam Detection in Twitter January 12, 2011 In-seok An SNU Internet Database Lab. Alex Hai Wang The Pensylvania State University International.
Enhanced hypertext categorization using hyperlinks Soumen Chakrabarti (IBM Almaden) Byron Dom (IBM Almaden) Piotr Indyk (Stanford)
Fabricio Benevenuto, Gabriel Magno, Tiago Rodrigues, and Virgilio Almeida Universidade Federal de Minas Gerais Belo Horizonte, Brazil ACSAC 2010 Fabricio.
Bo Zong, Yinghui Wu, Ambuj K. Singh, Xifeng Yan 1 Inferring the Underlying Structure of Information Cascades
Web Page Clustering using Heuristic Search in the Web Graph IJCAI 07.
A Connectivity-Based Popularity Prediction Approach for Social Networks Huangmao Quan, Ana Milicic, Slobodan Vucetic, and Jie Wu Department of Computer.
THE MARKETING RESEARCH PROCESS CHAPTER 29 Mrs. Simone Seaton Marketing Management.
Hypothesis Tests l Chapter 7 l 7.1 Developing Null and Alternative
The Scientific Method & Experimental Design
WSRec: A Collaborative Filtering Based Web Service Recommender System
PRESENTING RESULTS.
E-Commerce Theories & Practices
CS341: Project in Mining Massive Datasets Infosession
Inferential statistics,
iSRD Spam Review Detection with Imbalanced Data Distributions
Cluster Validity For supervised classification we have a variety of measures to evaluate how good our model is Accuracy, precision, recall For cluster.
Steps of the Scientific Method.
Chapter 10: Compilers and Language Translation
The Scientific Method & Experimental Design
Presentation transcript:

Twitter Hashtags RMBI4310Spring 2016 Group 14 Cheung Hiu Yan, Debbie Chow Miu Lam, Carman Tsang Wing Wah, Denise

Terminology 1. Tweet  Individual message within 140 characters 2. Hashtag  A string of characters preceded by the symbol # 3. Follower & Followee  If User A follows User B, then A is a follower of B and B is a followee of A

Why Hashtag was invented?

Hashtags and followers : An experimental study of the online social network Twitter Eva García Martín

Motivations ➢ To spread the information more widely ➢ To help marketing companies to correctly target their customers

Objectives ➢ To investigate a correlation between hashtags and change in number of followers

Experiment 1. Gathered random users that tweeted with hashtags and without hashtags  Control Group - do not use hashtag  Experimental Group - use hashtag 2. Computed the difference in the number of followers every 30 minutes 3. Collected data for a period of 7 days

Experiment Gathered random users Computed the difference in the number of followers Collected data for a period of 7 days Control Group - not use hashtag Experimental Group - use hashtag

Evaluation Null hypothesis is rejected Compare the change in number of followers H0: Median of Control Group = Median of Experimental Group H1: Median of Experimental group > Median of Control group Non-parametrical Mann-Whitney U-test Test whether the 2 groups represent different median values

Result ➢ Correlation is shown between hashtags and followers ➢ Tweets that contain hashtags are more likely to lead to a higher increase in the number of followers than tweets without hashtags.

Result ➢ Tweets with more than two hashtags results in a decrease in the number of followees

Future Work ➢ Further investigation on which type of hashtags can attract followers

On Analyzing Hashtags in Twitter Paolo Ferragina Francesco Piccinno Roberto Santoro

Motivations ➢ Extracting information from hashtags is difficult ➢ Composition is not constrained by any rule ➢ Usually appear in short and poorly written messages ➢ Difficult to analyze with classic IR techniques

Objectives Introduce the Hashtag-Entity Graph and proper algorithmsTo solve IR problems formulated on Twitter hashtagsBetter hashtag classification

What is Hashtag-Entity Graph? ➢ A weighted labeled graph made up of hashtags and entities drawn from a set of tweets Purple node: hashtag Green node: entity (pages drawn from Wikipedia) Black edge: links two entities which are semantically related Green edge: links a hashtag node iff they co-occur in the same tweet

Relatedness ➢ Devise a relatedness function for two hashtags, with an output value ranging [0; 1] 0 semantically unrelated 1 semantically related

Relatedness ➢ Expanded Consine Similarity (ExpCosEntity) & Personalized PageRank Relatedness (CosPPR) ➢ If the relatedness obtained from both methods are high (low), then the two hashtags are related (unrelated) ➢ If the output of CosPPR is significantly lower than that of ExpCosEntity, then the two hashtags are weakly related

Classification Classifier HE- Graph Wikipedia Category graph 8 Categories

Conclusion ➢ Tweets with hashtags ➢ result in higher increase in the number of followers ➢ help companies accurately target their customers in social media campaign ➢ Precise classification algorithm for hashtags ➢ allow further investigation on what types of hashtag could attract followers

Q&A