Department of Electrical Engineering and Computer Science Kunpeng Zhang, Yu Cheng, Yusheng Xie, Doug Downey, Ankit Agrawal, Alok Choudhary {kzh980,ych133,

Slides:



Advertisements
Similar presentations
Recommender System A Brief Survey.
Advertisements

Yinyin Yuan and Chang-Tsun Li Computer Science Department
Autonomic Scaling of Cloud Computing Resources
Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.
Suleyman Cetintas 1, Monica Rogati 2, Luo Si 1, Yi Fang 1 Identifying Similar People in Professional Social Networks with Discriminative Probabilistic.
+ Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George.
Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.
One Theme in All Views: Modeling Consensus Topics in Multiple Contexts Jian Tang 1, Ming Zhang 1, Qiaozhu Mei 2 1 School of EECS, Peking University 2 School.
Trust and Profit Sensitive Ranking for Web Databases and On-line Advertisements Raju Balakrishnan (Arizona State University)
Opinion Spam and Analysis Nitin Jindal and Bing Liu Department of Computer Science University of Illinois at Chicago.
Social Media Intro to Business & Marketing. The most three most trusted forms of advertising are: Recommendations from people I know - 90% Consumer opinions.
Vote Calibration in Community Question-Answering Systems Bee-Chung Chen (LinkedIn), Anirban Dasgupta (Yahoo! Labs), Xuanhui Wang (Facebook), Jie Yang (Google)
Sentiment Analysis An Overview of Concepts and Selected Techniques.
Ao-Jan Su † Y. Charlie Hu ‡ Aleksandar Kuzmanovic † Cheng-Kok Koh ‡ † Northwestern University ‡ Purdue University How to Improve Your Google Ranking: Myths.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Distributed Representations of Sentences and Documents
Scalable Text Mining with Sparse Generative Models
The Social Web: A laboratory for studying s ocial networks, tagging and beyond Kristina Lerman USC Information Sciences Institute.
Introduction Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Introduction Facebook How does Facebook use your data? Where do you think.
Forecasting with Twitter data Presented by : Thusitha Chandrapala MARTA ARIAS, ARGIMIRO ARRATIA, and RAMON XURIGUERA.
A Social Help Engine for Online Social Network Mobile Users Tam Vu, Akash Baid WINLAB, Rutgers University May 21,
More than words: Social networks’ text mining for consumer brand sentiments A Case on Text Mining Key words: Sentiment analysis, SNS Mining Opinion Mining,
1 Opinion Spam and Analysis (WSDM,08)Nitin Jindal and Bing Liu Date: 04/06/09 Speaker: Hsu, Yu-Wen Advisor: Dr. Koh, Jia-Ling.
Opinion Mining Using Econometrics: A Case Study on Reputation Systems Anindya Ghose, Panagiotis G. Ipeirotis, and Arun Sundararajan Leonard N. Stern School.
1 Information Filtering & Recommender Systems (Lecture for CS410 Text Info Systems) ChengXiang Zhai Department of Computer Science University of Illinois,
by B. Zadrozny and C. Elkan
PageRank for Product Image Search Kevin Jing (Googlc IncGVU, College of Computing, Georgia Institute of Technology) Shumeet Baluja (Google Inc.) WWW 2008.
Data Mining and Machine Learning Lab Network Denoising in Social Media Huiji Gao, Xufei Wang, Jiliang Tang, and Huan Liu Data Mining and Machine Learning.
Digital Media Strategy The Social Media Affect. LOUISVILLE.EDU The people who define our global conversations Today, we use Google, Facebook and Twitter.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.
+ Recommending Branded Products from Social Media Jessica CHOW Yuet Tsz Yongzheng Zhang, Marco Pennacchiotti eBay Inc. eBay Inc.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:
Designing Ranking Systems for Consumer Reviews: The Economic Impact of Customer Sentiment in Electronic Markets Anindya Ghose Panagiotis Ipeirotis Stern.
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
Google News Personalization: Scalable Online Collaborative Filtering
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
How Useful are Your Comments? Analyzing and Predicting YouTube Comments and Comment Ratings Stefan Siersdorfer, Sergiu Chelaru, Wolfgang Nejdl, Jose San.
Jun Li, Peng Zhang, Yanan Cao, Ping Liu, Li Guo Chinese Academy of Sciences State Grid Energy Institute, China Efficient Behavior Targeting Using SVM Ensemble.
CONFIDENTIAL1 Hidden Decision Trees to Design Predictive Scores – Application to Fraud Detection Vincent Granville, Ph.D. AnalyticBridge October 27, 2009.
A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation.
1 Collaborative Filtering & Content-Based Recommending CS 290N. T. Yang Slides based on R. Mooney at UT Austin.
Xutao Li1, Gao Cong1, Xiao-Li Li2
Click to Add Title A Systematic Framework for Sentiment Identification by Modeling User Social Effects Kunpeng Zhang Assistant Professor Department of.
Google News Personalization Big Data reading group November 12, 2007 Presented by Babu Pillai.
McCormick Northwestern Engineering 1 Electrical Engineering & Computer Science Mining Millions of Reviews: A Technique to Rank Products Based on Importance.
Classification Ensemble Methods 1
Ensemble Methods Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Advisor-Advisee Relationships from Research Publication.
Item-Based Collaborative Filtering Recommendation Algorithms Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl GroupLens Research Group/ Army.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Twitter as a Corpus for Sentiment Analysis and Opinion Mining
Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.
Collaborative Filtering With Decoupled Models for Preferences and Ratings Rong Jin 1, Luo Si 1, ChengXiang Zhai 2 and Jamie Callan 1 Language Technology.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
More than words: Social network’s text mining for consumer brand sentiments Expert Systems with Applications 40 (2013) 4241–4251 Mohamed M. Mostafa Reporter.
Empirical Analysis of Implicit Brand Networks on Social Media
Effects of User Similarity in Social Media Ashton Anderson Jure Leskovec Daniel Huttenlocher Jon Kleinberg Stanford University Cornell University Avia.
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Erasmus University Rotterdam
CSCI 5822 Probabilistic Models of Human and Machine Learning
Predict Failures with Developer Networks and Social Network Analysis
iSRD Spam Review Detection with Imbalanced Data Distributions
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.
Happiness and Stocks Ali Javed, Tim Stevens
Presentation transcript:

Department of Electrical Engineering and Computer Science Kunpeng Zhang, Yu Cheng, Yusheng Xie, Doug Downey, Ankit Agrawal, Alok Choudhary {kzh980,ych133, yxi389, A Probabilistic Graphical Model for Brand Reputation Assessment in Social Networks 1ASONAM

Department of Electrical Engineering and Computer Science Acknowledgement ASONAM

Department of Electrical Engineering and Computer Science Outline Introduction Problem Definition Methodology Social Sentiment Identification Proposed Graphical Model Experimental Results Related Work Future Work 3ASONAM

Department of Electrical Engineering and Computer Science Introduction Social media data Mining social data to make informed decisions is helpful for individuals and business companies. User opinions from reviews, blogs, comments, etc. Marketing analysis, competitor analysis. Brand reputation … 4ASONAM

Department of Electrical Engineering and Computer Science Challenges Understanding user opinions (positive, negative, objective) Social sentiment identification Bias on users’ opinions How do we reduce biases and fairly evaluate a social brand? Big data How do we efficiently measure brand reputation? 5ASONAM

Department of Electrical Engineering and Computer Science An Example Facebook Page Number of fans 6ASONAM

Department of Electrical Engineering and Computer Science Post Comment Post Like 7ASONAM

Department of Electrical Engineering and Computer Science Statements Each user can make comments or like multiple posts on different pages. Each page can receive comments or likes from different users. User can make positive, negative, or objective comments. How do we make use of these networked information, textual information to infer reputation of social brands with reducing bias? 8ASONAM

Department of Electrical Engineering and Computer Science Sentiment Identification * Ensemble method Extended compositional semantic rules 12 semantic rules and 2 compose functions One example of rules: If a sentence contains the key word “but”, then consider only the sentiment of the “but” clause. Frequency-based method The strength of a sentiment is expressed by the adjective and adverb used in the sentence. Adverb-Adjective-Noun (abbreviated as AAN) and Verb-Adverb (VA). Bag-of-word method Positive/negative/negation word list Internet language emoticons Domain-specific words 9 *: previous work at ICDM2011, SIGIR2012

Department of Electrical Engineering and Computer Science U1U1 R1R1 S 11 U2U2 R2R2 S 21 S 32 U3U3 S23S23 S nm R3R3 RmRm UnUn … … … P(R 1 ) P(R2) P(R 3 ) P(R m ) Problem Statement Given large amounts of user activities (comments) in social networks, we want to infer the brand reputation. U i : user i R j : brand j S ij : sentiment of comments made by user U i on brand R j … ASONAM

Department of Electrical Engineering and Computer Science Observations 1.Different people have different positivity. (e.g., star ratings on Amazon.com) 2.Positive people are likely to give positive comments to brands with high reputation. 3.Sentiments of comments can be “observed”. (We have the state-of-the-art techniques to identify sentiments.) 11ASONAM

Department of Electrical Engineering and Computer Science The Probabilistic Graphical Model S: observed variable R, U: hidden variables All variables have binary values m: number of brands n: number of users 12 ASONAM

Department of Electrical Engineering and Computer Science Collective Inference The goal is to infer all P(R). Intractable: Difficult to calculate the partition function (denominator) due to a large discrete state space. Millions of users, Billions of comments 13ASONAM

Department of Electrical Engineering and Computer Science Gibbs Sampling (MCMC) Brand reputation 14ASONAM

Department of Electrical Engineering and Computer Science Gibbs Sampling (MCMC) User positivity 15ASONAM

Department of Electrical Engineering and Computer Science 16ASONAM

Department of Electrical Engineering and Computer Science Important Observations: Conditional Independency R 1, R 2, · · ·, R m are independent of each other given all U 1, U 2, · · ·, U n and all observed variables S ij. Similarly for all U’s. 17 ASONAM

Department of Electrical Engineering and Computer Science Parallelized Block-based MCMC Consider users and brands as two separate blocks. We alternately sample all R i and U j in each sampling round. Can be scalable to solve problems with big size by parallelizing within each block. 18ASONAM

Department of Electrical Engineering and Computer Science Parallelized Block-based MCMC 19 Block 1 Block 2 U1U1 R1R1 S 11 U2U2 R2R2 S 21 S 32 U3U3 S23S23 S nm R3R3 RmRm UnUn … … …

Department of Electrical Engineering and Computer Science Experimental Data Facebook data Also applicable to other platforms. Facebook Graph API 11,140 brand pages and 270M users by May 1, ASONAM

Department of Electrical Engineering and Computer Science Data Cleaning Remove pages whose major language are not English; Ignore pages receiving very few comments (<=10000); Filter out spam users; Ignore users who make comments on only 1 brand (<=2); Ignore users who make very few total comments across all brands (<= 5). Data Stats 21 ASONAM # of unique users15,528,173 # of social brands7,523 # of comments126,613,072 # of positive comments93,233,898 # of total posts8,186,454

Department of Electrical Engineering and Computer Science Spam Users On average, a user comments on 4 to 5 brands. We set the threshold of 100 to discard users making comments on more than 100 brands. 22ASONAM

Department of Electrical Engineering and Computer Science Evaluation (1) Converges of the parallelized blocked-based MCMC X-axis: sampling round Y-axis: reputation probability 23ASONAM

Department of Electrical Engineering and Computer Science Evaluation (2) How efficient is the parallelized block-based MCMC? Speedup X-axis: sampling round Y-axis: speedup S p P = 8 24 ASONAM

Department of Electrical Engineering and Computer Science Model Evaluation Existing IMDb movie ranking (Internet Movie Database) 25 ASONAM

Department of Electrical Engineering and Computer Science Model Evaluation Rank correlation (spearman correlation) between our reputation and IMDb index (rating score, votes, box revenue) 26 Our reputation VS. IMDb rating score0.757 Our reputation VS. the number of votes in IMDb0.440 Our reputation VS. the box office revenue in IMDb0.283 ASONAM

Department of Electrical Engineering and Computer Science Model Evaluation Business school ranking from US News & World Report 27 ASONAM

Department of Electrical Engineering and Computer Science Model Evaluation Rank correlation (spearman correlation) between our reputation and business school ranking from US News & Word Report 28 Our reputation VS. top business school ranking from US News & World Report ASONAM

Department of Electrical Engineering and Computer Science Not significant 29

Department of Electrical Engineering and Computer Science Learning Models Based on All Those Metrics Least absolute deviation, Poisson regression, logistic regression, and SVM regression. Features: All listed metrics in the above slide. Train on movie data. Test on business school data. Rank correlation between predict values and existing values The best we obtained is 0.52 through SVM regression. ASONAM

Department of Electrical Engineering and Computer Science Parameter Setting Gama (γ) is the threshold for positive vs. non- positive sentiment. 31ASONAM

Department of Electrical Engineering and Computer Science Future Work Incorporating more factors to make model more comprehensive. Integration data from other social platform such as twitter, Google+, LinkedIn, etc. to make inference more reliable. 32ASONAM

Department of Electrical Engineering and Computer Science Related Work Behavior targeting Learning from past user behaviors, especially feedbacks (i.e., comments, clicks) to match the best advertisements to users. [ Chen; Kumar ] Recommender systems [ Han, et al ] proposed a network-based refinement approach utilizing the patent information network for prediction, smoothing and optimization. Sentiment analysis From rule-based, bag-of-words approaches to machine learning techniques which classifies as positive or negative. [ Pang, et al ] 33ASONAM

Department of Electrical Engineering and Computer Science Questions? 34ASONAM