Mining Cross-network Association for YouTube Video Promotion Ming Yan, Jitao Sang, Changsheng Xu*. 1 Institute of Automation, Chinese Academy of Sciences,

Slides:



Advertisements
Similar presentations
A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.
Advertisements

Suleyman Cetintas 1, Monica Rogati 2, Luo Si 1, Yi Fang 1 Identifying Similar People in Professional Social Networks with Discriminative Probabilistic.
Linking Named Entity in Tweets with Knowledge Base via User Interest Modeling Date : 2014/01/22 Author : Wei Shen, Jianyong Wang, Ping Luo, Min Wang Source.
Patch to the Future: Unsupervised Visual Prediction
Ming Yan, Jitao Sang, Tao Mei, ChangSheng Xu
Robust Object Tracking via Sparsity-based Collaborative Model
Active Learning and Collaborative Filtering
The Statistics of Fingerprints A Matching Algorithm to be used in an Investigation into the Reliability of the Use of Fingerprints for Identification Bob.
Presented by Li-Tal Mashiach Learning to Rank: A Machine Learning Approach to Static Ranking Algorithms for Large Data Sets Student Symposium.
Development of Empirical Models From Process Data
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 30, (2014) BERLIN CHEN, YI-WEN CHEN, KUAN-YU CHEN, HSIN-MIN WANG2 AND KUEN-TYNG YU Department of Computer.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Information Retrieval in Practice
Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.
Jinhui Tang †, Shuicheng Yan †, Richang Hong †, Guo-Jun Qi ‡, Tat-Seng Chua † † National University of Singapore ‡ University of Illinois at Urbana-Champaign.
Mining Cross-network Association for YouTube Video Promotion Ming Yan Institute of Automation, C hinese Academy of Sciences May 15, 2014.
Multi-Style Language Model for Web Scale Information Retrieval Kuansan Wang, Xiaolong Li and Jianfeng Gao SIGIR 2010 Min-Hsuan Lai Department of Computer.
Right Buddy Makes the Difference: an Early Exploration of Social Relation Analysis in Multimedia Applications Jitao Sang, Changsheng Xu*. 1 Institute of.
AN ITERATIVE METHOD FOR MODEL PARAMETER IDENTIFICATION 4. DIFFERENTIAL EQUATION MODELS E.Dimitrova, Chr. Boyadjiev E.Dimitrova, Chr. Boyadjiev BULGARIAN.
1 IEEE Trans. on Smart Grid, 3(1), pp , Optimal Power Allocation Under Communication Network Externalities --M.G. Kallitsis, G. Michailidis.
Resistant Learning on the Envelope Bulk for Identifying Anomalous Patterns Fang Yu Department of Management Information Systems National Chengchi University.
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Y. Kotani · F. Ino · K. Hagihara Springer Science + Business Media B.V Reporter: 李長霖.
Chapter © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or.
1 Linmei HU 1, Juanzi LI 1, Zhihui LI 2, Chao SHAO 1, and Zhixing LI 1 1 Knowledge Engineering Group, Dept. of Computer Science and Technology, Tsinghua.
IEEE TRANSSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
MAP: Multi-Auctioneer Progressive Auction in Dynamic Spectrum Access Lin Gao, Youyun Xu, Xinbing Wang Shanghai Jiaotong University.
Distributed Information Retrieval Using a Multi-Agent System and The Role of Logic Programming.
Friends and Locations Recommendation with the use of LBSN By EKUNDAYO OLUFEMI ADEOLA
Line detection Assume there is a binary image, we use F(ά,X)=0 as the parametric equation of a curve with a vector of parameters ά=[α 1, …, α m ] and X=[x.
1 of 27 How to invest in Information for Development An Introduction Introduction This question is the focus of our examination of the information management.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Transductive Regression Piloted by Inter-Manifold Relations.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
Designing multiple biometric systems: Measure of ensemble effectiveness Allen Tang NTUIM.
Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.
Learning from Positive and Unlabeled Examples Investigator: Bing Liu, Computer Science Prime Grant Support: National Science Foundation Problem Statement.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.
Pairwise Preference Regression for Cold-start Recommendation Speaker: Yuanshuai Sun
1 An Efficient Classification Approach Based on Grid Code Transformation and Mask-Matching Method Presenter: Yo-Ping Huang.
Towards Social User Profiling: Unified and Discriminative Influence Model for Inferring Home Locations Rui Li, Shengjie Wang, Hongbo Deng, Rui Wang, Kevin.
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
Unsupervised Auxiliary Visual Words Discovery for Large-Scale Image Object Retrieval Yin-Hsi Kuo1,2, Hsuan-Tien Lin 1, Wen-Huang Cheng 2, Yi-Hsuan Yang.
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Unsupervised Streaming Feature Selection in Social Media
Carl Vondrick, Aditya Khosla, Tomasz Malisiewicz, Antonio Torralba Massachusetts Institute of Technology
Chance Constrained Robust Energy Efficiency in Cognitive Radio Networks with Channel Uncertainty Yongjun Xu and Xiaohui Zhao College of Communication Engineering,
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
Autumn Web Information retrieval (Web IR) Handout #14: Ranking Based on Click Through data Ali Mohammad Zareh Bidoki ECE Department, Yazd University.
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
Chapter 8: Web Analytics, Web Mining, and Social Analytics
Presented by: Siddhant Kulkarni Spring Authors: Publication:  ICDE 2015 Type:  Research Paper 2.
CPH Dr. Charnigo Chap. 11 Notes Figure 11.2 provides a diagram which shows, at a glance, what a neural network does. Inputs X 1, X 2,.., X P are.
Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.
Queensland University of Technology
Recommendation in Scholarly Big Data
Dynamical Statistical Shape Priors for Level Set Based Tracking
Mining the Data Charu C. Aggarwal, ChengXiang Zhai
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
Community-based User Recommendation in Uni-Directional Social Networks
RECOMMENDER SYSTEMS WITH SOCIAL REGULARIZATION
MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.
Zhengyu Deng, Jitao Sang, Changsheng Xu
Block Matching for Ontologies
Personalized Celebrity Video Search Based on Cross-space Mining
Example: Academic Search
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.
Presentation transcript:

Mining Cross-network Association for YouTube Video Promotion Ming Yan, Jitao Sang, Changsheng Xu*. 1 Institute of Automation, Chinese Academy of Sciences, Beijing, China-Singapore Institute of Digital Media, Singapore, {ming.yan, jtsang, Large quantities of videos are consumed in YouTube and the trend is growing year by year. YouTube exhibits limited propagation efficiency and many videos remain unknown to the wide public due to the limited internal mechanism. External referrers such as social media websites arise to be important sources to lead users to YouTube videos, among which Twitter has grown to be the top referrer recently. The followee-follower and user-centric architecture has distinguished Twitter with significant information propagation efficiency. Motivation: For specific YouTube video, to identify proper Twitter followees with goal to maximize video dissemination to the followers. Introduction For our video promotion application, we claim that only large exposure to more audiences is not enough, what video promotion cares is the number of “effective” audiences, who are likely to show interest to the target video and with higher probability to take subsequent consuming actions. Therefore, the core of video promotion should both match with the interest of the Twitter followers and be cost-effective. Figure 1. Problem Illustration. Two Challenges The heterogeneous knowledge association between YouTube video and Twitter followee; How to define the “properness” of candidate Twitter followee for a specific YouTube video. To address the challenges one by one, we propose a three-stage framework as our solution: Heterogeneous Topic Modeling: To discover the latent structure within YouTube video and Twitter user spaces, respectively; Cross-network Topic Association: To address the discrepancy issue between the heterogeneous YouTube video and Twitter user spaces by mining cross-network topic association on a collective user-level. Referrer Identification:To define the “properness” of candidate Twitter followee for a specific YouTube video and match video to followee in a ranking-based method. Figure 2. The framework Heterogeneous Topic Modeling Goal To discover the latent structure within YouTube video and Twitter user spaces, and facilitate the subsequent analysis and applications in topic level. Through this stage, each YouTube video and Twitter user can be represented as distributions in the derived corresponding topic spaces. YouTube Video Topic Modeling The video topics are expected to span over both textual and visual spaces. We introduce a modification to the multi-modal topic model, Corr-LDA, as depicted in the figure 3. Figure 3. The graphical representation of inverse Corr-LDA Twitter Followee Topic Modeling Since the properness of Twitter followee is decided by the followers, we are interested in investigating into the followee-follower architecture in Twitter. Therefore, we represent each Twitter user (document) with all his/her followees (words) and apply the standard LDA on the user social graph for topic modeling. Advantage Don’t need an explicit association matrix Non-linear Non-overlapped users can be also utilized Discovered Topic Visualization Figure 5 and 6 show the visualization of some of the discovered topics in YouTube and Twitter, respectively. Figure 4. Perplexities for different topic numbers on YouTube and Twitter Figure 5. Visualization of discovered YouTube topics Figure 6. Visualization of discovered Twitter topics Cross-network Topic Association Innovation To address the discrepancy issue, we propose a solution that first aggregates YouTube video distribution to user level, and then exploit the overlapped users among different networks as bridge for association mining. Assumption if the same group of users heavily involve with topic A in network X and topic B in network Y, it is very likely that topic A and B are closely associated. YouTube User-Topic Distri. Aggregation Various Approaches for Association Mining via Overlapped Users Goal To enable topic distribution transfer between different networks, i.e., given user’s topical interest in YouTube videos, we can infer his/her most probably followed Twitter followee topics. 2. Regression-based Association To deal with the noisy user distribution issue, a regression-based optimization approach is proposed: When q = 1, this is a lasso problem and can be effectively solved by feature-sign search algorithm. When q = 2, this is a ridge regression problem and have an analytical solution. 3. Latent Attribute-based Association Topic Number Selection Assumption We assume that the latent structure is actually user attribute, i.e., it is the same user’s unique attribute values that give birth to his/her different activities and thus the cross-network topic distributions. A YouTube factor and a twitter factor are coupled to the same user attribute, and the same user should have identical coefficients when projected to the coupled user factors. Innovation Try to discover the shared latent structure behind YouTube and Twitter topic spaces on user level. Objective Function Evaluation for the final video promotion We utilize to evaluate our final video promotion problem. The result is shown in Figure 8. It demonstrates the advantages of our proposed framework to promote YouTube videos via Twitter network compared with other baselines. Figure 7. MAE for distribution transfer in Stage 2. Twitter Referrer Identification With the topical distribution transfer enabled, in this third stage, we first transfer the test video distribution to the Twitter topic space. Then we are devoted to matching the YouTube video with Twitter followee in the same Twitter topic space in a ranking svm- based scheme. Two critical issues should be addressed when utilizing the ranking-svm training scheme:  How to extract the video-followee pair features? We define the video-followee pair features as the vector product between the transferred video and Twitter followee distributions:  How to define the Ground-Truth (GT) properness for each video-followee pair in the training set? Both the follower interest and the cost to ask Twitter followee for help should be considered. Here, we treat the follower number of the twitter followee as the virtual cost, i.e., more popular a Twitter followee, more cost should be paid to ask him/her for help. To consider both of these, the Ground-Truth (GT) properness is defined as below: Evaluation of Transfer Error We utilize Mean Abosolute Error (MAE) on half of the overlapped users to evaluate the performance of topical distribution transfer between YouTube and Twitter. The result is shown in Figure 7. RandomPopularity Regression +Direct Regression +Weighted LA_all +Weighted LA_all +Direct Figure 8. for different settings