Mining Cross-network Association for YouTube Video Promotion Ming Yan Institute of Automation, C hinese Academy of Sciences May 15, 2014
Outline Motivation Three-stage Framework Some Visualization Further Discussion
Background More than 1 billion unique users visit YouTube each month. Over 6 billion hours of video are watched each month on YouTube. 100 hours of video are uploaded to YouTube every minute. Large quantities of videos are consumed in YouTube and the trend is growing year by year. YouTube exhibits limited propagation efficiency and many videos remain unknown to the wide public. Long tail effect for the video view count distribution. Short active life span for most videos.
Background YouTube video popularity limited by its internal mechanism. Internal search Related video recommendation Channel subscription Front page highlight External referrers such as social media websites arise to be important sources to lead users to YouTube videos. Twitter has been quickly growing as the top referrer source for web video discovery.
Motivation YouTube video Twitter followee watch Twitter follower For specific YouTube video, to identify proper Twitter followees with goal to maximize video dissemination to the followers. Got 1 billion views in 5 months
Challenge The heterogeneous knowledge association between YouTube video and Twitter followee user-perceived How to define the “properness” of candidate Twitter followee for a specific YouTube video interestness virtual cost Our Twitter followee identification scheme actually expects to find the optimal Twitter followee whose followers are more likely to show interest to the target video.
User-perceived Solution Illustration example view favor follow User Association follow better promotion referrer
Framework Three Stages
Heterogeneous Topic Modeling iCorr-LDA LDA … ACM Multimedia Bill Britney … … Following Topic Modeling Approach On YouTube Side: Propose an inverse Corr-LDA model to discover the YouTube video multimodal topics. On Twitter Side: Standard LDA on Twitter followee- follower social graph. user as document user’s followees as word
Cross-network Topic Association … Association Mining Aggregation overlapped users Interested videos username Approach YouTube User Aggregation Association Mining
Cross-network Topic Association YouTube User Aggregation …
Cross-network Topic Association Association Mining overlapped users
Cross-network Topic Association Transition Probability-based Association Regression-based Association q=1: lasso problem and can be effectively solved by LARS and feature sign algorithm q=2: ridge regression problem and with analytical solution as
Cross-network Topic Association Latent Attribute-based Association (non-linear) only on overlapped users on all users Innovation: To discover shared latent structure behind the two topic spaces. (After projected to the latent attribute spaces, user’s YouTube and Twitter distribution share the same coefficient.) shared latent user attribute Only on overlapped users By some simple transfer, it can be efficiently solved by the sparse coding algorithm.
Cross-network Topic Association Latent attribute discovery on all users (plenty of non- overlapped users are considered in this scheme) Objective function Iteratively solved via three sub-problems
Referrer Identification test YouTube video Distribution Transfer candidate Twitter followees Matching … Approach Direct product-based matching Weighted product-based matching
Referrer Identification Direct product-based matching In charge of the coverage of the interested audiences In charge of the virtual cost
Some Visualization
Further Discussion Some Extensible Application Examining the value of Twitter followees (Our work can be viewed as valuing Twitter followee w.r.t. promotion efficiency to YouTube videos) (e.g. the followee has a lot of young female followers) Advertising (Advertising media selection for our work) (e.g. anchor text generation (i.e., optimizing video description for promotion), advertising slot bid (i.e., followee reshare time selection))
Other user-bridged cross network application Tweet Topic Taobao Topic user Challenge Data hard to get! recommend Advertisement Video