Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cold Start Problem in Movie Recommendation JIANG CAIGAO, WANG WEIYAN Group 20.

Similar presentations


Presentation on theme: "Cold Start Problem in Movie Recommendation JIANG CAIGAO, WANG WEIYAN Group 20."— Presentation transcript:

1 Cold Start Problem in Movie Recommendation JIANG CAIGAO, WANG WEIYAN Group 20

2 Outline 1.Introduction 2.Problem Statement 3.Transfer Learning based 4.Semantic Extracting based 5.Experiments

3 Introduction: Problem Collaborative filtering: A Classical Recommending Solution 1)Look for users who share the same rating patterns with the user who needs recommendation. 2) Use the ratings from those like-minded users to predict active user’s rating for unrated items. Cold Start Problem: lack sufficient information to recommend New users take long time to rate sufficient movie to predict their preference Browsers even don’t have any direct preference information

4 Introduction: Solutions With Clicking data: Recommend basing on Transfer Learning Extract knowledge from one or more source tasks and apply to the target task With Semantic Data: Recommend Basing on Semantics Extracting Extract movie’s semantic information from its tags Respond to users’ actions e.g. visiting a movie’s page or fuzzy query

5 Introduction

6 Problem Statement

7

8 Methods

9

10

11 Learning Process: Use Jensen’s inequality to derive a lower bound on log-likelihood Where H is the entropy, and:

12 Methods

13 Learning Process: Variational expectation-maximization(VEM) 1.VE-Step Fix model parameters and optimize the bound w.r.t the variational parameters to make the bound as tight as possible. 2.VM-Step Fix variational parameters and optimize the equation w.r.t the model parameters to raise the bound.

14 Experiment Data Sets: There are in total of four datasets used in our experiments, namely, Netflix, Movielens, Book-Corssing and Each-Movie. Netflix: The Netflix dataset contains about 10^8 rating values in the range{1,2,3,4,5}, given by about 7x10^4 users on around 1.7x10^4 movies Movielens : The movielens contains about 10^7 rating value, rated by 7x10^4 users on around 10^4 movies. EachMovie: EachMovie contains approximately 2.8x10^6 ratings given by 7.2x10^4 users on 1628 movies

15 Experiment

16 Experiment Setting: 2. Learning EachMovie with MovieLens: All the preprocessing steps are the same as before, but the dimensions is 1000x800.

17 Experiment Results:

18 Experiment Results:

19 Semantics Extracting Solution Goal: To deal with situation knowing nothing about the user Motivation: Most movies are tagged with short phases and words by users. E.g. Extract the semantics from tags to describe the movie’s content for recommendation responding to users’ browsing and fuzzy query.

20 Semantics Extracting: Related Work Tags, as brief and informative data, has been used for recommending and prediction: (1) As a kind of binary variable only.[1][2] (2)Otherwise user manually provide relevance value between tag and item.[3] Tags are regarded as features instead of language words, and the semantics are ignored. [1] GUAN, Z.etc. Document recommendation in social tagging services. (WWW’10). ACM. [2] TSO-SUTTER, etc. Tag-aware recommender systems by fusion of collaborative filtering algorithms. ACM Symposium on Applied Computing (SAC’08) [3] Vig, Jesse, etc. "The tag genome: Encoding community knowledge to support novel interaction." ACM Transactions on Interactive Intelligent Systems (2012)

21 Semantics Extracting : Word Embedding NLP Perspective: Treat tags as brief and informative description of the movie and extract the semantics by generating word embedding[4]. VectorTag Neutral Network Lookup table containing the vector Hierarchical SoftMax (Huffman Tree) Sample: Shorten the similar words’ distance Unsample: enlarge the dissimilar words’ distance [4]Mikolov, etc. Distributed Representations of Words and Phrases and their Compositionality

22 Semantics Extracting : Word Embedding Modified for tags semantics extracting: Generate vectors in 100 dimensions representing tags, in which similar and related tags’ vector have large cosine distance (inner production) Original Word2vecModified Tag2vecReason ContextFixed context window size: 5~10 All tags of the movie regardless of the length All tags are related regardless of the order and appearing position Unsampl e 5~10 times randomly unsamples >=1000 randomly unsamplesTo effectively enlarge the distances DatasetLarge corpus: Wikipedia Tags of each movie + movie name + category To extract the special semantics in tags: e.g. name, special phrase

23 Semantics Extracting : Movie & Query Embedding for Recommending

24 Semantics Extracting: Experiment Data Set: Full MovieLens (Last updated 8/2015) Tag Vector: Funny: inspiring: moving: ItemNum Movies~30,000 Users230,000 Tags510,000 Vocabulary Size11363 Training Words100949

25 Semantics Extracting: Experiment Recommend similar movies with what is being visited: Matrix The Lord of the Rings I, Robot Pride and Prejudice(2005)

26 Semantics Extracting: Experiment Recommend responding to single word fuzzy query: bond China kid funny

27 Semantics Extracting: Experiment Recommend responding to multi-words fuzzy query : french funny kid action Magic book war documentary

28 THANKS


Download ppt "Cold Start Problem in Movie Recommendation JIANG CAIGAO, WANG WEIYAN Group 20."

Similar presentations


Ads by Google