Download presentation
Presentation is loading. Please wait.
Published byBarrie Clark Modified over 9 years ago
1
Cold Start Problem in Movie Recommendation JIANG CAIGAO, WANG WEIYAN Group 20
2
Outline 1.Introduction 2.Problem Statement 3.Transfer Learning based 4.Semantic Extracting based 5.Experiments
3
Introduction: Problem Collaborative filtering: A Classical Recommending Solution 1)Look for users who share the same rating patterns with the user who needs recommendation. 2) Use the ratings from those like-minded users to predict active user’s rating for unrated items. Cold Start Problem: lack sufficient information to recommend New users take long time to rate sufficient movie to predict their preference Browsers even don’t have any direct preference information
4
Introduction: Solutions With Clicking data: Recommend basing on Transfer Learning Extract knowledge from one or more source tasks and apply to the target task With Semantic Data: Recommend Basing on Semantics Extracting Extract movie’s semantic information from its tags Respond to users’ actions e.g. visiting a movie’s page or fuzzy query
5
Introduction
6
Problem Statement
8
Methods
11
Learning Process: Use Jensen’s inequality to derive a lower bound on log-likelihood Where H is the entropy, and:
12
Methods
13
Learning Process: Variational expectation-maximization(VEM) 1.VE-Step Fix model parameters and optimize the bound w.r.t the variational parameters to make the bound as tight as possible. 2.VM-Step Fix variational parameters and optimize the equation w.r.t the model parameters to raise the bound.
14
Experiment Data Sets: There are in total of four datasets used in our experiments, namely, Netflix, Movielens, Book-Corssing and Each-Movie. Netflix: The Netflix dataset contains about 10^8 rating values in the range{1,2,3,4,5}, given by about 7x10^4 users on around 1.7x10^4 movies Movielens : The movielens contains about 10^7 rating value, rated by 7x10^4 users on around 10^4 movies. EachMovie: EachMovie contains approximately 2.8x10^6 ratings given by 7.2x10^4 users on 1628 movies
15
Experiment
16
Experiment Setting: 2. Learning EachMovie with MovieLens: All the preprocessing steps are the same as before, but the dimensions is 1000x800.
17
Experiment Results:
18
Experiment Results:
19
Semantics Extracting Solution Goal: To deal with situation knowing nothing about the user Motivation: Most movies are tagged with short phases and words by users. E.g. Extract the semantics from tags to describe the movie’s content for recommendation responding to users’ browsing and fuzzy query.
20
Semantics Extracting: Related Work Tags, as brief and informative data, has been used for recommending and prediction: (1) As a kind of binary variable only.[1][2] (2)Otherwise user manually provide relevance value between tag and item.[3] Tags are regarded as features instead of language words, and the semantics are ignored. [1] GUAN, Z.etc. Document recommendation in social tagging services. (WWW’10). ACM. [2] TSO-SUTTER, etc. Tag-aware recommender systems by fusion of collaborative filtering algorithms. ACM Symposium on Applied Computing (SAC’08) [3] Vig, Jesse, etc. "The tag genome: Encoding community knowledge to support novel interaction." ACM Transactions on Interactive Intelligent Systems (2012)
21
Semantics Extracting : Word Embedding NLP Perspective: Treat tags as brief and informative description of the movie and extract the semantics by generating word embedding[4]. VectorTag Neutral Network Lookup table containing the vector Hierarchical SoftMax (Huffman Tree) Sample: Shorten the similar words’ distance Unsample: enlarge the dissimilar words’ distance [4]Mikolov, etc. Distributed Representations of Words and Phrases and their Compositionality
22
Semantics Extracting : Word Embedding Modified for tags semantics extracting: Generate vectors in 100 dimensions representing tags, in which similar and related tags’ vector have large cosine distance (inner production) Original Word2vecModified Tag2vecReason ContextFixed context window size: 5~10 All tags of the movie regardless of the length All tags are related regardless of the order and appearing position Unsampl e 5~10 times randomly unsamples >=1000 randomly unsamplesTo effectively enlarge the distances DatasetLarge corpus: Wikipedia Tags of each movie + movie name + category To extract the special semantics in tags: e.g. name, special phrase
23
Semantics Extracting : Movie & Query Embedding for Recommending
24
Semantics Extracting: Experiment Data Set: Full MovieLens (Last updated 8/2015) Tag Vector: Funny: inspiring: moving: ItemNum Movies~30,000 Users230,000 Tags510,000 Vocabulary Size11363 Training Words100949
25
Semantics Extracting: Experiment Recommend similar movies with what is being visited: Matrix The Lord of the Rings I, Robot Pride and Prejudice(2005)
26
Semantics Extracting: Experiment Recommend responding to single word fuzzy query: bond China kid funny
27
Semantics Extracting: Experiment Recommend responding to multi-words fuzzy query : french funny kid action Magic book war documentary
28
THANKS
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.