Hybrid-ε-greedy for Mobile Context- Aware Recommender System Djallel Bouneffouf, Amel Bouzeghoub & Alda Lopes Gançarski Institut Télécom, Télécom SudParis, France 1
OUTLINE 1. Introduction 2. State of the art 3. Proposition 4. Experimental evaluation 5. Conclusion 2
OUTLINE 1. Introduction 2. State of the art 3. Proposition 4. Experimental evaluation 5. Conclusion 3
Software editor Access and navigation into the corporate data 4
5 MOBILE INFORMATION SYSTEMS Context
6 Context-based Recommender System To reduce search and navigation time To assist users in finding information
PROBLEMS IN CONTEXT-BASED RECOMMENDER SYSTEM 7 USER Contextual Recommender System algorithm: Selects item(s) to show Gets feedback (click, time spent,..) Refine the models Repeats (large number of times) with an Optimization of metrics of interest (Total number of clicks, Total rewards,…) How to recommend information to users taking into account their surrounding environment (location, time, near people)? How to follow the evolution of user’s interest? Item Inventory Articles, web page, documents, … Context location, time, …
OUTLINE 1. Introduction 2. State of the art 3. Proposition 4. Experimental evaluation 5. Conclusion 8
USER OR EXPERT SPECIFICATION Constraints Laborious Not a dynamic system Not a personalized system Advantage Context management Reference [Panayiotou, 2006] [Bila, 2008] [Bellotti, 2008] [ Dobson, 2005] [Lakshmish, 2009] [Alexandre de Spindler, 2006] [Mieczysław, 2009] [Wei, 2010] [Lihong, 2010] 9
CONTENT-BASED AND COLLABORATIVE FILTERING Dataset Situations Reward Action Social group Meeting Home Drive Offic e Constraints Cold start problem Slow training Reference [Panayiotou, 2006] [Bila, 2008] [Bellotti, 2008] [ Dobson, 2005] [Lakshmish, 2009] [Alexandre de Spindler, 2006] [Mieczysław,2009] [Wei,2010] [Lihong, 2010] Advantage Context management Automatic process 10
MACHINE LEARNING - The greedy strategy only exploitation; - The ε-greedy strategy adds some random action. Advantage Solve cold start problem Follow the evolution of user interest Constraints No context management Slow training Reference[Panayiotou,2006] [Bellotti,2008] [Bila,2008] [Dobson,2005] [Lakshmish, 2009] [Alexandre de Spindler, 2006] [Mieczysła,2009] [Wei,2010] [Lihong,2010] Reinforcement learning 11 DocumentsD1D2D3D4D5D6D8D9D10 Displays12 Number of Clicks Exploration Exploitation mean= 0.48 mean= 0.79
State of the art Learning Profile The user or The expert specificati- on Content and Collaborative filteringReinforcement learning Reference [Pana yiotou,2006] [Bila, 2008] [Bellotti, 2008] [Dobson, 2005] [Lakshmish, 2009] [Alexandre de Spindler, 2006] [Mieczysła, 2009] [Wei,2010] [Lihong,2010] Context management Semantic Context representation Content- based Automatic process Follow the evolution of user interest Solve the cold start
OUTLINE 1. Introduction 2. State of the art 3. Proposition 4. Experimental evaluation 5. Conclusion 13
MULTI-ARMED BANDITS (MAB) A (basic) MAB problem has: A set D of possibilities (documents) A CTR(d) ∈ [0,1] of expected rewards for each d ∈ D In each round: algorithm picks document d ∈ D based on past history Reward: independent sample in [0,1] with expectation CTR (d) Classical setting that models exploration/exploitation trade-off 14 DocumentsD1D2D3D4D5D6D8D9D10 CTR
CONTEXTUAL BANDITS X is a set of situations, D is a set of arms, CTR: X x D [0,1] expected rewards Situation x ∈ X occurs In each round: Algorithm picks arm d ∈ D Rewards: independent sample in [0,1] with expectation CTR(x, d) 15 x1x1 x2x2 x3x3 DocumentsD1D2D3D4D5D6D8D9D10 CTR DocumentsD1D2D3D4D5D6D8D9D10 CTR DocumentsD1D2D3D4D5D6D8D9D10 CTR Situations Meeting Home Drive Offic e 1 2
GET SITUATION FROM CONTEXT SENSING Mon Oct 3 12:10: GPS " , " NATIXIS 16
GET SITUATION FROM CONTEXT THINKING ABSTRACTION Mon Oct 3 12:10: GPS " , " NATIXIS Time Ontology Location Ontology Social Ontology situationtimelocationsocial x1x1 workdayParisBank 17
GET SITUATION FROM CONTEXT RETRIEVING THE RELEVANT SITUATION IDSUsersTimePlaceClient 1PaulWorkdayParisBNP 2FabriceHolydayEvryMGET 3PaulWorkdayGentillyAMUNDI IDSUsersTimePlaceClient 1PaulWorkdayParisNATIXIS RetrieveSituation IDSUsersTimePlaceClient 1PaulWorkdayParisBNP Location Ontology Time Ontology Social Ontology
SELECT DOCUMENTS 19 Documentsd1d2d3d4d5d6d8d9d10 CTR Hybrid-ε-greedy argmax d (CTR(d)) p(1-ε) d t = Random(D) p(ε) CBR-ε-greedy argmax d (CTR(d)) p(1-ε) d t = CBF (d) p(z) Random(D) p(k) ε is the probability of exploration CBF (d) gives documents similar to document d ε = z+k Content-Based filtering (CBF)
OUTLINE 1. Introduction 2. State of the art 3. Proposition 4. Experimental evaluation 5. Conclusion 20
IDSUsersTimePlaceClient 1Paul11/05/2011ParisAFNOR 2Fabrice15/05/2011EvryMGET 3Paul19/05/2011GentillyAMUNDI IdDocIDSClickTimeInterestDocuments 1122 min3/5Demand 2133 min1/5Contact secnullPerson Diary navigation entries Diary situation entries 21 Data from Nomalys diary situations navigation entries EXPERIMENTAL DATASETS
RECOMMEND DOCUMENTS ε variation on learning ε variation on deployment ε- Variation 22 ε is the probability of exploration CTR argmax d (CTR(d)) p(1-ε) d t = Random(D) p(ε)
RECOMMEND DOCUMENTS Data size on learning Data size on deployment Data size variation 23 CTR Data size
CONCLUSION Our experiments yield to the conclusion that: Considering the user’s context for the exploration/exploitation strategy significantly increases the performance of the recommender system. In the future: We plan to investigate methods that automatically learn the optimal exploitation and exploration trade-o ff. 24